Skip to content
  1. Nov 23, 2019
  2. Nov 22, 2019
    • Toke Høiland-Jørgensen's avatar
      mac80211: Use Airtime-based Queue Limits (AQL) on packet dequeue · 7a89233a
      Toke Høiland-Jørgensen authored
      
      
      The previous commit added the ability to throttle stations when they queue
      too much airtime in the hardware. This commit enables the functionality by
      calculating the expected airtime usage of each packet that is dequeued from
      the TXQs in mac80211, and accounting that as pending airtime.
      
      The estimated airtime for each skb is stored in the tx_info, so we can
      subtract the same amount from the running total when the skb is freed or
      recycled. The throttling mechanism relies on this accounting to be
      accurate (i.e., that we are not freeing skbs without subtracting any
      airtime they were accounted for), so we put the subtraction into
      ieee80211_report_used_skb(). As an optimisation, we also subtract the
      airtime on regular TX completion, zeroing out the value stored in the
      packet afterwards, to avoid having to do an expensive lookup of the station
      from the packet data on every packet.
      
      This patch does *not* include any mechanism to wake a throttled TXQ again,
      on the assumption that this will happen anyway as a side effect of whatever
      freed the skb (most commonly a TX completion).
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20191119060610.76681-5-kyan@google.com
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      7a89233a
    • Kan Yan's avatar
      mac80211: Implement Airtime-based Queue Limit (AQL) · 3ace10f5
      Kan Yan authored
      
      
      In order for the Fq_CoDel algorithm integrated in mac80211 layer to operate
      effectively to control excessive queueing latency, the CoDel algorithm
      requires an accurate measure of how long packets stays in the queue, AKA
      sojourn time. The sojourn time measured at the mac80211 layer doesn't
      include queueing latency in the lower layer (firmware/hardware) and CoDel
      expects lower layer to have a short queue. However, most 802.11ac chipsets
      offload tasks such TX aggregation to firmware or hardware, thus have a deep
      lower layer queue.
      
      Without a mechanism to control the lower layer queue size, packets only
      stay in mac80211 layer transiently before being sent to firmware queue.
      As a result, the sojourn time measured by CoDel in the mac80211 layer is
      almost always lower than the CoDel latency target, hence CoDel does little
      to control the latency, even when the lower layer queue causes excessive
      latency.
      
      The Byte Queue Limits (BQL) mechanism is commonly used to address the
      similar issue with wired network interface. However, this method cannot be
      applied directly to the wireless network interface. "Bytes" is not a
      suitable measure of queue depth in the wireless network, as the data rate
      can vary dramatically from station to station in the same network, from a
      few Mbps to over Gbps.
      
      This patch implements an Airtime-based Queue Limit (AQL) to make CoDel work
      effectively with wireless drivers that utilized firmware/hardware
      offloading. AQL allows each txq to release just enough packets to the lower
      layer to form 1-2 large aggregations to keep hardware fully utilized and
      retains the rest of the frames in mac80211 layer to be controlled by the
      CoDel algorithm.
      
      Signed-off-by: default avatarKan Yan <kyan@google.com>
      [ Toke: Keep API to set pending airtime internal, fix nits in commit msg ]
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20191119060610.76681-4-kyan@google.com
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      3ace10f5
    • Toke Høiland-Jørgensen's avatar
      mac80211: Import airtime calculation code from mt76 · db3e1c40
      Toke Høiland-Jørgensen authored
      
      
      Felix recently added code to calculate airtime of packets to the mt76
      driver. Import this into mac80211 so we can use it for airtime queue limit
      calculations.
      
      The airtime.c file is copied verbatim from the mt76 driver, and adjusted to
      be usable in mac80211. This involves:
      
      - Switching to mac80211 data structures.
      - Adding support for 160 MHz channels and HE mode.
      - Moving the symbol and duration calculations around a bit to avoid
        rounding with the higher rates and longer symbol times used for HE rates.
      
      The per-rate TX rate calculation is also split out to its own function so
      it can be used directly for the AQL calculations later.
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20191119060610.76681-3-kyan@google.com
      [fix HE_GROUP_IDX() to use 3 * bw, since there are 3 _gi values]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      db3e1c40
    • Taehee Yoo's avatar
      virt_wifi: fix use-after-free in virt_wifi_newlink() · bc71d8b5
      Taehee Yoo authored
      When virt_wifi interface is created, virt_wifi_newlink() is called and
      it calls register_netdevice().
      if register_netdevice() fails, it internally would call
      ->priv_destructor(), which is virt_wifi_net_device_destructor() and
      it frees netdev. but virt_wifi_newlink() still use netdev.
      So, use-after-free would occur in virt_wifi_newlink().
      
      Test commands:
          ip link add dummy0 type dummy
          modprobe bonding
          ip link add bonding_masters link dummy0 type virt_wifi
      
      Splat looks like:
      [  202.220554] BUG: KASAN: use-after-free in virt_wifi_newlink+0x88b/0x9a0 [virt_wifi]
      [  202.221659] Read of size 8 at addr ffff888061629cb8 by task ip/852
      
      [  202.222896] CPU: 1 PID: 852 Comm: ip Not tainted 5.4.0-rc5 #3
      [  202.223765] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  202.225073] Call Trace:
      [  202.225532]  dump_stack+0x7c/0xbb
      [  202.226869]  print_address_description.constprop.5+0x1be/0x360
      [  202.229362]  __kasan_report+0x12a/0x16f
      [  202.230714]  kasan_report+0xe/0x20
      [  202.232595]  virt_wifi_newlink+0x88b/0x9a0 [virt_wifi]
      [  202.233370]  __rtnl_newlink+0xb9f/0x11b0
      [  202.244909]  rtnl_newlink+0x65/0x90
      [ ... ]
      
      Cc: stable@vger.kernel.org
      Fixes: c7cdba31
      
       ("mac80211-next: rtnetlink wifi simulation device")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20191121122645.9355-1-ap420073@gmail.com
      [trim stack dump a bit]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      bc71d8b5
    • Thomas Pedersen's avatar
      mac80211: consider QoS Null frames for STA_NULLFUNC_ACKED · 08a5bdde
      Thomas Pedersen authored
      Commit 7b6ddeaf ("mac80211: use QoS NDP for AP probing")
      let STAs send QoS Null frames as PS triggers if the AP was
      a QoS STA.  However, the mac80211 PS stack relies on an
      interface flag IEEE80211_STA_NULLFUNC_ACKED for
      determining trigger frame ACK, which was not being set for
      acked non-QoS Null frames. The effect is an inability to
      trigger hardware sleep via IEEE80211_CONF_PS since the QoS
      Null frame was seemingly never acked.
      
      This bug only applies to drivers which set both
      IEEE80211_HW_REPORTS_TX_ACK_STATUS and
      IEEE80211_HW_PS_NULLFUNC_STACK.
      
      Detect the acked QoS Null frame to restore STA power save.
      
      Fixes: 7b6ddeaf
      
       ("mac80211: use QoS NDP for AP probing")
      Signed-off-by: default avatarThomas Pedersen <thomas@adapt-ip.com>
      Link: https://lore.kernel.org/r/20191119053538.25979-4-thomas@adapt-ip.com
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      08a5bdde
    • Thomas Pedersen's avatar
      mac80211: expose HW conf flags through debugfs · c90142a5
      Thomas Pedersen authored
      
      
      This is useful during testing to eg. check the currently
      configured HW power save state.
      
      Signed-off-by: default avatarThomas Pedersen <thomas@adapt-ip.com>
      Link: https://lore.kernel.org/r/20191119053538.25979-3-thomas@adapt-ip.com
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c90142a5
    • Toke Høiland-Jørgensen's avatar
      mac80211: Add new sta_info getter by sta/vif addrs · 5072f73c
      Toke Høiland-Jørgensen authored
      
      
      In ieee80211_tx_status() we don't have an sdata struct when looking up the
      destination sta. Instead, we just do a lookup by the vif addr that is the
      source of the packet being completed. Factor this out into a new sta_info
      getter helper, since we need to use it for accounting AQL as well.
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20191112130835.382062-1-toke@redhat.com
      [remove internal rcu_read_lock(), document instead]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      5072f73c
    • Johannes Berg's avatar
      mac80211: add a comment about monitor-to-dev injection · b226a826
      Johannes Berg authored
      
      
      Add a note with a use-case for the monitor-to-dev injection
      mechanism in mac80211, reported by Ben Greear.
      
      Change-Id: I6456997ef9bc40b24ede860b6ef2fed5af49cf44
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      b226a826
    • Mao Wenan's avatar
      enetc: make enetc_setup_tc_mqprio static · 13baf667
      Mao Wenan authored
      While using ARCH=mips CROSS_COMPILE=mips-linux-gnu- command to compile,
      make C=2 drivers/net/ethernet/freescale/enetc/enetc.o
      
      one warning can be found:
      drivers/net/ethernet/freescale/enetc/enetc.c:1439:5:
      warning: symbol 'enetc_setup_tc_mqprio' was not declared.
      Should it be static?
      
      This patch make symbol enetc_setup_tc_mqprio static.
      Fixes: 34c6adf1
      
       ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13baf667
    • David S. Miller's avatar
      Merge branch 'net-introduce-and-use-route-hint' · 7d75c0cb
      David S. Miller authored
      
      
      Paolo Abeni says:
      
      ====================
      net: introduce and use route hint
      
      This series leverages the listification infrastructure to avoid
      unnecessary route lookup on ingress packets. In absence of custom rules,
      packets with equal daddr will usually land on the same dst.
      
      When processing packet bursts (lists) we can easily reference the previous
      dst entry. When we hit the 'same destination' condition we can avoid the
      route lookup, coping the already available dst.
      
      Detailed performance numbers are available in the individual commit
      messages.
      
      v3 -> v4:
       - move helpers to their own patches (Eric D.)
       - enable hints for SUBTREE builds (David A.)
       - re-enable hints for ipv4 forward (David A.)
      
      v2 -> v3:
       - use fib*_has_custom_rules() helpers (David A.)
       - add ip*_extract_route_hint() helper (Edward C.)
       - use prev skb as hint instead of copying data (Willem )
      
      v1 -> v2:
       - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
       - fix potential race in ip6_list_rcv_finish()
      ====================
      
      Acked-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d75c0cb
    • Paolo Abeni's avatar
      ipv4: use dst hint for ipv4 list receive · 02b24941
      Paolo Abeni authored
      
      
      This is alike the previous change, with some additional ipv4 specific
      quirk. Even when using the route hint we still have to do perform
      additional per packet checks about source address validity: a new
      helper is added to wrap them.
      
      Hints are explicitly disabled if the destination is a local broadcast,
      that keeps the code simple and local broadcast are a slower path anyway.
      
      UDP flood performances vs recvmmsg() receiver:
      
      vanilla		patched		delta
      Kpps		Kpps		%
      1683		1871		+11
      
      In the worst case scenario - each packet has a different
      destination address - the performance delta is within noise
      range.
      
      v3 -> v4:
       - re-enable hints for forward
      
      v2 -> v3:
       - really fix build (sic) and hint usage check
       - use fib4_has_custom_rules() helpers (David A.)
       - add ip_extract_route_hint() helper (Edward C.)
       - use prev skb as hint instead of copying data (Willem)
      
      v1 -> v2:
       - fix build issue with !CONFIG_IP_MULTIPLE_TABLES
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02b24941
    • Paolo Abeni's avatar
      ipv4: move fib4_has_custom_rules() helper to public header · c43c3d76
      Paolo Abeni authored
      
      
      So that we can use it in the next patch.
      Additionally constify the helper argument.
      
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c43c3d76
    • Paolo Abeni's avatar
      ipv6: introduce and uses route look hints for list input. · 197dbf24
      Paolo Abeni authored
      
      
      When doing RX batch packet processing, we currently always repeat
      the route lookup for each ingress packet. When no custom rules are
      in place, and there aren't routes depending on source addresses,
      we know that packets with the same destination address will use
      the same dst.
      
      This change tries to avoid per packet route lookup caching
      the destination address of the latest successful lookup, and
      reusing it for the next packet when the above conditions are
      in place. Ingress traffic for most servers should fit.
      
      The measured performance delta under UDP flood vs a recvmmsg
      receiver is as follow:
      
      vanilla		patched		delta
      Kpps		Kpps		%
      1431		1674		+17
      
      In the worst-case scenario - each packet has a different
      destination address - the performance delta is within noise
      range.
      
      v3 -> v4:
       - support hints for SUBFLOW build, too (David A.)
       - several style fixes (Eric)
      
      v2 -> v3:
       - add fib6_has_custom_rules() helpers (David A.)
       - add ip6_extract_route_hint() helper (Edward C.)
       - use hint directly in ip6_list_rcv_finish() (Willem)
      
      v1 -> v2:
       - fix build issue with !CONFIG_IPV6_MULTIPLE_TABLES
       - fix potential race when fib6_has_custom_rules is set
         while processing a packet batch
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      197dbf24
    • Paolo Abeni's avatar
      ipv6: keep track of routes using src · b9b33e7c
      Paolo Abeni authored
      
      
      Use a per namespace counter, increment it on successful creation
      of any route using the source address, decrement it on deletion
      of such routes.
      
      This allows us to check easily if the routing decision in the
      current namespace depends on the packet source. Will be used
      by the next patch.
      
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9b33e7c
    • Paolo Abeni's avatar
      ipv6: add fib6_has_custom_rules() helper · 1f8ac570
      Paolo Abeni authored
      
      
      It wraps the namespace field with the same name, to easily
      access it regardless of build options.
      
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f8ac570
    • David S. Miller's avatar
      Merge branch 'DSA-Felix-PTP' · 2c44713e
      David S. Miller authored
      
      
      Yangbo Lu says:
      
      ====================
      Support PTP clock and hardware timestamping for DSA Felix driver
      
      This patch-set is to support PTP clock and hardware timestamping
      for DSA Felix driver. Some functions in ocelot.c/ocelot_board.c
      driver were reworked/exported, so that DSA Felix driver was able
      to reuse them as much as possible.
      
      On TX path, timestamping works on packet which requires timestamp.
      The injection header will be configured accordingly, and skb clone
      requires timestamp will be added into a list. The TX timestamp
      is final handled in threaded interrupt handler when PTP timestamp
      FIFO is ready.
      On RX path, timestamping is always working. The RX timestamp could
      be got from extraction header.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c44713e
    • Yangbo Lu's avatar
      net: dsa: ocelot: add hardware timestamping support for Felix · c0bcf537
      Yangbo Lu authored
      
      
      This patch is to reuse ocelot functions as possible to enable PTP
      clock and to support hardware timestamping on Felix.
      On TX path, timestamping works on packet which requires timestamp.
      The injection header will be configured accordingly, and skb clone
      requires timestamp will be added into a list. The TX timestamp
      is final handled in threaded interrupt handler when PTP timestamp
      FIFO is ready.
      On RX path, timestamping is always working. The RX timestamp could
      be got from extraction header.
      
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0bcf537
    • Yangbo Lu's avatar
      net: dsa: ocelot: define PTP registers for felix_vsc9959 · 5df66c48
      Yangbo Lu authored
      
      
      This patch is to define PTP registers for felix_vsc9959.
      
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5df66c48
    • Yangbo Lu's avatar
      net: mscc: ocelot: convert to use ocelot_port_add_txtstamp_skb() · 400928bf
      Yangbo Lu authored
      
      
      Convert to use ocelot_port_add_txtstamp_skb() for adding skbs which
      require TX timestamp into list. Export it so that DSA Felix driver
      could reuse it too.
      
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      400928bf
    • Yangbo Lu's avatar
      net: mscc: ocelot: convert to use ocelot_get_txtstamp() · e23a7b3e
      Yangbo Lu authored
      
      
      The method getting TX timestamp by reading timestamp FIFO and
      matching skbs list is common for DSA Felix driver too.
      So move code out of ocelot_board.c, convert to use
      ocelot_get_txtstamp() function and export it.
      
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e23a7b3e
    • Yangbo Lu's avatar
      net: mscc: ocelot: export ocelot_hwstamp_get/set functions · f145922d
      Yangbo Lu authored
      
      
      Export ocelot_hwstamp_get/set functions so that DSA driver
      is able to reuse them.
      
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f145922d