Skip to content
  1. Mar 04, 2022
    • Amit Cohen's avatar
      selftests: mlxsw: resource_scale: Fix return value · 196f9bc0
      Amit Cohen authored
      
      
      The test runs several test cases and is supposed to return an error in
      case at least one of them failed.
      
      Currently, the check of the return value of each test case is in the
      wrong place, which can result in the wrong return value. For example:
      
       # TESTS='tc_police' ./resource_scale.sh
       TEST: 'tc_police' [default] 968                                     [FAIL]
               tc police offload count failed
       Error: mlxsw_spectrum: Failed to allocate policer index.
       We have an error talking to the kernel
       Command failed /tmp/tmp.i7Oc5HwmXY:969
       TEST: 'tc_police' [default] overflow 969                            [ OK ]
       ...
       TEST: 'tc_police' [ipv4_max] overflow 969                           [ OK ]
      
       $ echo $?
       0
      
      Fix this by moving the check to be done after each test case.
      
      Fixes: 059b18e2 ("selftests: mlxsw: Return correct error code in resource scale test")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      196f9bc0
    • Amit Cohen's avatar
      selftests: mlxsw: tc_police_scale: Make test more robust · dc975207
      Amit Cohen authored
      
      
      The test adds tc filters and checks how many of them were offloaded by
      grepping for 'in_hw'.
      
      iproute2 commit f4cd4f127047 ("tc: add skip_hw and skip_sw to control
      action offload") added offload indication to tc actions, producing the
      following output:
      
       $ tc filter show dev swp2 ingress
       ...
       filter protocol ipv6 pref 1000 flower chain 0 handle 0x7c0
         eth_type ipv6
         dst_ip 2001:db8:1::7bf
         skip_sw
         in_hw in_hw_count 1
               action order 1:  police 0x7c0 rate 10Mbit burst 100Kb mtu 2Kb action drop overhead 0b
               ref 1 bind 1
               not_in_hw
               used_hw_stats immediate
      
      The current grep expression matches on both 'in_hw' and 'not_in_hw',
      resulting in incorrect results.
      
      Fix that by using JSON output instead.
      
      Fixes: 5061e773 ("selftests: mlxsw: Add scale test for tc-police")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dc975207
    • Vladimir Oltean's avatar
      net: dcb: disable softirqs in dcbnl_flush_dev() · 10b6bb62
      Vladimir Oltean authored
      
      
      Ido Schimmel points out that since commit 52cff74e ("dcbnl : Disable
      software interrupts before taking dcb_lock"), the DCB API can be called
      by drivers from softirq context.
      
      One such in-tree example is the chelsio cxgb4 driver:
      dcb_rpl
      -> cxgb4_dcb_handle_fw_update
         -> dcb_ieee_setapp
      
      If the firmware for this driver happened to send an event which resulted
      in a call to dcb_ieee_setapp() at the exact same time as another
      DCB-enabled interface was unregistering on the same CPU, the softirq
      would deadlock, because the interrupted process was already holding the
      dcb_lock in dcbnl_flush_dev().
      
      Fix this unlikely event by using spin_lock_bh() in dcbnl_flush_dev() as
      in the rest of the dcbnl code.
      
      Fixes: 91b0383f ("net: dcb: flush lingering app table entries for unregistered devices")
      Reported-by: default avatarIdo Schimmel <idosch@idosch.org>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220302193939.1368823-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      10b6bb62
  2. Mar 03, 2022
  3. Mar 02, 2022
    • Sven Eckelmann's avatar
      batman-adv: Don't expect inter-netns unique iflink indices · 6c1f41af
      Sven Eckelmann authored
      
      
      The ifindex doesn't have to be unique for multiple network namespaces on
      the same machine.
      
        $ ip netns add test1
        $ ip -net test1 link add dummy1 type dummy
        $ ip netns add test2
        $ ip -net test2 link add dummy2 type dummy
      
        $ ip -net test1 link show dev dummy1
        6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
            link/ether 96:81:55:1e:dd:85 brd ff:ff:ff:ff:ff:ff
        $ ip -net test2 link show dev dummy2
        6: dummy2: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
            link/ether 5a:3c:af:35:07:c3 brd ff:ff:ff:ff:ff:ff
      
      But the batman-adv code to walk through the various layers of virtual
      interfaces uses this assumption because dev_get_iflink handles it
      internally and doesn't return the actual netns of the iflink. And
      dev_get_iflink only documents the situation where ifindex == iflink for
      physical devices.
      
      But only checking for dev->netdev_ops->ndo_get_iflink is also not an option
      because ipoib_get_iflink implements it even when it sometimes returns an
      iflink != ifindex and sometimes iflink == ifindex. The caller must
      therefore make sure itself to check both netns and iflink + ifindex for
      equality. Only when they are equal, a "physical" interface was detected
      which should stop the traversal. On the other hand, vxcan_get_iflink can
      also return 0 in case there was currently no valid peer. In this case, it
      is still necessary to stop.
      
      Fixes: b7eddd0b ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
      Fixes: 5ed4a460 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
      Reported-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      6c1f41af
    • Sven Eckelmann's avatar
      batman-adv: Request iflink once in batadv_get_real_netdevice · 6116ba09
      Sven Eckelmann authored
      
      
      There is no need to call dev_get_iflink multiple times for the same
      net_device in batadv_get_real_netdevice. And since some of the
      ndo_get_iflink callbacks are dynamic (for example via RCUs like in
      vxcan_get_iflink), it could easily happen that the returned values are not
      stable. The pre-checks before __dev_get_by_index are then of course bogus.
      
      Fixes: 5ed4a460 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      6116ba09
    • Sven Eckelmann's avatar
      batman-adv: Request iflink once in batadv-on-batadv check · 690bb6fb
      Sven Eckelmann authored
      
      
      There is no need to call dev_get_iflink multiple times for the same
      net_device in batadv_is_on_batman_iface. And since some of the
      .ndo_get_iflink callbacks are dynamic (for example via RCUs like in
      vxcan_get_iflink), it could easily happen that the returned values are not
      stable. The pre-checks before __dev_get_by_index are then of course bogus.
      
      Fixes: b7eddd0b ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      690bb6fb
    • Vladimir Oltean's avatar
      net: dsa: restore error path of dsa_tree_change_tag_proto · 0b0e2ff1
      Vladimir Oltean authored
      
      
      When the DSA_NOTIFIER_TAG_PROTO returns an error, the user space process
      which initiated the protocol change exits the kernel processing while
      still holding the rtnl_mutex. So any other process attempting to lock
      the rtnl_mutex would deadlock after such event.
      
      The error handling of DSA_NOTIFIER_TAG_PROTO was inadvertently changed
      by the blamed commit, introducing this regression. We must still call
      rtnl_unlock(), and we must still call DSA_NOTIFIER_TAG_PROTO for the old
      protocol. The latter is due to the limiting design of notifier chains
      for cross-chip operations, which don't have a built-in error recovery
      mechanism - we should look into using notifier_call_chain_robust for that.
      
      Fixes: dc452a47 ("net: dsa: introduce tagger-owned storage for private and shared data")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220228141715.146485-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0b0e2ff1
    • Jakub Kicinski's avatar
      Merge tag 'for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 2e77551c
      Jakub Kicinski authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
       - Fix regression with scanning not working in some systems.
      
      * tag 'for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: Fix not checking MGMT cmd pending queue
      ====================
      
      Link: https://lore.kernel.org/r/20220302004330.125536-1-luiz.dentz@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e77551c
    • Brian Gix's avatar
      Bluetooth: Fix not checking MGMT cmd pending queue · 275f3f64
      Brian Gix authored
      A number of places in the MGMT handlers we examine the command queue for
      other commands (in progress but not yet complete) that will interact
      with the process being performed. However, not all commands go into the
      queue if one of:
      
      1. There is no negative side effect of consecutive or redundent commands
      2. The command is entirely perform "inline".
      
      This change examines each "pending command" check, and if it is not
      needed, deletes the check. Of the remaining pending command checks, we
      make sure that the command is in the pending queue by using the
      mgmt_pending_add/mgmt_pending_remove pair rather than the
      mgmt_pending_new/mgmt_pending_free pair.
      
      Link: https://lore.kernel.org/linux-bluetooth/f648f2e11bb3c2974c32e605a85ac3a9fac944f1.camel@redhat.com/T/
      
      
      Tested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarBrian Gix <brian.gix@intel.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      275f3f64
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 4761df52
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Use kfree_rcu(ptr, rcu) variant, using kfree_rcu(ptr) was not
         intentional. From Eric Dumazet.
      
      2) Use-after-free in netfilter hook core, from Eric Dumazet.
      
      3) Missing rcu read lock side for netfilter egress hook,
         from Florian Westphal.
      
      4) nf_queue assume state->sk is full socket while it might not be.
         Invoke sock_gen_put(), from Florian Westphal.
      
      5) Add selftest to exercise the reported KASAN splat in 4)
      
      6) Fix possible use-after-free in nf_queue in case sk_refcnt is 0.
         Also from Florian.
      
      7) Use input interface index only for hardware offload, not for
         the software plane. This breaks tc ct action. Patch from Paul Blakey.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        net/sched: act_ct: Fix flow table lookup failure with no originating ifindex
        netfilter: nf_queue: handle socket prefetch
        netfilter: nf_queue: fix possible use-after-free
        selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test
        netfilter: nf_queue: don't assume sk is full socket
        netfilter: egress: silence egress hook lockdep splats
        netfilter: fix use-after-free in __nf_register_net_hook()
        netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant
      ====================
      
      Link: https://lore.kernel.org/r/20220301215337.378405-1-pablo@netfilter.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4761df52
    • Paul Blakey's avatar
      net/sched: act_ct: Fix flow table lookup failure with no originating ifindex · db6140e5
      Paul Blakey authored
      
      
      After cited commit optimizted hw insertion, flow table entries are
      populated with ifindex information which was intended to only be used
      for HW offload. This tuple ifindex is hashed in the flow table key, so
      it must be filled for lookup to be successful. But tuple ifindex is only
      relevant for the netfilter flowtables (nft), so it's not filled in
      act_ct flow table lookup, resulting in lookup failure, and no SW
      offload and no offload teardown for TCP connection FIN/RST packets.
      
      To fix this, add new tc ifindex field to tuple, which will
      only be used for offloading, not for lookup, as it will not be
      part of the tuple hash.
      
      Fixes: 9795ded7 ("net/sched: act_ct: Fill offloading tuple iifidx")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      db6140e5
  4. Mar 01, 2022