Skip to content
  1. Apr 06, 2022
    • Eric Dumazet's avatar
      rxrpc: fix a race in rxrpc_exit_net() · 1946014c
      Eric Dumazet authored
      Current code can lead to the following race:
      
      CPU0                                                 CPU1
      
      rxrpc_exit_net()
                                                           rxrpc_peer_keepalive_worker()
                                                             if (rxnet->live)
      
        rxnet->live = false;
        del_timer_sync(&rxnet->peer_keepalive_timer);
      
                                                                   timer_reduce(&rxnet->peer_keepalive_timer, jiffies + delay);
      
        cancel_work_sync(&rxnet->peer_keepalive_work);
      
      rxrpc_exit_net() exits while peer_keepalive_timer is still armed,
      leading to use-after-free.
      
      syzbot report was:
      
      ODEBUG: free active (active state 0) object type: timer_list hint: rxrpc_peer_keepalive_timeout+0x0/0xb0
      WARNING: CPU: 0 PID: 3660 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
      Modules linked in:
      CPU: 0 PID: 3660 Comm: kworker/u4:6 Not tainted 5.17.0-syzkaller-13993-g88e6c0207623 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
      Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd 00 1c 26 8a 4c 89 ee 48 c7 c7 00 10 26 8a e8 b1 e7 28 05 <0f> 0b 83 05 15 eb c5 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
      RSP: 0018:ffffc9000353fb00 EFLAGS: 00010082
      RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
      RDX: ffff888029196140 RSI: ffffffff815efad8 RDI: fffff520006a7f52
      RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff815ea4ae R11: 0000000000000000 R12: ffffffff89ce23e0
      R13: ffffffff8a2614e0 R14: ffffffff816628c0 R15: dffffc0000000000
      FS:  0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fe1f2908924 CR3: 0000000043720000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       __debug_check_no_obj_freed lib/debugobjects.c:992 [inline]
       debug_check_no_obj_freed+0x301/0x420 lib/debugobjects.c:1023
       kfree+0xd6/0x310 mm/slab.c:3809
       ops_free_list.part.0+0x119/0x370 net/core/net_namespace.c:176
       ops_free_list net/core/net_namespace.c:174 [inline]
       cleanup_net+0x591/0xb00 net/core/net_namespace.c:598
       process_one_work+0x996/0x1610 kernel/workqueue.c:2289
       worker_thread+0x665/0x1080 kernel/workqueue.c:2436
       kthread+0x2e9/0x3a0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
       </TASK>
      
      Fixes: ace45bec
      
       ("rxrpc: Fix firewall route keepalive")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Marc Dionne <marc.dionne@auristor.com>
      Cc: linux-afs@lists.infradead.org
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1946014c
    • Ilya Maximets's avatar
      net: openvswitch: fix leak of nested actions · 1f30fb91
      Ilya Maximets authored
      While parsing user-provided actions, openvswitch module may dynamically
      allocate memory and store pointers in the internal copy of the actions.
      So this memory has to be freed while destroying the actions.
      
      Currently there are only two such actions: ct() and set().  However,
      there are many actions that can hold nested lists of actions and
      ovs_nla_free_flow_actions() just jumps over them leaking the memory.
      
      For example, removal of the flow with the following actions will lead
      to a leak of the memory allocated by nf_ct_tmpl_alloc():
      
        actions:clone(ct(commit),0)
      
      Non-freed set() action may also leak the 'dst' structure for the
      tunnel info including device references.
      
      Under certain conditions with a high rate of flow rotation that may
      cause significant memory leak problem (2MB per second in reporter's
      case).  The problem is also hard to mitigate, because the user doesn't
      have direct control over the datapath flows generated by OVS.
      
      Fix that by iterating over all the nested actions and freeing
      everything that needs to be freed recursively.
      
      New build time assertion should protect us from this problem if new
      actions will be added in the future.
      
      Unfortunately, openvswitch module doesn't use NLA_F_NESTED, so all
      attributes has to be explicitly checked.  sample() and clone() actions
      are mixing extra attributes into the user-provided action list.  That
      prevents some code generalization too.
      
      Fixes: 34ae932a
      
       ("openvswitch: Make tunnel set action attach a metadata dst")
      Link: https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392922.html
      Reported-by: default avatarStéphane Graber <stgraber@ubuntu.com>
      Signed-off-by: default avatarIlya Maximets <i.maximets@ovn.org>
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f30fb91
    • Andrew Lunn's avatar
      net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address() · 11f8e7c1
      Andrew Lunn authored
      
      
      There is often not a MAC address available in an EEPROM accessible by
      Linux with Marvell devices. Instead the bootload has the MAC address
      and directly programs it into the hardware. So don't consider an error
      from of_get_mac_address() has fatal. However, the check was added for
      the case where there is a MAC address in an the EEPROM, but the EEPROM
      has not probed yet, and -EPROBE_DEFER is returned. In that case the
      error should be returned. So make the check specific to this error
      code.
      
      Cc: Mauri Sandberg <maukka@ext.kapsi.fi>
      Reported-by: default avatarThomas Walther <walther-it@gmx.de>
      Fixes: 42404d8f
      
       ("net: mv643xx_eth: process retval from of_get_mac_address")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220405000404.3374734-1-andrew@lunn.ch
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      11f8e7c1
    • Ilya Maximets's avatar
      net: openvswitch: don't send internal clone attribute to the userspace. · 3f2a3050
      Ilya Maximets authored
      'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for
      performance optimization inside the kernel.  It's added by the kernel
      while parsing user-provided actions and should not be sent during the
      flow dump as it's not part of the uAPI.
      
      The issue doesn't cause any significant problems to the ovs-vswitchd
      process, because reported actions are not really used in the
      application lifecycle and only supposed to be shown to a human via
      ovs-dpctl flow dump.  However, the action list is still incorrect
      and causes the following error if the user wants to look at the
      datapath flows:
      
        # ovs-dpctl add-dp system@ovs-system
        # ovs-dpctl add-flow "<flow match>" "clone(ct(commit),0)"
        # ovs-dpctl dump-flows
        <flow match>, packets:0, bytes:0, used:never,
          actions:clone(bad length 4, expected -1 for: action0(01 00 00 00),
                        ct(commit),0)
      
      With the fix:
      
        # ovs-dpctl dump-flows
        <flow match>, packets:0, bytes:0, used:never,
          actions:clone(ct(commit),0)
      
      Additionally fixed an incorrect attribute name in the comment.
      
      Fixes: b2335040
      
       ("openvswitch: kernel datapath clone action")
      Signed-off-by: default avatarIlya Maximets <i.maximets@ovn.org>
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://lore.kernel.org/r/20220404104150.2865736-1-i.maximets@ovn.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3f2a3050
    • Horatiu Vultur's avatar
      net: micrel: Fix KS8851 Kconfig · 1d7e4fd7
      Horatiu Vultur authored
      
      
      KS8851 selects MICREL_PHY, which depends on PTP_1588_CLOCK_OPTIONAL, so
      make KS8851 also depend on PTP_1588_CLOCK_OPTIONAL.
      
      Fixes kconfig warning and build errors:
      
      WARNING: unmet direct dependencies detected for MICREL_PHY
        Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m]
          Selected by [y]:
            - KS8851 [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICREL [=y] && SPI [=y]
      
      ld.lld: error: undefined symbol: ptp_clock_register referenced by micrel.c
      net/phy/micrel.o:(lan8814_probe) in archive drivers/built-in.a
      ld.lld: error: undefined symbol: ptp_clock_index referenced by micrel.c
      net/phy/micrel.o:(lan8814_ts_info) in archive drivers/built-in.a
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: ece19502
      
       ("net: phy: micrel: 1588 support for LAN8814 phy")
      Signed-off-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Tested-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Link: https://lore.kernel.org/r/20220405065936.4105272-1-horatiu.vultur@microchip.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1d7e4fd7
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 03eb7dae
      Jakub Kicinski authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Incorrect comparison in bitmask .reduce, from Jeremy Sowden.
      
      2) Missing GFP_KERNEL_ACCOUNT for dynamically allocated objects,
         from Vasily Averin.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: memcg accounting for dynamically allocated objects
        netfilter: bitwise: fix reduce comparisons
      ====================
      
      Link: https://lore.kernel.org/r/20220405100923.7231-1-pablo@netfilter.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      03eb7dae
  2. Apr 05, 2022
    • David Ahern's avatar
      ipv6: Fix stats accounting in ip6_pkt_drop · 1158f79f
      David Ahern authored
      VRF devices are the loopbacks for VRFs, and a loopback can not be
      assigned to a VRF. Accordingly, the condition in ip6_pkt_drop should
      be '||' not '&&'.
      
      Fixes: 1d3fd8a1
      
       ("vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach")
      Reported-by: default avatarPudak, Filip <Filip.Pudak@windriver.com>
      Reported-by: default avatarXiao, Jiguang <Jiguang.Xiao@windriver.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220404150908.2937-1-dsahern@kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1158f79f
    • Paolo Abeni's avatar
      Merge branch 'ice-bug-fixes' · 61fb3eee
      Paolo Abeni authored
      Tony Nguyen says:
      
      ====================
      ice bug fixes
      
      Alice Michael says:
      
      There were a couple of bugs that have been found and
      fixed by Anatolii in the ice driver.  First he fixed
      a bug on ring creation by setting the default value
      for the teid.  Anatolli also fixed a bug with deleting
      queues in ice_vc_dis_qs_msg based on their enablement.
      ---
      v2: Remove empty lines between tags
      
      The following are changes since commit 458f5d92
      
      :
        sfc: Do not free an empty page_ring
      and are available in the git repository at:
        git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 100GbE
      ====================
      
      Link: https://lore.kernel.org/r/20220404183548.3422851-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      61fb3eee
    • Anatolii Gerasymenko's avatar
      ice: Do not skip not enabled queues in ice_vc_dis_qs_msg · 05ef6813
      Anatolii Gerasymenko authored
      Disable check for queue being enabled in ice_vc_dis_qs_msg, because
      there could be a case when queues were created, but were not enabled.
      We still need to delete those queues.
      
      Normal workflow for VF looks like:
      Enable path:
      VIRTCHNL_OP_ADD_ETH_ADDR (opcode 10)
      VIRTCHNL_OP_CONFIG_VSI_QUEUES (opcode 6)
      VIRTCHNL_OP_ENABLE_QUEUES (opcode 8)
      
      Disable path:
      VIRTCHNL_OP_DISABLE_QUEUES (opcode 9)
      VIRTCHNL_OP_DEL_ETH_ADDR (opcode 11)
      
      The issue appears only in stress conditions when VF is enabled and
      disabled very fast.
      Eventually there will be a case, when queues are created by
      VIRTCHNL_OP_CONFIG_VSI_QUEUES, but are not enabled by
      VIRTCHNL_OP_ENABLE_QUEUES.
      In turn, these queues are not deleted by VIRTCHNL_OP_DISABLE_QUEUES,
      because there is a check whether queues are enabled in
      ice_vc_dis_qs_msg.
      
      When we bring up the VF again, we will see the "Failed to set LAN Tx queue
      context" error during VIRTCHNL_OP_CONFIG_VSI_QUEUES step. This
      happens because old 16 queues were not deleted and VF requests to create
      16 more, but ice_sched_get_free_qparent in ice_ena_vsi_txq would fail to
      find a parent node for first newly requested queue (because all nodes
      are allocated to 16 old queues).
      
      Testing Hints:
      
      Just enable and disable VF fast enough, so it would be disabled before
      reaching VIRTCHNL_OP_ENABLE_QUEUES.
      
      while true; do
              ip link set dev ens785f0v0 up
              sleep 0.065 # adjust delay value for you machine
              ip link set dev ens785f0v0 down
      done
      
      Fixes: 77ca27c4
      
       ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
      Signed-off-by: default avatarAnatolii Gerasymenko <anatolii.gerasymenko@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      05ef6813
    • Anatolii Gerasymenko's avatar
      ice: Set txq_teid to ICE_INVAL_TEID on ring creation · ccfee182
      Anatolii Gerasymenko authored
      When VF is freshly created, but not brought up, ring->txq_teid
      value is by default set to 0.
      But 0 is a valid TEID. On some platforms the Root Node of
      Tx scheduler has a TEID = 0. This can cause issues as shown below.
      
      The proper way is to set ring->txq_teid to ICE_INVAL_TEID (0xFFFFFFFF).
      
      Testing Hints:
      echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
      ip link set dev ens785f0v0 up
      ip link set dev ens785f0v0 down
      
      If we have freshly created VF and quickly turn it on and off, so there
      would be no time to reach VIRTCHNL_OP_CONFIG_VSI_QUEUES stage, then
      VIRTCHNL_OP_DISABLE_QUEUES stage will fail with error:
      [  639.531454] disable queue 89 failed 14
      [  639.532233] Failed to disable LAN Tx queues, error: ICE_ERR_AQ_ERROR
      [  639.533107] ice 0000:02:00.0: Failed to stop Tx ring 0 on VSI 5
      
      The reason for the fail is that we are trying to send AQ command to
      delete queue 89, which has never been created and receive an "invalid
      argument" error from firmware.
      
      As this queue has never been created, it's teid and ring->txq_teid
      have default value 0.
      ice_dis_vsi_txq has a check against non-existent queues:
      
      node = ice_sched_find_node_by_teid(pi->root, q_teids[i]);
      if (!node)
      	continue;
      
      But on some platforms the Root Node of Tx scheduler has a teid = 0.
      Hence, ice_sched_find_node_by_teid finds a node with teid = 0 (it is
      pi->root), and we go further to submit an erroneous request to firmware.
      
      Fixes: 37bb8390
      
       ("ice: Move common functions out of ice_main.c part 7/7")
      Signed-off-by: default avatarAnatolii Gerasymenko <anatolii.gerasymenko@intel.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ccfee182
    • Miaoqian Lin's avatar
      dpaa2-ptp: Fix refcount leak in dpaa2_ptp_probe · 2b04bd4f
      Miaoqian Lin authored
      This node pointer is returned by of_find_compatible_node() with
      refcount incremented. Calling of_node_put() to aovid the refcount leak.
      
      Fixes: d346c9e8
      
       ("dpaa2-ptp: reuse ptp_qoriq driver")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Link: https://lore.kernel.org/r/20220404125336.13427-1-linmq006@gmail.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2b04bd4f
    • Vasily Averin's avatar
      netfilter: nf_tables: memcg accounting for dynamically allocated objects · 42193ffd
      Vasily Averin authored
      
      
      nft_*.c files whose NFT_EXPR_STATEFUL flag is set on need to
      use __GFP_ACCOUNT flag for objects that are dynamically
      allocated from the packet path.
      
      Such objects are allocated inside nft_expr_ops->init() callbacks
      executed in task context while processing netlink messages.
      
      In addition, this patch adds accounting to nft_set_elem_expr_clone()
      used for the same purposes.
      
      Signed-off-by: default avatarVasily Averin <vvs@openvz.org>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      42193ffd
    • Jamie Bainbridge's avatar
      sctp: count singleton chunks in assoc user stats · e3d37210
      Jamie Bainbridge authored
      Singleton chunks (INIT, HEARTBEAT PMTU probes, and SHUTDOWN-
      COMPLETE) are not counted in SCTP_GET_ASOC_STATS "sas_octrlchunks"
      counter available to the assoc owner.
      
      These are all control chunks so they should be counted as such.
      
      Add counting of singleton chunks so they are properly accounted for.
      
      Fixes: 196d6759
      
       ("sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call")
      Signed-off-by: default avatarJamie Bainbridge <jamie.bainbridge@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Link: https://lore.kernel.org/r/c9ba8785789880cf07923b8a5051e174442ea9ee.1649029663.git.jamie.bainbridge@gmail.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e3d37210
  3. Apr 04, 2022
  4. Apr 03, 2022
  5. Apr 02, 2022
  6. Apr 01, 2022
    • David S. Miller's avatar
      Merge branch 'nexthop-route-deletye-warning' · 37391cc8
      David S. Miller authored
      
      
      Nikolay Aleksandrov says:
      
      ====================
      net: ipv4: fix nexthop route delete warning
      
      The first patch fixes a warning that can be triggered by deleting a
      nexthop route and specifying a device (more info in its commit msg).
      And the second patch adds a selftest for that case.
      
      Chose this way to fix it because we should match when deleting without
      nh spec and should fail when deleting a nexthop route with old-style nh
      spec because nexthop objects are managed separately, e.g.:
      $ ip r show 1.2.3.4/32
      1.2.3.4 nhid 12 via 192.168.11.2 dev dummy0
      
      $ ip r del 1.2.3.4/32
      $ ip r del 1.2.3.4/32 nhid 12
      <both should work>
      
      $ ip r del 1.2.3.4/32 dev dummy0
      <should fail with ESRCH>
      
      v2: addded more to patch 01's commit message
          adjusted the test comment in patch 02
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37391cc8
    • Nikolay Aleksandrov's avatar
      selftests: net: add delete nexthop route warning test · 392baa33
      Nikolay Aleksandrov authored
      
      
      Add a test which causes a WARNING on kernels which treat a
      nexthop route like a normal route when comparing for deletion and a
      device is specified. That is, a route is found but we hit a warning while
      matching it. The warning is from fib_info_nh() in include/net/nexthop.h
      because we run it on a fib_info with nexthop object. The call chain is:
       inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
      nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
      the fib_info and triggering the warning).
      
      Repro steps:
       $ ip nexthop add id 12 via 172.16.1.3 dev veth1
       $ ip route add 172.16.101.1/32 nhid 12
       $ ip route delete 172.16.101.1/32 dev veth1
      
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      392baa33
    • Nikolay Aleksandrov's avatar
      net: ipv4: fix route with nexthop object delete warning · 6bf92d70
      Nikolay Aleksandrov authored
      FRR folks have hit a kernel warning[1] while deleting routes[2] which is
      caused by trying to delete a route pointing to a nexthop id without
      specifying nhid but matching on an interface. That is, a route is found
      but we hit a warning while matching it. The warning is from
      fib_info_nh() in include/net/nexthop.h because we run it on a fib_info
      with nexthop object. The call chain is:
       inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
      nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
      the fib_info and triggering the warning). The fix is to not do any
      matching in that branch if the fi has a nexthop object because those are
      managed separately. I.e. we should match when deleting without nh spec and
      should fail when deleting a nexthop route with old-style nh spec because
      nexthop objects are managed separately, e.g.:
       $ ip r show 1.2.3.4/32
       1.2.3.4 nhid 12 via 192.168.11.2 dev dummy0
      
       $ ip r del 1.2.3.4/32
       $ ip r del 1.2.3.4/32 nhid 12
       <both should work>
      
       $ ip r del 1.2.3.4/32 dev dummy0
       <should fail with ESRCH>
      
      [1]
       [  523.462226] ------------[ cut here ]------------
       [  523.462230] WARNING: CPU: 14 PID: 22893 at include/net/nexthop.h:468 fib_nh_match+0x210/0x460
       [  523.462236] Modules linked in: dummy rpcsec_gss_krb5 xt_socket nf_socket_ipv4 nf_socket_ipv6 ip6table_raw iptable_raw bpf_preload xt_statistic ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_mark nf_tables xt_nat veth nf_conntrack_netlink nfnetlink xt_addrtype br_netfilter overlay dm_crypt nfsv3 nfs fscache netfs vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack 8021q garp mrp ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bridge stp llc rfcomm snd_seq_dummy snd_hrtimer rpcrdma rdma_cm iw_cm ib_cm ib_core ip6table_filter xt_comment ip6_tables vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) qrtr bnep binfmt_misc xfs vfat fat squashfs loop nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) nvidia(POE) intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi btusb btrtl iwlmvm uvcvideo btbcm snd_hda_intel edac_mce_amd
       [  523.462274]  videobuf2_vmalloc videobuf2_memops btintel snd_intel_dspcfg videobuf2_v4l2 snd_intel_sdw_acpi bluetooth snd_usb_audio snd_hda_codec mac80211 snd_usbmidi_lib joydev snd_hda_core videobuf2_common kvm_amd snd_rawmidi snd_hwdep snd_seq videodev ccp snd_seq_device libarc4 ecdh_generic mc snd_pcm kvm iwlwifi snd_timer drm_kms_helper snd cfg80211 cec soundcore irqbypass rapl wmi_bmof i2c_piix4 rfkill k10temp pcspkr acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel nvme sp5100_tco r8169 nvme_core wmi ipmi_devintf ipmi_msghandler fuse
       [  523.462300] CPU: 14 PID: 22893 Comm: ip Tainted: P           OE     5.16.18-200.fc35.x86_64 #1
       [  523.462302] Hardware name: Micro-Star International Co., Ltd. MS-7C37/MPG X570 GAMING EDGE WIFI (MS-7C37), BIOS 1.C0 10/29/2020
       [  523.462303] RIP: 0010:fib_nh_match+0x210/0x460
       [  523.462304] Code: 7c 24 20 48 8b b5 90 00 00 00 e8 bb ee f4 ff 48 8b 7c 24 20 41 89 c4 e8 ee eb f4 ff 45 85 e4 0f 85 2e fe ff ff e9 4c ff ff ff <0f> 0b e9 17 ff ff ff 3c 0a 0f 85 61 fe ff ff 48 8b b5 98 00 00 00
       [  523.462306] RSP: 0018:ffffaa53d4d87928 EFLAGS: 00010286
       [  523.462307] RAX: 0000000000000000 RBX: ffffaa53d4d87a90 RCX: ffffaa53d4d87bb0
       [  523.462308] RDX: ffff9e3d2ee6be80 RSI: ffffaa53d4d87a90 RDI: ffffffff920ed380
       [  523.462309] RBP: ffff9e3d2ee6be80 R08: 0000000000000064 R09: 0000000000000000
       [  523.462310] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000031
       [  523.462310] R13: 0000000000000020 R14: 0000000000000000 R15: ffff9e3d331054e0
       [  523.462311] FS:  00007f245517c1c0(0000) GS:ffff9e492ed80000(0000) knlGS:0000000000000000
       [  523.462313] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [  523.462313] CR2: 000055e5dfdd8268 CR3: 00000003ef488000 CR4: 0000000000350ee0
       [  523.462315] Call Trace:
       [  523.462316]  <TASK>
       [  523.462320]  fib_table_delete+0x1a9/0x310
       [  523.462323]  inet_rtm_delroute+0x93/0x110
       [  523.462325]  rtnetlink_rcv_msg+0x133/0x370
       [  523.462327]  ? _copy_to_iter+0xb5/0x6f0
       [  523.462330]  ? rtnl_calcit.isra.0+0x110/0x110
       [  523.462331]  netlink_rcv_skb+0x50/0xf0
       [  523.462334]  netlink_unicast+0x211/0x330
       [  523.462336]  netlink_sendmsg+0x23f/0x480
       [  523.462338]  sock_sendmsg+0x5e/0x60
       [  523.462340]  ____sys_sendmsg+0x22c/0x270
       [  523.462341]  ? import_iovec+0x17/0x20
       [  523.462343]  ? sendmsg_copy_msghdr+0x59/0x90
       [  523.462344]  ? __mod_lruvec_page_state+0x85/0x110
       [  523.462348]  ___sys_sendmsg+0x81/0xc0
       [  523.462350]  ? netlink_seq_start+0x70/0x70
       [  523.462352]  ? __dentry_kill+0x13a/0x180
       [  523.462354]  ? __fput+0xff/0x250
       [  523.462356]  __sys_sendmsg+0x49/0x80
       [  523.462358]  do_syscall_64+0x3b/0x90
       [  523.462361]  entry_SYSCALL_64_after_hwframe+0x44/0xae
       [  523.462364] RIP: 0033:0x7f24552aa337
       [  523.462365] Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
       [  523.462366] RSP: 002b:00007fff7f05a838 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
       [  523.462368] RAX: ffffffffffffffda RBX: 000000006245bf91 RCX: 00007f24552aa337
       [  523.462368] RDX: 0000000000000000 RSI: 00007fff7f05a8a0 RDI: 0000000000000003
       [  523.462369] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
       [  523.462370] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000001
       [  523.462370] R13: 00007fff7f05ce08 R14: 0000000000000000 R15: 000055e5dfdd1040
       [  523.462373]  </TASK>
       [  523.462374] ---[ end trace ba537bc16f6bf4ed ]---
      
      [2] https://github.com/FRRouting/frr/issues/6412
      
      Fixes: 4c7e8084
      
       ("ipv4: Plumb support for nexthop object in a fib_info")
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6bf92d70
    • Randy Dunlap's avatar
      net: micrel: fix KS8851_MLL Kconfig · c3efcedd
      Randy Dunlap authored
      
      
      KS8851_MLL selects MICREL_PHY, which depends on PTP_1588_CLOCK_OPTIONAL,
      so make KS8851_MLL also depend on PTP_1588_CLOCK_OPTIONAL since
      'select' does not follow any dependency chains.
      
      Fixes kconfig warning and build errors:
      
      WARNING: unmet direct dependencies detected for MICREL_PHY
        Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m]
        Selected by [y]:
        - KS8851_MLL [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICREL [=y] && HAS_IOMEM [=y]
      
      ld: drivers/net/phy/micrel.o: in function `lan8814_ts_info':
      micrel.c:(.text+0xb35): undefined reference to `ptp_clock_index'
      ld: drivers/net/phy/micrel.o: in function `lan8814_probe':
      micrel.c:(.text+0x2586): undefined reference to `ptp_clock_register'
      
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3efcedd
    • David S. Miller's avatar
      Merge branch 'MCTP-fixes' · f41bdd49
      David S. Miller authored
      
      
      Matt Johnston says:
      
      ====================
      MCTP fixes
      
      The following are fixes for the mctp core and mctp-i2c driver.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f41bdd49
    • Matt Johnston's avatar
      mctp: Use output netdev to allocate skb headroom · 4a9dda1c
      Matt Johnston authored
      Previously the skb was allocated with headroom MCTP_HEADER_MAXLEN,
      but that isn't sufficient if we are using devs that are not MCTP
      specific.
      
      This also adds a check that the smctp_halen provided to sendmsg for
      extended addressing is the correct size for the netdev.
      
      Fixes: 833ef3b9
      
       ("mctp: Populate socket implementation")
      Reported-by: default avatarMatthew Rinaldi <mjrinal@g.clemson.edu>
      Signed-off-by: default avatarMatt Johnston <matt@codeconstruct.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a9dda1c
    • Matt Johnston's avatar
      mctp i2c: correct mctp_i2c_header_create result · 8ce40a2f
      Matt Johnston authored
      header_ops.create should return the length of the header,
      instead mctp_i2c_head_create() returned 0.
      This didn't cause any problem because the MCTP stack accepted
      0 as success.
      
      Fixes: f5b8abf9
      
       ("mctp i2c: MCTP I2C binding driver")
      Signed-off-by: default avatarMatt Johnston <matt@codeconstruct.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ce40a2f
    • Matt Johnston's avatar
      mctp: Fix check for dev_hard_header() result · 60be976a
      Matt Johnston authored
      dev_hard_header() returns the length of the header, so
      we need to test for negative errors rather than non-zero.
      
      Fixes: 889b7da2
      
       ("mctp: Add initial routing framework")
      Signed-off-by: default avatarMatt Johnston <matt@codeconstruct.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60be976a
    • David S. Miller's avatar
      Merge branch 'ice-fixups' · 4298a62f
      David S. Miller authored
      
      
      Tony Nguyen says:
      
      ====================
      ice-fixups
      
      This series handles a handful of cleanups for the ice
      driver.  Ivan fixed a problem on the VSI during a release,
      fixing a MAC address setting, and a broken IFF_ALLMULTI
      handling.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4298a62f
    • Ivan Vecera's avatar
      ice: Fix broken IFF_ALLMULTI handling · 1273f895
      Ivan Vecera authored
      Handling of all-multicast flag and associated multicast promiscuous
      mode is broken in ice driver. When an user switches allmulticast
      flag on or off the driver checks whether any VLANs are configured
      over the interface (except default VLAN 0).
      
      If any extra VLANs are registered it enables multicast promiscuous
      mode for all these VLANs (including default VLAN 0) using
      ICE_SW_LKUP_PROMISC_VLAN look-up type. In this situation all
      multicast packets tagged with known VLAN ID or untagged are received
      and multicast packets tagged with unknown VLAN ID ignored.
      
      If no extra VLANs are registered (so only VLAN 0 exists) it enables
      multicast promiscuous mode for VLAN 0 and uses ICE_SW_LKUP_PROMISC
      look-up type. In this situation any multicast packets including
      tagged ones are received.
      
      The driver handles IFF_ALLMULTI in ice_vsi_sync_fltr() this way:
      
      ice_vsi_sync_fltr() {
        ...
        if (changed_flags & IFF_ALLMULTI) {
          if (netdev->flags & IFF_ALLMULTI) {
            if (vsi->num_vlans > 1)
              ice_set_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS);
            else
              ice_set_promisc(..., ICE_MCAST_PROMISC_BITS);
          } else {
            if (vsi->num_vlans > 1)
              ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS);
            else
              ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS);
          }
        }
        ...
      }
      
      The code above depends on value vsi->num_vlan that specifies number
      of VLANs configured over the interface (including VLAN 0) and
      this is problem because that value is modified in NDO callbacks
      ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid().
      
      Scenario 1:
      1. ip link set ens7f0 allmulticast on
      2. ip link add vlan10 link ens7f0 type vlan id 10
      3. ip link set ens7f0 allmulticast off
      4. ip link set ens7f0 allmulticast on
      
      [1] In this scenario IFF_ALLMULTI is enabled and the driver calls
          ice_set_promisc(..., ICE_MCAST_PROMISC_BITS) that installs
          multicast promisc rule with non-VLAN look-up type.
      [2] Then VLAN with ID 10 is added and vsi->num_vlan incremented to 2
      [3] Command switches IFF_ALLMULTI off and the driver calls
          ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS) but this
          call is effectively NOP because it looks for multicast promisc
          rules for VLAN 0 and VLAN 10 with VLAN look-up type but no such
          rules exist. So the all-multicast remains enabled silently
          in hardware.
      [4] Command tries to switch IFF_ALLMULTI on and the driver calls
          ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS) but this call
          fails (-EEXIST) because non-VLAN multicast promisc rule already
          exists.
      
      Scenario 2:
      1. ip link add vlan10 link ens7f0 type vlan id 10
      2. ip link set ens7f0 allmulticast on
      3. ip link add vlan20 link ens7f0 type vlan id 20
      4. ip link del vlan10 ; ip link del vlan20
      5. ip link set ens7f0 allmulticast off
      
      [1] VLAN with ID 10 is added and vsi->num_vlan==2
      [2] Command switches IFF_ALLMULTI on and driver installs multicast
          promisc rules with VLAN look-up type for VLAN 0 and 10
      [3] VLAN with ID 20 is added and vsi->num_vlan==3 but no multicast
          promisc rules is added for this new VLAN so the interface does
          not receive MC packets from VLAN 20
      [4] Both VLANs are removed but multicast rule for VLAN 10 remains
          installed so interface receives multicast packets from VLAN 10
      [5] Command switches IFF_ALLMULTI off and because vsi->num_vlan is 1
          the driver tries to remove multicast promisc rule for VLAN 0
          with non-VLAN look-up that does not exist.
          All-multicast looks disabled from user point of view but it
          is partially enabled in HW (interface receives all multicast
          packets either untagged or tagged with VLAN ID 10)
      
      To resolve these issues the patch introduces these changes:
      1. Adds handling for IFF_ALLMULTI to ice_vlan_rx_add_vid() and
         ice_vlan_rx_kill_vid() callbacks. So when VLAN is added/removed
         and IFF_ALLMULTI is enabled an appropriate multicast promisc
         rule for that VLAN ID is added/removed.
      2. In ice_vlan_rx_add_vid() when first VLAN besides VLAN 0 is added
         so (vsi->num_vlan == 2) and IFF_ALLMULTI is enabled then look-up
         type for existing multicast promisc rule for VLAN 0 is updated
         to ICE_MCAST_VLAN_PROMISC_BITS.
      3. In ice_vlan_rx_kill_vid() when last VLAN besides VLAN 0 is removed
         so (vsi->num_vlan == 1) and IFF_ALLMULTI is enabled then look-up
         type for existing multicast promisc rule for VLAN 0 is updated
         to ICE_MCAST_PROMISC_BITS.
      4. Both ice_vlan_rx_{add,kill}_vid() have to run under ICE_CFG_BUSY
         bit protection to avoid races with ice_vsi_sync_fltr() that runs
         in ice_service_task() context.
      5. Bit ICE_VSI_VLAN_FLTR_CHANGED is use-less and can be removed.
      6. Error messages added to ice_fltr_*_vsi_promisc() helper functions
         to avoid them in their callers
      7. Small improvements to increase readability
      
      Fixes: 5eda8afd
      
       ("ice: Add support for PF/VF promiscuous mode")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1273f895
    • Ivan Vecera's avatar
      ice: Fix MAC address setting · 2c0069f3
      Ivan Vecera authored
      Commit 2ccc1c1c ("ice: Remove excess error variables") merged
      the usage of 'status' and 'err' variables into single one in
      function ice_set_mac_address(). Unfortunately this causes
      a regression when call of ice_fltr_add_mac() returns -EEXIST because
      this return value does not indicate an error in this case but
      value of 'err' remains to be -EEXIST till the end of the function
      and is returned to caller.
      
      Prior mentioned commit this does not happen because return value of
      ice_fltr_add_mac() was stored to 'status' variable first and
      if it was -EEXIST then 'err' remains to be zero.
      
      Fix the problem by reset 'err' to zero when ice_fltr_add_mac()
      returns -EEXIST.
      
      Fixes: 2ccc1c1c
      
       ("ice: Remove excess error variables")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c0069f3
    • Ivan Vecera's avatar
      ice: Clear default forwarding VSI during VSI release · bd8c624c
      Ivan Vecera authored
      VSI is set as default forwarding one when promisc mode is set for
      PF interface, when PF is switched to switchdev mode or when VF
      driver asks to enable allmulticast or promisc mode for the VF
      interface (when vf-true-promisc-support priv flag is off).
      The third case is buggy because in that case VSI associated with
      VF remains as default one after VF removal.
      
      Reproducer:
      1. Create VF
         echo 1 > sys/class/net/ens7f0/device/sriov_numvfs
      2. Enable allmulticast or promisc mode on VF
         ip link set ens7f0v0 allmulticast on
         ip link set ens7f0v0 promisc on
      3. Delete VF
         echo 0 > sys/class/net/ens7f0/device/sriov_numvfs
      4. Try to enable promisc mode on PF
         ip link set ens7f0 promisc on
      
      Although it looks that promisc mode on PF is enabled the opposite
      is true because ice_vsi_sync_fltr() responsible for IFF_PROMISC
      handling first checks if any other VSI is set as default forwarding
      one and if so the function does not do anything. At this point
      it is not possible to enable promisc mode on PF without re-probe
      device.
      
      To resolve the issue this patch clear default forwarding VSI
      during ice_vsi_release() when the VSI to be released is the default
      one.
      
      Fixes: 01b5e89a
      
       ("ice: Add VF promiscuous support")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarAlice Michael <alice.michael@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd8c624c