Skip to content
  1. Mar 21, 2024
  2. Mar 20, 2024
    • David S. Miller's avatar
      Merge branch 'octeontx2-pf-mbox-fixes' · 9c6a5954
      David S. Miller authored
      
      
      Subbaraya Sundeep says:
      
      ====================
      octeontx2-pf: RVU Mailbox fixes
      
      This patchset fixes the problems related to RVU mailbox.
      During long run tests some times VF commands like setting
      MTU or toggling interface fails because VF mailbox is timedout
      waiting for response from PF.
      
      Below are the fixes
      Patch 1: There are two types of messages in RVU mailbox namely up and down
      messages. Down messages are synchronous messages where a PF/VF sends
      a message to AF and AF replies back with response. UP messages are
      notifications and are asynchronous like AF sending link events to
      PF. When VF sends a down message to PF, PF forwards to AF and sends
      the response from AF back to VF. PF has to forward VF messages since
      there is no path in hardware for VF to send directly to AF.
      There is one mailbox interrupt from AF to PF when raised could mean
      two scenarios one is where AF sending reply to PF for a down message
      sent by PF and another one is AF sending up message asynchronously
      when link changed for that PF. Receiving the up message interrupt while
      PF is in middle of forwarding down message causes mailbox errors.
      Fix this by receiver detecting the type of message from the mbox data register
      set by sender.
      
      Patch 2:
      During VF driver remove, VF has to wait until last message is
      completed and then turn off mailbox interrupts from PF.
      
      Patch 3:
      Do not use ordered workqueue for message processing since multiple works are
      queued simultaneously by all the VFs and PF link UP messages.
      
      Patch 4:
      When sending link event to VF by PF check whether VF is really up to
      receive this message.
      
      Patch 5:
      In AF driver, use separate interrupt handlers for the AF-VF interrupt and
      AF-PF interrupt. Sometimes both interrupts are raised to two CPUs at same
      time and both CPUs execute same function at same time corrupting the data.
      
      v2 changes:
      	Added missing mutex unlock in error path in patch 1
      	Refactored if else logic in patch 1 as suggested by Paolo Abeni
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c6a5954
    • Subbaraya Sundeep's avatar
      octeontx2-af: Use separate handlers for interrupts · 50e60de3
      Subbaraya Sundeep authored
      
      
      For PF to AF interrupt vector and VF to AF vector same
      interrupt handler is registered which is causing race condition.
      When two interrupts are raised to two CPUs at same time
      then two cores serve same event corrupting the data.
      
      Fixes: 7304ac45 ("octeontx2-af: Add mailbox IRQ and msg handlers")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50e60de3
    • Subbaraya Sundeep's avatar
      octeontx2-pf: Send UP messages to VF only when VF is up. · dfcf6355
      Subbaraya Sundeep authored
      
      
      When PF sending link status messages to VF, it is possible
      that by the time link_event_task work function is executed
      VF might have brought down. Hence before sending VF link
      status message check whether VF is up to receive it.
      
      Fixes: ad513ed9 ("octeontx2-vf: Link event notification support")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dfcf6355
    • Subbaraya Sundeep's avatar
      octeontx2-pf: Use default max_active works instead of one · 7558ce0d
      Subbaraya Sundeep authored
      
      
      Only one execution context for the workqueue used for PF and
      VFs mailbox communication is incorrect since multiple works are
      queued simultaneously by all the VFs and PF link UP messages.
      Hence use default number of execution contexts by passing zero
      as max_active to alloc_workqueue function. With this fix in place,
      modify UP messages also to wait until completion.
      
      Fixes: d424b6c0 ("octeontx2-pf: Enable SRIOV and added VF mbox handling")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7558ce0d
    • Subbaraya Sundeep's avatar
      octeontx2-pf: Wait till detach_resources msg is complete · cbf2f249
      Subbaraya Sundeep authored
      
      
      During VF driver remove, a message is sent to detach VF
      resources to PF but VF is not waiting until message is
      complete. Also mailbox interrupts need to be turned off
      after the detach resource message is complete. This patch
      fixes that problem.
      
      Fixes: 05fcc9e0 ("octeontx2-pf: Attach NIX and NPA block LFs")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbf2f249
    • Subbaraya Sundeep's avatar
      octeontx2: Detect the mbox up or down message via register · a88e0f93
      Subbaraya Sundeep authored
      
      
      A single line of interrupt is used to receive up notifications
      and down reply messages from AF to PF (similarly from PF to its VF).
      PF acts as bridge and forwards VF messages to AF and sends respsones
      back from AF to VF. When an async event like link event is received
      by up message when PF is in middle of forwarding VF message then
      mailbox errors occur because PF state machine is corrupted.
      Since VF is a separate driver or VF driver can be in a VM it is
      not possible to serialize from the start of communication at VF.
      Hence to differentiate between type of messages at PF this patch makes
      sender to set mbox data register with distinct values for up and down
      messages. Sender also checks whether previous interrupt is received
      before triggering current interrupt by waiting for mailbox data register
      to become zero.
      
      Fixes: 5a6d7c9d ("octeontx2-pf: Mailbox communication with AF")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a88e0f93
    • Jakub Kicinski's avatar
      Merge tag 'ipsec-2024-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 94e3ca2f
      Jakub Kicinski authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2024-03-19
      
      1) Fix possible page_pool leak triggered by esp_output.
         From Dragos Tatulea.
      
      2) Fix UDP encapsulation in software GSO path.
         From Leon Romanovsky.
      
      * tag 'ipsec-2024-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
        xfrm: Allow UDP encapsulation only in offload modes
        net: esp: fix bad handling of pages from page_pool
      ====================
      
      Link: https://lore.kernel.org/r/20240319110151.409825-1-steffen.klassert@secunet.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      94e3ca2f
    • Jiri Pirko's avatar
      devlink: fix port new reply cmd type · 78a2f5e6
      Jiri Pirko authored
      
      
      Due to a c&p error, port new reply fills-up cmd with wrong value,
      any other existing port command replies and notifications.
      
      Fix it by filling cmd with value DEVLINK_CMD_PORT_NEW.
      
      Skimmed through devlink userspace implementations, none of them cares
      about this cmd value.
      
      Reported-by: default avatarChenyuan Yang <chenyuan0y@gmail.com>
      Closes: https://lore.kernel.org/all/ZfZcDxGV3tSy4qsV@cy-server/
      
      
      Fixes: cd76dcd6 ("devlink: Support add and delete devlink port")
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarParav Pandit <parav@nvidia.com>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Link: https://lore.kernel.org/r/20240318091908.2736542-1-jiri@resnulli.us
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      78a2f5e6
    • Kuniyuki Iwashima's avatar
      tcp: Clear req->syncookie in reqsk_alloc(). · 956c0d61
      Kuniyuki Iwashima authored
      
      
      syzkaller reported a read of uninit req->syncookie. [0]
      
      Originally, req->syncookie was used only in tcp_conn_request()
      to indicate if we need to encode SYN cookie in SYN+ACK, so the
      field remains uninitialised in other places.
      
      The commit 695751e3 ("bpf: tcp: Handle BPF SYN Cookie in
      cookie_v[46]_check().") added another meaning in ACK path;
      req->syncookie is set true if SYN cookie is validated by BPF
      kfunc.
      
      After the change, cookie_v[46]_check() always read req->syncookie,
      but it is not initialised in the normal SYN cookie case as reported
      by KMSAN.
      
      Let's make sure we always initialise req->syncookie in reqsk_alloc().
      
      [0]:
      BUG: KMSAN: uninit-value in cookie_v4_check+0x22b7/0x29e0
       net/ipv4/syncookies.c:477
       cookie_v4_check+0x22b7/0x29e0 net/ipv4/syncookies.c:477
       tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1855 [inline]
       tcp_v4_do_rcv+0xb17/0x10b0 net/ipv4/tcp_ipv4.c:1914
       tcp_v4_rcv+0x4ce4/0x5420 net/ipv4/tcp_ipv4.c:2322
       ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205
       ip_local_deliver_finish+0x332/0x500 net/ipv4/ip_input.c:233
       NF_HOOK include/linux/netfilter.h:314 [inline]
       ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
       dst_input include/net/dst.h:460 [inline]
       ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:449
       NF_HOOK include/linux/netfilter.h:314 [inline]
       ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:569
       __netif_receive_skb_one_core net/core/dev.c:5538 [inline]
       __netif_receive_skb+0x319/0x9e0 net/core/dev.c:5652
       process_backlog+0x480/0x8b0 net/core/dev.c:5981
       __napi_poll+0xe7/0x980 net/core/dev.c:6632
       napi_poll net/core/dev.c:6701 [inline]
       net_rx_action+0x89d/0x1820 net/core/dev.c:6813
       __do_softirq+0x1c0/0x7d7 kernel/softirq.c:554
       do_softirq+0x9a/0x100 kernel/softirq.c:455
       __local_bh_enable_ip+0x9f/0xb0 kernel/softirq.c:382
       local_bh_enable include/linux/bottom_half.h:33 [inline]
       rcu_read_unlock_bh include/linux/rcupdate.h:820 [inline]
       __dev_queue_xmit+0x2776/0x52c0 net/core/dev.c:4362
       dev_queue_xmit include/linux/netdevice.h:3091 [inline]
       neigh_hh_output include/net/neighbour.h:526 [inline]
       neigh_output include/net/neighbour.h:540 [inline]
       ip_finish_output2+0x187a/0x1b70 net/ipv4/ip_output.c:235
       __ip_finish_output+0x287/0x810
       ip_finish_output+0x4b/0x550 net/ipv4/ip_output.c:323
       NF_HOOK_COND include/linux/netfilter.h:303 [inline]
       ip_output+0x15f/0x3f0 net/ipv4/ip_output.c:433
       dst_output include/net/dst.h:450 [inline]
       ip_local_out net/ipv4/ip_output.c:129 [inline]
       __ip_queue_xmit+0x1e93/0x2030 net/ipv4/ip_output.c:535
       ip_queue_xmit+0x60/0x80 net/ipv4/ip_output.c:549
       __tcp_transmit_skb+0x3c70/0x4890 net/ipv4/tcp_output.c:1462
       tcp_transmit_skb net/ipv4/tcp_output.c:1480 [inline]
       tcp_write_xmit+0x3ee1/0x8900 net/ipv4/tcp_output.c:2792
       __tcp_push_pending_frames net/ipv4/tcp_output.c:2977 [inline]
       tcp_send_fin+0xa90/0x12e0 net/ipv4/tcp_output.c:3578
       tcp_shutdown+0x198/0x1f0 net/ipv4/tcp.c:2716
       inet_shutdown+0x33f/0x5b0 net/ipv4/af_inet.c:923
       __sys_shutdown_sock net/socket.c:2425 [inline]
       __sys_shutdown net/socket.c:2437 [inline]
       __do_sys_shutdown net/socket.c:2445 [inline]
       __se_sys_shutdown+0x2a4/0x440 net/socket.c:2443
       __x64_sys_shutdown+0x6c/0xa0 net/socket.c:2443
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
       reqsk_alloc include/net/request_sock.h:148 [inline]
       inet_reqsk_alloc+0x651/0x7a0 net/ipv4/tcp_input.c:6978
       cookie_tcp_reqsk_alloc+0xd4/0x900 net/ipv4/syncookies.c:328
       cookie_tcp_check net/ipv4/syncookies.c:388 [inline]
       cookie_v4_check+0x289f/0x29e0 net/ipv4/syncookies.c:420
       tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1855 [inline]
       tcp_v4_do_rcv+0xb17/0x10b0 net/ipv4/tcp_ipv4.c:1914
       tcp_v4_rcv+0x4ce4/0x5420 net/ipv4/tcp_ipv4.c:2322
       ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205
       ip_local_deliver_finish+0x332/0x500 net/ipv4/ip_input.c:233
       NF_HOOK include/linux/netfilter.h:314 [inline]
       ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
       dst_input include/net/dst.h:460 [inline]
       ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:449
       NF_HOOK include/linux/netfilter.h:314 [inline]
       ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:569
       __netif_receive_skb_one_core net/core/dev.c:5538 [inline]
       __netif_receive_skb+0x319/0x9e0 net/core/dev.c:5652
       process_backlog+0x480/0x8b0 net/core/dev.c:5981
       __napi_poll+0xe7/0x980 net/core/dev.c:6632
       napi_poll net/core/dev.c:6701 [inline]
       net_rx_action+0x89d/0x1820 net/core/dev.c:6813
       __do_softirq+0x1c0/0x7d7 kernel/softirq.c:554
      
      Uninit was created at:
       __alloc_pages+0x9a7/0xe00 mm/page_alloc.c:4592
       __alloc_pages_node include/linux/gfp.h:238 [inline]
       alloc_pages_node include/linux/gfp.h:261 [inline]
       alloc_slab_page mm/slub.c:2175 [inline]
       allocate_slab mm/slub.c:2338 [inline]
       new_slab+0x2de/0x1400 mm/slub.c:2391
       ___slab_alloc+0x1184/0x33d0 mm/slub.c:3525
       __slab_alloc mm/slub.c:3610 [inline]
       __slab_alloc_node mm/slub.c:3663 [inline]
       slab_alloc_node mm/slub.c:3835 [inline]
       kmem_cache_alloc+0x6d3/0xbe0 mm/slub.c:3852
       reqsk_alloc include/net/request_sock.h:131 [inline]
       inet_reqsk_alloc+0x66/0x7a0 net/ipv4/tcp_input.c:6978
       tcp_conn_request+0x484/0x44e0 net/ipv4/tcp_input.c:7135
       tcp_v4_conn_request+0x16f/0x1d0 net/ipv4/tcp_ipv4.c:1716
       tcp_rcv_state_process+0x2e5/0x4bb0 net/ipv4/tcp_input.c:6655
       tcp_v4_do_rcv+0xbfd/0x10b0 net/ipv4/tcp_ipv4.c:1929
       tcp_v4_rcv+0x4ce4/0x5420 net/ipv4/tcp_ipv4.c:2322
       ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205
       ip_local_deliver_finish+0x332/0x500 net/ipv4/ip_input.c:233
       NF_HOOK include/linux/netfilter.h:314 [inline]
       ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
       dst_input include/net/dst.h:460 [inline]
       ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
       ip_list_rcv_finish net/ipv4/ip_input.c:631 [inline]
       ip_sublist_rcv+0x15f3/0x17f0 net/ipv4/ip_input.c:639
       ip_list_rcv+0x9ef/0xa40 net/ipv4/ip_input.c:674
       __netif_receive_skb_list_ptype net/core/dev.c:5581 [inline]
       __netif_receive_skb_list_core+0x15c5/0x1670 net/core/dev.c:5629
       __netif_receive_skb_list net/core/dev.c:5681 [inline]
       netif_receive_skb_list_internal+0x106c/0x16f0 net/core/dev.c:5773
       gro_normal_list include/net/gro.h:438 [inline]
       napi_complete_done+0x425/0x880 net/core/dev.c:6113
       virtqueue_napi_complete drivers/net/virtio_net.c:465 [inline]
       virtnet_poll+0x149d/0x2240 drivers/net/virtio_net.c:2211
       __napi_poll+0xe7/0x980 net/core/dev.c:6632
       napi_poll net/core/dev.c:6701 [inline]
       net_rx_action+0x89d/0x1820 net/core/dev.c:6813
       __do_softirq+0x1c0/0x7d7 kernel/softirq.c:554
      
      CPU: 0 PID: 16792 Comm: syz-executor.2 Not tainted 6.8.0-syzkaller-05562-g61387b8dcf1d #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
      
      Fixes: 695751e3 ("bpf: tcp: Handle BPF SYN Cookie in cookie_v[46]_check().")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Closes: https://lore.kernel.org/bpf/CANn89iKdN9c+C_2JAUbc+VY3DDQjAQukMtiBbormAmAk9CdvQA@mail.gmail.com/
      
      
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/r/20240315224710.55209-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      956c0d61
    • Thinh Tran's avatar
      net/bnx2x: Prevent access to a freed page in page_pool · d27e2da9
      Thinh Tran authored
      
      
      Fix race condition leading to system crash during EEH error handling
      
      During EEH error recovery, the bnx2x driver's transmit timeout logic
      could cause a race condition when handling reset tasks. The
      bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(),
      which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload()
      SGEs are freed using bnx2x_free_rx_sge_range(). However, this could
      overlap with the EEH driver's attempt to reset the device using
      bnx2x_io_slot_reset(), which also tries to free SGEs. This race
      condition can result in system crashes due to accessing freed memory
      locations in bnx2x_free_rx_sge()
      
      799  static inline void bnx2x_free_rx_sge(struct bnx2x *bp,
      800				struct bnx2x_fastpath *fp, u16 index)
      801  {
      802	struct sw_rx_page *sw_buf = &fp->rx_page_ring[index];
      803     struct page *page = sw_buf->page;
      ....
      where sw_buf was set to NULL after the call to dma_unmap_page()
      by the preceding thread.
      
          EEH: Beginning: 'slot_reset'
          PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset()
          bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing...
          bnx2x 0011:01:00.0: enabling device (0140 -> 0142)
          bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload
          Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
          BUG: Kernel NULL pointer dereference on read at 0x00000000
          Faulting instruction address: 0xc0080000025065fc
          Oops: Kernel access of bad area, sig: 11 [#1]
          .....
          Call Trace:
          [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable)
          [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0
          [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550
          [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60
          [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170
          [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0
          [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64
      
      To solve this issue, we need to verify page pool allocations before
      freeing.
      
      Fixes: 4cace675 ("bnx2x: Alloc 4k fragment for each rx ring buffer element")
      Signed-off-by: default avatarThinh Tran <thinhtr@linux.ibm.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/r/20240315205535.1321-1-thinhtr@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d27e2da9
  3. Mar 19, 2024