Skip to content
  1. Aug 11, 2023
    • Hao Chen's avatar
      net: hns3: fix strscpy causing content truncation issue · 5e3d2061
      Hao Chen authored
      
      
      hns3_dbg_fill_content()/hclge_dbg_fill_content() is aim to integrate some
      items to a string for content, and we add '\n' and '\0' in the last
      two bytes of content.
      
      strscpy() will add '\0' in the last byte of destination buffer(one of
      items), it result in finishing content print ahead of schedule and some
      dump content truncation.
      
      One Error log shows as below:
      cat mac_list/uc
      UC MAC_LIST:
      
      Expected:
      UC MAC_LIST:
      FUNC_ID  MAC_ADDR            STATE
      pf       00:2b:19:05:03:00   ACTIVE
      
      The destination buffer is length-bounded and not required to be
      NUL-terminated, so just change strscpy() to memcpy() to fix it.
      
      Fixes: 1cf3d556 ("net: hns3: fix strncpy() not using dest-buf length as length issue")
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Link: https://lore.kernel.org/r/20230809020902.1941471-1-shaojijie@huawei.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5e3d2061
    • Jakub Kicinski's avatar
      net: tls: set MSG_SPLICE_PAGES consistently · 6b486676
      Jakub Kicinski authored
      
      
      We used to change the flags for the last segment, because
      non-last segments had the MSG_SENDPAGE_NOTLAST flag set.
      That flag is no longer a thing so remove the setting.
      
      Since flags most likely don't have MSG_SPLICE_PAGES set
      this avoids passing parts of the sg as splice and parts
      as non-splice. Before commit under Fixes we'd have called
      tcp_sendpage() which would add the MSG_SPLICE_PAGES.
      
      Why this leads to trouble remains unclear but Tariq
      reports hitting the WARN_ON(!sendpage_ok()) due to
      page refcount of 0.
      
      Fixes: e117dcfd ("tls: Inline do_tcp_sendpages()")
      Reported-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/all/4c49176f-147a-4283-f1b1-32aac7b4b996@gmail.com/
      
      
      Tested-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20230808180917.1243540-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6b486676
    • Jakub Kicinski's avatar
      Merge tag 'nf-23-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 3e91b0eb
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The existing attempt to resolve races between control plane and GC work
      is error prone, as reported by Bien Pham <phamnnb@sea.com>, some places
      forgot to call nft_set_elem_mark_busy(), leading to double-deactivation
      of elements.
      
      This series contains the following patches:
      
      1) Do not skip expired elements during walk otherwise elements might
         never decrement the reference counter on data, leading to memleak.
      
      2) Add a GC transaction API to replace the former attempt to deal with
         races between control plane and GC. GC worker sets on NFT_SET_ELEM_DEAD_BIT
         on elements and it creates a GC transaction to remove the expired
         elements, GC transaction could abort in case of interference with
         control plane and retried later (GC async). Set backends such as
         rbtree and pipapo also perform GC from control plane (GC sync), in
         such case, element deactivation and removal is safe because mutex
         is held then collected elements are released via call_rcu().
      
      3) Adapt existing set backends to use the GC transaction API.
      
      4) Update rhash set backend to set on _DEAD bit to report deleted
         elements from datapath for GC.
      
      5) Remove old GC batch API and the NFT_SET_ELEM_BUSY_BIT.
      
      * tag 'nf-23-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: remove busy mark and gc batch API
        netfilter: nft_set_hash: mark set element as dead when deleting from packet path
        netfilter: nf_tables: adapt set backend to use GC transaction API
        netfilter: nf_tables: GC transaction API to avoid race with control plane
        netfilter: nf_tables: don't skip expired elements during walk
      ====================
      
      Link: https://lore.kernel.org/r/20230810070830.24064-1-pablo@netfilter.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3e91b0eb
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 62d02fca
      Jakub Kicinski authored
      Martin KaFai Lau says:
      
      ====================
      pull-request: bpf 2023-08-09
      
      We've added 5 non-merge commits during the last 7 day(s) which contain
      a total of 6 files changed, 102 insertions(+), 8 deletions(-).
      
      The main changes are:
      
      1) A bpf sockmap memleak fix and a fix in accessing the programs of
         a sockmap under the incorrect map type from Xu Kuohai.
      
      2) A refcount underflow fix in xsk from Magnus Karlsson.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: Add sockmap test for redirecting partial skb data
        selftests/bpf: fix a CI failure caused by vsock sockmap test
        bpf, sockmap: Fix bug that strp_done cannot be called
        bpf, sockmap: Fix map type error in sock_map_del_link
        xsk: fix refcount underflow in error path
      ====================
      
      Link: https://lore.kernel.org/r/20230810055303.120917-1-martin.lau@linux.dev
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      62d02fca
    • Nick Child's avatar
      ibmvnic: Ensure login failure recovery is safe from other resets · 6db541ae
      Nick Child authored
      
      
      If a login request fails, the recovery process should be protected
      against parallel resets. It is a known issue that freeing and
      registering CRQ's in quick succession can result in a failover CRQ from
      the VIOS. Processing a failover during login recovery is dangerous for
      two reasons:
       1. This will result in two parallel initialization processes, this can
       cause serious issues during login.
       2. It is possible that the failover CRQ is received but never executed.
       We get notified of a pending failover through a transport event CRQ.
       The reset is not performed until a INIT CRQ request is received.
       Previously, if CRQ init fails during login recovery, then the ibmvnic
       irq is freed and the login process returned error. If failover_pending
       is true (a transport event was received), then the ibmvnic device
       would never be able to process the reset since it cannot receive the
       CRQ_INIT request due to the irq being freed. This leaved the device
       in a inoperable state.
      
      Therefore, the login failure recovery process must be hardened against
      these possible issues. Possible failovers (due to quick CRQ free and
      init) must be avoided and any issues during re-initialization should be
      dealt with instead of being propagated up the stack. This logic is
      similar to that of ibmvnic_probe().
      
      Fixes: dff515a3 ("ibmvnic: Harden device login requests")
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230809221038.51296-5-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6db541ae
    • Nick Child's avatar
      ibmvnic: Do partial reset on login failure · 23cc5f66
      Nick Child authored
      
      
      Perform a partial reset before sending a login request if any of the
      following are true:
       1. If a previous request times out. This can be dangerous because the
       	VIOS could still receive the old login request at any point after
       	the timeout. Therefore, it is best to re-register the CRQ's  and
       	sub-CRQ's before retrying.
       2. If the previous request returns an error that is not described in
       	PAPR. PAPR provides procedures if the login returns with partial
       	success or aborted return codes (section L.5.1) but other values
      	do not have a defined procedure. Previously, these conditions
      	just returned error from the login function rather than trying
      	to resolve the issue.
       	This can cause further issues since most callers of the login
       	function are not prepared to handle an error when logging in. This
       	improper cleanup can lead to the device being permanently DOWN'd.
       	For example, if the VIOS believes that the device is already logged
       	in then it will return INVALID_STATE (-7). If we never re-register
       	CRQ's then it will always think that the device is already logged
       	in. This leaves the device inoperable.
      
      The partial reset involves freeing the sub-CRQs, freeing the CRQ then
      registering and initializing a new CRQ and sub-CRQs. This essentially
      restarts all communication with VIOS to allow for a fresh login attempt
      that will be unhindered by any previous failed attempts.
      
      Fixes: dff515a3 ("ibmvnic: Harden device login requests")
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230809221038.51296-4-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      23cc5f66
    • Nick Child's avatar
      ibmvnic: Handle DMA unmapping of login buffs in release functions · d78a671e
      Nick Child authored
      
      
      Rather than leaving the DMA unmapping of the login buffers to the
      login response handler, move this work into the login release functions.
      Previously, these functions were only used for freeing the allocated
      buffers. This could lead to issues if there are more than one
      outstanding login buffer requests, which is possible if a login request
      times out.
      
      If a login request times out, then there is another call to send login.
      The send login function makes a call to the login buffer release
      function. In the past, this freed the buffers but did not DMA unmap.
      Therefore, the VIOS could still write to the old login (now freed)
      buffer. It is for this reason that it is a good idea to leave the DMA
      unmap call to the login buffers release function.
      
      Since the login buffer release functions now handle DMA unmapping,
      remove the duplicate DMA unmapping in handle_login_rsp().
      
      Fixes: dff515a3 ("ibmvnic: Harden device login requests")
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230809221038.51296-3-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d78a671e
    • Nick Child's avatar
      ibmvnic: Unmap DMA login rsp buffer on send login fail · 411c565b
      Nick Child authored
      
      
      If the LOGIN CRQ fails to send then we must DMA unmap the response
      buffer. Previously, if the CRQ failed then the memory was freed without
      DMA unmapping.
      
      Fixes: c98d9cc4 ("ibmvnic: send_login should check for crq errors")
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230809221038.51296-2-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      411c565b
    • Nick Child's avatar
      ibmvnic: Enforce stronger sanity checks on login response · db17ba71
      Nick Child authored
      
      
      Ensure that all offsets in a login response buffer are within the size
      of the allocated response buffer. Any offsets or lengths that surpass
      the allocation are likely the result of an incomplete response buffer.
      In these cases, a full reset is necessary.
      
      When attempting to login, the ibmvnic device will allocate a response
      buffer and pass a reference to the VIOS. The VIOS will then send the
      ibmvnic device a LOGIN_RSP CRQ to signal that the buffer has been filled
      with data. If the ibmvnic device does not get a response in 20 seconds,
      the old buffer is freed and a new login request is sent. With 2
      outstanding requests, any LOGIN_RSP CRQ's could be for the older
      login request. If this is the case then the login response buffer (which
      is for the newer login request) could be incomplete and contain invalid
      data. Therefore, we must enforce strict sanity checks on the response
      buffer values.
      
      Testing has shown that the `off_rxadd_buff_size` value is filled in last
      by the VIOS and will be the smoking gun for these circumstances.
      
      Until VIOS can implement a mechanism for tracking outstanding response
      buffers and a method for mapping a LOGIN_RSP CRQ to a particular login
      response buffer, the best ibmvnic can do in this situation is perform a
      full reset.
      
      Fixes: dff515a3 ("ibmvnic: Harden device login requests")
      Signed-off-by: default avatarNick Child <nnac123@linux.ibm.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230809221038.51296-1-nnac123@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      db17ba71
    • Souradeep Chakrabarti's avatar
      net: mana: Fix MANA VF unload when hardware is unresponsive · a7dfeda6
      Souradeep Chakrabarti authored
      
      
      When unloading the MANA driver, mana_dealloc_queues() waits for the MANA
      hardware to complete any inflight packets and set the pending send count
      to zero. But if the hardware has failed, mana_dealloc_queues()
      could wait forever.
      
      Fix this by adding a timeout to the wait. Set the timeout to 120 seconds,
      which is a somewhat arbitrary value that is more than long enough for
      functional hardware to complete any sends.
      
      Cc: stable@vger.kernel.org
      Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
      Signed-off-by: default avatarSouradeep Chakrabarti <schakrabarti@linux.microsoft.com>
      Link: https://lore.kernel.org/r/1691576525-24271-1-git-send-email-schakrabarti@linux.microsoft.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a7dfeda6
  2. Aug 10, 2023