Skip to content
  1. Apr 20, 2023
    • Liang Chen's avatar
      skbuff: Fix a race between coalescing and releasing SKBs · 71850b5a
      Liang Chen authored
      
      
      [ Upstream commit 0646dc31 ]
      
      Commit 1effe8ca ("skbuff: fix coalescing for page_pool fragment
      recycling") allowed coalescing to proceed with non page pool page and page
      pool page when @from is cloned, i.e.
      
      to->pp_recycle    --> false
      from->pp_recycle  --> true
      skb_cloned(from)  --> true
      
      However, it actually requires skb_cloned(@from) to hold true until
      coalescing finishes in this situation. If the other cloned SKB is
      released while the merging is in process, from_shinfo->nr_frags will be
      set to 0 toward the end of the function, causing the increment of frag
      page _refcount to be unexpectedly skipped resulting in inconsistent
      reference counts. Later when SKB(@to) is released, it frees the page
      directly even though the page pool page is still in use, leading to
      use-after-free or double-free errors. So it should be prohibited.
      
      The double-free error message below prompted us to investigate:
      BUG: Bad page state in process swapper/1  pfn:0e0d1
      page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000
      index:0x2 pfn:0xe0d1
      flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
      raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000
      raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000
      page dumped because: nonzero _refcount
      
      CPU: 1 PID: 0 Comm: swapper/1 Tainted: G            E      6.2.0+
      Call Trace:
       <IRQ>
      dump_stack_lvl+0x32/0x50
      bad_page+0x69/0xf0
      free_pcp_prepare+0x260/0x2f0
      free_unref_page+0x20/0x1c0
      skb_release_data+0x10b/0x1a0
      napi_consume_skb+0x56/0x150
      net_rx_action+0xf0/0x350
      ? __napi_schedule+0x79/0x90
      __do_softirq+0xc8/0x2b1
      __irq_exit_rcu+0xb9/0xf0
      common_interrupt+0x82/0xa0
      </IRQ>
      <TASK>
      asm_common_interrupt+0x22/0x40
      RIP: 0010:default_idle+0xb/0x20
      
      Fixes: 53e0961d ("page_pool: add frag page recycling support in page pool")
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230413090353.14448-1-liangchen.linux@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      71850b5a
    • Roman Gushchin's avatar
      net: macb: fix a memory corruption in extended buffer descriptor mode · 9412a9bf
      Roman Gushchin authored
      
      
      [ Upstream commit e8b74453 ]
      
      For quite some time we were chasing a bug which looked like a sudden
      permanent failure of networking and mmc on some of our devices.
      The bug was very sensitive to any software changes and even more to
      any kernel debug options.
      
      Finally we got a setup where the problem was reproducible with
      CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma:
      
      [   16.992082] ------------[ cut here ]------------
      [   16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes]
      [   17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900
      [   17.018977] Modules linked in: xxxxx
      [   17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 #28
      [   17.045345] Hardware name: xxxxx
      [   17.049528] pstate: 60000005 (nZCv daif -PAN -UAO)
      [   17.054322] pc : check_unmap+0x6a0/0x900
      [   17.058243] lr : check_unmap+0x6a0/0x900
      [   17.062163] sp : ffffffc010003c40
      [   17.065470] x29: ffffffc010003c40 x28: 000000004000c03c
      [   17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800
      [   17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8
      [   17.081407] x23: 0000000000000000 x22: ffffffc010a08750
      [   17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000
      [   17.092032] x19: 0000000875e3e244 x18: 0000000000000010
      [   17.097343] x17: 0000000000000000 x16: 0000000000000000
      [   17.102647] x15: ffffff8879e4a988 x14: 0720072007200720
      [   17.107959] x13: 0720072007200720 x12: 0720072007200720
      [   17.113261] x11: 0720072007200720 x10: 0720072007200720
      [   17.118565] x9 : 0720072007200720 x8 : 000000000000022d
      [   17.123869] x7 : 0000000000000015 x6 : 0000000000000098
      [   17.129173] x5 : 0000000000000000 x4 : 0000000000000000
      [   17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370
      [   17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000
      [   17.145082] Call trace:
      [   17.147524]  check_unmap+0x6a0/0x900
      [   17.151091]  debug_dma_unmap_page+0x88/0x90
      [   17.155266]  gem_rx+0x114/0x2f0
      [   17.158396]  macb_poll+0x58/0x100
      [   17.161705]  net_rx_action+0x118/0x400
      [   17.165445]  __do_softirq+0x138/0x36c
      [   17.169100]  irq_exit+0x98/0xc0
      [   17.172234]  __handle_domain_irq+0x64/0xc0
      [   17.176320]  gic_handle_irq+0x5c/0xc0
      [   17.179974]  el1_irq+0xb8/0x140
      [   17.183109]  xiic_process+0x5c/0xe30
      [   17.186677]  irq_thread_fn+0x28/0x90
      [   17.190244]  irq_thread+0x208/0x2a0
      [   17.193724]  kthread+0x130/0x140
      [   17.196945]  ret_from_fork+0x10/0x20
      [   17.200510] ---[ end trace 7240980785f81d6f ]---
      
      [  237.021490] ------------[ cut here ]------------
      [  237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b
      [  237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240
      [  237.041802] Modules linked in: xxxxx
      [  237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0 #28
      [  237.068941] Hardware name: xxxxx
      [  237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO)
      [  237.077900] pc : add_dma_entry+0x214/0x240
      [  237.081986] lr : add_dma_entry+0x214/0x240
      [  237.086072] sp : ffffffc010003c30
      [  237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00
      [  237.094683] x27: 0000000000000180 x26: ffffff8878e387c0
      [  237.099987] x25: 0000000000000002 x24: 0000000000000000
      [  237.105290] x23: 000000000000003b x22: ffffffc010a0fa00
      [  237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600
      [  237.115897] x19: 00000000ffffffef x18: 0000000000000010
      [  237.121201] x17: 0000000000000000 x16: 0000000000000000
      [  237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720
      [  237.131807] x13: 0720072007200720 x12: 0720072007200720
      [  237.137111] x11: 0720072007200720 x10: 0720072007200720
      [  237.142415] x9 : 0720072007200720 x8 : 0000000000000259
      [  237.147718] x7 : 0000000000000001 x6 : 0000000000000000
      [  237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001
      [  237.158325] x3 : 0000000000000006 x2 : 0000000000000007
      [  237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000
      [  237.168932] Call trace:
      [  237.171373]  add_dma_entry+0x214/0x240
      [  237.175115]  debug_dma_map_page+0xf8/0x120
      [  237.179203]  gem_rx_refill+0x190/0x280
      [  237.182942]  gem_rx+0x224/0x2f0
      [  237.186075]  macb_poll+0x58/0x100
      [  237.189384]  net_rx_action+0x118/0x400
      [  237.193125]  __do_softirq+0x138/0x36c
      [  237.196780]  irq_exit+0x98/0xc0
      [  237.199914]  __handle_domain_irq+0x64/0xc0
      [  237.204000]  gic_handle_irq+0x5c/0xc0
      [  237.207654]  el1_irq+0xb8/0x140
      [  237.210789]  arch_cpu_idle+0x40/0x200
      [  237.214444]  default_idle_call+0x18/0x30
      [  237.218359]  do_idle+0x200/0x280
      [  237.221578]  cpu_startup_entry+0x20/0x30
      [  237.225493]  rest_init+0xe4/0xf0
      [  237.228713]  arch_call_rest_init+0xc/0x14
      [  237.232714]  start_kernel+0x47c/0x4a8
      [  237.236367] ---[ end trace 7240980785f81d70 ]---
      
      Lars was fast to find an explanation: according to the datasheet
      bit 2 of the rx buffer descriptor entry has a different meaning in the
      extended mode:
        Address [2] of beginning of buffer, or
        in extended buffer descriptor mode (DMA configuration register [28] = 1),
        indicates a valid timestamp in the buffer descriptor entry.
      
      The macb driver didn't mask this bit while getting an address and it
      eventually caused a memory corruption and a dma failure.
      
      The problem is resolved by explicitly clearing the problematic bit
      if hw timestamping is used.
      
      Fixes: 7b429614 ("net: macb: Add support for PTP timestamps in DMA descriptors")
      Signed-off-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Co-developed-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://lore.kernel.org/r/20230412232144.770336-1-roman.gushchin@linux.dev
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9412a9bf
    • Eric Dumazet's avatar
      udp6: fix potential access to stale information · ecdf42c2
      Eric Dumazet authored
      
      
      [ Upstream commit 1c5950fc ]
      
      lena wang reported an issue caused by udpv6_sendmsg()
      mangling msg->msg_name and msg->msg_namelen, which
      are later read from ____sys_sendmsg() :
      
      	/*
      	 * If this is sendmmsg() and sending to current destination address was
      	 * successful, remember it.
      	 */
      	if (used_address && err >= 0) {
      		used_address->name_len = msg_sys->msg_namelen;
      		if (msg_sys->msg_name)
      			memcpy(&used_address->name, msg_sys->msg_name,
      			       used_address->name_len);
      	}
      
      udpv6_sendmsg() wants to pretend the remote address family
      is AF_INET in order to call udp_sendmsg().
      
      A fix would be to modify the address in-place, instead
      of using a local variable, but this could have other side effects.
      
      Instead, restore initial values before we return from udpv6_sendmsg().
      
      Fixes: c71d8ebe ("net: Fix security_socket_sendmsg() bypass problem.")
      Reported-by: default avatarlena wang <lena.wang@mediatek.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMaciej Żenczykowski <maze@google.com>
      Link: https://lore.kernel.org/r/20230412130308.1202254-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ecdf42c2
    • Aaron Conole's avatar
      selftests: openvswitch: adjust datapath NL message declaration · 6985701e
      Aaron Conole authored
      
      
      [ Upstream commit 306dc213 ]
      
      The netlink message for creating a new datapath takes an array
      of ports for the PID creation.  This shouldn't cause much issue
      but correct it for future cases where we need to do decode of
      datapath information that could include the per-cpu PID map.
      
      Fixes: 25f16c87 ("selftests: add openvswitch selftest suite")
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://lore.kernel.org/r/20230412115828.3991806-1-aconole@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6985701e
    • Saravanan Vajravel's avatar
      RDMA/core: Fix GID entry ref leak when create_ah fails · 370280c6
      Saravanan Vajravel authored
      [ Upstream commit aca3b0fa ]
      
      If AH create request fails, release sgid_attr to avoid GID entry
      referrence leak reported while releasing GID table
      
      Fixes: 1a1f460f ("RDMA: Hold the sgid_attr inside the struct ib_ah/qp")
      Link: https://lore.kernel.org/r/20230401063424.342204-1-saravanan.vajravel@broadcom.com
      
      
      Reviewed-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Signed-off-by: default avatarSaravanan Vajravel <saravanan.vajravel@broadcom.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      370280c6
    • Xin Long's avatar
      sctp: fix a potential overflow in sctp_ifwdtsn_skip · 5c9367ac
      Xin Long authored
      
      
      [ Upstream commit 32832a2c ]
      
      Currently, when traversing ifwdtsn skips with _sctp_walk_ifwdtsn, it only
      checks the pos against the end of the chunk. However, the data left for
      the last pos may be < sizeof(struct sctp_ifwdtsn_skip), and dereference
      it as struct sctp_ifwdtsn_skip may cause coverflow.
      
      This patch fixes it by checking the pos against "the end of the chunk -
      sizeof(struct sctp_ifwdtsn_skip)" in sctp_ifwdtsn_skip, similar to
      sctp_fwdtsn_skip.
      
      Fixes: 0fc2ea92 ("sctp: implement validate_ftsn for sctp_stream_interleave")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/2a71bffcd80b4f2c61fac6d344bb2f11c8fd74f7.1681155810.git.lucien.xin@gmail.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5c9367ac
    • Ziyang Xuan's avatar
      net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() · bef57c22
      Ziyang Xuan authored
      
      
      [ Upstream commit 64170709 ]
      
      Syzbot reported a bug as following:
      
      =====================================================
      BUG: KMSAN: uninit-value in qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230
       qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230
       qrtr_endpoint_post+0xf85/0x11b0 net/qrtr/af_qrtr.c:519
       qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108
       call_write_iter include/linux/fs.h:2189 [inline]
       aio_write+0x63a/0x950 fs/aio.c:1600
       io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019
       __do_sys_io_submit fs/aio.c:2078 [inline]
       __se_sys_io_submit+0x293/0x770 fs/aio.c:2048
       __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Uninit was created at:
       slab_post_alloc_hook mm/slab.h:766 [inline]
       slab_alloc_node mm/slub.c:3452 [inline]
       __kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491
       __do_kmalloc_node mm/slab_common.c:967 [inline]
       __kmalloc_node_track_caller+0x114/0x3b0 mm/slab_common.c:988
       kmalloc_reserve net/core/skbuff.c:492 [inline]
       __alloc_skb+0x3af/0x8f0 net/core/skbuff.c:565
       __netdev_alloc_skb+0x120/0x7d0 net/core/skbuff.c:630
       qrtr_endpoint_post+0xbd/0x11b0 net/qrtr/af_qrtr.c:446
       qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108
       call_write_iter include/linux/fs.h:2189 [inline]
       aio_write+0x63a/0x950 fs/aio.c:1600
       io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019
       __do_sys_io_submit fs/aio.c:2078 [inline]
       __se_sys_io_submit+0x293/0x770 fs/aio.c:2048
       __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      It is because that skb->len requires at least sizeof(struct qrtr_ctrl_pkt)
      in qrtr_tx_resume(). And skb->len equals to size in qrtr_endpoint_post().
      But size is less than sizeof(struct qrtr_ctrl_pkt) when qrtr_cb->type
      equals to QRTR_TYPE_RESUME_TX in qrtr_endpoint_post() under the syzbot
      scenario. This triggers the uninit variable access bug.
      
      Add size check when qrtr_cb->type equals to QRTR_TYPE_RESUME_TX in
      qrtr_endpoint_post() to fix the bug.
      
      Fixes: 5fdeb0d3 ("net: qrtr: Implement outgoing flow control")
      Reported-by: default avatar <syzbot+4436c9630a45820fda76@syzkaller.appspotmail.com>
      Link: https://syzkaller.appspot.com/bug?id=c14607f0963d27d5a3d5f4c8639b500909e43540
      
      
      Suggested-by: default avatarManivannan Sadhasivam <mani@kernel.org>
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230410012352.3997823-1-william.xuanziyang@huawei.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bef57c22
    • Tetsuo Handa's avatar
      cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex · 3756171b
      Tetsuo Handa authored
      
      
      [ Upstream commit 57dcd64c ]
      
      syzbot is reporting circular locking dependency between cpu_hotplug_lock
      and freezer_mutex, for commit f5d39b02 ("freezer,sched: Rewrite core
      freezer logic") replaced atomic_inc() in freezer_apply_state() with
      static_branch_inc() which holds cpu_hotplug_lock.
      
      cpu_hotplug_lock => cgroup_threadgroup_rwsem => freezer_mutex
      
        cgroup_file_write() {
          cgroup_procs_write() {
            __cgroup_procs_write() {
              cgroup_procs_write_start() {
                cgroup_attach_lock() {
                  cpus_read_lock() {
                    percpu_down_read(&cpu_hotplug_lock);
                  }
                  percpu_down_write(&cgroup_threadgroup_rwsem);
                }
              }
              cgroup_attach_task() {
                cgroup_migrate() {
                  cgroup_migrate_execute() {
                    freezer_attach() {
                      mutex_lock(&freezer_mutex);
                      (...snipped...)
                    }
                  }
                }
              }
              (...snipped...)
            }
          }
        }
      
      freezer_mutex => cpu_hotplug_lock
      
        cgroup_file_write() {
          freezer_write() {
            freezer_change_state() {
              mutex_lock(&freezer_mutex);
              freezer_apply_state() {
                static_branch_inc(&freezer_active) {
                  static_key_slow_inc() {
                    cpus_read_lock();
                    static_key_slow_inc_cpuslocked();
                    cpus_read_unlock();
                  }
                }
              }
              mutex_unlock(&freezer_mutex);
            }
          }
        }
      
      Swap locking order by moving cpus_read_lock() in freezer_apply_state()
      to before mutex_lock(&freezer_mutex) in freezer_change_state().
      
      Reported-by: default avatarsyzbot <syzbot+c39682e86c9d84152f93@syzkaller.appspotmail.com>
      Link: https://syzkaller.appspot.com/bug?extid=c39682e86c9d84152f93
      
      
      Suggested-by: default avatarHillf Danton <hdanton@sina.com>
      Fixes: f5d39b02 ("freezer,sched: Rewrite core freezer logic")
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarMukesh Ojha <quic_mojha@quicinc.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3756171b
    • Harshit Mogalapalli's avatar
      net: wwan: iosm: Fix error handling path in ipc_pcie_probe() · eabf778f
      Harshit Mogalapalli authored
      
      
      [ Upstream commit a56ef256 ]
      
      Smatch reports:
      	drivers/net/wwan/iosm/iosm_ipc_pcie.c:298 ipc_pcie_probe()
      	warn: missing unwind goto?
      
      When dma_set_mask fails it directly returns without disabling pci
      device and freeing ipc_pcie. Fix this my calling a correct goto label
      
      As dma_set_mask returns either 0 or -EIO, we can use a goto label, as
      it finally returns -EIO.
      
      Add a set_mask_fail goto label which stands consistent with other goto
      labels in this function..
      
      Fixes: 035e3bef ("net: wwan: iosm: fix driver not working with INTEL_IOMMU disabled")
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarHarshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      eabf778f
    • Denis Plotnikov's avatar
      qlcnic: check pci_reset_function result · f517b5ee
      Denis Plotnikov authored
      
      
      [ Upstream commit 7573099e ]
      
      Static code analyzer complains to unchecked return value.
      The result of pci_reset_function() is unchecked.
      Despite, the issue is on the FLR supported code path and in that
      case reset can be done with pcie_flr(), the patch uses less invasive
      approach by adding the result check of pci_reset_function().
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 7e2cf4fe ("qlcnic: change driver hardware interface mechanism")
      Signed-off-by: default avatarDenis Plotnikov <den-plotnikov@yandex-team.ru>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f517b5ee
    • Christophe JAILLET's avatar
      drm/armada: Fix a potential double free in an error handling path · 09f4dec1
      Christophe JAILLET authored
      
      
      [ Upstream commit b89ce117 ]
      
      'priv' is a managed resource, so there is no need to free it explicitly or
      there will be a double free().
      
      Fixes: 90ad200b ("drm/armada: Use devm_drm_dev_alloc")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/c4f3c9207a9fce35cb6dd2cc60e755275961588a.1640536364.git.christophe.jaillet@wanadoo.fr
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      09f4dec1
    • Claudia Draghicescu's avatar
      Bluetooth: Set ISO Data Path on broadcast sink · a3f1344a
      Claudia Draghicescu authored
      
      
      [ Upstream commit d2e4f1b1 ]
      
      This patch enables ISO data rx on broadcast sink.
      
      Fixes: eca0ae4a ("Bluetooth: Add initial implementation of BIS connections")
      Signed-off-by: default avatarClaudia Draghicescu <claudia.rosu@nxp.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a3f1344a
    • Luiz Augusto von Dentz's avatar
      Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt · 2fcfd51a
      Luiz Augusto von Dentz authored
      
      
      [ Upstream commit 975abc0c ]
      
      This attempts to fix the following trace:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      6.3.0-rc2-g68fcb3a7bf97 #4706 Not tainted
      ------------------------------------------------------
      sco-tester/31 is trying to acquire lock:
      ffff8880025b8070 (&hdev->lock){+.+.}-{3:3}, at:
      sco_sock_getsockopt+0x1fc/0xa90
      
      but task is already holding lock:
      ffff888001eeb130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at:
      sco_sock_getsockopt+0x104/0xa90
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
             lock_sock_nested+0x32/0x80
             sco_connect_cfm+0x118/0x4a0
             hci_sync_conn_complete_evt+0x1e6/0x3d0
             hci_event_packet+0x55c/0x7c0
             hci_rx_work+0x34c/0xa00
             process_one_work+0x575/0x910
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x2b/0x50
      
      -> #1 (hci_cb_list_lock){+.+.}-{3:3}:
             __mutex_lock+0x13b/0xcc0
             hci_sync_conn_complete_evt+0x1ad/0x3d0
             hci_event_packet+0x55c/0x7c0
             hci_rx_work+0x34c/0xa00
             process_one_work+0x575/0x910
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x2b/0x50
      
      -> #0 (&hdev->lock){+.+.}-{3:3}:
             __lock_acquire+0x18cc/0x3740
             lock_acquire+0x151/0x3a0
             __mutex_lock+0x13b/0xcc0
             sco_sock_getsockopt+0x1fc/0xa90
             __sys_getsockopt+0xe9/0x190
             __x64_sys_getsockopt+0x5b/0x70
             do_syscall_64+0x42/0x90
             entry_SYSCALL_64_after_hwframe+0x70/0xda
      
      other info that might help us debug this:
      
      Chain exists of:
        &hdev->lock --> hci_cb_list_lock --> sk_lock-AF_BLUETOOTH-BTPROTO_SCO
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);
                                     lock(hci_cb_list_lock);
                                     lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);
        lock(&hdev->lock);
      
       *** DEADLOCK ***
      
      1 lock held by sco-tester/31:
       #0: ffff888001eeb130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0},
       at: sco_sock_getsockopt+0x104/0xa90
      
      Fixes: 248733e8 ("Bluetooth: Allow querying of supported offload codecs over SCO socket")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2fcfd51a
    • Luiz Augusto von Dentz's avatar
      Bluetooth: Fix printing errors if LE Connection times out · 99f1bc32
      Luiz Augusto von Dentz authored
      [ Upstream commit b62e7220 ]
      
      This fixes errors like bellow when LE Connection times out since that
      is actually not a controller error:
      
       Bluetooth: hci0: Opcode 0x200d failed: -110
       Bluetooth: hci0: request failed to create LE connection: err -110
      
      Instead the code shall properly detect if -ETIMEDOUT is returned and
      send HCI_OP_LE_CREATE_CONN_CANCEL to give up on the connection.
      
      Link: https://github.com/bluez/bluez/issues/340
      
      
      Fixes: 8e8b92ee ("Bluetooth: hci_sync: Add hci_le_create_conn_sync")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      99f1bc32
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure · 7c90d783
      Luiz Augusto von Dentz authored
      
      
      [ Upstream commit 19cf60bf ]
      
      hci_connect_le_scan_cleanup shall always be invoked to cleanup the
      states and re-enable passive scanning if necessary, otherwise it may
      cause the pending action to stay active causing multiple attempts to
      connect.
      
      Fixes: 9b3628d7 ("Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c90d783
    • Felix Huettner's avatar
      net: openvswitch: fix race on port output · 644b3051
      Felix Huettner authored
      
      
      [ Upstream commit 066b8678 ]
      
      assume the following setup on a single machine:
      1. An openvswitch instance with one bridge and default flows
      2. two network namespaces "server" and "client"
      3. two ovs interfaces "server" and "client" on the bridge
      4. for each ovs interface a veth pair with a matching name and 32 rx and
         tx queues
      5. move the ends of the veth pairs to the respective network namespaces
      6. assign ip addresses to each of the veth ends in the namespaces (needs
         to be the same subnet)
      7. start some http server on the server network namespace
      8. test if a client in the client namespace can reach the http server
      
      when following the actions below the host has a chance of getting a cpu
      stuck in a infinite loop:
      1. send a large amount of parallel requests to the http server (around
         3000 curls should work)
      2. in parallel delete the network namespace (do not delete interfaces or
         stop the server, just kill the namespace)
      
      there is a low chance that this will cause the below kernel cpu stuck
      message. If this does not happen just retry.
      Below there is also the output of bpftrace for the functions mentioned
      in the output.
      
      The series of events happening here is:
      1. the network namespace is deleted calling
         `unregister_netdevice_many_notify` somewhere in the process
      2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
         then runs `synchronize_net`
      3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
      4. this is then handled by `dp_device_event` which calls
         `ovs_netdev_detach_dev` (if a vport is found, which is the case for
         the veth interface attached to ovs)
      5. this removes the rx_handlers of the device but does not prevent
         packages to be sent to the device
      6. `dp_device_event` then queues the vport deletion to work in
         background as a ovs_lock is needed that we do not hold in the
         unregistration path
      7. `unregister_netdevice_many_notify` continues to call
         `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
      8. port deletion continues (but details are not relevant for this issue)
      9. at some future point the background task deletes the vport
      
      If after 7. but before 9. a packet is send to the ovs vport (which is
      not deleted at this point in time) which forwards it to the
      `dev_queue_xmit` flow even though the device is unregistering.
      In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
      a while loop (if the packet has a rx_queue recorded) that is infinite if
      `dev->real_num_tx_queues` is zero.
      
      To prevent this from happening we update `do_output` to handle devices
      without carrier the same as if the device is not found (which would
      be the code path after 9. is done).
      
      Additionally we now produce a warning in `skb_tx_hash` if we will hit
      the infinite loop.
      
      bpftrace (first word is function name):
      
      __dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
      netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
      ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
      netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
      netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
      __dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
      netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
      broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
      ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
      synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604
      
      stuck message:
      
      watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
      Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
      CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic #74-Ubuntu
      Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      RIP: 0010:netdev_pick_tx+0xf1/0x320
      Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
      RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
      RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
      RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
      RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
      R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
      FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
      Call Trace:
       <IRQ>
       netdev_core_pick_tx+0xa4/0xb0
       __dev_queue_xmit+0xf8/0x510
       ? __bpf_prog_exit+0x1e/0x30
       dev_queue_xmit+0x10/0x20
       ovs_vport_send+0xad/0x170 [openvswitch]
       do_output+0x59/0x180 [openvswitch]
       do_execute_actions+0xa80/0xaa0 [openvswitch]
       ? kfree+0x1/0x250
       ? kfree+0x1/0x250
       ? kprobe_perf_func+0x4f/0x2b0
       ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
       ovs_execute_actions+0x4c/0x120 [openvswitch]
       ovs_dp_process_packet+0xa1/0x200 [openvswitch]
       ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
       ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
       ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
       ovs_vport_receive+0x77/0xd0 [openvswitch]
       ? __htab_map_lookup_elem+0x4e/0x60
       ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
       ? trace_call_bpf+0xc8/0x150
       ? kfree+0x1/0x250
       ? kfree+0x1/0x250
       ? kprobe_perf_func+0x4f/0x2b0
       ? kprobe_perf_func+0x4f/0x2b0
       ? __mod_memcg_lruvec_state+0x63/0xe0
       netdev_port_receive+0xc4/0x180 [openvswitch]
       ? netdev_port_receive+0x180/0x180 [openvswitch]
       netdev_frame_hook+0x1f/0x40 [openvswitch]
       __netif_receive_skb_core.constprop.0+0x23d/0xf00
       __netif_receive_skb_one_core+0x3f/0xa0
       __netif_receive_skb+0x15/0x60
       process_backlog+0x9e/0x170
       __napi_poll+0x33/0x180
       net_rx_action+0x126/0x280
       ? ttwu_do_activate+0x72/0xf0
       __do_softirq+0xd9/0x2e7
       ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
       do_softirq+0x7d/0xb0
       </IRQ>
       <TASK>
       __local_bh_enable_ip+0x54/0x60
       ip_finish_output2+0x191/0x460
       __ip_finish_output+0xb7/0x180
       ip_finish_output+0x2e/0xc0
       ip_output+0x78/0x100
       ? __ip_finish_output+0x180/0x180
       ip_local_out+0x5e/0x70
       __ip_queue_xmit+0x184/0x440
       ? tcp_syn_options+0x1f9/0x300
       ip_queue_xmit+0x15/0x20
       __tcp_transmit_skb+0x910/0x9c0
       ? __mod_memcg_state+0x44/0xa0
       tcp_connect+0x437/0x4e0
       ? ktime_get_with_offset+0x60/0xf0
       tcp_v4_connect+0x436/0x530
       __inet_stream_connect+0xd4/0x3a0
       ? kprobe_perf_func+0x4f/0x2b0
       ? aa_sk_perm+0x43/0x1c0
       inet_stream_connect+0x3b/0x60
       __sys_connect_file+0x63/0x70
       __sys_connect+0xa6/0xd0
       ? setfl+0x108/0x170
       ? do_fcntl+0xe8/0x5a0
       __x64_sys_connect+0x18/0x20
       do_syscall_64+0x5c/0xc0
       ? __x64_sys_fcntl+0xa9/0xd0
       ? exit_to_user_mode_prepare+0x37/0xb0
       ? syscall_exit_to_user_mode+0x27/0x50
       ? do_syscall_64+0x69/0xc0
       ? __sys_setsockopt+0xea/0x1e0
       ? exit_to_user_mode_prepare+0x37/0xb0
       ? syscall_exit_to_user_mode+0x27/0x50
       ? __x64_sys_setsockopt+0x1f/0x30
       ? do_syscall_64+0x69/0xc0
       ? irqentry_exit+0x1d/0x30
       ? exc_page_fault+0x89/0x170
       entry_SYSCALL_64_after_hwframe+0x61/0xcb
      RIP: 0033:0x7f7b8101c6a7
      Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
      RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
      RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
      RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
      R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
      R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
       </TASK>
      
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Co-developed-by: default avatarLuca Czesla <luca.czesla@mail.schwarz>
      Signed-off-by: default avatarLuca Czesla <luca.czesla@mail.schwarz>
      Signed-off-by: default avatarFelix Huettner <felix.huettner@mail.schwarz>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      644b3051
    • Ahmed Zaki's avatar
      iavf: remove active_cvlans and active_svlans bitmaps · d10c9511
      Ahmed Zaki authored
      
      
      [ Upstream commit 9c85b7fa ]
      
      The VLAN filters info is currently being held in a list and 2 bitmaps
      (active_cvlans and active_svlans). We are experiencing some racing where
      data is not in sync in the list and bitmaps. For example, the VLAN is
      initially added to the list but only when the PF replies, it is added to
      the bitmap. If a user adds many V2 VLANS before the PF responds:
      
          while [ $((i++)) ]
              ip l add l eth0 name eth0.$i type vlan id $i
      
      we might end up with more VLAN list entries than the designated limit.
      Also, The "ip link show" will show more links added than the PF limit.
      
      On the other and, the bitmaps are only used to check the number of VLAN
      filters and to re-enable the filters when the interface goes from DOWN to
      UP.
      
      This patch gets rid of the bitmaps and uses the list only. To do that,
      the states of the VLAN filter are modified:
      1 - IAVF_VLAN_REMOVE: the entry needs to be totally removed after informing
        the PF. This is the "ip link del eth0.$i" path.
      2 - IAVF_VLAN_DISABLE: (new) the netdev went down. The filter needs to be
        removed from the PF and then marked INACTIVE.
      3 - IAVF_VLAN_INACTIVE: (new) no PF filter exists, but the user did not
        delete the VLAN.
      
      Fixes: 48ccc43e ("iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev config")
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d10c9511
    • Ahmed Zaki's avatar
      iavf: refactor VLAN filter states · aa0f377c
      Ahmed Zaki authored
      
      
      [ Upstream commit 0c0da0e9 ]
      
      The VLAN filter states are currently being saved as individual bits.
      This is error prone as multiple bits might be mistakenly set.
      
      Fix by replacing the bits with a single state enum. Also, add an
      "ACTIVE" state for filters that are accepted by the PF.
      
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: 9c85b7fa ("iavf: remove active_cvlans and active_svlans bitmaps")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aa0f377c
    • Hangbin Liu's avatar
      bonding: fix ns validation on backup slaves · 4f6c08c2
      Hangbin Liu authored
      
      
      [ Upstream commit 4598380f ]
      
      When arp_validate is set to 2, 3, or 6, validation is performed for
      backup slaves as well. As stated in the bond documentation, validation
      involves checking the broadcast ARP request sent out via the active
      slave. This helps determine which slaves are more likely to function in
      the event of an active slave failure.
      
      However, when the target is an IPv6 address, the NS message sent from
      the active interface is not checked on backup slaves. Additionally,
      based on the bond_arp_rcv() rule b, we must reverse the saddr and daddr
      when checking the NS message.
      
      Note that when checking the NS message, the destination address is a
      multicast address. Therefore, we must convert the target address to
      solicited multicast in the bond_get_targets_ip6() function.
      
      Prior to the fix, the backup slaves had a mii status of "down", but
      after the fix, all of the slaves' mii status was updated to "UP".
      
      Fixes: 4e24be01 ("bonding: add new parameter ns_targets")
      Reviewed-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4f6c08c2
    • YueHaibing's avatar
      tcp: restrict net.ipv4.tcp_app_win · 9d776563
      YueHaibing authored
      
      
      [ Upstream commit dc5110c2 ]
      
      UBSAN: shift-out-of-bounds in net/ipv4/tcp_input.c:555:23
      shift exponent 255 is too large for 32-bit type 'int'
      CPU: 1 PID: 7907 Comm: ssh Not tainted 6.3.0-rc4-00161-g62bad54b26db-dirty #206
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x136/0x150
       __ubsan_handle_shift_out_of_bounds+0x21f/0x5a0
       tcp_init_transfer.cold+0x3a/0xb9
       tcp_finish_connect+0x1d0/0x620
       tcp_rcv_state_process+0xd78/0x4d60
       tcp_v4_do_rcv+0x33d/0x9d0
       __release_sock+0x133/0x3b0
       release_sock+0x58/0x1b0
      
      'maxwin' is int, shifting int for 32 or more bits is undefined behaviour.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9d776563
    • Harshit Mogalapalli's avatar
      niu: Fix missing unwind goto in niu_alloc_channels() · 53a22fa7
      Harshit Mogalapalli authored
      
      
      [ Upstream commit 8ce07be7 ]
      
      Smatch reports: drivers/net/ethernet/sun/niu.c:4525
      	niu_alloc_channels() warn: missing unwind goto?
      
      If niu_rbr_fill() fails, then we are directly returning 'err' without
      freeing the channels.
      
      Fix this by changing direct return to a goto 'out_err'.
      
      Fixes: a3138df9 ("[NIU]: Add Sun Neptune ethernet driver.")
      Signed-off-by: default avatarHarshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      53a22fa7
    • Fuad Tabba's avatar
      KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV2/3 to protected VMs · 24da5765
      Fuad Tabba authored
      
      
      [ Upstream commit e8162521 ]
      
      The existing pKVM code attempts to advertise CSV2/3 using values
      initialized to 0, but never set. To advertise CSV2/3 to protected
      guests, pass the CSV2/3 values to hyp when initializing hyp's
      view of guests' ID_AA64PFR0_EL1.
      
      Similar to non-protected KVM, these are system-wide, rather than
      per cpu, for simplicity.
      
      Fixes: 6c30bfb1 ("KVM: arm64: Add handlers for protected VM System Registers")
      Signed-off-by: default avatarFuad Tabba <tabba@google.com>
      Link: https://lore.kernel.org/r/20230404152321.413064-1-tabba@google.com
      
      
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      24da5765
    • Will Deacon's avatar
      KVM: arm64: Initialise hypervisor copies of host symbols unconditionally · 361b02e6
      Will Deacon authored
      
      
      [ Upstream commit 6c165223 ]
      
      The nVHE object at EL2 maintains its own copies of some host variables
      so that, when pKVM is enabled, the host cannot directly modify the
      hypervisor state. When running in normal nVHE mode, however, these
      variables are still mirrored at EL2 but are not initialised.
      
      Initialise the hypervisor symbols from the host copies regardless of
      pKVM, ensuring that any reference to this data at EL2 with normal nVHE
      will return a sensibly initialised value.
      
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Tested-by: default avatarVincent Donnefort <vdonnefort@google.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20221110190259.26861-16-will@kernel.org
      
      
      Stable-dep-of: e8162521 ("KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV2/3 to protected VMs")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      361b02e6
    • Xu Kuohai's avatar
      bpf, arm64: Fixed a BTI error on returning to patched function · 8b9c6494
      Xu Kuohai authored
      
      
      [ Upstream commit 738a96c4 ]
      
      When BPF_TRAMP_F_CALL_ORIG is set, BPF trampoline uses BLR to jump
      back to the instruction next to call site to call the patched function.
      For BTI-enabled kernel, the instruction next to call site is usually
      PACIASP, in this case, it's safe to jump back with BLR. But when
      the call site is not followed by a PACIASP or bti, a BTI exception
      is triggered.
      
      Here is a fault log:
      
       Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI
       CPU: 0 PID: 263 Comm: test_progs Tainted: GF
       Hardware name: linux,dummy-virt (DT)
       pstate: 40400805 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
       pc : bpf_fentry_test1+0xc/0x30
       lr : bpf_trampoline_6442573892_0+0x48/0x1000
       sp : ffff80000c0c3a50
       x29: ffff80000c0c3a90 x28: ffff0000c2e6c080 x27: 0000000000000000
       x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000050
       x23: 0000000000000000 x22: 0000ffffcfd2a7f0 x21: 000000000000000a
       x20: 0000ffffcfd2a7f0 x19: 0000000000000000 x18: 0000000000000000
       x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffcfd2a7f0
       x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
       x11: 0000000000000000 x10: ffff80000914f5e4 x9 : ffff8000082a1528
       x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0101010101010101
       x5 : 0000000000000000 x4 : 00000000fffffff2 x3 : 0000000000000001
       x2 : ffff8001f4b82000 x1 : 0000000000000000 x0 : 0000000000000001
       Kernel panic - not syncing: Unhandled exception
       CPU: 0 PID: 263 Comm: test_progs Tainted: GF
       Hardware name: linux,dummy-virt (DT)
       Call trace:
        dump_backtrace+0xec/0x144
        show_stack+0x24/0x7c
        dump_stack_lvl+0x8c/0xb8
        dump_stack+0x18/0x34
        panic+0x1cc/0x3ec
        __el0_error_handler_common+0x0/0x130
        el1h_64_sync_handler+0x60/0xd0
        el1h_64_sync+0x78/0x7c
        bpf_fentry_test1+0xc/0x30
        bpf_fentry_test1+0xc/0x30
        bpf_prog_test_run_tracing+0xdc/0x2a0
        __sys_bpf+0x438/0x22a0
        __arm64_sys_bpf+0x30/0x54
        invoke_syscall+0x78/0x110
        el0_svc_common.constprop.0+0x6c/0x1d0
        do_el0_svc+0x38/0xe0
        el0_svc+0x30/0xd0
        el0t_64_sync_handler+0x1ac/0x1b0
        el0t_64_sync+0x1a0/0x1a4
       Kernel Offset: disabled
       CPU features: 0x0000,00034c24,f994fdab
       Memory Limit: none
      
      And the instruction next to call site of bpf_fentry_test1 is ADD,
      not PACIASP:
      
      <bpf_fentry_test1>:
      	bti     c
      	nop
      	nop
      	add     w0, w0, #0x1
      	paciasp
      
      For BPF prog, JIT always puts a PACIASP after call site for BTI-enabled
      kernel, so there is no problem. To fix it, replace BLR with RET to bypass
      the branch target check.
      
      Fixes: efc9909f ("bpf, arm64: Add bpf trampoline for arm64")
      Reported-by: default avatarFlorent Revest <revest@chromium.org>
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarFlorent Revest <revest@chromium.org>
      Acked-by: default avatarFlorent Revest <revest@chromium.org>
      Link: https://lore.kernel.org/bpf/20230401234144.3719742-1-xukuohai@huaweicloud.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b9c6494
    • Zheng Wang's avatar
      9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition · c4002b9d
      Zheng Wang authored
      
      
      [ Upstream commit ea4f1009 ]
      
      In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
      to init priv->rings and bound &ring->work with p9_xen_response.
      
      When it calls xen_9pfs_front_event_handler to handle IRQ requests,
      it will finally call schedule_work to start the work.
      
      When we call xen_9pfs_front_remove to remove the driver, there
      may be a sequence as follows:
      
      Fix it by finishing the work before cleanup in xen_9pfs_front_free.
      
      Note that, this bug is found by static analysis, which might be
      false positive.
      
      CPU0                  CPU1
      
                           |p9_xen_response
      xen_9pfs_front_remove|
        xen_9pfs_front_free|
      kfree(priv)          |
      //free priv          |
                           |p9_tag_lookup
                           |//use priv->client
      
      Fixes: 71ebd719 ("xen/9pfs: connect to the backend")
      Signed-off-by: default avatarZheng Wang <zyytlz.wz@163.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Signed-off-by: default avatarEric Van Hensbergen <ericvh@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c4002b9d
    • Martin Povišer's avatar
      dmaengine: apple-admac: Fix 'current_tx' not getting freed · b7abd535
      Martin Povišer authored
      
      
      [ Upstream commit d9503be5 ]
      
      In terminate_all we should queue up all submitted descriptors to be
      freed. We do that for the content of the 'issued' and 'submitted' lists,
      but the 'current_tx' descriptor falls through the cracks as it's
      removed from the 'issued' list once it gets assigned to be the current
      descriptor. Explicitly queue up freeing of the 'current_tx' descriptor
      to address a memory leak that is otherwise present.
      
      Fixes: b127315d ("dmaengine: apple-admac: Add Apple ADMAC driver")
      Signed-off-by: default avatarMartin Povišer <povik+lin@cutebit.org>
      Link: https://lore.kernel.org/r/20230224152222.26732-2-povik+lin@cutebit.org
      
      
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b7abd535
    • Martin Povišer's avatar
      dmaengine: apple-admac: Set src_addr_widths capability · fdbd0392
      Martin Povišer authored
      
      
      [ Upstream commit 6e96adca ]
      
      Add missing setting of 'src_addr_widths', which is the same as for the
      other direction.
      
      Fixes: b127315d ("dmaengine: apple-admac: Add Apple ADMAC driver")
      Signed-off-by: default avatarMartin Povišer <povik+lin@cutebit.org>
      Link: https://lore.kernel.org/r/20230224152222.26732-3-povik+lin@cutebit.org
      
      
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fdbd0392
    • Martin Povišer's avatar
      dmaengine: apple-admac: Handle 'global' interrupt flags · c7bb0859
      Martin Povišer authored
      
      
      [ Upstream commit a288fd15 ]
      
      In addition to TX channel and RX channel interrupt flags there's
      another class of 'global' interrupt flags with unknown semantics. Those
      weren't being handled up to now, and they are the suspected cause of
      stuck IRQ states that have been sporadically occurring. Check the global
      flags and clear them if raised.
      
      Fixes: b127315d ("dmaengine: apple-admac: Add Apple ADMAC driver")
      Signed-off-by: default avatarMartin Povišer <povik+lin@cutebit.org>
      Link: https://lore.kernel.org/r/20230224152222.26732-1-povik+lin@cutebit.org
      
      
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c7bb0859
    • George Guo's avatar
      LoongArch, bpf: Fix jit to skip speculation barrier opcode · 37b39345
      George Guo authored
      
      
      [ Upstream commit a6f6a95f ]
      
      Just skip the opcode(BPF_ST | BPF_NOSPEC) in the BPF JIT instead of
      failing to JIT the entire program, given LoongArch currently has no
      couterpart of a speculation barrier instruction. To verify the issue,
      use the ltp testcase as shown below.
      
      Also, Wang says:
      
        I can confirm there's currently no speculation barrier equivalent
        on LonogArch. (Loongson says there are builtin mitigations for
        Spectre-V1 and V2 on their chips, and AFAIK efforts to port the
        exploits to mips/LoongArch have all failed a few years ago.)
      
      Without this patch:
      
        $ ./bpf_prog02
        [...]
        bpf_common.c:123: TBROK: Failed verification: ??? (524)
        [...]
        Summary:
        passed   0
        failed   0
        broken   1
        skipped  0
        warnings 0
      
      With this patch:
      
        $ ./bpf_prog02
        [...]
        Summary:
        passed   0
        failed   0
        broken   0
        skipped  0
        warnings 0
      
      Fixes: 5dc61552 ("LoongArch: Add BPF JIT support")
      Signed-off-by: default avatarGeorge Guo <guodongtai@kylinos.cn>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarWANG Xuerui <git@xen0n.name>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Link: https://lore.kernel.org/bpf/20230328071335.2664966-1-guodongtai@kylinos.cn
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      37b39345
    • Martin KaFai Lau's avatar
      bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp · db9c9086
      Martin KaFai Lau authored
      
      
      [ Upstream commit 580031ff ]
      
      While reviewing the udp-iter batching patches, noticed the bpf_iter_tcp
      calling sock_put() is incorrect. It should call sock_gen_put instead
      because bpf_iter_tcp is iterating the ehash table which has the req sk
      and tw sk. This patch replaces all sock_put with sock_gen_put in the
      bpf_iter_tcp codepath.
      
      Fixes: 04c7820b ("bpf: tcp: Bpf iter batching and lock_sock")
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230328004232.2134233-1-martin.lau@linux.dev
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      db9c9086
    • Mark Zhang's avatar
      RDMA/cma: Allow UD qp_type to join multicast only · 02eabb63
      Mark Zhang authored
      [ Upstream commit 58e84f6b ]
      
      As for multicast:
      - The SIDR is the only mode that makes sense;
      - Besides PS_UDP, other port spaces like PS_IB is also allowed, as it is
        UD compatible. In this case qkey also needs to be set [1].
      
      This patch allows only UD qp_type to join multicast, and set qkey to
      default if it's not set, to fix an uninit-value error: the ib->rec.qkey
      field is accessed without being initialized.
      
      =====================================================
      BUG: KMSAN: uninit-value in cma_set_qkey drivers/infiniband/core/cma.c:510 [inline]
      BUG: KMSAN: uninit-value in cma_make_mc_event+0xb73/0xe00 drivers/infiniband/core/cma.c:4570
       cma_set_qkey drivers/infiniband/core/cma.c:510 [inline]
       cma_make_mc_event+0xb73/0xe00 drivers/infiniband/core/cma.c:4570
       cma_iboe_join_multicast drivers/infiniband/core/cma.c:4782 [inline]
       rdma_join_multicast+0x2b83/0x30a0 drivers/infiniband/core/cma.c:4814
       ucma_process_join+0xa76/0xf60 drivers/infiniband/core/ucma.c:1479
       ucma_join_multicast+0x1e3/0x250 drivers/infiniband/core/ucma.c:1546
       ucma_write+0x639/0x6d0 drivers/infiniband/core/ucma.c:1732
       vfs_write+0x8ce/0x2030 fs/read_write.c:588
       ksys_write+0x28c/0x520 fs/read_write.c:643
       __do_sys_write fs/read_write.c:655 [inline]
       __se_sys_write fs/read_write.c:652 [inline]
       __ia32_sys_write+0xdb/0x120 fs/read_write.c:652
       do_syscall_32_irqs_on arch/x86/entry/common.c:114 [inline]
       __do_fast_syscall_32+0x96/0xf0 arch/x86/entry/common.c:180
       do_fast_syscall_32+0x34/0x70 arch/x86/entry/common.c:205
       do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:248
       entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
      
      Local variable ib.i created at:
      cma_iboe_join_multicast drivers/infiniband/core/cma.c:4737 [inline]
      rdma_join_multicast+0x586/0x30a0 drivers/infiniband/core/cma.c:4814
      ucma_process_join+0xa76/0xf60 drivers/infiniband/core/ucma.c:1479
      
      CPU: 0 PID: 29874 Comm: syz-executor.3 Not tainted 5.16.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      =====================================================
      
      [1] https://lore.kernel.org/linux-rdma/20220117183832.GD84788@nvidia.com/
      
      
      
      Fixes: b5de0c60 ("RDMA/cma: Fix use after free race in roce multicast join")
      Reported-by: default avatar <syzbot+8fcbb77276d43cc8b693@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Link: https://lore.kernel.org/r/58a4a98323b5e6b1282e83f6b76960d06e43b9fa.1679309909.git.leon@kernel.org
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      02eabb63
    • Alexander Stein's avatar
      clk: rs9: Fix suspend/resume · 74f4471a
      Alexander Stein authored
      
      
      [ Upstream commit 632e0473 ]
      
      Disabling the cache in commit 2ff4ba9e ("clk: rs9: Fix I2C accessors")
      without removing cache synchronization in resume path results in a
      kernel panic as map->cache_ops is unset, due to REGCACHE_NONE.
      Enable flat cache again to support resume again. num_reg_defaults_raw
      is necessary to read the cache defaults from hardware. Some registers
      are strapped in hardware and cannot be provided in software.
      
      Fixes: 2ff4ba9e ("clk: rs9: Fix I2C accessors")
      Signed-off-by: default avatarAlexander Stein <alexander.stein@ew.tq-group.com>
      Link: https://lore.kernel.org/r/20230310074940.3475703-1-alexander.stein@ew.tq-group.com
      
      
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      74f4471a
    • Cheng Xu's avatar
      RDMA/erdma: Defer probing if netdevice can not be found · 132918e0
      Cheng Xu authored
      
      
      [ Upstream commit 6bd1bca8 ]
      
      ERDMA device may be probed before its associated netdevice, returning
      -EPROBE_DEFER allows OS try to probe erdma device later.
      
      Fixes: d55e6fb4 ("RDMA/erdma: Add the erdma module")
      Signed-off-by: default avatarCheng Xu <chengyou@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20230320084652.16807-5-chengyou@linux.alibaba.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      132918e0
    • Cheng Xu's avatar
      RDMA/erdma: Inline mtt entries into WQE if supported · d682c9bc
      Cheng Xu authored
      
      
      [ Upstream commit 0dd83a4d ]
      
      The max inline mtt count supported is ERDMA_MAX_INLINE_MTT_ENTRIES.
      When mr->mem.mtt_nents == ERDMA_MAX_INLINE_MTT_ENTRIES, inline mtt
      is also supported, fix it.
      
      Fixes: 15505577 ("RDMA/erdma: Add verbs implementation")
      Signed-off-by: default avatarCheng Xu <chengyou@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20230320084652.16807-4-chengyou@linux.alibaba.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d682c9bc
    • Cheng Xu's avatar
      RDMA/erdma: Update default EQ depth to 4096 and max_send_wr to 8192 · 3570f3cc
      Cheng Xu authored
      
      
      [ Upstream commit 6256aa9a ]
      
      Max EQ depth of hardware is 32K, the current default EQ depth is too small
      for some applications, so change the default depth to 4096.
      Max send WRs the hardware can support is 8K, but the driver limits the
      value to 4K. Remove this limitation.
      
      Fixes: be3cff0f ("RDMA/erdma: Add the hardware related definitions")
      Fixes: db23ae64 ("RDMA/erdma: Add verbs header file")
      Signed-off-by: default avatarCheng Xu <chengyou@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20230320084652.16807-3-chengyou@linux.alibaba.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3570f3cc
    • Maher Sanalla's avatar
      IB/mlx5: Add support for 400G_8X lane speed · 9554a6b5
      Maher Sanalla authored
      
      
      [ Upstream commit 88c9483f ]
      
      Currently, when driver queries PTYS to report which link speed is being
      used on its RoCE ports, it does not check the case of having 400Gbps
      transmitted over 8 lanes. Thus it fails to report the said speed and
      instead it defaults to report 10G over 4 lanes.
      
      Add a check for the said speed when querying PTYS and report it back
      correctly when needed.
      
      Fixes: 08e8676f ("IB/mlx5: Add support for 50Gbps per lane link modes")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Link: https://lore.kernel.org/r/ec9040548d119d22557d6a4b4070d6f421701fd4.1678973994.git.leon@kernel.org
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9554a6b5
    • Tatyana Nikolova's avatar
      RDMA/irdma: Add ipv4 check to irdma_find_listener() · 6ea322a1
      Tatyana Nikolova authored
      
      
      [ Upstream commit e4522c09 ]
      
      Add ipv4 check to irdma_find_listener(). Otherwise the function
      incorrectly finds and returns a listener with a different addr family for
      the zero IP addr, if a listener with a zero IP addr and the same port as
      the one searched for has already been created.
      
      Fixes: 146b9756 ("RDMA/irdma: Add connection manager")
      Signed-off-by: default avatarTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Link: https://lore.kernel.org/r/20230315145231.931-5-shiraz.saleem@intel.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6ea322a1
    • Mustafa Ismail's avatar
      RDMA/irdma: Increase iWARP CM default rexmit count · ad960ae9
      Mustafa Ismail authored
      
      
      [ Upstream commit 8385a875 ]
      
      When running perftest with large number of connections in iWARP mode, the
      passive side could be slow to respond. Increase the rexmit counter default
      to allow scaling connections.
      
      Fixes: 146b9756 ("RDMA/irdma: Add connection manager")
      Signed-off-by: default avatarMustafa Ismail <mustafa.ismail@intel.com>
      Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Link: https://lore.kernel.org/r/20230315145231.931-4-shiraz.saleem@intel.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ad960ae9
    • Mustafa Ismail's avatar
      RDMA/irdma: Fix memory leak of PBLE objects · ee02fa4a
      Mustafa Ismail authored
      
      
      [ Upstream commit b69a6979 ]
      
      On rmmod of irdma, the PBLE object memory is not being freed. PBLE object
      memory are not statically pre-allocated at function initialization time
      unlike other HMC objects. PBLEs objects and the Segment Descriptors (SD)
      for it can be dynamically allocated during scale up and SD's remain
      allocated till function deinitialization.
      
      Fix this leak by adding IRDMA_HMC_IW_PBLE to the iw_hmc_obj_types[] table
      and skip pbles in irdma_create_hmc_obj but not in irdma_del_hmc_objects().
      
      Fixes: 44d9e529 ("RDMA/irdma: Implement device initialization definitions")
      Signed-off-by: default avatarMustafa Ismail <mustafa.ismail@intel.com>
      Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Link: https://lore.kernel.org/r/20230315145231.931-3-shiraz.saleem@intel.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee02fa4a
    • Mustafa Ismail's avatar
      RDMA/irdma: Do not generate SW completions for NOPs · 6d61b0cc
      Mustafa Ismail authored
      
      
      [ Upstream commit 30ed9ee9 ]
      
      Currently, artificial SW completions are generated for NOP wqes which can
      generate unexpected completions with wr_id = 0. Skip the generation of
      artificial completions for NOPs.
      
      Fixes: 81091d76 ("RDMA/irdma: Add SW mechanism to generate completions on error")
      Signed-off-by: default avatarMustafa Ismail <mustafa.ismail@intel.com>
      Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
      Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com
      
      
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6d61b0cc