Skip to content
  1. Jun 11, 2024
  2. Jun 10, 2024
  3. Jun 09, 2024
  4. Jun 07, 2024
    • Aleksandr Mishin's avatar
      liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet · c44711b7
      Aleksandr Mishin authored
      In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value,
      but then it is unconditionally passed to skb_add_rx_frag() which looks
      strange and could lead to null pointer dereference.
      
      lio_vf_rep_copy_packet() call trace looks like:
      	octeon_droq_process_packets
      	 octeon_droq_fast_process_packets
      	  octeon_droq_dispatch_pkt
      	   octeon_create_recv_info
      	    ...search in the dispatch_list...
      	     ->disp_fn(rdisp->rinfo, ...)
      	      lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...)
      In this path there is no code which sets pg_info->page to NULL.
      So this check looks unneeded and doesn't solve potential problem.
      But I guess the author had reason to add a check and I have no such card
      and can't do real test.
      In addition, the code in the function liquidio_push_packet() in
      liquidio/lio_core.c does exactly the same.
      
      Based on this, I consider the most acceptable compromise solution to
      adjust this issue by moving skb_add_rx_frag() into conditional scope.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 1f233f32
      
       ("liquidio: switchdev support for LiquidIO NIC")
      Signed-off-by: default avatarAleksandr Mishin <amishin@t-argos.ru>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c44711b7
    • David S. Miller's avatar
      Merge branch 'hns3-fixes' · dbfb8864
      David S. Miller authored
      
      
      Jijie Shao says:
      
      ====================
      There are some bugfix for the HNS3 ethernet driver
      
      There are some bugfix for the HNS3 ethernet driver
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbfb8864
    • Jie Wang's avatar
      net: hns3: add cond_resched() to hns3 ring buffer init process · 968fde83
      Jie Wang authored
      Currently hns3 ring buffer init process would hold cpu too long with big
      Tx/Rx ring depth. This could cause soft lockup.
      
      So this patch adds cond_resched() to the process. Then cpu can break to
      run other tasks instead of busy looping.
      
      Fixes: a723fb8e
      
       ("net: hns3: refine for set ring parameters")
      Signed-off-by: default avatarJie Wang <wangjie125@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      968fde83
    • Yonglong Liu's avatar
      net: hns3: fix kernel crash problem in concurrent scenario · 12cda920
      Yonglong Liu authored
      When link status change, the nic driver need to notify the roce
      driver to handle this event, but at this time, the roce driver
      may uninit, then cause kernel crash.
      
      To fix the problem, when link status change, need to check
      whether the roce registered, and when uninit, need to wait link
      update finish.
      
      Fixes: 45e92b7e
      
       ("net: hns3: add calling roce callback function when link status change")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12cda920
    • Udit Kumar's avatar
      dt-bindings: net: dp8386x: Add MIT license along with GPL-2.0 · b472b996
      Udit Kumar authored
      
      
      Modify license to include dual licensing as GPL-2.0-only OR MIT
      license for TI specific phy header files. This allows for Linux
      kernel files to be used in other Operating System ecosystems
      such as Zephyr or FreeBSD.
      
      While at this, update the GPL-2.0 to be GPL-2.0-only to be in sync
      with latest SPDX conventions (GPL-2.0 is deprecated).
      
      While at this, update the TI copyright year to sync with current year
      to indicate license change.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Trent Piepho <tpiepho@impinj.com>
      Cc: Wadim Egorov <w.egorov@phytec.de>
      Cc: Kip Broadhurst <kbroadhurst@ti.com>
      Signed-off-by: default avatarUdit Kumar <u-kumar1@ti.com>
      Acked-by: default avatarWadim Egorov <w.egorov@phytec.de>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b472b996
    • Csókás, Bence's avatar
      net: sfp: Always call `sfp_sm_mod_remove()` on remove · e96b2933
      Csókás, Bence authored
      If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will
      not be run. As a consequence, `sfp_hwmon_remove()` is not getting
      run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()`
      itself checks `sfp->sm_mod_state` anyways, so this check was not
      really needed in the first place.
      
      Fixes: d2e816c0
      
       ("net: sfp: handle module remove outside state machine")
      Signed-off-by: default avatar"Csókás, Bence" <csokas.bence@prolan.hu>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e96b2933
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d30d0e49
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from BPF and big collection of fixes for WiFi core and
        drivers.
      
        Current release - regressions:
      
         - vxlan: fix regression when dropping packets due to invalid src
           addresses
      
         - bpf: fix a potential use-after-free in bpf_link_free()
      
         - xdp: revert support for redirect to any xsk socket bound to the
           same UMEM as it can result in a corruption
      
         - virtio_net:
            - add missing lock protection when reading return code from
              control_buf
            - fix false-positive lockdep splat in DIM
            - Revert "wifi: wilc1000: convert list management to RCU"
      
         - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config
      
        Previous releases - regressions:
      
         - rtnetlink: make the "split" NLM_DONE handling generic, restore the
           old behavior for two cases where we started coalescing those
           messages with normal messages, breaking sloppily-coded userspace
      
         - wifi:
            - cfg80211: validate HE operation element parsing
            - cfg80211: fix 6 GHz scan request building
            - mt76: mt7615: add missing chanctx ops
            - ath11k: move power type check to ASSOC stage, fix connecting to
              6 GHz AP
            - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
            - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
            - iwlwifi: mvm: fix a crash on 7265
      
        Previous releases - always broken:
      
         - ncsi: prevent multi-threaded channel probing, a spec violation
      
         - vmxnet3: disable rx data ring on dma allocation failure
      
         - ethtool: init tsinfo stats if requested, prevent unintentionally
           reporting all-zero stats on devices which don't implement any
      
         - dst_cache: fix possible races in less common IPv6 features
      
         - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED
      
         - ax25: fix two refcounting bugs
      
         - eth: ionic: fix kernel panic in XDP_TX action
      
        Misc:
      
         - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB"
      
      * tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
        selftests: net: lib: set 'i' as local
        selftests: net: lib: avoid error removing empty netns name
        selftests: net: lib: support errexit with busywait
        net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
        ipv6: fix possible race in __fib6_drop_pcpu_from()
        af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
        af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
        af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
        af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
        af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
        af_unix: Annotate data-races around sk->sk_sndbuf.
        af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
        af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
        af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
        af_unix: Annotate data-race of sk->sk_state in unix_accept().
        af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
        af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
        af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
        af_unix: Annodate data-races around sk->sk_state for writers.
        af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
        ...
      d30d0e49
    • Linus Torvalds's avatar
      Merge tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo · 2faf6332
      Linus Torvalds authored
      Pull tomoyo fixlet from Tetsuo Handa:
       "Single patch to update project links, no behavior changes"
      
      * tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo:
        tomoyo: update project links
      2faf6332
    • Linus Torvalds's avatar
      Merge tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · a34adf60
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
      
       - Ensure that .discard sections are really discarded in the EFI zboot
         image build
      
       - Return proper error numbers from efi-pstore
      
       - Add __nocfi annotations to EFI runtime wrappers
      
      * tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: Add missing __nocfi annotations to runtime wrappers
        efi: pstore: Return proper errors on UEFI failures
        efi/libstub: zboot.lds: Discard .discard sections
      a34adf60
  5. Jun 06, 2024
    • Jakub Kicinski's avatar
      Merge branch 'selftests-net-lib-small-fixes' · 27bc8654
      Jakub Kicinski authored
      
      
      Matthieu Baerts says:
      
      ====================
      selftests: net: lib: small fixes
      
      While looking at using 'lib.sh' for the MPTCP selftests [1], we found
      some small issues with 'lib.sh'. Here they are:
      
      - Patch 1: fix 'errexit' (set -e) support with busywait. 'errexit' is
        supported in some functions, not all. A fix for v6.8+.
      
      - Patch 2: avoid confusing error messages linked to the cleaning part
        when the netns setup fails. A fix for v6.8+.
      
      - Patch 3: set a variable as local to avoid accidentally changing the
        value of a another one with the same name on the caller side. A fix
        for v6.10-rc1+.
      
      Link: https://lore.kernel.org/mptcp/5f4615c3-0621-43c5-ad25-55747a4350ce@kernel.org/T/ [1]
      ====================
      
      Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-0-b3afadd368c9@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      27bc8654
    • Matthieu Baerts (NGI0)'s avatar
      selftests: net: lib: set 'i' as local · 84a8bc3e
      Matthieu Baerts (NGI0) authored
      Without this, the 'i' variable declared before could be overridden by
      accident, e.g.
      
        for i in "${@}"; do
            __ksft_status_merge "${i}"  ## 'i' has been modified
            foo "${i}"                  ## using 'i' with an unexpected value
        done
      
      After a quick look, it looks like 'i' is currently not used after having
      been modified in __ksft_status_merge(), but still, better be safe than
      sorry. I saw this while modifying the same file, not because I suspected
      an issue somewhere.
      
      Fixes: 596c8819
      
       ("selftests: forwarding: Have RET track kselftest framework constants")
      Acked-by: default avatarGeliang Tang <geliang@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-3-b3afadd368c9@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      84a8bc3e
    • Matthieu Baerts (NGI0)'s avatar
      selftests: net: lib: avoid error removing empty netns name · 79322174
      Matthieu Baerts (NGI0) authored
      If there is an error to create the first netns with 'setup_ns()',
      'cleanup_ns()' will be called with an empty string as first parameter.
      
      The consequences is that 'cleanup_ns()' will try to delete an invalid
      netns, and wait 20 seconds if the netns list is empty.
      
      Instead of just checking if the name is not empty, convert the string
      separated by spaces to an array. Manipulating the array is cleaner, and
      calling 'cleanup_ns()' with an empty array will be a no-op.
      
      Fixes: 25ae948b
      
       ("selftests/net: add lib.sh")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarGeliang Tang <geliang@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-2-b3afadd368c9@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      79322174
    • Matthieu Baerts (NGI0)'s avatar
      selftests: net: lib: support errexit with busywait · 41b02ea4
      Matthieu Baerts (NGI0) authored
      If errexit is enabled ('set -e'), loopy_wait -- or busywait and others
      using it -- will stop after the first failure.
      
      Note that if the returned status of loopy_wait is checked, and even if
      errexit is enabled, Bash will not stop at the first error.
      
      Fixes: 25ae948b
      
       ("selftests/net: add lib.sh")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarGeliang Tang <geliang@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-1-b3afadd368c9@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      41b02ea4
    • Su Hui's avatar
      net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() · 0dcc53ab
      Su Hui authored
      Clang static checker (scan-build) warning:
      net/ethtool/ioctl.c:line 2233, column 2
      Called function pointer is null (null dereference).
      
      Return '-EOPNOTSUPP' when 'ops->get_ethtool_phy_stats' is NULL to fix
      this typo error.
      
      Fixes: 201ed315
      
       ("net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers")
      Signed-off-by: default avatarSu Hui <suhui@nfschina.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Link: https://lore.kernel.org/r/20240605034742.921751-1-suhui@nfschina.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0dcc53ab
    • Eric Dumazet's avatar
      ipv6: fix possible race in __fib6_drop_pcpu_from() · b01e1c03
      Eric Dumazet authored
      syzbot found a race in __fib6_drop_pcpu_from() [1]
      
      If compiler reads more than once (*ppcpu_rt),
      second read could read NULL, if another cpu clears
      the value in rt6_get_pcpu_route().
      
      Add a READ_ONCE() to prevent this race.
      
      Also add rcu_read_lock()/rcu_read_unlock() because
      we rely on RCU protection while dereferencing pcpu_rt.
      
      [1]
      
      Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI
      KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
      CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
      Workqueue: netns cleanup_net
       RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984
      Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48
      RSP: 0018:ffffc900040df070 EFLAGS: 00010206
      RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16
      RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091
      RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007
      R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8
      R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
        __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline]
        fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline]
        fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038
        fib6_del_route net/ipv6/ip6_fib.c:1998 [inline]
        fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043
        fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205
        fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127
        fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175
        fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255
        __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271
        rt6_sync_down_dev net/ipv6/route.c:4906 [inline]
        rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911
        addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855
        addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778
        notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
        call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992
        call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
        call_netdevice_notifiers net/core/dev.c:2044 [inline]
        dev_close_many+0x333/0x6a0 net/core/dev.c:1585
        unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193
        unregister_netdevice_many net/core/dev.c:11276 [inline]
        default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759
        ops_exit_list+0x128/0x180 net/core/net_namespace.c:178
        cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
        process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
        process_scheduled_works kernel/workqueue.c:3312 [inline]
        worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
        kthread+0x2c1/0x3a0 kernel/kthread.c:389
        ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
      Fixes: d52d3997
      
       ("ipv6: Create percpu rt6_info")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b01e1c03
    • Paolo Abeni's avatar
      Merge branch 'af_unix-fix-lockless-access-of-sk-sk_state-and-others-fields' · 411c0ea6
      Paolo Abeni authored
      
      
      Kuniyuki Iwashima says:
      
      ====================
      af_unix: Fix lockless access of sk->sk_state and others fields.
      
      The patch 1 fixes a bug where SOCK_DGRAM's sk->sk_state is changed
      to TCP_CLOSE even if the socket is connect()ed to another socket.
      
      The rest of this series annotates lockless accesses to the following
      fields.
      
        * sk->sk_state
        * sk->sk_sndbuf
        * net->unx.sysctl_max_dgram_qlen
        * sk->sk_receive_queue.qlen
        * sk->sk_shutdown
      
      Note that with this series there is skb_queue_empty() left in
      unix_dgram_disconnected() that needs to be changed to lockless
      version, and unix_peer(other) access there should be protected
      by unix_state_lock().
      
      This will require some refactoring, so another series will follow.
      
      Changes:
        v2:
          * Patch 1: Fix wrong double lock
      
        v1: https://lore.kernel.org/netdev/20240603143231.62085-1-kuniyu@amazon.com/
      ====================
      
      Link: https://lore.kernel.org/r/20240604165241.44758-1-kuniyu@amazon.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      411c0ea6
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). · efaf24e3
      Kuniyuki Iwashima authored
      While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock().
      
      Let's use READ_ONCE() to read sk->sk_shutdown.
      
      Fixes: e4e541a8
      
       ("sock-diag: Report shutdown for inet and unix sockets (v2)")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      efaf24e3
    • Kuniyuki Iwashima's avatar
      af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). · 5d915e58
      Kuniyuki Iwashima authored
      We can dump the socket queue length via UNIX_DIAG by specifying
      UDIAG_SHOW_RQLEN.
      
      If sk->sk_state is TCP_LISTEN, we return the recv queue length,
      but here we do not hold recvq lock.
      
      Let's use skb_queue_len_lockless() in sk_diag_show_rqlen().
      
      Fixes: c9da99e6
      
       ("unix_diag: Fixup RQLEN extension report")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5d915e58
    • Kuniyuki Iwashima's avatar
      af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). · 83690b82
      Kuniyuki Iwashima authored
      If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock()
      checks the length of the peer socket's recvq under unix_state_lock().
      
      However, unix_stream_read_generic() calls skb_unlink() after releasing
      the lock.  Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks
      skb without unix_state_lock().
      
      Thues, unix_state_lock() does not protect qlen.
      
      Let's use skb_queue_empty_lockless() in unix_release_sock().
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      83690b82
    • Kuniyuki Iwashima's avatar
      af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). · 45d872f0
      Kuniyuki Iwashima authored
      Once sk->sk_state is changed to TCP_LISTEN, it never changes.
      
      unix_accept() takes advantage of this characteristics; it does not
      hold the listener's unix_state_lock() and only acquires recvq lock
      to pop one skb.
      
      It means unix_state_lock() does not prevent the queue length from
      changing in unix_stream_connect().
      
      Thus, we need to use unix_recvq_full_lockless() to avoid data-race.
      
      Now we remove unix_recvq_full() as no one uses it.
      
      Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in
      unix_recvq_full_lockless() because of the following reasons:
      
        (1) For SOCK_DGRAM, it is a written-once field in unix_create1()
      
        (2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the
            listener's unix_state_lock() in unix_listen(), and we hold
            the lock in unix_stream_connect()
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      45d872f0
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. · bd9f2d05
      Kuniyuki Iwashima authored
      net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be
      changed concurrently.
      
      Let's use READ_ONCE() in unix_create1().
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      bd9f2d05
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-races around sk->sk_sndbuf. · b0632e53
      Kuniyuki Iwashima authored
      sk_setsockopt() changes sk->sk_sndbuf under lock_sock(), but it's
      not used in af_unix.c.
      
      Let's use READ_ONCE() to read sk->sk_sndbuf in unix_writable(),
      unix_dgram_sendmsg(), and unix_stream_sendmsg().
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b0632e53
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. · 0aa3be7b
      Kuniyuki Iwashima authored
      While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read
      locklessly.
      
      Let's use READ_ONCE() there.
      
      Note that the result could be inconsistent if the socket is dumped
      during the state change.  This is common for other SOCK_DIAG and
      similar interfaces.
      
      Fixes: c9da99e6 ("unix_diag: Fixup RQLEN extension report")
      Fixes: 2aac7a2c ("unix_diag: Pending connections IDs NLA")
      Fixes: 45a96b9b
      
       ("unix_diag: Dumping all sockets core")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0aa3be7b
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). · af4c733b
      Kuniyuki Iwashima authored
      unix_stream_read_skb() is called from sk->sk_data_ready() context
      where unix_state_lock() is not held.
      
      Let's use READ_ONCE() there.
      
      Fixes: 77462de1
      
       ("af_unix: Add read_sock for stream socket types")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      af4c733b
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). · 8a34d4e8
      Kuniyuki Iwashima authored
      The following functions read sk->sk_state locklessly and proceed only if
      the state is TCP_ESTABLISHED.
      
        * unix_stream_sendmsg
        * unix_stream_read_generic
        * unix_seqpacket_sendmsg
        * unix_seqpacket_recvmsg
      
      Let's use READ_ONCE() there.
      
      Fixes: a05d2ad1 ("af_unix: Only allow recv on connected seqpacket sockets.")
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8a34d4e8
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-race of sk->sk_state in unix_accept(). · 1b536948
      Kuniyuki Iwashima authored
      Once sk->sk_state is changed to TCP_LISTEN, it never changes.
      
      unix_accept() takes the advantage and reads sk->sk_state without
      holding unix_state_lock().
      
      Let's use READ_ONCE() there.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1b536948
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). · a9bf9c7d
      Kuniyuki Iwashima authored
      As small optimisation, unix_stream_connect() prefetches the client's
      sk->sk_state without unix_state_lock() and checks if it's TCP_CLOSE.
      
      Later, sk->sk_state is checked again under unix_state_lock().
      
      Let's use READ_ONCE() for the first check and TCP_CLOSE directly for
      the second check.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      a9bf9c7d
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). · eb0718fb
      Kuniyuki Iwashima authored
      unix_poll() and unix_dgram_poll() read sk->sk_state locklessly and
      calls unix_writable() which also reads sk->sk_state without holding
      unix_state_lock().
      
      Let's use READ_ONCE() in unix_poll() and unix_dgram_poll() and pass
      it to unix_writable().
      
      While at it, we remove TCP_SYN_SENT check in unix_dgram_poll() as
      that state does not exist for AF_UNIX socket since the code was added.
      
      Fixes: 1586a587 ("af_unix: do not report POLLOUT on listeners")
      Fixes: 3c73419c ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      eb0718fb
    • Kuniyuki Iwashima's avatar
      af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). · 3a0f38eb
      Kuniyuki Iwashima authored
      ioctl(SIOCINQ) calls unix_inq_len() that checks sk->sk_state first
      and returns -EINVAL if it's TCP_LISTEN.
      
      Then, for SOCK_STREAM sockets, unix_inq_len() returns the number of
      bytes in recvq.
      
      However, unix_inq_len() does not hold unix_state_lock(), and the
      concurrent listen() might change the state after checking sk->sk_state.
      
      If the race occurs, 0 is returned for the listener, instead of -EINVAL,
      because the length of skb with embryo is 0.
      
      We could hold unix_state_lock() in unix_inq_len(), but it's overkill
      given the result is true for pre-listen() TCP_CLOSE state.
      
      So, let's use READ_ONCE() for sk->sk_state in unix_inq_len().
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3a0f38eb
    • Kuniyuki Iwashima's avatar
      af_unix: Annodate data-races around sk->sk_state for writers. · 942238f9
      Kuniyuki Iwashima authored
      sk->sk_state is changed under unix_state_lock(), but it's read locklessly
      in many places.
      
      This patch adds WRITE_ONCE() on the writer side.
      
      We will add READ_ONCE() to the lockless readers in the following patches.
      
      Fixes: 83301b53 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      942238f9