Skip to content
  1. Nov 16, 2023
    • Maher Sanalla's avatar
      net/mlx5: Free used cpus mask when an IRQ is released · 7d2f74d1
      Maher Sanalla authored
      Each EQ table maintains a cpumask of the already used CPUs that are mapped
      to IRQs to ensure that each IRQ gets mapped to a unique CPU.
      
      However, on IRQ release, the said cpumask is not updated by clearing the
      CPU from the mask to allow future IRQ request, causing the following
      error when a SF is reloaded after it has utilized all CPUs for its IRQs:
      
      mlx5_irq_affinity_request:135:(pid 306010): Didn't find a matching IRQ.
      err = -28
      
      Thus, when releasing an IRQ, clear its mapped CPU from the used CPUs
      mask, to prevent the case described above.
      
      While at it, move the used cpumask update to the EQ layer as it is more
      fitting and preserves symmetricity of the IRQ request/release API.
      
      Fixes: a1772de7
      
       ("net/mlx5: Refactor completion IRQ request/release API")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Link: https://lore.kernel.org/r/20231114215846.5902-3-saeed@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7d2f74d1
    • Itamar Gozlan's avatar
      Revert "net/mlx5: DR, Supporting inline WQE when possible" · df3aafe5
      Itamar Gozlan authored
      This reverts commit 95c337cc.
      The revert is required due to the suspicion it cause some tests
      fail and will be moved to further investigation.
      
      Fixes: 95c337cc
      
       ("net/mlx5: DR, Supporting inline WQE when possible")
      Signed-off-by: default avatarItamar Gozlan <igozlan@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Link: https://lore.kernel.org/r/20231114215846.5902-2-saeed@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      df3aafe5
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · a6a6a0a9
      Jakub Kicinski authored
      
      
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2023-11-15
      
      We've added 7 non-merge commits during the last 6 day(s) which contain
      a total of 9 files changed, 200 insertions(+), 49 deletions(-).
      
      The main changes are:
      
      1) Do not allocate bpf specific percpu memory unconditionally, from Yonghong.
      
      2) Fix precision backtracking instruction iteration, from Andrii.
      
      3) Fix control flow graph checking, from Andrii.
      
      4) Fix xskxceiver selftest build, from Anders.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Do not allocate percpu memory at init stage
        selftests/bpf: add more test cases for check_cfg()
        bpf: fix control-flow graph checking in privileged mode
        selftests/bpf: add edge case backtracking logic test
        bpf: fix precision backtracking instruction iteration
        bpf: handle ldimm64 properly in check_cfg()
        selftests: bpf: xskxceiver: ksft_print_msg: fix format type error
      ====================
      
      Link: https://lore.kernel.org/r/20231115214949.48854-1-alexei.starovoitov@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a6a6a0a9
  2. Nov 15, 2023
    • Yonghong Song's avatar
      bpf: Do not allocate percpu memory at init stage · 1fda5bb6
      Yonghong Song authored
      Kirill Shutemov reported significant percpu memory consumption increase after
      booting in 288-cpu VM ([1]) due to commit 41a5db8d ("bpf: Add support for
      non-fix-size percpu mem allocation"). The percpu memory consumption is
      increased from 111MB to 969MB. The number is from /proc/meminfo.
      
      I tried to reproduce the issue with my local VM which at most supports upto
      255 cpus. With 252 cpus, without the above commit, the percpu memory
      consumption immediately after boot is 57MB while with the above commit the
      percpu memory consumption is 231MB.
      
      This is not good since so far percpu memory from bpf memory allocator is not
      widely used yet. Let us change pre-allocation in init stage to on-demand
      allocation when verifier detects there is a need of percpu memory for bpf
      program. With this change, percpu memory consumption after boot can be reduced
      signicantly.
      
        [1] https://lore.kernel.org/lkml/20231109154934.4saimljtqx625l3v@box.shutemov.name/
      
      Fixes: 41a5db8d
      
       ("bpf: Add support for non-fix-size percpu mem allocation")
      Reported-and-tested-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Acked-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/r/20231111013928.948838-1-yonghong.song@linux.dev
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1fda5bb6
    • Gal Pressman's avatar
      net: Fix undefined behavior in netdev name allocation · 674e3180
      Gal Pressman authored
      Cited commit removed the strscpy() call and kept the snprintf() only.
      
      It is common to use 'dev->name' as the format string before a netdev is
      registered, this results in 'res' and 'name' pointers being equal.
      According to POSIX, if copying takes place between objects that overlap
      as a result of a call to sprintf() or snprintf(), the results are
      undefined.
      
      Add back the strscpy() and use 'buf' as an intermediate buffer.
      
      Fixes: 7ad17b04
      
       ("net: trust the bitmap in __dev_alloc_name()")
      Cc: Jakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      674e3180
    • Niklas Söderlund's avatar
      dt-bindings: net: ethernet-controller: Fix formatting error · efc0c836
      Niklas Söderlund authored
      
      
      When moving the *-internal-delay-ps properties to only apply for RGMII
      interface modes there where a typo in the text formatting.
      
      Signed-off-by: default avatarNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      efc0c836
    • Johnathan Mantey's avatar
      Revert ncsi: Propagate carrier gain/loss events to the NCSI controller · 9e2e7efb
      Johnathan Mantey authored
      This reverts commit 3780bb29.
      
      The cited commit introduced unwanted behavior.
      
      The intent for the commit was to be able to detect carrier loss/gain
      for just the NIC connected to the BMC. The unwanted effect is a
      carrier loss for auxiliary paths also causes the BMC to lose
      carrier. The BMC never regains carrier despite the secondary NIC
      regaining a link.
      
      This change, when merged, needs to be backported to stable kernels.
      5.4-stable, 5.10-stable, 5.15-stable, 6.1-stable, 6.5-stable
      
      Fixes: 3780bb29
      
       ("ncsi: Propagate carrier gain/loss events to the NCSI controller")
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJohnathan Mantey <johnathanx.mantey@intel.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e2e7efb
    • Jakub Kicinski's avatar
      Merge branch 'mptcp-misc-fixes-for-v6-7' · a133eae8
      Jakub Kicinski authored
      
      
      Matthieu Baerts says:
      
      ====================
      mptcp: misc. fixes for v6.7
      
      Here are a few fixes related to MPTCP:
      
      - Patch 1 limits GSO max size to ~64K when MPTCP is being used due to a
        spec limit. 'gso_max_size' can exceed the max value supported by MPTCP
        since v5.19.
      
      - Patch 2 fixes a possible NULL pointer dereference on close that can
        happen since v6.7-rc1.
      
      - Patch 3 avoids sending a RM_ADDR when the corresponding address is no
        longer tracked locally. A regression for a fix backported to v5.19.
      
      - Patch 4 adds a missing lock when changing the IP TOS with setsockopt().
        A fix for v5.17.
      
      - Patch 5 fixes an expectation when running MPTCP Join selftest with the
        checksum option (-C). An issue present since v6.1.
      ====================
      
      Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-0-7b9cd6a7b7f4@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a133eae8
    • Paolo Abeni's avatar
      selftests: mptcp: fix fastclose with csum failure · 7cefbe5e
      Paolo Abeni authored
      
      
      Running the mp_join selftest manually with the following command line:
      
        ./mptcp_join.sh -z -C
      
      leads to some failures:
      
        002 fastclose server test
        # ...
        rtx                                 [fail] got 1 MP_RST[s] TX expected 0
        # ...
        rstrx                               [fail] got 1 MP_RST[s] RX expected 0
      
      The problem is really in the wrong expectations for the RST checks
      implied by the csum validation. Note that the same check is repeated
      explicitly in the same test-case, with the correct expectation and
      pass successfully.
      
      Address the issue explicitly setting the correct expectation for
      the failing checks.
      
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Fixes: 6bf41020
      
       ("selftests: mptcp: update and extend fastclose test-cases")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-5-7b9cd6a7b7f4@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7cefbe5e
    • Paolo Abeni's avatar
      mptcp: fix setsockopt(IP_TOS) subflow locking · 7679d34f
      Paolo Abeni authored
      The MPTCP implementation of the IP_TOS socket option uses the lockless
      variant of the TOS manipulation helper and does not hold such lock at
      the helper invocation time.
      
      Add the required locking.
      
      Fixes: ffcacff8
      
       ("mptcp: Support for IP_TOS for MPTCP setsockopt()")
      Cc: stable@vger.kernel.org
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/457
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-4-7b9cd6a7b7f4@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7679d34f
    • Geliang Tang's avatar
      mptcp: add validity check for sending RM_ADDR · 8df220b2
      Geliang Tang authored
      This patch adds the validity check for sending RM_ADDRs for userspace PM
      in mptcp_pm_remove_addrs(), only send a RM_ADDR when the address is in the
      anno_list or conn_list.
      
      Fixes: 8b1c94da
      
       ("mptcp: only send RM_ADDR in nl_cmd_remove")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-3-7b9cd6a7b7f4@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8df220b2
    • Paolo Abeni's avatar
      mptcp: fix possible NULL pointer dereference on close · d109a776
      Paolo Abeni authored
      After the blamed commit below, the MPTCP release callback can
      dereference the first subflow pointer via __mptcp_set_connected()
      and send buffer auto-tuning. Such pointer is always expected to be
      valid, except at socket destruction time, when the first subflow is
      deleted and the pointer zeroed.
      
      If the connect event is handled by the release callback while the
      msk socket is finally released, MPTCP hits the following splat:
      
        general protection fault, probably for non-canonical address 0xdffffc00000000f2: 0000 [#1] PREEMPT SMP KASAN
        KASAN: null-ptr-deref in range [0x0000000000000790-0x0000000000000797]
        CPU: 1 PID: 26719 Comm: syz-executor.2 Not tainted 6.6.0-syzkaller-10102-gff269e2cd5ad #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
        RIP: 0010:mptcp_subflow_ctx net/mptcp/protocol.h:542 [inline]
        RIP: 0010:__mptcp_propagate_sndbuf net/mptcp/protocol.h:813 [inline]
        RIP: 0010:__mptcp_set_connected+0x57/0x3e0 net/mptcp/subflow.c:424
        RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8a62323c
        RDX: 00000000000000f2 RSI: ffffffff8a630116 RDI: 0000000000000790
        RBP: ffff88803334b100 R08: 0000000000000001 R09: 0000000000000000
        R10: 0000000000000001 R11: 0000000000000034 R12: ffff88803334b198
        R13: ffff888054f0b018 R14: 0000000000000000 R15: ffff88803334b100
        FS:  0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fbcb4f75198 CR3: 000000006afb5000 CR4: 00000000003506f0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         <TASK>
         mptcp_release_cb+0xa2c/0xc40 net/mptcp/protocol.c:3405
         release_sock+0xba/0x1f0 net/core/sock.c:3537
         mptcp_close+0x32/0xf0 net/mptcp/protocol.c:3084
         inet_release+0x132/0x270 net/ipv4/af_inet.c:433
         inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:485
         __sock_release+0xae/0x260 net/socket.c:659
         sock_close+0x1c/0x20 net/socket.c:1419
         __fput+0x270/0xbb0 fs/file_table.c:394
         task_work_run+0x14d/0x240 kernel/task_work.c:180
         exit_task_work include/linux/task_work.h:38 [inline]
         do_exit+0xa92/0x2a20 kernel/exit.c:876
         do_group_exit+0xd4/0x2a0 kernel/exit.c:1026
         get_signal+0x23ba/0x2790 kernel/signal.c:2900
         arch_do_signal_or_restart+0x90/0x7f0 arch/x86/kernel/signal.c:309
         exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
         exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204
         __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
         syscall_exit_to_user_mode+0x1d/0x60 kernel/entry/common.c:296
         do_syscall_64+0x4b/0x110 arch/x86/entry/common.c:88
         entry_SYSCALL_64_after_hwframe+0x63/0x6b
        RIP: 0033:0x7fb515e7cae9
        Code: Unable to access opcode bytes at 0x7fb515e7cabf.
        RSP: 002b:00007fb516c560c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
        RAX: 000000000000003c RBX: 00007fb515f9c120 RCX: 00007fb515e7cae9
        RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000006
        RBP: 00007fb515ec847a R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
        R13: 000000000000006e R14: 00007fb515f9c120 R15: 00007ffc631eb968
         </TASK>
      
      To avoid sparkling unneeded conditionals, address the issue explicitly
      checking msk->first only in the critical place.
      
      Fixes: 8005184f
      
       ("mptcp: refactor sndbuf auto-tuning")
      Cc: stable@vger.kernel.org
      Reported-by: default avatar <syzbot+9dfbaedb6e6baca57a32@syzkaller.appspotmail.com>
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/454
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Closes: https://lore.kernel.org/netdev/CANn89iLZUA6S2a=K8GObnS62KK6Jt4B7PsAs7meMFooM8xaTgw@mail.gmail.com/
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-2-7b9cd6a7b7f4@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d109a776
    • Paolo Abeni's avatar
      mptcp: deal with large GSO size · 9fce92f0
      Paolo Abeni authored
      
      
      After the blamed commit below, the TCP sockets (and the MPTCP subflows)
      can build egress packets larger than 64K. That exceeds the maximum DSS
      data size, the length being misrepresent on the wire and the stream being
      corrupted, as later observed on the receiver:
      
        WARNING: CPU: 0 PID: 9696 at net/mptcp/protocol.c:705 __mptcp_move_skbs_from_subflow+0x2604/0x26e0
        CPU: 0 PID: 9696 Comm: syz-executor.7 Not tainted 6.6.0-rc5-gcd8bdf563d46 #45
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
        netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
        RIP: 0010:__mptcp_move_skbs_from_subflow+0x2604/0x26e0 net/mptcp/protocol.c:705
        RSP: 0018:ffffc90000006e80 EFLAGS: 00010246
        RAX: ffffffff83e9f674 RBX: ffff88802f45d870 RCX: ffff888102ad0000
        netlink: 8 bytes leftover after parsing attributes in process `syz-executor.4'.
        RDX: 0000000080000303 RSI: 0000000000013908 RDI: 0000000000003908
        RBP: ffffc90000007110 R08: ffffffff83e9e078 R09: 1ffff1100e548c8a
        R10: dffffc0000000000 R11: ffffed100e548c8b R12: 0000000000013908
        R13: dffffc0000000000 R14: 0000000000003908 R15: 000000000031cf29
        FS:  00007f239c47e700(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f239c45cd78 CR3: 000000006a66c006 CR4: 0000000000770ef0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
        PKRU: 55555554
        Call Trace:
         <IRQ>
         mptcp_data_ready+0x263/0xac0 net/mptcp/protocol.c:819
         subflow_data_ready+0x268/0x6d0 net/mptcp/subflow.c:1409
         tcp_data_queue+0x21a1/0x7a60 net/ipv4/tcp_input.c:5151
         tcp_rcv_established+0x950/0x1d90 net/ipv4/tcp_input.c:6098
         tcp_v6_do_rcv+0x554/0x12f0 net/ipv6/tcp_ipv6.c:1483
         tcp_v6_rcv+0x2e26/0x3810 net/ipv6/tcp_ipv6.c:1749
         ip6_protocol_deliver_rcu+0xd6b/0x1ae0 net/ipv6/ip6_input.c:438
         ip6_input+0x1c5/0x470 net/ipv6/ip6_input.c:483
         ipv6_rcv+0xef/0x2c0 include/linux/netfilter.h:304
         __netif_receive_skb+0x1ea/0x6a0 net/core/dev.c:5532
         process_backlog+0x353/0x660 net/core/dev.c:5974
         __napi_poll+0xc6/0x5a0 net/core/dev.c:6536
         net_rx_action+0x6a0/0xfd0 net/core/dev.c:6603
         __do_softirq+0x184/0x524 kernel/softirq.c:553
         do_softirq+0xdd/0x130 kernel/softirq.c:454
      
      Address the issue explicitly bounding the maximum GSO size to what MPTCP
      actually allows.
      
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/450
      Fixes: 7c4e983c
      
       ("net: allow gso_max_size to exceed 65536")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-1-7b9cd6a7b7f4@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9fce92f0
    • Ziwei Xiao's avatar
      gve: Fixes for napi_poll when budget is 0 · 278a370c
      Ziwei Xiao authored
      Netpoll will explicilty pass the polling call with a budget of 0 to
      indicate it's clearing the Tx path only. For the gve_rx_poll and
      gve_xdp_poll, they were mistakenly taking the 0 budget as the indication
      to do all the work. Add check to avoid the rx path and xdp path being
      called when budget is 0. And also avoid napi_complete_done being called
      when budget is 0 for netpoll.
      
      Fixes: f5cedc84
      
       ("gve: Add transmit and receive support")
      Signed-off-by: default avatarZiwei Xiao <ziweixiao@google.com>
      Link: https://lore.kernel.org/r/20231114004144.2022268-1-ziweixiao@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      278a370c
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 67af0bdc
      Jakub Kicinski authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-11-13 (ice)
      
      This series contains updates to ice driver only.
      
      Arkadiusz ensures the device is initialized with valid lock status
      value. He also removes range checking of dpll priority to allow firmware
      to process the request; supported values are firmware dependent.
      Finally, he removes setting of can change capability for pins that
      cannot be changed.
      
      Dan restores ability to load a package which doesn't contain a signature
      segment.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: fix DDP package download for packages without signature segment
        ice: dpll: fix output pin capabilities
        ice: dpll: fix check for dpll input priority range
        ice: dpll: fix initial lock status of dpll
      ====================
      
      Link: https://lore.kernel.org/r/20231113230551.548489-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      67af0bdc
    • Jakub Kicinski's avatar
      Merge branch 'pds_core-fix-irq-index-bug-and-compiler-warnings' · 9d350b2b
      Jakub Kicinski authored
      
      
      Shannon Nelson says:
      
      ====================
      pds_core: fix irq index bug and compiler warnings
      
      The first patch fixes a bug in our interrupt masking where we used the
      wrong index.  The second patch addresses a couple of kernel test robot
      string truncation warnings.
      ====================
      
      Link: https://lore.kernel.org/r/20231113183257.71110-1-shannon.nelson@amd.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9d350b2b
    • Shannon Nelson's avatar
      pds_core: fix up some format-truncation complaints · 7c02f6ae
      Shannon Nelson authored
      
      
      Our friendly kernel test robot pointed out a couple of potential
      string truncation issues.  None of which were we worried about,
      but can be relatively easily fixed to quiet the complaints.
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202310211736.66syyDpp-lkp@intel.com/
      Fixes: 45d76f49
      
       ("pds_core: set up device and adminq")
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Link: https://lore.kernel.org/r/20231113183257.71110-3-shannon.nelson@amd.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7c02f6ae
    • Shannon Nelson's avatar
      pds_core: use correct index to mask irq · 09d4c14c
      Shannon Nelson authored
      
      
      Use the qcq's interrupt index, not the irq number, to mask
      the interrupt.  Since the irq number can be out of range from
      the number of possible interrupts, we can end up accessing
      and potentially scribbling on out-of-range and/or unmapped
      memory, making the kernel angry.
      
          [ 3116.039364] BUG: unable to handle page fault for address: ffffbeea1c3edf84
          [ 3116.047059] #PF: supervisor write access in kernel mode
          [ 3116.052895] #PF: error_code(0x0002) - not-present page
          [ 3116.058636] PGD 100000067 P4D 100000067 PUD 1001f2067 PMD 10f82e067 PTE 0
          [ 3116.066221] Oops: 0002 [#1] SMP NOPTI
          [ 3116.092948] RIP: 0010:iowrite32+0x9/0x76
          [ 3116.190452] Call Trace:
          [ 3116.193185]  <IRQ>
          [ 3116.195430]  ? show_trace_log_lvl+0x1d6/0x2f9
          [ 3116.200298]  ? show_trace_log_lvl+0x1d6/0x2f9
          [ 3116.205166]  ? pdsc_adminq_isr+0x43/0x55 [pds_core]
          [ 3116.210618]  ? __die_body.cold+0x8/0xa
          [ 3116.214806]  ? page_fault_oops+0x16d/0x1ac
          [ 3116.219382]  ? exc_page_fault+0xbe/0x13b
          [ 3116.223764]  ? asm_exc_page_fault+0x22/0x27
          [ 3116.228440]  ? iowrite32+0x9/0x76
          [ 3116.232143]  pdsc_adminq_isr+0x43/0x55 [pds_core]
          [ 3116.237627]  __handle_irq_event_percpu+0x3a/0x184
          [ 3116.243088]  handle_irq_event+0x57/0xb0
          [ 3116.247575]  handle_edge_irq+0x87/0x225
          [ 3116.252062]  __common_interrupt+0x3e/0xbc
          [ 3116.256740]  common_interrupt+0x7b/0x98
          [ 3116.261216]  </IRQ>
          [ 3116.263745]  <TASK>
          [ 3116.266268]  asm_common_interrupt+0x22/0x27
      
      Reported-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Fixes: 01ba61b5
      
       ("pds_core: Add adminq processing and commands")
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Link: https://lore.kernel.org/r/20231113183257.71110-2-shannon.nelson@amd.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      09d4c14c
    • Alex Pakhunov's avatar
      tg3: Increment tx_dropped in tg3_tso_bug() · 17dd5efe
      Alex Pakhunov authored
      
      
      tg3_tso_bug() drops a packet if it cannot be segmented for any reason.
      The number of discarded frames should be incremented accordingly.
      
      Signed-off-by: default avatarAlex Pakhunov <alexey.pakhunov@spacex.com>
      Signed-off-by: default avatarVincent Wong <vincent.wong2@spacex.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Link: https://lore.kernel.org/r/20231113182350.37472-2-alexey.pakhunov@spacex.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      17dd5efe
    • Alex Pakhunov's avatar
      tg3: Move the [rt]x_dropped counters to tg3_napi · 907d1bdb
      Alex Pakhunov authored
      
      
      This change moves [rt]x_dropped counters to tg3_napi so that they can be
      updated by a single writer, race-free.
      
      Signed-off-by: default avatarAlex Pakhunov <alexey.pakhunov@spacex.com>
      Signed-off-by: default avatarVincent Wong <vincent.wong2@spacex.com>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20231113182350.37472-1-alexey.pakhunov@spacex.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      907d1bdb
    • Baruch Siach's avatar
      net: stmmac: avoid rx queue overrun · b6cb4541
      Baruch Siach authored
      
      
      dma_rx_size can be set as low as 64. Rx budget might be higher than
      that. Make sure to not overrun allocated rx buffers when budget is
      larger.
      
      Leave one descriptor unused to avoid wrap around of 'dirty_rx' vs
      'cur_rx'.
      
      Signed-off-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Fixes: 47dd7a54
      
       ("net: add support for STMicroelectronics Ethernet controllers.")
      Link: https://lore.kernel.org/r/d95413e44c97d4692e72cec13a75f894abeb6998.1699897370.git.baruch@tkos.co.il
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b6cb4541
    • Baruch Siach's avatar
      net: stmmac: fix rx budget limit check · fa02de9e
      Baruch Siach authored
      The while loop condition verifies 'count < limit'. Neither value change
      before the 'count >= limit' check. As is this check is dead code. But
      code inspection reveals a code path that modifies 'count' and then goto
      'drain_data' and back to 'read_again'. So there is a need to verify
      count value sanity after 'read_again'.
      
      Move 'read_again' up to fix the count limit check.
      
      Fixes: ec222003
      
       ("net: stmmac: Prepare to add Split Header support")
      Signed-off-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Link: https://lore.kernel.org/r/d9486296c3b6b12ab3a0515fcd47d56447a07bfc.1699897370.git.baruch@tkos.co.il
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fa02de9e
  3. Nov 14, 2023
    • Eric Dumazet's avatar
      af_unix: fix use-after-free in unix_stream_read_actor() · 4b7b4926
      Eric Dumazet authored
      syzbot reported the following crash [1]
      
      After releasing unix socket lock, u->oob_skb can be changed
      by another thread. We must temporarily increase skb refcount
      to make sure this other thread will not free the skb under us.
      
      [1]
      
      BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      Read of size 4 at addr ffff88801f3b9cc4 by task syz-executor107/5297
      
      CPU: 1 PID: 5297 Comm: syz-executor107 Not tainted 6.6.0-syzkaller-15910-gb8e3a87a627b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:364 [inline]
      print_report+0xc4/0x620 mm/kasan/report.c:475
      kasan_report+0xda/0x110 mm/kasan/report.c:588
      unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866
      unix_stream_recv_urg net/unix/af_unix.c:2587 [inline]
      unix_stream_read_generic+0x19a5/0x2480 net/unix/af_unix.c:2666
      unix_stream_recvmsg+0x189/0x1b0 net/unix/af_unix.c:2903
      sock_recvmsg_nosec net/socket.c:1044 [inline]
      sock_recvmsg+0xe2/0x170 net/socket.c:1066
      ____sys_recvmsg+0x21f/0x5c0 net/socket.c:2803
      ___sys_recvmsg+0x115/0x1a0 net/socket.c:2845
      __sys_recvmsg+0x114/0x1e0 net/socket.c:2875
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7fc67492c559
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fc6748ab228 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fc67492c559
      RDX: 0000000040010083 RSI: 0000000020000140 RDI: 0000000000000004
      RBP: 00007fc6749b6348 R08: 00007fc6748ab6c0 R09: 00007fc6748ab6c0
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc6749b6340
      R13: 00007fc6749b634c R14: 00007ffe9fac52a0 R15: 00007ffe9fac5388
      </TASK>
      
      Allocated by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:328
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      kmem_cache_alloc_node+0x180/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Freed by task 5295:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
      ____kasan_slab_free mm/kasan/common.c:236 [inline]
      ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
      kasan_slab_free include/linux/kasan.h:164 [inline]
      slab_free_hook mm/slub.c:1800 [inline]
      slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
      slab_free mm/slub.c:3809 [inline]
      kmem_cache_free+0xf8/0x340 mm/slub.c:3831
      kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:1015
      __kfree_skb net/core/skbuff.c:1073 [inline]
      consume_skb net/core/skbuff.c:1288 [inline]
      consume_skb+0xdf/0x170 net/core/skbuff.c:1282
      queue_oob net/unix/af_unix.c:2178 [inline]
      unix_stream_sendmsg+0xd49/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The buggy address belongs to the object at ffff88801f3b9c80
      which belongs to the cache skbuff_head_cache of size 240
      The buggy address is located 68 bytes inside of
      freed 240-byte region [ffff88801f3b9c80, ffff88801f3b9d70)
      
      The buggy address belongs to the physical page:
      page:ffffea00007cee40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f3b9
      flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xffffffff()
      raw: 00fff00000000800 ffff888142a60640 dead000000000122 0000000000000000
      raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5299, tgid 5283 (syz-executor107), ts 103803840339, free_ts 103600093431
      set_page_owner include/linux/page_owner.h:31 [inline]
      post_alloc_hook+0x2cf/0x340 mm/page_alloc.c:1537
      prep_new_page mm/page_alloc.c:1544 [inline]
      get_page_from_freelist+0xa25/0x36c0 mm/page_alloc.c:3312
      __alloc_pages+0x1d0/0x4a0 mm/page_alloc.c:4568
      alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
      alloc_slab_page mm/slub.c:1870 [inline]
      allocate_slab+0x251/0x380 mm/slub.c:2017
      new_slab mm/slub.c:2070 [inline]
      ___slab_alloc+0x8c7/0x1580 mm/slub.c:3223
      __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
      __slab_alloc_node mm/slub.c:3375 [inline]
      slab_alloc_node mm/slub.c:3468 [inline]
      kmem_cache_alloc_node+0x132/0x3c0 mm/slub.c:3523
      __alloc_skb+0x287/0x330 net/core/skbuff.c:641
      alloc_skb include/linux/skbuff.h:1286 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      sock_alloc_send_skb include/net/sock.h:1884 [inline]
      queue_oob net/unix/af_unix.c:2147 [inline]
      unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2667
      page last free stack trace:
      reset_page_owner include/linux/page_owner.h:24 [inline]
      free_pages_prepare mm/page_alloc.c:1137 [inline]
      free_unref_page_prepare+0x4f8/0xa90 mm/page_alloc.c:2347
      free_unref_page+0x33/0x3b0 mm/page_alloc.c:2487
      __unfreeze_partials+0x21d/0x240 mm/slub.c:2655
      qlink_free mm/kasan/quarantine.c:168 [inline]
      qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
      kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
      __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      slab_alloc mm/slub.c:3486 [inline]
      __kmem_cache_alloc_lru mm/slub.c:3493 [inline]
      kmem_cache_alloc+0x15d/0x380 mm/slub.c:3502
      vm_area_dup+0x21/0x2f0 kernel/fork.c:500
      __split_vma+0x17d/0x1070 mm/mmap.c:2365
      split_vma mm/mmap.c:2437 [inline]
      vma_modify+0x25d/0x450 mm/mmap.c:2472
      vma_modify_flags include/linux/mm.h:3271 [inline]
      mprotect_fixup+0x228/0xc80 mm/mprotect.c:635
      do_mprotect_pkey+0x852/0xd60 mm/mprotect.c:809
      __do_sys_mprotect mm/mprotect.c:830 [inline]
      __se_sys_mprotect mm/mprotect.c:827 [inline]
      __x64_sys_mprotect+0x78/0xb0 mm/mprotect.c:827
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Memory state around the buggy address:
      ffff88801f3b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ffff88801f3b9c00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      >ffff88801f3b9c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ^
      ffff88801f3b9d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
      ffff88801f3b9d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
      
      Fixes: 876c14ad
      
       ("af_unix: fix holding spinlock in oob handling")
      Reported-and-tested-by: default avatar <syzbot+7a2d546fa43e49315ed3@syzkaller.appspotmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rao Shoaib <rao.shoaib@oracle.com>
      Reviewed-by: default avatarRao shoaib <rao.shoaib@oracle.com>
      Link: https://lore.kernel.org/r/20231113134938.168151-1-edumazet@google.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4b7b4926
    • Jakub Kicinski's avatar
      Merge branch 'r8169-fix-dash-devices-network-lost-issue' · 48c205c6
      Jakub Kicinski authored
      
      
      ChunHao Lin says:
      
      ====================
      r8169: fix DASH devices network lost issue
      
      This series are used to fix network lost issue on systems that support
      DASH. It has been tested on rtl8168ep and rtl8168fp.
      ====================
      
      Link: https://lore.kernel.org/r/20231109173400.4573-1-hau@realtek.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      48c205c6
    • ChunHao Lin's avatar
      r8169: fix network lost after resume on DASH systems · 868c3b95
      ChunHao Lin authored
      Device that support DASH may be reseted or powered off during suspend.
      So driver needs to handle DASH during system suspend and resume. Or
      DASH firmware will influence device behavior and causes network lost.
      
      Fixes: b646d900
      
       ("r8169: magic.")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarChunHao Lin <hau@realtek.com>
      Link: https://lore.kernel.org/r/20231109173400.4573-3-hau@realtek.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      868c3b95
    • ChunHao Lin's avatar
      r8169: add handling DASH when DASH is disabled · 0ab0c45d
      ChunHao Lin authored
      For devices that support DASH, even DASH is disabled, there may still
      exist a default firmware that will influence device behavior.
      So driver needs to handle DASH for devices that support DASH, no
      matter the DASH status is.
      
      This patch also prepares for "fix network lost after resume on DASH
      systems".
      
      Fixes: ee7a1beb
      
       ("r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChunHao Lin <hau@realtek.com>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/20231109173400.4573-2-hau@realtek.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0ab0c45d
    • Jakub Kicinski's avatar
      Merge branch 'fix-large-frames-in-the-gemini-ethernet-driver' · 334e90b8
      Jakub Kicinski authored
      
      
      Linus Walleij says:
      
      ====================
      Fix large frames in the Gemini ethernet driver
      
      This is the result of a bug hunt for a problem with the
      RTL8366RB DSA switch leading me wrong all over the place.
      
      I am indebted to Vladimir Oltean who as usual pointed
      out where the real problem was, many thanks!
      
      Tryig to actually use big ("jumbo") frames on this
      hardware uncovered the real bugs. Then I tested it on
      the DSA switch and it indeed fixes the issue.
      
      To make sure it also works fine with big frames on
      non-DSA devices I also copied a large video file over
      scp to a device with maximum frame size, the data
      was transported in large TCP packets ending up in
      0x7ff sized frames using software checksumming at
      ~2.0 MB/s.
      
      If I set down the MTU to the standard 1500 bytes so
      that hardware checksumming is used, the scp transfer
      of the same file was slightly lower, ~1.8-1.9 MB/s.
      
      Despite this not being the best test it shows that
      we can now stress the hardware with large frames
      and that software checksum works fine.
      
      v3: https://lore.kernel.org/r/20231107-gemini-largeframe-fix-v3-0-e3803c080b75@linaro.org
      v2: https://lore.kernel.org/r/20231105-gemini-largeframe-fix-v2-0-cd3a5aa6c496@linaro.org
      v1: https://lore.kernel.org/r/20231104-gemini-largeframe-fix-v1-0-9c5513f22f33@linaro.org
      ====================
      
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-0-6e611528db08@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      334e90b8
    • Linus Walleij's avatar
      net: ethernet: cortina: Fix MTU max setting · dc6c0bfb
      Linus Walleij authored
      The RX max frame size is over 10000 for the Gemini ethernet,
      but the TX max frame size is actually just 2047 (0x7ff after
      checking the datasheet). Reflect this in what we offer to Linux,
      cap the MTU at the TX max frame minus ethernet headers.
      
      We delete the code disabling the hardware checksum for large
      MTUs as netdev->mtu can no longer be larger than
      netdev->max_mtu meaning the if()-clause in gmac_fix_features()
      is never true.
      
      Fixes: 4d5ae32f
      
       ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-3-6e611528db08@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dc6c0bfb
    • Linus Walleij's avatar
      net: ethernet: cortina: Handle large frames · d4d0c5b4
      Linus Walleij authored
      The Gemini ethernet controller provides hardware checksumming
      for frames up to 1514 bytes including ethernet headers but not
      FCS.
      
      If we start sending bigger frames (after first bumping up the MTU
      on both interfaces sending and receiving the frames), truncated
      packets start to appear on the target such as in this tcpdump
      resulting from ping -s 1474:
      
      23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown),
      ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing!
      (tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502)
      OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482
      
      If we bypass the hardware checksumming and provide a software
      fallback, everything starts working fine up to the max TX MTU
      of 2047 bytes, for example ping -s2000 192.168.1.2:
      
      00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown),
      ethertype IPv4 (0x0800), length 2042:
      (tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028)
      Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008
      
      The bit enabling to bypass hardware checksum (or any of the
      "TSS" bits) are undocumented in the hardware reference manual.
      The entire hardware checksum unit appears undocumented. The
      conclusion that we need to use the "bypass" bit was found by
      trial-and-error.
      
      Since no hardware checksum will happen, we slot in a software
      checksum fallback.
      
      Check for the condition where we need to compute checksum on the
      skb with either hardware or software using == CHECKSUM_PARTIAL instead
      of != CHECKSUM_NONE which is an incomplete check according to
      <linux/skbuff.h>.
      
      On the D-Link DIR-685 router this fixes a bug on the conduit
      interface to the RTL8366RB DSA switch: as the switch needs to add
      space for its tag it increases the MTU on the conduit interface
      to 1504 and that means that when the router sends packages
      of 1500 bytes these get an extra 4 bytes of DSA tag and the
      transfer fails because of the erroneous hardware checksumming,
      affecting such basic functionality as the LuCI web interface.
      
      Fixes: 4d5ae32f
      
       ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-2-6e611528db08@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d4d0c5b4
    • Linus Walleij's avatar
      net: ethernet: cortina: Fix max RX frame define · 510e35fb
      Linus Walleij authored
      Enumerator 3 is 1548 bytes according to the datasheet.
      Not 1542.
      
      Fixes: 4d5ae32f
      
       ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-1-6e611528db08@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      510e35fb
    • Eric Dumazet's avatar
      bonding: stop the device in bond_setup_by_slave() · 3cffa2dd
      Eric Dumazet authored
      Commit 9eed321c ("net: lapbether: only support ethernet devices")
      has been able to keep syzbot away from net/lapb, until today.
      
      In the following splat [1], the issue is that a lapbether device has
      been created on a bonding device without members. Then adding a non
      ARPHRD_ETHER member forced the bonding master to change its type.
      
      The fix is to make sure we call dev_close() in bond_setup_by_slave()
      so that the potential linked lapbether devices (or any other devices
      having assumptions on the physical device) are removed.
      
      A similar bug has been addressed in commit 40baec22
      ("bonding: fix panic on non-ARPHRD_ETHER enslave failure")
      
      [1]
      skbuff: skb_under_panic: text:ffff800089508810 len:44 put:40 head:ffff0000c78e7c00 data:ffff0000c78e7bea tail:0x16 end:0x140 dev:bond0
      kernel BUG at net/core/skbuff.c:192 !
      Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 0 PID: 6007 Comm: syz-executor383 Not tainted 6.6.0-rc3-syzkaller-gbf6547d8715b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023
      pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : skb_panic net/core/skbuff.c:188 [inline]
      pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202
      lr : skb_panic net/core/skbuff.c:188 [inline]
      lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202
      sp : ffff800096a06aa0
      x29: ffff800096a06ab0 x28: ffff800096a06ba0 x27: dfff800000000000
      x26: ffff0000ce9b9b50 x25: 0000000000000016 x24: ffff0000c78e7bea
      x23: ffff0000c78e7c00 x22: 000000000000002c x21: 0000000000000140
      x20: 0000000000000028 x19: ffff800089508810 x18: ffff800096a06100
      x17: 0000000000000000 x16: ffff80008a629a3c x15: 0000000000000001
      x14: 1fffe00036837a32 x13: 0000000000000000 x12: 0000000000000000
      x11: 0000000000000201 x10: 0000000000000000 x9 : cb50b496c519aa00
      x8 : cb50b496c519aa00 x7 : 0000000000000001 x6 : 0000000000000001
      x5 : ffff800096a063b8 x4 : ffff80008e280f80 x3 : ffff8000805ad11c
      x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086
      Call trace:
      skb_panic net/core/skbuff.c:188 [inline]
      skb_under_panic+0x13c/0x140 net/core/skbuff.c:202
      skb_push+0xf0/0x108 net/core/skbuff.c:2446
      ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1384
      dev_hard_header include/linux/netdevice.h:3136 [inline]
      lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257
      lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447
      lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149
      lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251
      __lapb_disconnect_request+0x9c/0x17c net/lapb/lapb_iface.c:326
      lapb_device_event+0x288/0x4e0 net/lapb/lapb_iface.c:492
      notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93
      raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461
      call_netdevice_notifiers_info net/core/dev.c:1970 [inline]
      call_netdevice_notifiers_extack net/core/dev.c:2008 [inline]
      call_netdevice_notifiers net/core/dev.c:2022 [inline]
      __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508
      dev_close_many+0x1e0/0x470 net/core/dev.c:1559
      dev_close+0x174/0x250 net/core/dev.c:1585
      lapbeth_device_event+0x2e4/0x958 drivers/net/wan/lapbether.c:466
      notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93
      raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461
      call_netdevice_notifiers_info net/core/dev.c:1970 [inline]
      call_netdevice_notifiers_extack net/core/dev.c:2008 [inline]
      call_netdevice_notifiers net/core/dev.c:2022 [inline]
      __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508
      dev_close_many+0x1e0/0x470 net/core/dev.c:1559
      dev_close+0x174/0x250 net/core/dev.c:1585
      bond_enslave+0x2298/0x30cc drivers/net/bonding/bond_main.c:2332
      bond_do_ioctl+0x268/0xc64 drivers/net/bonding/bond_main.c:4539
      dev_ifsioc+0x754/0x9ac
      dev_ioctl+0x4d8/0xd34 net/core/dev_ioctl.c:786
      sock_do_ioctl+0x1d4/0x2d0 net/socket.c:1217
      sock_ioctl+0x4e8/0x834 net/socket.c:1322
      vfs_ioctl fs/ioctl.c:51 [inline]
      __do_sys_ioctl fs/ioctl.c:871 [inline]
      __se_sys_ioctl fs/ioctl.c:857 [inline]
      __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:857
      __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
      invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
      el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
      do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
      el0_svc+0x58/0x16c arch/arm64/kernel/entry-common.c:678
      el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
      el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
      Code: aa1803e6 aa1903e7 a90023f5 94785b8b (d4210000)
      
      Fixes: 872254dd
      
       ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20231109180102.4085183-1-edumazet@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3cffa2dd
    • Eric Dumazet's avatar
      ptp: annotate data-race around q->head and q->tail · 73bde5a3
      Eric Dumazet authored
      As I was working on a syzbot report, I found that KCSAN would
      probably complain that reading q->head or q->tail without
      barriers could lead to invalid results.
      
      Add corresponding READ_ONCE() and WRITE_ONCE() to avoid
      load-store tearing.
      
      Fixes: d94ba80e
      
       ("ptp: Added a brand new class driver for ptp clocks.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Link: https://lore.kernel.org/r/20231109174859.3995880-1-edumazet@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      73bde5a3
    • Jakub Kicinski's avatar
      Revert "ptp: Fixes a null pointer dereference in ptp_ioctl" · 4b3812d9
      Jakub Kicinski authored
      This reverts commit 8a4f030d.
      
      Richard says:
      
        The test itself is harmless, but keeping it will make people think,
        "oh this pointer can be invalid."
      
        In fact the core stack ensures that ioctl() can't be invoked after
        release(), otherwise Bad Stuff happens.
      
      Fixes: 8a4f030d
      
       ("ptp: Fixes a null pointer dereference in ptp_ioctl")
      Link: https://lore.kernel.org/all/ZVAf_qdRfDAQYUt-@hoboy.vegasvil.org/
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4b3812d9
    • Dan Nowlin's avatar
      ice: fix DDP package download for packages without signature segment · a778616e
      Dan Nowlin authored
      Commit 3cbdb034 ("ice: Add support for E830 DDP package segment")
      incorrectly removed support for package download for packages without a
      signature segment. These packages include the signature buffer inline
      in the configurations buffers, and not in a signature segment.
      
      Fix package download by providing download support for both packages
      with (ice_download_pkg_with_sig_seg()) and without signature segment
      (ice_download_pkg_without_sig_seg()).
      
      Fixes: 3cbdb034
      
       ("ice: Add support for E830 DDP package segment")
      Reported-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Closes: https://lore.kernel.org/netdev/ZUT50a94kk2pMGKb@boxer/
      Tested-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDan Nowlin <dan.nowlin@intel.com>
      Signed-off-by: default avatarPaul Greenwalt <paul.greenwalt@intel.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      a778616e
    • Arkadiusz Kubalewski's avatar
      ice: dpll: fix output pin capabilities · 6db5f2cd
      Arkadiusz Kubalewski authored
      The dpll output pins which are used to feed clock signal of PHY and MAC
      circuits cannot be disconnected, those integrated circuits require clock
      signal for operation.
      By stopping assignment of DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE pin
      capability, prevent the user from invoking the state set callback on
      those pins, setting the state on those pins already returns error, as
      firmware doesn't allow the change of their state.
      
      Fixes: d7999f5e ("ice: implement dpll interface to control cgu")
      Fixes: 8a3a565f
      
       ("ice: add admin commands to access cgu configuration")
      Reviewed-by: default avatarAndrii Staikov <andrii.staikov@intel.com>
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      6db5f2cd
    • Arkadiusz Kubalewski's avatar
      ice: dpll: fix check for dpll input priority range · 4a4027f2
      Arkadiusz Kubalewski authored
      Supported priority value for input pins may differ with regard of NIC
      firmware version. E810T NICs with 3.20/4.00 FW versions would accept
      priority range 0-31, where firmware 4.10+ would support the range 0-9
      and extra value of 255.
      Remove the in-range check as the driver has no information on supported
      values from the running firmware, let firmware decide if given value is
      correct and return extack error if the value is not supported.
      
      Fixes: d7999f5e
      
       ("ice: implement dpll interface to control cgu")
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      4a4027f2
    • Arkadiusz Kubalewski's avatar
      ice: dpll: fix initial lock status of dpll · 7a1aba89
      Arkadiusz Kubalewski authored
      When dpll device is registered and dpll subsystem performs notify of a
      new device, the lock state value provided to dpll subsystem equals 0
      which is invalid value for the `enum dpll_lock_status`.
      Provide correct value by obtaining it from firmware before registering
      the dpll device.
      
      Fixes: d7999f5e
      
       ("ice: implement dpll interface to control cgu")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      7a1aba89
  4. Nov 13, 2023
    • Willem de Bruijn's avatar
      ppp: limit MRU to 64K · c0a2a1b0
      Willem de Bruijn authored
      ppp_sync_ioctl allows setting device MRU, but does not sanity check
      this input.
      
      Limit to a sane upper bound of 64KB.
      
      No implementation I could find generates larger than 64KB frames.
      RFC 2823 mentions an upper bound of PPP over SDL of 64KB based on the
      16-bit length field. Other protocols will be smaller, such as PPPoE
      (9KB jumbo frame) and PPPoA (18190 maximum CPCS-SDU size, RFC 2364).
      PPTP and L2TP encapsulate in IP.
      
      Syzbot managed to trigger alloc warning in __alloc_pages:
      
      	if (WARN_ON_ONCE_GFP(order > MAX_ORDER, gfp))
      
          WARNING: CPU: 1 PID: 37 at mm/page_alloc.c:4544 __alloc_pages+0x3ab/0x4a0 mm/page_alloc.c:4544
      
          __alloc_skb+0x12b/0x330 net/core/skbuff.c:651
          __netdev_alloc_skb+0x72/0x3f0 net/core/skbuff.c:715
          netdev_alloc_skb include/linux/skbuff.h:3225 [inline]
          dev_alloc_skb include/linux/skbuff.h:3238 [inline]
          ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline]
          ppp_sync_receive+0xff/0x680 drivers/net/ppp/ppp_synctty.c:334
          tty_ldisc_receive_buf+0x14c/0x180 drivers/tty/tty_buffer.c:390
          tty_port_default_receive_buf+0x70/0xb0 drivers/tty/tty_port.c:37
          receive_buf drivers/tty/tty_buffer.c:444 [inline]
          flush_to_ldisc+0x261/0x780 drivers/tty/tty_buffer.c:494
          process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
      
      With call
      
          ioctl$PPPIOCSMRU1(r1, 0x40047452, &(0x7f0000000100)=0x5e6417a8)
      
      Similar code exists in other drivers that implement ppp_channel_ops
      ioctl PPPIOCSMRU. Those might also be in scope. Notably excluded from
      this are pppol2tp_ioctl and pppoe_ioctl.
      
      This code goes back to the start of git history.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatar <syzbot+6177e1f90d92583bcc58@syzkaller.appspotmail.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0a2a1b0
    • Sven Auhagen's avatar
      net: mvneta: fix calls to page_pool_get_stats · ca8add92
      Sven Auhagen authored
      Calling page_pool_get_stats in the mvneta driver without checks
      leads to kernel crashes.
      First the page pool is only available if the bm is not used.
      The page pool is also not allocated when the port is stopped.
      It can also be not allocated in case of errors.
      
      The current implementation leads to the following crash calling
      ethstats on a port that is down or when calling it at the wrong moment:
      
      ble to handle kernel NULL pointer dereference at virtual address 00000070
      [00000070] *pgd=00000000
      Internal error: Oops: 5 [#1] SMP ARM
      Hardware name: Marvell Armada 380/385 (Device Tree)
      PC is at page_pool_get_stats+0x18/0x1cc
      LR is at mvneta_ethtool_get_stats+0xa0/0xe0 [mvneta]
      pc : [<c0b413cc>]    lr : [<bf0a98d8>]    psr: a0000013
      sp : f1439d48  ip : f1439dc0  fp : 0000001d
      r10: 00000100  r9 : c4816b80  r8 : f0d75150
      r7 : bf0b400c  r6 : c238f000  r5 : 00000000  r4 : f1439d68
      r3 : c2091040  r2 : ffffffd8  r1 : f1439d68  r0 : 00000000
      Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c5387d  Table: 066b004a  DAC: 00000051
      Register r0 information: NULL pointer
      Register r1 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390
      Register r2 information: non-paged memory
      Register r3 information: slab kmalloc-2k start c2091000 pointer offset 64 size 2048
      Register r4 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390
      Register r5 information: NULL pointer
      Register r6 information: slab kmalloc-cg-4k start c238f000 pointer offset 0 size 4096
      Register r7 information: 15-page vmalloc region starting at 0xbf0a8000 allocated at load_module+0xa30/0x219c
      Register r8 information: 1-page vmalloc region starting at 0xf0d75000 allocated at ethtool_get_stats+0x138/0x208
      Register r9 information: slab task_struct start c4816b80 pointer offset 0
      Register r10 information: non-paged memory
      Register r11 information: non-paged memory
      Register r12 information: 2-page vmalloc region starting at 0xf1438000 allocated at kernel_clone+0x9c/0x390
      Process snmpd (pid: 733, stack limit = 0x38de3a88)
      Stack: (0xf1439d48 to 0xf143a000)
      9d40:                   000000c0 00000001 c238f000 bf0b400c f0d75150 c4816b80
      9d60: 00000100 bf0a98d8 00000000 00000000 00000000 00000000 00000000 00000000
      9d80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      9da0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      9dc0: 00000dc0 5335509c 00000035 c238f000 bf0b2214 01067f50 f0d75000 c0b9b9c8
      9de0: 0000001d 00000035 c2212094 5335509c c4816b80 c238f000 c5ad6e00 01067f50
      9e00: c1b0be80 c4816b80 00014813 c0b9d7f0 00000000 00000000 0000001d 0000001d
      9e20: 00000000 00001200 00000000 00000000 c216ed90 c73943b8 00000000 00000000
      9e40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      9e60: 00000000 c0ad9034 00000000 00000000 00000000 00000000 00000000 00000000
      9e80: 00000000 00000000 00000000 5335509c c1b0be80 f1439ee4 00008946 c1b0be80
      9ea0: 01067f50 f1439ee3 00000000 00000046 b6d77ae0 c0b383f0 00008946 becc83e8
      9ec0: c1b0be80 00000051 0000000b c68ca480 c7172d00 c0ad8ff0 f1439ee3 cf600e40
      9ee0: 01600e40 32687465 00000000 00000000 00000000 01067f50 00000000 00000000
      9f00: 00000000 5335509c 00008946 00008946 00000000 c68ca480 becc83e8 c05e2de0
      9f20: f1439fb0 c03002f0 00000006 5ac3c35a c4816b80 00000006 b6d77ae0 c030caf0
      9f40: c4817350 00000014 f1439e1c 0000000c 00000000 00000051 01000000 00000014
      9f60: 00003fec f1439edc 00000001 c0372abc b6d77ae0 c0372abc cf600e40 5335509c
      9f80: c21e6800 01015c9c 0000000b 00008946 00000036 c03002f0 c4816b80 00000036
      9fa0: b6d77ae0 c03000c0 01015c9c 0000000b 0000000b 00008946 becc83e8 00000000
      9fc0: 01015c9c 0000000b 00008946 00000036 00000035 010678a0 b6d797ec b6d77ae0
      9fe0: b6dbf738 becc838c b6d186d7 b6baa858 40000030 0000000b 00000000 00000000
       page_pool_get_stats from mvneta_ethtool_get_stats+0xa0/0xe0 [mvneta]
       mvneta_ethtool_get_stats [mvneta] from ethtool_get_stats+0x154/0x208
       ethtool_get_stats from dev_ethtool+0xf48/0x2480
       dev_ethtool from dev_ioctl+0x538/0x63c
       dev_ioctl from sock_ioctl+0x49c/0x53c
       sock_ioctl from sys_ioctl+0x134/0xbd8
       sys_ioctl from ret_fast_syscall+0x0/0x1c
      Exception stack(0xf1439fa8 to 0xf1439ff0)
      9fa0:                   01015c9c 0000000b 0000000b 00008946 becc83e8 00000000
      9fc0: 01015c9c 0000000b 00008946 00000036 00000035 010678a0 b6d797ec b6d77ae0
      9fe0: b6dbf738 becc838c b6d186d7 b6baa858
      Code: e28dd004 e1a05000 e2514000 0a00006a (e5902070)
      
      This commit adds the proper checks before calling page_pool_get_stats.
      
      Fixes: b3fc7922
      
       ("net: mvneta: add support for page_pool_get_stats")
      Signed-off-by: default avatarSven Auhagen <sven.auhagen@voleatech.de>
      Reported-by: default avatarPaulo Da Silva <Paulo.DaSilva@kyberna.com>
      Acked-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca8add92
    • Shigeru Yoshida's avatar
      tipc: Fix kernel-infoleak due to uninitialized TLV value · fb317eb2
      Shigeru Yoshida authored
      KMSAN reported the following kernel-infoleak issue:
      
      =====================================================
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
      BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
      BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
      BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       copy_to_user_iter lib/iov_iter.c:24 [inline]
       iterate_ubuf include/linux/iov_iter.h:29 [inline]
       iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       iterate_and_advance include/linux/iov_iter.h:271 [inline]
       _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
       copy_to_iter include/linux/uio.h:197 [inline]
       simple_copy_to_iter net/core/datagram.c:532 [inline]
       __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420
       skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546
       skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
       netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967
       sock_recvmsg_nosec net/socket.c:1044 [inline]
       sock_recvmsg net/socket.c:1066 [inline]
       __sys_recvfrom+0x476/0x860 net/socket.c:2246
       __do_sys_recvfrom net/socket.c:2264 [inline]
       __se_sys_recvfrom net/socket.c:2260 [inline]
       __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Uninit was created at:
       slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768
       slab_alloc_node mm/slub.c:3478 [inline]
       kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523
       kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560
       __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651
       alloc_skb include/linux/skbuff.h:1286 [inline]
       tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline]
       tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170
       tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324
       genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
       genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067
       netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545
       genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076
       netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
       netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368
       netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg net/socket.c:745 [inline]
       ____sys_sendmsg+0x997/0xd60 net/socket.c:2588
       ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642
       __sys_sendmsg net/socket.c:2671 [inline]
       __do_sys_sendmsg net/socket.c:2680 [inline]
       __se_sys_sendmsg net/socket.c:2678 [inline]
       __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Bytes 34-35 of 36 are uninitialized
      Memory access of size 36 starts at ffff88802d464a00
      Data copied to user address 00007ff55033c0a0
      
      CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
      =====================================================
      
      tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is
      calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and
      the length of TLV value passed as an argument, and aligns the result to a
      multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
      
      If the size of struct tlv_desc plus the length of TLV value is not aligned,
      the current implementation leaves the remaining bytes uninitialized. This
      is the cause of the above kernel-infoleak issue.
      
      This patch resolves this issue by clearing data up to an aligned size.
      
      Fixes: d0796d1e
      
       ("tipc: convert legacy nl bearer dump to nl compat")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb317eb2