Skip to content
  1. Aug 24, 2023
    • Ido Schimmel's avatar
      rtnetlink: Reject negative ifindexes in RTM_NEWLINK · 30188bd7
      Ido Schimmel authored
      Negative ifindexes are illegal, but the kernel does not validate the
      ifindex in the ancillary header of RTM_NEWLINK messages, resulting in
      the kernel generating a warning [1] when such an ifindex is specified.
      
      Fix by rejecting negative ifindexes.
      
      [1]
      WARNING: CPU: 0 PID: 5031 at net/core/dev.c:9593 dev_index_reserve+0x1a2/0x1c0 net/core/dev.c:9593
      [...]
      Call Trace:
       <TASK>
       register_netdevice+0x69a/0x1490 net/core/dev.c:10081
       br_dev_newlink+0x27/0x110 net/bridge/br_netlink.c:1552
       rtnl_newlink_create net/core/rtnetlink.c:3471 [inline]
       __rtnl_newlink+0x115e/0x18c0 net/core/rtnetlink.c:3688
       rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3701
       rtnetlink_rcv_msg+0x439/0xd30 net/core/rtnetlink.c:6427
       netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545
       netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
       netlink_unicast+0x536/0x810 net/netlink/af_netlink.c:1368
       netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910
       sock_sendmsg_nosec net/socket.c:728 [inline]
       sock_sendmsg+0xd9/0x180 net/socket.c:751
       ____sys_sendmsg+0x6ac/0x940 net/socket.c:2538
       ___sys_sendmsg+0x135/0x1d0 net/socket.c:2592
       __sys_sendmsg+0x117/0x1e0 net/socket.c:2621
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 38f7b870
      
       ("[RTNETLINK]: Link creation API")
      Reported-by: default avatar <syzbot+5ba06978f34abb058571@syzkaller.appspotmail.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/20230823064348.2252280-1-idosch@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      30188bd7
  2. Aug 23, 2023
  3. Aug 22, 2023
    • Kees Cook's avatar
      tg3: Use slab_build_skb() when needed · 99b415fe
      Kees Cook authored
      The tg3 driver will use kmalloc() under some conditions. Check the
      frag_size and use slab_build_skb() when frag_size is 0. Silences
      the warning introduced by commit ce098da1 ("skbuff: Introduce
      slab_build_skb()"):
      
      	Use slab_build_skb() instead
      	...
      	tg3_poll_work+0x638/0xf90 [tg3]
      
      Fixes: ce098da1
      
       ("skbuff: Introduce slab_build_skb()")
      Reported-by: default avatarFiona Ebner <f.ebner@proxmox.com>
      Closes: https://lore.kernel.org/all/1bd4cb9c-4eb8-3bdb-3e05-8689817242d1@proxmox.com
      
      
      Cc: Siva Reddy Kallam <siva.kallam@broadcom.com>
      Cc: Prashant Sreedharan <prashant@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Cc: Bagas Sanjaya <bagasdotme@gmail.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Link: https://lore.kernel.org/r/20230818175417.never.273-kees@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      99b415fe
    • Hangbin Liu's avatar
      selftests: bonding: do not set port down before adding to bond · be809424
      Hangbin Liu authored
      Before adding a port to bond, it need to be set down first. In the
      lacpdu test the author set the port down specifically. But commit
      a4abfa62 ("net: rtnetlink: Enslave device before bringing it up")
      changed the operation order, the kernel will set the port down _after_
      adding to bond. So all the ports will be down at last and the test failed.
      
      In fact, the veth interfaces are already inactive when added. This
      means there's no need to set them down again before adding to the bond.
      Let's just remove the link down operation.
      
      Fixes: a4abfa62
      
       ("net: rtnetlink: Enslave device before bringing it up")
      Reported-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Closes: https://lore.kernel.org/netdev/a0ef07c7-91b0-94bd-240d-944a330fcabd@huawei.com/
      
      
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20230817082459.1685972-1-liuhangbin@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      be809424
    • Petr Oros's avatar
      ice: Fix NULL pointer deref during VF reset · 67f6317d
      Petr Oros authored
      During stress test with attaching and detaching VF from KVM and
      simultaneously changing VFs spoofcheck and trust there was a
      NULL pointer dereference in ice_reset_vf that VF's VSI is null.
      
      More than one instance of ice_reset_vf() can be running at a given
      time. When we rebuild the VSI in ice_reset_vf, another reset can be
      triaged from ice_service_task. In this case we can access the currently
      uninitialized VSI and cause panic. The window for this racing condition
      has been around for a long time but it's much worse after commit
      227bf450 ("ice: move VSI delete outside deconfig") because
      the reset runs faster. ice_reset_vf() using vf->cfg_lock and when
      we move this lock before accessing to the VF VSI, we can fix
      BUG for all cases.
      
      Panic occurs sometimes in ice_vsi_is_rx_queue_active() and sometimes
      in ice_vsi_stop_all_rx_rings()
      
      With our reproducer, we can hit BUG:
      ~8h before commit 227bf450 ("ice: move VSI delete outside deconfig").
      ~20m after commit 227bf450 ("ice: move VSI delete outside deconfig").
      After this fix we are not able to reproduce it after ~48h
      
      There was commit cf90b743 ("ice: Fix call trace with null VSI during
      VF reset") which also tried to fix this issue, but it was only
      partially resolved and the bug still exists.
      
      [ 6420.658415] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [ 6420.665382] #PF: supervisor read access in kernel mode
      [ 6420.670521] #PF: error_code(0x0000) - not-present page
      [ 6420.675659] PGD 0
      [ 6420.677679] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [ 6420.682038] CPU: 53 PID: 326472 Comm: kworker/53:0 Kdump: loaded Not tainted 5.14.0-317.el9.x86_64 #1
      [ 6420.691250] Hardware name: Dell Inc. PowerEdge R750/04V528, BIOS 1.6.5 04/15/2022
      [ 6420.698729] Workqueue: ice ice_service_task [ice]
      [ 6420.703462] RIP: 0010:ice_vsi_is_rx_queue_active+0x2d/0x60 [ice]
      [ 6420.705860] ice 0000:ca:00.0: VF 0 is now untrusted
      [ 6420.709494] Code: 00 00 66 83 bf 76 04 00 00 00 48 8b 77 10 74 3e 31 c0 eb 0f 0f b7 97 76 04 00 00 48 83 c0 01 39 c2 7e 2b 48 8b 97 68 04 00 00 <0f> b7 0c 42 48 8b 96 20 13 00 00 48 8d 94 8a 00 00 12 00 8b 12 83
      [ 6420.714426] ice 0000:ca:00.0 ens7f0: Setting MAC 22:22:22:22:22:00 on VF 0. VF driver will be reinitialized
      [ 6420.733120] RSP: 0018:ff778d2ff383fdd8 EFLAGS: 00010246
      [ 6420.733123] RAX: 0000000000000000 RBX: ff2acf1916294000 RCX: 0000000000000000
      [ 6420.733125] RDX: 0000000000000000 RSI: ff2acf1f2c6401a0 RDI: ff2acf1a27301828
      [ 6420.762346] RBP: ff2acf1a27301828 R08: 0000000000000010 R09: 0000000000001000
      [ 6420.769476] R10: ff2acf1916286000 R11: 00000000019eba3f R12: ff2acf19066460d0
      [ 6420.776611] R13: ff2acf1f2c6401a0 R14: ff2acf1f2c6401a0 R15: 00000000ffffffff
      [ 6420.783742] FS:  0000000000000000(0000) GS:ff2acf28ffa80000(0000) knlGS:0000000000000000
      [ 6420.791829] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 6420.797575] CR2: 0000000000000000 CR3: 00000016ad410003 CR4: 0000000000773ee0
      [ 6420.804708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 6420.811034] vfio-pci 0000:ca:01.0: enabling device (0000 -> 0002)
      [ 6420.811840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 6420.811841] PKRU: 55555554
      [ 6420.811842] Call Trace:
      [ 6420.811843]  <TASK>
      [ 6420.811844]  ice_reset_vf+0x9a/0x450 [ice]
      [ 6420.811876]  ice_process_vflr_event+0x8f/0xc0 [ice]
      [ 6420.841343]  ice_service_task+0x23b/0x600 [ice]
      [ 6420.845884]  ? __schedule+0x212/0x550
      [ 6420.849550]  process_one_work+0x1e2/0x3b0
      [ 6420.853563]  ? rescuer_thread+0x390/0x390
      [ 6420.857577]  worker_thread+0x50/0x3a0
      [ 6420.861242]  ? rescuer_thread+0x390/0x390
      [ 6420.865253]  kthread+0xdd/0x100
      [ 6420.868400]  ? kthread_complete_and_exit+0x20/0x20
      [ 6420.873194]  ret_from_fork+0x1f/0x30
      [ 6420.876774]  </TASK>
      [ 6420.878967] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iavf vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables bridge stp llc sctp ip6_udp_tunnel udp_tunnel nfp tls nfnetlink bluetooth mlx4_en mlx4_core rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma kvm_intel i40e kvm iTCO_wdt dcdbas ib_uverbs irqbypass iTCO_vendor_support mgag200 mei_me ib_core dell_smbios isst_if_mmio isst_if_mbox_pci rapl i2c_algo_bit drm_shmem_helper intel_cstate drm_kms_helper syscopyarea sysfillrect isst_if_common sysimgblt intel_uncore fb_sys_fops dell_wmi_descriptor wmi_bmof intel_vsec mei i2c_i801 acpi_ipmi ipmi_si i2c_smbus ipmi_devintf intel_pch_thermal acpi_power_meter pcspk
       r
      
      Fixes: efe41860 ("ice: Fix memory corruption in VF driver")
      Fixes: f23df522
      
       ("ice: Fix spurious interrupt during removal of trusted VF")
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      67f6317d
    • Petr Oros's avatar
      Revert "ice: Fix ice VF reset during iavf initialization" · 0ecff05e
      Petr Oros authored
      This reverts commit 7255355a.
      
      After this commit we are not able to attach VF to VM:
      virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52
      error: Failed to attach interface
      error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable
      
      ice_check_vf_ready_for_cfg() already contain waiting for reset.
      New condition in ice_check_vf_ready_for_reset() causing only problems.
      
      Fixes: 7255355a
      
       ("ice: Fix ice VF reset during iavf initialization")
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      0ecff05e
    • Jesse Brandeburg's avatar
      ice: fix receive buffer size miscalculation · 10083aef
      Jesse Brandeburg authored
      The driver is misconfiguring the hardware for some values of MTU such that
      it could use multiple descriptors to receive a packet when it could have
      simply used one.
      
      Change the driver to use a round-up instead of the result of a shift, as
      the shift can truncate the lower bits of the size, and result in the
      problem noted above. It also aligns this driver with similar code in i40e.
      
      The insidiousness of this problem is that everything works with the wrong
      size, it's just not working as well as it could, as some MTU sizes end up
      using two or more descriptors, and there is no way to tell that is
      happening without looking at ice_trace or a bus analyzer.
      
      Fixes: efc2214b
      
       ("ice: Add support for XDP")
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      10083aef
  4. Aug 21, 2023
    • Ping-Ke Shih's avatar
      wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning · b98c1610
      Ping-Ke Shih authored
      The commit 06470f74
      
       ("mac80211: add API to allow filtering frames in BA sessions")
      added reorder_buf_filtered to mark frames filtered by firmware, and it
      can only work correctly if hw.max_rx_aggregation_subframes <= 64 since
      it stores the bitmap in a u64 variable.
      
      However, new HE or EHT devices can support BlockAck number up to 256 or
      1024, and then using a higher subframe index leads UBSAN warning:
      
       UBSAN: shift-out-of-bounds in net/mac80211/rx.c:1129:39
       shift exponent 215 is too large for 64-bit type 'long long unsigned int'
       Call Trace:
        <IRQ>
        dump_stack_lvl+0x48/0x70
        dump_stack+0x10/0x20
        __ubsan_handle_shift_out_of_bounds+0x1ac/0x360
        ieee80211_release_reorder_frame.constprop.0.cold+0x64/0x69 [mac80211]
        ieee80211_sta_reorder_release+0x9c/0x400 [mac80211]
        ieee80211_prepare_and_rx_handle+0x1234/0x1420 [mac80211]
        ieee80211_rx_list+0xaef/0xf60 [mac80211]
        ieee80211_rx_napi+0x53/0xd0 [mac80211]
      
      Since only old hardware that supports <=64 BlockAck uses
      ieee80211_mark_rx_ba_filtered_frames(), limit the use as it is, so add a
      WARN_ONCE() and comment to note to avoid using this function if hardware
      capability is not suitable.
      
      Signed-off-by: default avatarPing-Ke Shih <pkshih@realtek.com>
      Link: https://lore.kernel.org/r/20230818014004.16177-1-pkshih@realtek.com
      
      
      [edit commit message]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      b98c1610
    • Sabrina Dubroca's avatar
      MAINTAINERS: add entry for macsec · d1cdbf66
      Sabrina Dubroca authored
      
      
      Jakub asked if I'd be willing to be the maintainer of the macsec code
      and review the driver code adding macsec offload, so let's add the
      corresponding entry.
      
      The keyword lines are meant to catch selftests and patches adding HW
      offload support to other drivers.
      
      Suggested-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1cdbf66
  5. Aug 20, 2023
    • Anh Tuan Phan's avatar
      selftests/net: Add log.txt and tools to .gitignore · 144e22e7
      Anh Tuan Phan authored
      
      
      Update .gitignore to untrack tools directory and log.txt. "tools" is
      generated in "selftests/net/Makefile" and log.txt is generated in
      "selftests/net/gro.sh" when executing run_all_tests.
      
      Signed-off-by: default avatarAnh Tuan Phan <tuananhlfc@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      144e22e7
    • Eric Dumazet's avatar
      ipv4: fix data-races around inet->inet_id · f866fbc8
      Eric Dumazet authored
      UDP sendmsg() is lockless, so ip_select_ident_segs()
      can very well be run from multiple cpus [1]
      
      Convert inet->inet_id to an atomic_t, but implement
      a dedicated path for TCP, avoiding cost of a locked
      instruction (atomic_add_return())
      
      Note that this patch will cause a trivial merge conflict
      because we added inet->flags in net-next tree.
      
      v2: added missing change in
      drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
      (David Ahern)
      
      [1]
      
      BUG: KCSAN: data-race in __ip_make_skb / __ip_make_skb
      
      read-write to 0xffff888145af952a of 2 bytes by task 7803 on cpu 1:
      ip_select_ident_segs include/net/ip.h:542 [inline]
      ip_select_ident include/net/ip.h:556 [inline]
      __ip_make_skb+0x844/0xc70 net/ipv4/ip_output.c:1446
      ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560
      udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260
      inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830
      sock_sendmsg_nosec net/socket.c:725 [inline]
      sock_sendmsg net/socket.c:748 [inline]
      ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494
      ___sys_sendmsg net/socket.c:2548 [inline]
      __sys_sendmmsg+0x269/0x500 net/socket.c:2634
      __do_sys_sendmmsg net/socket.c:2663 [inline]
      __se_sys_sendmmsg net/socket.c:2660 [inline]
      __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff888145af952a of 2 bytes by task 7804 on cpu 0:
      ip_select_ident_segs include/net/ip.h:541 [inline]
      ip_select_ident include/net/ip.h:556 [inline]
      __ip_make_skb+0x817/0xc70 net/ipv4/ip_output.c:1446
      ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560
      udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260
      inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830
      sock_sendmsg_nosec net/socket.c:725 [inline]
      sock_sendmsg net/socket.c:748 [inline]
      ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494
      ___sys_sendmsg net/socket.c:2548 [inline]
      __sys_sendmmsg+0x269/0x500 net/socket.c:2634
      __do_sys_sendmmsg net/socket.c:2663 [inline]
      __se_sys_sendmmsg net/socket.c:2660 [inline]
      __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x184d -> 0x184e
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 7804 Comm: syz-executor.1 Not tainted 6.5.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      ==================================================================
      
      Fixes: 23f57406
      
       ("ipv4: avoid using shared IP generator for connected sockets")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f866fbc8
    • Jakub Kicinski's avatar
      net: validate veth and vxcan peer ifindexes · f534f658
      Jakub Kicinski authored
      veth and vxcan need to make sure the ifindexes of the peer
      are not negative, core does not validate this.
      
      Using iproute2 with user-space-level checking removed:
      
      Before:
      
        # ./ip link add index 10 type veth peer index -1
        # ip link show
        1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
          link/ether 52:54:00:74:b2:03 brd ff:ff:ff:ff:ff:ff
        10: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
          link/ether 8a:90:ff:57:6d:5d brd ff:ff:ff:ff:ff:ff
        -1: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
          link/ether ae:ed:18:e6:fa:7f brd ff:ff:ff:ff:ff:ff
      
      Now:
      
        $ ./ip link add index 10 type veth peer index -1
        Error: ifindex can't be negative.
      
      This problem surfaced in net-next because an explicit WARN()
      was added, the root cause is older.
      
      Fixes: e6f8f1a7 ("veth: Allow to create peer link with given ifindex")
      Fixes: a8f820a3
      
       ("can: add Virtual CAN Tunnel driver (vxcan)")
      Reported-by: default avatar <syzbot+5ba06978f34abb058571@syzkaller.appspotmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f534f658
    • David S. Miller's avatar
      Merge branch 'fixed_phy_register-return-value' · c727c6f7
      David S. Miller authored
      
      
      Ruan Jinjie says:
      
      ====================
      net: Fix return value check for fixed_phy_register()
      
      The fixed_phy_register() function returns error pointers and never
      returns NULL. Update the checks accordingly.
      
      Changes in v3:
      - Drop the error fix patch for fixed_phy_get_gpiod().
      - Split the error code update code into another patch set as suggested.
      - Update the commit title and message.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c727c6f7
    • Ruan Jinjie's avatar
      net: bcmgenet: Fix return value check for fixed_phy_register() · 32bbe64a
      Ruan Jinjie authored
      The fixed_phy_register() function returns error pointers and never
      returns NULL. Update the checks accordingly.
      
      Fixes: b0ba512e
      
       ("net: bcmgenet: enable driver to work without a device tree")
      Signed-off-by: default avatarRuan Jinjie <ruanjinjie@huawei.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Acked-by: default avatarDoug Berger <opendmb@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32bbe64a
    • Ruan Jinjie's avatar
      net: bgmac: Fix return value check for fixed_phy_register() · 23a14488
      Ruan Jinjie authored
      The fixed_phy_register() function returns error pointers and never
      returns NULL. Update the checks accordingly.
      
      Fixes: c25b23b8
      
       ("bgmac: register fixed PHY for ARM BCM470X / BCM5301X chipsets")
      Signed-off-by: default avatarRuan Jinjie <ruanjinjie@huawei.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23a14488
    • Serge Semin's avatar
      net: phy: Fix deadlocking in phy_error() invocation · a0e026e7
      Serge Semin authored
      Since commit 91a7cda1 ("net: phy: Fix race condition on link status
      change") all the phy_error() method invocations have been causing the
      nested-mutex-lock deadlock because it's normally done in the PHY-driver
      threaded IRQ handlers which since that change have been called with the
      phydev->lock mutex held. Here is the calls thread:
      
      IRQ: phy_interrupt()
           +-> mutex_lock(&phydev->lock); <--------------------+
               drv->handle_interrupt()                         | Deadlock due
               +-> ERROR: phy_error()                          + to the nested
                          +-> phy_process_error()              | mutex lock
                              +-> mutex_lock(&phydev->lock); <-+
                                  phydev->state = PHY_ERROR;
                                  mutex_unlock(&phydev->lock);
               mutex_unlock(&phydev->lock);
      
      The problem can be easily reproduced just by calling phy_error() from any
      PHY-device threaded interrupt handler. Fix it by dropping the phydev->lock
      mutex lock from the phy_process_error() method and printing a nasty error
      message to the system log if the mutex isn't held in the caller execution
      context.
      
      Note for the fix to work correctly in the PHY-subsystem itself the
      phydev->lock mutex locking must be added to the phy_error_precise()
      function.
      
      Link: https://lore.kernel.org/netdev/20230816180944.19262-1-fancer.lancer@gmail.com
      Fixes: 91a7cda1
      
       ("net: phy: Fix race condition on link status change")
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0e026e7
    • Josua Mayer's avatar
      net: sfp: handle 100G/25G active optical cables in sfp_parse_support · db1a6ad7
      Josua Mayer authored
      
      
      Handle extended compliance code 0x1 (SFF8024_ECC_100G_25GAUI_C2M_AOC)
      for active optical cables supporting 25G and 100G speeds.
      
      Since the specification makes no statement about transmitter range, and
      as the specific sfp module that had been tested features only 2m fiber -
      short-range (SR) modes are selected.
      
      The 100G speed is irrelevant because it would require multiple fibers /
      multiple SFP28 modules combined under one netdev.
      sfp-bus.c only handles a single module per netdev, so only 25Gbps modes
      are selected.
      
      sfp_parse_support already handles SFF8024_ECC_100GBASE_SR4_25GBASE_SR
      with compatible properties, however that entry is a contradiction in
      itself since with SFP(28) 100GBASE_SR4 is impossible - that would likely
      be a mode for qsfp modules only.
      
      Add a case for SFF8024_ECC_100G_25GAUI_C2M_AOC selecting 25gbase-r
      interface mode and 25000baseSR link mode.
      Also enforce SFP28 bitrate limits on the values read from sfp eeprom as
      requested by Russell King.
      
      Tested with fs.com S28-AO02 AOC SFP28 module.
      
      Signed-off-by: default avatarJosua Mayer <josua@solid-run.com>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db1a6ad7
  6. Aug 19, 2023