Skip to content
  1. Jul 18, 2023
    • Ido Schimmel's avatar
      vrf: Fix lockdep splat in output path · 2033ab90
      Ido Schimmel authored
      Cited commit converted the neighbour code to use the standard RCU
      variant instead of the RCU-bh variant, but the VRF code still uses
      rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
      code in its IPv4 and IPv6 output paths, resulting in lockdep splats
      [1][2]. Can be reproduced using [3].
      
      Fix by switching to rcu_read_lock() / rcu_read_unlock().
      
      [1]
      =============================
      WARNING: suspicious RCU usage
      6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
      -----------------------------
      include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by ping/183:
       #0: ffff888105ea1d80 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xc6c/0x33c0
       #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output+0x2e3/0x2030
      
      stack backtrace:
      CPU: 0 PID: 183 Comm: ping Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0xc1/0xf0
       lockdep_rcu_suspicious+0x211/0x3b0
       vrf_output+0x1380/0x2030
       ip_push_pending_frames+0x125/0x2a0
       raw_sendmsg+0x200d/0x33c0
       inet_sendmsg+0xa2/0xe0
       __sys_sendto+0x2aa/0x420
       __x64_sys_sendto+0xe5/0x1c0
       do_syscall_64+0x38/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      [2]
      =============================
      WARNING: suspicious RCU usage
      6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
      -----------------------------
      include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by ping6/182:
       #0: ffff888114b63000 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg+0x1602/0x3e50
       #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output6+0xe9/0x1310
      
      stack backtrace:
      CPU: 0 PID: 182 Comm: ping6 Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0xc1/0xf0
       lockdep_rcu_suspicious+0x211/0x3b0
       vrf_output6+0xd32/0x1310
       ip6_local_out+0xb4/0x1a0
       ip6_send_skb+0xbc/0x340
       ip6_push_pending_frames+0xe5/0x110
       rawv6_sendmsg+0x2e6e/0x3e50
       inet_sendmsg+0xa2/0xe0
       __sys_sendto+0x2aa/0x420
       __x64_sys_sendto+0xe5/0x1c0
       do_syscall_64+0x38/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      [3]
      #!/bin/bash
      
      ip link add name vrf-red up numtxqueues 2 type vrf table 10
      ip link add name swp1 up master vrf-red type dummy
      ip address add 192.0.2.1/24 dev swp1
      ip address add 2001:db8:1::1/64 dev swp1
      ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1
      ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1
      ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null
      ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null
      
      Fixes: 09eed119
      
       ("neighbour: switch to standard rcu, instead of rcu_bh")
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ_AZcyg@mail.gmail.com/
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230715153605.4068066-1-idosch@nvidia.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2033ab90
    • Paolo Abeni's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 03803083
      Paolo Abeni authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-07-14 (ice)
      
      This series contains updates to ice driver only.
      
      Petr Oros removes multiple calls made to unregister netdev and
      devlink_port.
      
      Michal fixes null pointer dereference that can occur during reload.
      ====================
      
      Link: https://lore.kernel.org/r/20230714201041.1717834-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      03803083
  2. Jul 17, 2023
  3. Jul 15, 2023
    • Jakub Kicinski's avatar
      Merge branch 'net-fix-kernel-doc-problems-in-include-net' · 0dd1805f
      Jakub Kicinski authored
      
      
      Randy Dunlap says:
      
      ====================
      net: fix kernel-doc problems in include/net/
      
      Fix many (but not all) kernel-doc warnings in include/net/.
      
       [PATCH v2 net 1/9] net: bonding: remove kernel-doc comment marker
       [PATCH v2 net 2/9] net: cfg802154: fix kernel-doc notation warnings
       [PATCH v2 net 3/9] codel: fix kernel-doc notation warnings
       [PATCH v2 net 4/9] devlink: fix kernel-doc notation warnings
       [PATCH v2 net 5/9] inet: frags: remove kernel-doc comment marker
       [PATCH v2 net 6/9] net: llc: fix kernel-doc notation warnings
       [PATCH v2 net 7/9] net: NSH: fix kernel-doc notation warning
       [PATCH v2 net 8/9] pie: fix kernel-doc notation warning
       [PATCH v2 net 9/9] rsi: remove kernel-doc comment marker
      ====================
      
      Link: https://lore.kernel.org/r/20230714045127.18752-1-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0dd1805f
    • Randy Dunlap's avatar
      rsi: remove kernel-doc comment marker · 04be3c95
      Randy Dunlap authored
      Change an errant kernel-doc comment marker (/**) to a regular
      comment to prevent a kernel-doc warning.
      
      rsi_91x.h:3: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
       * Copyright (c) 2017 Redpine Signals Inc.
      
      Fixes: 4c10d56a
      
       ("rsi: add header file rsi_91x")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Prameela Rani Garnepudi <prameela.j04cs@gmail.com>
      Cc: Siva Rebbagondla <siva.rebbagondla@redpinesignals.com>
      Acked-by: default avatarKalle Valo <kvalo@kernel.org>
      Link: https://lore.kernel.org/r/20230714045127.18752-10-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      04be3c95
    • Randy Dunlap's avatar
      pie: fix kernel-doc notation warning · d1cca974
      Randy Dunlap authored
      Spell a struct member's name correctly to prevent a kernel-doc
      warning.
      
      pie.h:38: warning: Function parameter or member 'tupdate' not described in 'pie_params'
      
      Fixes: b42a3d7c
      
       ("pie: improve comments and commenting style")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Leslie Monis <lesliemonis@gmail.com>
      Cc: "Mohit P. Tahiliani" <tahiliani@nitk.edu.in>
      Cc: Gautam Ramakrishnan <gautamramk@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Link: https://lore.kernel.org/r/20230714045127.18752-9-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d1cca974
    • Randy Dunlap's avatar
      net: NSH: fix kernel-doc notation warning · d1533d72
      Randy Dunlap authored
      Use the struct member's name and the correct format to prevent a
      kernel-doc warning.
      
      nsh.h:200: warning: Function parameter or member 'context' not described in 'nsh_md1_ctx'
      
      Fixes: 1f0b7744
      
       ("net: add NSH header structures and helpers")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jiri Benc <jbenc@redhat.com>
      Link: https://lore.kernel.org/r/20230714045127.18752-8-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d1533d72
    • Randy Dunlap's avatar
      net: llc: fix kernel-doc notation warnings · 201a0883
      Randy Dunlap authored
      Use the corrent function parameter name or format to prevent
      kernel-doc warnings.
      Add 2 function parameter descriptions to prevent kernel-doc warnings.
      
      llc_pdu.h:278: warning: Function parameter or member 'da' not described in 'llc_pdu_decode_da'
      llc_pdu.h:278: warning: Excess function parameter 'sa' description in 'llc_pdu_decode_da'
      llc_pdu.h:330: warning: Function parameter or member 'skb' not described in 'llc_pdu_init_as_test_cmd'
      llc_pdu.h:379: warning: Function parameter or member 'svcs_supported' not described in 'llc_pdu_init_as_xid_cmd'
      llc_pdu.h:379: warning: Function parameter or member 'rx_window' not described in 'llc_pdu_init_as_xid_cmd'
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Link: https://lore.kernel.org/r/20230714045127.18752-7-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      201a0883
    • Randy Dunlap's avatar
      inet: frags: eliminate kernel-doc warning · d20909a0
      Randy Dunlap authored
      Modify the anonymous enum kernel-doc content so that it doesn't cause
      a kernel-doc warning.
      
      inet_frag.h:33: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
      
      Fixes: 1ab1934e
      
       ("inet: frags: enum the flag definitions and add descriptions")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Nikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20230714045127.18752-6-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d20909a0
    • Randy Dunlap's avatar
      devlink: fix kernel-doc notation warnings · 839f55c5
      Randy Dunlap authored
      Spell function or struct member names correctly.
      Use ':' instead of '-' for struct member entries.
      Mark one field as private in kernel-doc.
      Add a few entries that were missing.
      Fix a typo.
      
      These changes prevent kernel-doc warnings:
      
      devlink.h:252: warning: Function parameter or member 'field_id' not described in 'devlink_dpipe_match'
      devlink.h:267: warning: Function parameter or member 'field_id' not described in 'devlink_dpipe_action'
      devlink.h:310: warning: Function parameter or member 'match_values_count' not described in 'devlink_dpipe_entry'
      devlink.h:355: warning: Function parameter or member 'list' not described in 'devlink_dpipe_table'
      devlink.h:374: warning: Function parameter or member 'actions_dump' not described in 'devlink_dpipe_table_ops'
      devlink.h:374: warning: Function parameter or member 'matches_dump' not described in 'devlink_dpipe_table_ops'
      devlink.h:374: warning: Function parameter or member 'entries_dump' not described in 'devlink_dpipe_table_ops'
      devlink.h:374: warning: Function parameter or member 'counters_set_update' not described in 'devlink_dpipe_table_ops'
      devlink.h:374: warning: Function parameter or member 'size_get' not described in 'devlink_dpipe_table_ops'
      devlink.h:384: warning: Function parameter or member 'headers' not described in 'devlink_dpipe_headers'
      devlink.h:384: warning: Function parameter or member 'headers_count' not described in 'devlink_dpipe_headers'
      devlink.h:398: warning: Function parameter or member 'unit' not described in 'devlink_resource_size_params'
      devlink.h:487: warning: Function parameter or member 'id' not described in 'devlink_param'
      devlink.h:645: warning: Function parameter or member 'overwrite_mask' not described in 'devlink_flash_update_params'
      
      Fixes: 1555d204 ("devlink: Support for pipeline debug (dpipe)")
      Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
      Fixes: eabaef18 ("devlink: Add devlink_param register and unregister")
      Fixes: 5d5b4128
      
       ("devlink: introduce flash update overwrite mask")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: Moshe Shemesh <moshe@mellanox.com>
      Cc: Jacob Keller <jacob.e.keller@intel.com>
      Link: https://lore.kernel.org/r/20230714045127.18752-5-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      839f55c5
    • Randy Dunlap's avatar
      codel: fix kernel-doc notation warnings · cfe57122
      Randy Dunlap authored
      Use '@' before the struct member names in kernel-doc notation
      to prevent kernel-doc warnings.
      
      codel.h:158: warning: Function parameter or member 'ecn_mark' not described in 'codel_stats'
      codel.h:158: warning: Function parameter or member 'ce_mark' not described in 'codel_stats'
      
      Fixes: 76e3cc12
      
       ("codel: Controlled Delay AQM")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: Dave Taht <dave.taht@bufferbloat.net>
      Link: https://lore.kernel.org/r/20230714045127.18752-4-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cfe57122
    • Randy Dunlap's avatar
      net: cfg802154: fix kernel-doc notation warnings · a63e4044
      Randy Dunlap authored
      Add an enum heading to the kernel-doc comments to prevent
      kernel-doc warnings.
      
      cfg802154.h:174: warning: Cannot understand  * @WPAN_PHY_FLAG_TRANSMIT_POWER: Indicates that transceiver will support
       on line 174 - I thought it was a doc line
      
      cfg802154.h:192: warning: Enum value 'WPAN_PHY_FLAG_TXPOWER' not described in enum 'wpan_phy_flags'
      cfg802154.h:192: warning: Excess enum value 'WPAN_PHY_FLAG_TRANSMIT_POWER' description in 'wpan_phy_flags'
      
      Fixes: edea8f7c
      
       ("cfg802154: introduce wpan phy flags")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@datenfreihafen.org>
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Acked-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/r/20230714045127.18752-3-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a63e4044
    • Randy Dunlap's avatar
      net: bonding: remove kernel-doc comment marker · a66557c7
      Randy Dunlap authored
      Change an errant kernel-doc comment marker (/**) to a regular
      comment to prevent a kernel-doc warning.
      
      bonding.h:282: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
       * Returns NULL if the net_device does not belong to any of the bond's slaves
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Link: https://lore.kernel.org/r/20230714045127.18752-2-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a66557c7
    • Michal Swiatkowski's avatar
      ice: prevent NULL pointer deref during reload · b3e7b3a6
      Michal Swiatkowski authored
      Calling ethtool during reload can lead to call trace, because VSI isn't
      configured for some time, but netdev is alive.
      
      To fix it add rtnl lock for VSI deconfig and config. Set ::num_q_vectors
      to 0 after freeing and add a check for ::tx/rx_rings in ring related
      ethtool ops.
      
      Add proper unroll of filters in ice_start_eth().
      
      Reproduction:
      $watch -n 0.1 -d 'ethtool -g enp24s0f0np0'
      $devlink dev reload pci/0000:18:00.0 action driver_reinit
      
      Call trace before fix:
      [66303.926205] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [66303.926259] #PF: supervisor read access in kernel mode
      [66303.926286] #PF: error_code(0x0000) - not-present page
      [66303.926311] PGD 0 P4D 0
      [66303.926332] Oops: 0000 [#1] PREEMPT SMP PTI
      [66303.926358] CPU: 4 PID: 933821 Comm: ethtool Kdump: loaded Tainted: G           OE      6.4.0-rc5+ #1
      [66303.926400] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0014.070920180847 07/09/2018
      [66303.926446] RIP: 0010:ice_get_ringparam+0x22/0x50 [ice]
      [66303.926649] Code: 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 87 c0 09 00 00 c7 46 04 e0 1f 00 00 c7 46 10 e0 1f 00 00 48 8b 50 20 <48> 8b 12 0f b7 52 3a 89 56 14 48 8b 40 28 48 8b 00 0f b7 40 58 48
      [66303.926722] RSP: 0018:ffffad40472f39c8 EFLAGS: 00010246
      [66303.926749] RAX: ffff98a8ada05828 RBX: ffff98a8c46dd060 RCX: ffffad40472f3b48
      [66303.926781] RDX: 0000000000000000 RSI: ffff98a8c46dd068 RDI: ffff98a8b23c4000
      [66303.926811] RBP: ffffad40472f3b48 R08: 00000000000337b0 R09: 0000000000000000
      [66303.926843] R10: 0000000000000001 R11: 0000000000000100 R12: ffff98a8b23c4000
      [66303.926874] R13: ffff98a8c46dd060 R14: 000000000000000f R15: ffffad40472f3a50
      [66303.926906] FS:  00007f6397966740(0000) GS:ffff98b390900000(0000) knlGS:0000000000000000
      [66303.926941] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [66303.926967] CR2: 0000000000000000 CR3: 000000011ac20002 CR4: 00000000007706e0
      [66303.926999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [66303.927029] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [66303.927060] PKRU: 55555554
      [66303.927075] Call Trace:
      [66303.927094]  <TASK>
      [66303.927111]  ? __die+0x23/0x70
      [66303.927140]  ? page_fault_oops+0x171/0x4e0
      [66303.927176]  ? exc_page_fault+0x7f/0x180
      [66303.927209]  ? asm_exc_page_fault+0x26/0x30
      [66303.927244]  ? ice_get_ringparam+0x22/0x50 [ice]
      [66303.927433]  rings_prepare_data+0x62/0x80
      [66303.927469]  ethnl_default_doit+0xe2/0x350
      [66303.927501]  genl_family_rcv_msg_doit.isra.0+0xe3/0x140
      [66303.927538]  genl_rcv_msg+0x1b1/0x2c0
      [66303.927561]  ? __pfx_ethnl_default_doit+0x10/0x10
      [66303.927590]  ? __pfx_genl_rcv_msg+0x10/0x10
      [66303.927615]  netlink_rcv_skb+0x58/0x110
      [66303.927644]  genl_rcv+0x28/0x40
      [66303.927665]  netlink_unicast+0x19e/0x290
      [66303.927691]  netlink_sendmsg+0x254/0x4d0
      [66303.927717]  sock_sendmsg+0x93/0xa0
      [66303.927743]  __sys_sendto+0x126/0x170
      [66303.927780]  __x64_sys_sendto+0x24/0x30
      [66303.928593]  do_syscall_64+0x5d/0x90
      [66303.929370]  ? __count_memcg_events+0x60/0xa0
      [66303.930146]  ? count_memcg_events.constprop.0+0x1a/0x30
      [66303.930920]  ? handle_mm_fault+0x9e/0x350
      [66303.931688]  ? do_user_addr_fault+0x258/0x740
      [66303.932452]  ? exc_page_fault+0x7f/0x180
      [66303.933193]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      Fixes: 5b246e53
      
       ("ice: split probe into smaller functions")
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      b3e7b3a6
    • Petr Oros's avatar
      ice: Unregister netdev and devlink_port only once · 24a3298a
      Petr Oros authored
      Since commit 6624e780 ("ice: split ice_vsi_setup into smaller
      functions") ice_vsi_release does things twice. There is unregister
      netdev which is unregistered in ice_deinit_eth also.
      
      It also unregisters the devlink_port twice which is also unregistered
      in ice_deinit_eth(). This double deregistration is hidden because
      devl_port_unregister ignores the return value of xa_erase.
      
      [   68.642167] Call Trace:
      [   68.650385]  ice_devlink_destroy_pf_port+0xe/0x20 [ice]
      [   68.655656]  ice_vsi_release+0x445/0x690 [ice]
      [   68.660147]  ice_deinit+0x99/0x280 [ice]
      [   68.664117]  ice_remove+0x1b6/0x5c0 [ice]
      
      [  171.103841] Call Trace:
      [  171.109607]  ice_devlink_destroy_pf_port+0xf/0x20 [ice]
      [  171.114841]  ice_remove+0x158/0x270 [ice]
      [  171.118854]  pci_device_remove+0x3b/0xc0
      [  171.122779]  device_release_driver_internal+0xc7/0x170
      [  171.127912]  driver_detach+0x54/0x8c
      [  171.131491]  bus_remove_driver+0x77/0xd1
      [  171.135406]  pci_unregister_driver+0x2d/0xb0
      [  171.139670]  ice_module_exit+0xc/0x55f [ice]
      
      Fixes: 6624e780
      
       ("ice: split ice_vsi_setup into smaller functions")
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      24a3298a
  4. Jul 14, 2023
    • Yan Zhai's avatar
      gso: fix dodgy bit handling for GSO_UDP_L4 · 98400367
      Yan Zhai authored
      Commit 1fd54773 ("udp: allow header check for dodgy GSO_UDP_L4
      packets.") checks DODGY bit for UDP, but for packets that can be fed
      directly to the device after gso_segs reset, it actually falls through
      to fragmentation:
      
      https://lore.kernel.org/all/CAJPywTKDdjtwkLVUW6LRA2FU912qcDmQOQGt2WaDo28KzYDg+A@mail.gmail.com/
      
      This change restores the expected behavior of GSO_UDP_L4 packets.
      
      Fixes: 1fd54773
      
       ("udp: allow header check for dodgy GSO_UDP_L4 packets.")
      Suggested-by: default avatarWillem de Bruijn <willemdebruijn.kernel@gmail.com>
      Signed-off-by: default avatarYan Zhai <yan@cloudflare.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98400367
    • Wang Ming's avatar
      net: ethernet: Remove repeating expression · a822551c
      Wang Ming authored
      
      
      Identify issues that arise by using the tests/doublebitand.cocci
      semantic patch. Need to remove duplicate expression in if statement.
      
      Signed-off-by: default avatarWang Ming <machel@vivo.com>
      Reviewed-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a822551c
    • Wang Ming's avatar
      bna: Remove error checking for debugfs_create_dir() · 4ad23d23
      Wang Ming authored
      
      
      It is expected that most callers should _ignore_ the errors return by
      debugfs_create_dir() in bnad_debugfs_init().
      
      Signed-off-by: default avatarWang Ming <machel@vivo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ad23d23
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: handle probe deferral · 1d6d537d
      Daniel Golle authored
      Move the call to of_get_ethdev_address to mtk_add_mac which is part of
      the probe function and can hence itself return -EPROBE_DEFER should
      of_get_ethdev_address return -EPROBE_DEFER. This allows us to entirely
      get rid of the mtk_init function.
      
      The problem of of_get_ethdev_address returning -EPROBE_DEFER surfaced
      in situations in which the NVMEM provider holding the MAC address has
      not yet be loaded at the time mtk_eth_soc is initially probed. In this
      case probing of mtk_eth_soc should be deferred instead of falling back
      to use a random MAC address, so once the NVMEM provider becomes
      available probing can be repeated.
      
      Fixes: 656e7052
      
       ("net-next: mediatek: add support for MT7623 ethernet")
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d6d537d
    • Kuniyuki Iwashima's avatar
      bridge: Add extack warning when enabling STP in netns. · 56a16035
      Kuniyuki Iwashima authored
      When we create an L2 loop on a bridge in netns, we will see packets storm
      even if STP is enabled.
      
        # unshare -n
        # ip link add br0 type bridge
        # ip link add veth0 type veth peer name veth1
        # ip link set veth0 master br0 up
        # ip link set veth1 master br0 up
        # ip link set br0 type bridge stp_state 1
        # ip link set br0 up
        # sleep 30
        # ip -s link show br0
        2: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
            link/ether b6:61:98:1c:1c:b5 brd ff:ff:ff:ff:ff:ff
            RX: bytes  packets  errors  dropped missed  mcast
            956553768  12861249 0       0       0       12861249  <-. Keep
            TX: bytes  packets  errors  dropped carrier collsns     |  increasing
            1027834    11951    0       0       0       0         <-'   rapidly
      
      This is because llc_rcv() drops all packets in non-root netns and BPDU
      is dropped.
      
      Let's add extack warning when enabling STP in netns.
      
        # unshare -n
        # ip link add br0 type bridge
        # ip link set br0 type bridge stp_state 1
        Warning: bridge: STP does not work in non-root netns.
      
      Note this commit will be reverted later when we namespacify the whole LLC
      infra.
      
      Fixes: e730c155
      
       ("[NET]: Make packet reception network namespace safe")
      Suggested-by: default avatarHarry Coin <hcoin@quietfountain.com>
      Link: https://lore.kernel.org/netdev/0f531295-e289-022d-5add-5ceffa0df9bc@quietfountain.com/
      Suggested-by: default avatarIdo Schimmel <idosch@idosch.org>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56a16035
    • Tanmay Patil's avatar
      net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field() · b685f1a5
      Tanmay Patil authored
      CPSW ALE has 75 bit ALE entries which are stored within three 32 bit words.
      The cpsw_ale_get_field() and cpsw_ale_set_field() functions assume that the
      field will be strictly contained within one word. However, this is not
      guaranteed to be the case and it is possible for ALE field entries to span
      across up to two words at the most.
      
      Fix the methods to handle getting/setting fields spanning up to two words.
      
      Fixes: db82173f
      
       ("netdev: driver: ethernet: add cpsw address lookup engine support")
      Signed-off-by: default avatarTanmay Patil <t-patil@ti.com>
      [s-vadapalli@ti.com: rephrased commit message and added Fixes tag]
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b685f1a5
    • Mark Brown's avatar
      net: dsa: ar9331: Use explict flags for regmap single read/write · 9845217d
      Mark Brown authored
      
      
      The at9331 is only able to read or write a single register at once.  The
      driver has a custom regmap bus and chooses to tell the regmap core about
      this by reporting the maximum transfer sizes rather than the explicit
      flags that exist at the regmap level.  Since there are a number of
      problems with the raw transfer limits and the regmap level flags are
      better integrated anyway convert the driver to use the flags.
      
      No functional change.
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9845217d
    • Alan Stern's avatar
      net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb · 5e1627cb
      Alan Stern authored
      
      
      The syzbot fuzzer identified a problem in the usbnet driver:
      
      usb 1-1: BOGUS urb xfer, pipe 3 != type 1
      WARNING: CPU: 0 PID: 754 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504
      Modules linked in:
      CPU: 0 PID: 754 Comm: kworker/0:2 Not tainted 6.4.0-rc7-syzkaller-00014-g692b7dc87ca6 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
      Workqueue: mld mld_ifc_work
      RIP: 0010:usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504
      Code: 7c 24 18 e8 2c b4 5b fb 48 8b 7c 24 18 e8 42 07 f0 fe 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 c9 fc 8a e8 5a 6f 23 fb <0f> 0b e9 58 f8 ff ff e8 fe b3 5b fb 48 81 c5 c0 05 00 00 e9 84 f7
      RSP: 0018:ffffc9000463f568 EFLAGS: 00010086
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
      RDX: ffff88801eb28000 RSI: ffffffff814c03b7 RDI: 0000000000000001
      RBP: ffff8881443b7190 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000003
      R13: ffff88802a77cb18 R14: 0000000000000003 R15: ffff888018262500
      FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000556a99c15a18 CR3: 0000000028c71000 CR4: 0000000000350ef0
      Call Trace:
       <TASK>
       usbnet_start_xmit+0xfe5/0x2190 drivers/net/usb/usbnet.c:1453
       __netdev_start_xmit include/linux/netdevice.h:4918 [inline]
       netdev_start_xmit include/linux/netdevice.h:4932 [inline]
       xmit_one net/core/dev.c:3578 [inline]
       dev_hard_start_xmit+0x187/0x700 net/core/dev.c:3594
      ...
      
      This bug is caused by the fact that usbnet trusts the bulk endpoint
      addresses its probe routine receives in the driver_info structure, and
      it does not check to see that these endpoints actually exist and have
      the expected type and directions.
      
      The fix is simply to add such a check.
      
      Reported-and-tested-by: default avatar <syzbot+63ee658b9a100ffadbe2@syzkaller.appspotmail.com>
      Closes: https://lore.kernel.org/linux-usb/000000000000a56e9105d0cec021@google.com/
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      CC: Oliver Neukum <oneukum@suse.com>
      Link: https://lore.kernel.org/r/ea152b6d-44df-4f8a-95c6-4db51143dcc1@rowland.harvard.edu
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5e1627cb
    • Linus Walleij's avatar
      dsa: mv88e6xxx: Do a final check before timing out · 95ce158b
      Linus Walleij authored
      
      
      I get sporadic timeouts from the driver when using the
      MV88E6352. Reading the status again after the loop fixes the
      problem: the operation is successful but goes undetected.
      
      Some added prints show things like this:
      
      [   58.356209] mv88e6085 mdio_mux-0.1:00: Timeout while waiting
          for switch, addr 1b reg 0b, mask 8000, val 0000, data c000
      [   58.367487] mv88e6085 mdio_mux-0.1:00: Timeout waiting for
          ATU op 4000, fid 0001
      (...)
      [   61.826293] mv88e6085 mdio_mux-0.1:00: Timeout while waiting
          for switch, addr 1c reg 18, mask 8000, val 0000, data 9860
      [   61.837560] mv88e6085 mdio_mux-0.1:00: Timeout waiting
          for PHY command 1860 to complete
      
      The reason is probably not the commands: I think those are
      mostly fine with the 50+50ms timeout, but the problem
      appears when OpenWrt brings up several interfaces in
      parallel on a system with 7 populated ports: if one of
      them take more than 50 ms and waits one or more of the
      others can get stuck on the mutex for the switch and then
      this can easily multiply.
      
      As we sleep and wait, the function loop needs a final
      check after exiting the loop if we were successful.
      
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Cc: Tobias Waldekranz <tobias@waldekranz.com>
      Fixes: 35da1dfd
      
       ("net: dsa: mv88e6xxx: Improve performance of busy bit polling")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230712223405.861899-1-linus.walleij@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      95ce158b
    • Linus Torvalds's avatar
      Merge tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · b1983d42
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from netfilter, wireless and ebpf.
      
        Current release - regressions:
      
         - netfilter: conntrack: gre: don't set assured flag for clash entries
      
         - wifi: iwlwifi: remove 'use_tfh' config to fix crash
      
        Previous releases - regressions:
      
         - ipv6: fix a potential refcount underflow for idev
      
         - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in
           icmp6_dev()
      
         - bpf: fix max stack depth check for async callbacks
      
         - eth: mlx5e:
            - check for NOT_READY flag state after locking
            - fix page_pool page fragment tracking for XDP
      
         - eth: igc:
            - fix tx hang issue when QBV gate is closed
            - fix corner cases for TSN offload
      
         - eth: octeontx2-af: Move validation of ptp pointer before its usage
      
         - eth: ena: fix shift-out-of-bounds in exponential backoff
      
        Previous releases - always broken:
      
         - core: prevent skb corruption on frag list segmentation
      
         - sched:
            - cls_fw: fix improper refcount update leads to use-after-free
            - sch_qfq: account for stab overhead in qfq_enqueue
      
         - netfilter:
            - report use refcount overflow
            - prevent OOB access in nft_byteorder_eval
      
         - wifi: mt7921e: fix init command fail with enabled device
      
         - eth: ocelot: fix oversize frame dropping for preemptible TCs
      
         - eth: fec: recycle pages for transmitted XDP frames"
      
      * tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
        selftests: tc-testing: add test for qfq with stab overhead
        net/sched: sch_qfq: account for stab overhead in qfq_enqueue
        selftests: tc-testing: add tests for qfq mtu sanity check
        net/sched: sch_qfq: reintroduce lmax bound check for MTU
        wifi: cfg80211: fix receiving mesh packets without RFC1042 header
        wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set()
        net: txgbe: fix eeprom calculation error
        net/sched: make psched_mtu() RTNL-less safe
        net: ena: fix shift-out-of-bounds in exponential backoff
        netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write()
        net/sched: flower: Ensure both minimum and maximum ports are specified
        MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER
        docs: netdev: update the URL of the status page
        wifi: iwlwifi: remove 'use_tfh' config to fix crash
        xdp: use trusted arguments in XDP hints kfuncs
        bpf: cpumap: Fix memory leak in cpu_map_update_elem
        wifi: airo: avoid uninitialized warning in airo_get_rate()
        octeontx2-pf: Add additional check for MCAM rules
        net: dsa: Removed unneeded of_node_put in felix_parse_ports_node
        net: fec: use netdev_err_once() instead of netdev_err()
        ...
      b1983d42
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · ebc27aac
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix some missing-prototype warnings
      
       - Fix user events struct args (did not include size of struct)
      
         When creating a user event, the "struct" keyword is to denote that
         the size of the field will be passed in. But the parsing failed to
         handle this case.
      
       - Add selftest to struct sizes for user events
      
       - Fix sample code for direct trampolines.
      
         The sample code for direct trampolines attached to handle_mm_fault().
         But the prototype changed and the direct trampoline sample code was
         not updated. Direct trampolines needs to have the arguments correct
         otherwise it can fail or crash the system.
      
       - Remove unused ftrace_regs_caller_ret() prototype.
      
       - Quiet false positive of FORTIFY_SOURCE
      
         Due to backward compatibility, the structure used to save stack
         traces in the kernel had a fixed size of 8. This structure is
         exported to user space via the tracing format file. A change was made
         to allow more than 8 functions to be recorded, and user space now
         uses the size field to know how many functions are actually in the
         stack.
      
         But the structure still has size of 8 (even though it points into the
         ring buffer that has the required amount allocated to hold a full
         stack.
      
         This was fine until the fortifier noticed that the
         memcpy(&entry->caller, stack, size) was greater than the 8 functions
         and would complain at runtime about it.
      
         Hide this by using a pointer to the stack location on the ring buffer
         instead of using the address of the entry structure caller field.
      
       - Fix a deadloop in reading trace_pipe that was caused by a mismatch
         between ring_buffer_empty() returning false which then asked to read
         the data, but the read code uses rb_num_of_entries() that returned
         zero, and causing a infinite "retry".
      
       - Fix a warning caused by not using all pages allocated to store ftrace
         functions, where this can happen if the linker inserts a bunch of
         "NULL" entries, causing the accounting of how many pages needed to be
         off.
      
       - Fix histogram synthetic event crashing when the start event is
         removed and the end event is still using a variable from it
      
       - Fix memory leak in freeing iter->temp in tracing_release_pipe()
      
      * tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing: Fix memory leak of iter->temp when reading trace_pipe
        tracing/histograms: Add histograms to hist_vars if they have referenced variables
        tracing: Stop FORTIFY_SOURCE complaining about stack trace caller
        ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()
        ring-buffer: Fix deadloop issue on reading trace_pipe
        tracing: arm64: Avoid missing-prototype warnings
        selftests/user_events: Test struct size match cases
        tracing/user_events: Fix struct arg size match check
        x86/ftrace: Remove unsued extern declaration ftrace_regs_caller_ret()
        arm64: ftrace: Add direct call trampoline samples support
        samples: ftrace: Save required argument registers in sample trampolines
      ebc27aac
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.5-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 15999328
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - a cleanup of the Xen related ELF-notes
      
       - a fix for virtio handling in Xen dom0 when running Xen in a VM
      
      * tag 'for-linus-6.5-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/virtio: Fix NULL deref when a bridge of PCI root bus has no parent
        x86/Xen: tidy xen-head.S
      15999328
    • Linus Torvalds's avatar
      Merge tag 'sh-for-v6.5-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux · 9350cd01
      Linus Torvalds authored
      Pull sh fixes from John Paul Adrian Glaubitz:
       "The sh updates introduced multiple regressions.
      
        In particular, the change a8ac2961 ("sh: Avoid using IRQ0 on SH3
        and SH4") causes several boards to hang during boot due to incorrect
        IRQ numbers.
      
        Geert Uytterhoeven has contributed patches that handle the virq offset
        in the IRQ code for the dreamcast, highlander and r2d boards while
        Artur Rojek has contributed a patch which handles the virq offset for
        the hd64461 companion chip"
      
      * tag 'sh-for-v6.5-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
        sh: hd64461: Handle virq offset for offchip IRQ base and HD64461 IRQ
        sh: mach-dreamcast: Handle virq offset in cascaded IRQ demux
        sh: mach-highlander: Handle virq offset in cascaded IRL demux
        sh: mach-r2d: Handle virq offset in cascaded IRL demux
      9350cd01
  5. Jul 13, 2023
    • Zheng Yejian's avatar
      tracing: Fix memory leak of iter->temp when reading trace_pipe · d5a82189
      Zheng Yejian authored
      kmemleak reports:
        unreferenced object 0xffff88814d14e200 (size 256):
          comm "cat", pid 336, jiffies 4294871818 (age 779.490s)
          hex dump (first 32 bytes):
            04 00 01 03 00 00 00 00 08 00 00 00 00 00 00 00  ................
            0c d8 c8 9b ff ff ff ff 04 5a ca 9b ff ff ff ff  .........Z......
          backtrace:
            [<ffffffff9bdff18f>] __kmalloc+0x4f/0x140
            [<ffffffff9bc9238b>] trace_find_next_entry+0xbb/0x1d0
            [<ffffffff9bc9caef>] trace_print_lat_context+0xaf/0x4e0
            [<ffffffff9bc94490>] print_trace_line+0x3e0/0x950
            [<ffffffff9bc95499>] tracing_read_pipe+0x2d9/0x5a0
            [<ffffffff9bf03a43>] vfs_read+0x143/0x520
            [<ffffffff9bf04c2d>] ksys_read+0xbd/0x160
            [<ffffffff9d0f0edf>] do_syscall_64+0x3f/0x90
            [<ffffffff9d2000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      
      when reading file 'trace_pipe', 'iter->temp' is allocated or relocated
      in trace_find_next_entry() but not freed before 'trace_pipe' is closed.
      
      To fix it, free 'iter->temp' in tracing_release_pipe().
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230713141435.1133021-1-zhengyejian1@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: ff895103
      
       ("tracing: Save off entry when peeking at next entry")
      Signed-off-by: default avatarZheng Yejian <zhengyejian1@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      d5a82189
    • Paolo Abeni's avatar
      Merge branch 'net-sched-fixes-for-sch_qfq' · 9d23aac8
      Paolo Abeni authored
      
      
      Pedro Tammela says:
      
      ====================
      net/sched: fixes for sch_qfq
      
      Patch 1 fixes a regression introduced in 6.4 where the MTU size could be
      bigger than 'lmax'.
      
      Patch 3 fixes an issue where the code doesn't account for qdisc_pkt_len()
      returning a size bigger then 'lmax'.
      
      Patches 2 and 4 are selftests for the issues above.
      ====================
      
      Link: https://lore.kernel.org/r/20230711210103.597831-1-pctammela@mojatatu.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9d23aac8
    • Pedro Tammela's avatar
      selftests: tc-testing: add test for qfq with stab overhead · 137f6219
      Pedro Tammela authored
      
      
      A packet with stab overhead greater than QFQ_MAX_LMAX should be dropped
      by the QFQ qdisc as it can't handle such lengths.
      
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      137f6219
    • Pedro Tammela's avatar
      net/sched: sch_qfq: account for stab overhead in qfq_enqueue · 3e337087
      Pedro Tammela authored
      Lion says:
      -------
      In the QFQ scheduler a similar issue to CVE-2023-31436
      persists.
      
      Consider the following code in net/sched/sch_qfq.c:
      
      static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                      struct sk_buff **to_free)
      {
           unsigned int len = qdisc_pkt_len(skb), gso_segs;
      
          // ...
      
           if (unlikely(cl->agg->lmax < len)) {
               pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
                    cl->agg->lmax, len, cl->common.classid);
               err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
               if (err) {
                   cl->qstats.drops++;
                   return qdisc_drop(skb, sch, to_free);
               }
      
          // ...
      
           }
      
      Similarly to CVE-2023-31436, "lmax" is increased without any bounds
      checks according to the packet length "len". Usually this would not
      impose a problem because packet sizes are naturally limited.
      
      This is however not the actual packet length, rather the
      "qdisc_pkt_len(skb)" which might apply size transformations according to
      "struct qdisc_size_table" as created by "qdisc_get_stab()" in
      net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.
      
      A user may choose virtually any size using such a table.
      
      As a result the same issue as in CVE-2023-31436 can occur, allowing heap
      out-of-bounds read / writes in the kmalloc-8192 cache.
      -------
      
      We can create the issue with the following commands:
      
      tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
      overhead 999999999 linklayer ethernet qfq
      tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
      tc filter add dev $DEV parent 1: matchall classid 1:1
      ping -I $DEV 1.1.1.2
      
      This is caused by incorrectly assuming that qdisc_pkt_len() returns a
      length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.
      
      Fixes: 462dbc91
      
       ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
      Reported-by: default avatarLion <nnamrec@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3e337087
    • Pedro Tammela's avatar
      selftests: tc-testing: add tests for qfq mtu sanity check · c5a06fdc
      Pedro Tammela authored
      
      
      QFQ only supports a certain bound of MTU size so make sure
      we check for this requirement in the tests.
      
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c5a06fdc