Skip to content
  1. Feb 02, 2023
    • Oliver Hartkopp's avatar
      can: isotp: handle wait_event_interruptible() return values · 823b2e42
      Oliver Hartkopp authored
      When wait_event_interruptible() has been interrupted by a signal the
      tx.state value might not be ISOTP_IDLE. Force the state machines
      into idle state to inhibit the timer handlers to continue working.
      
      Fixes: 86633786
      
       ("can: isotp: fix tx state handling for echo tx processing")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Link: https://lore.kernel.org/all/20230112192347.1944-1-socketcan@hartkopp.net
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      823b2e42
    • Oliver Hartkopp's avatar
      can: raw: fix CAN FD frame transmissions over CAN XL devices · 3793301c
      Oliver Hartkopp authored
      A CAN XL device is always capable to process CAN FD frames. The former
      check when sending CAN FD frames relied on the existence of a CAN FD
      device and did not check for a CAN XL device that would be correct
      too.
      
      With this patch the CAN FD feature is enabled automatically when CAN
      XL is switched on - and CAN FD cannot be switch off while CAN XL is
      enabled.
      
      This precondition also leads to a clean up and reduction of checks in
      the hot path in raw_rcv() and raw_sendmsg(). Some conditions are
      reordered to handle simple checks first.
      
      changes since v1: https://lore.kernel.org/all/20230131091012.50553-1-socketcan@hartkopp.net
      - fixed typo: devive -> device
      changes since v2: https://lore.kernel.org/all/20230131091824.51026-1-socketcan@hartkopp.net/
      - reorder checks in if statements to handle simple checks first
      
      Fixes: 62633269
      
       ("can: raw: add CAN XL support")
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Link: https://lore.kernel.org/all/20230131105613.55228-1-socketcan@hartkopp.net
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      3793301c
    • Ziyang Xuan's avatar
      can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate · d0553680
      Ziyang Xuan authored
      The conclusion "j1939_session_deactivate() should be called with a
      session ref-count of at least 2" is incorrect. In some concurrent
      scenarios, j1939_session_deactivate can be called with the session
      ref-count less than 2. But there is not any problem because it
      will check the session active state before session putting in
      j1939_session_deactivate_locked().
      
      Here is the concurrent scenario of the problem reported by syzbot
      and my reproduction log.
      
              cpu0                            cpu1
                                      j1939_xtp_rx_eoma
      j1939_xtp_rx_abort_one
                                      j1939_session_get_by_addr [kref == 2]
      j1939_session_get_by_addr [kref == 3]
      j1939_session_deactivate [kref == 2]
      j1939_session_put [kref == 1]
      				j1939_session_completed
      				j1939_session_deactivate
      				WARN_ON_ONCE(kref < 2)
      
      =====================================================
      WARNING: CPU: 1 PID: 21 at net/can/j1939/transport.c:1088 j1939_session_deactivate+0x5f/0x70
      CPU: 1 PID: 21 Comm: ksoftirqd/1 Not tainted 5.14.0-rc7+ #32
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
      RIP: 0010:j1939_session_deactivate+0x5f/0x70
      Call Trace:
       j1939_session_deactivate_activate_next+0x11/0x28
       j1939_xtp_rx_eoma+0x12a/0x180
       j1939_tp_recv+0x4a2/0x510
       j1939_can_recv+0x226/0x380
       can_rcv_filter+0xf8/0x220
       can_receive+0x102/0x220
       ? process_backlog+0xf0/0x2c0
       can_rcv+0x53/0xf0
       __netif_receive_skb_one_core+0x67/0x90
       ? process_backlog+0x97/0x2c0
       __netif_receive_skb+0x22/0x80
      
      Fixes: 0c71437d
      
       ("can: j1939: j1939_session_deactivate(): clarify lifetime of session object")
      Reported-by: default avatar <syzbot+9981a614060dcee6eeca@syzkaller.appspotmail.com>
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Acked-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Link: https://lore.kernel.org/all/20210906094200.95868-1-william.xuanziyang@huawei.com
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      d0553680
    • Ratheesh Kannoth's avatar
      octeontx2-af: Fix devlink unregister · 917d5e04
      Ratheesh Kannoth authored
      Exact match feature is only available in CN10K-B.
      Unregister exact match devlink entry only for
      this silicon variant.
      
      Fixes: 87e4ea29
      
       ("octeontx2-af: Debugsfs support for exact match.")
      Signed-off-by: default avatarRatheesh Kannoth <rkannoth@marvell.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Link: https://lore.kernel.org/r/20230131061659.1025137-1-rkannoth@marvell.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      917d5e04
    • Tom Rix's avatar
      igc: return an error if the mac type is unknown in igc_ptp_systim_to_hwtstamp() · a2df8463
      Tom Rix authored
      clang static analysis reports
      drivers/net/ethernet/intel/igc/igc_ptp.c:673:3: warning: The left operand of
        '+' is a garbage value [core.UndefinedBinaryOperatorResult]
         ktime_add_ns(shhwtstamps.hwtstamp, adjust);
         ^            ~~~~~~~~~~~~~~~~~~~~
      
      igc_ptp_systim_to_hwtstamp() silently returns without setting the hwtstamp
      if the mac type is unknown.  This should be treated as an error.
      
      Fixes: 81b05520
      
       ("igc: Add support for RX timestamping")
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarSasha Neftin <sasha.neftin@intel.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20230131215437.1528994-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a2df8463
    • Yanguo Li's avatar
      nfp: flower: avoid taking mutex in atomic context · 9c6b9cba
      Yanguo Li authored
      A mutex may sleep, which is not permitted in atomic context.
      Avoid a case where this may arise by moving the to
      nfp_flower_lag_get_info_from_netdev() in nfp_tun_write_neigh() spinlock.
      
      Fixes: abc21095
      
       ("nfp: flower: tunnel neigh support bond offload")
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarYanguo Li <yanguo.li@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230131080313.2076060-1-simon.horman@corigine.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9c6b9cba
    • Jakub Kicinski's avatar
      Merge branch 'ip-ip6_gre-fix-gre-tunnels-not-generating-ipv6-link-local-addresses' · cca6e9ff
      Jakub Kicinski authored
      Thomas Winter says:
      
      ====================
      ip/ip6_gre: Fix GRE tunnels not generating IPv6 link local addresses
      
      For our point-to-point GRE tunnels, they have IN6_ADDR_GEN_MODE_NONE
      when they are created then we set IN6_ADDR_GEN_MODE_EUI64 when they
      come up to generate the IPv6 link local address for the interface.
      Recently we found that they were no longer generating IPv6 addresses.
      
      Also, non-point-to-point tunnels were not generating any IPv6 link
      local address and instead generating an IPv6 compat address,
      breaking IPv6 communication on the tunnel.
      
      These failures were caused by commit e5dd7294
      
       and this patch set
      aims to resolve these issues.
      ====================
      
      Link: https://lore.kernel.org/r/20230131034646.237671-1-Thomas.Winter@alliedtelesis.co.nz
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cca6e9ff
    • Thomas Winter's avatar
      ip/ip6_gre: Fix non-point-to-point tunnel not generating IPv6 link local address · 30e2291f
      Thomas Winter authored
      We recently found that our non-point-to-point tunnels were not
      generating any IPv6 link local address and instead generating an
      IPv6 compat address, breaking IPv6 communication on the tunnel.
      
      Previously, addrconf_gre_config always would call addrconf_addr_gen
      and generate a EUI64 link local address for the tunnel.
      Then commit e5dd7294 changed the code path so that add_v4_addrs
      is called but this only generates a compat IPv6 address for
      non-point-to-point tunnels.
      
      I assume the compat address is specifically for SIT tunnels so
      have kept that only for SIT - GRE tunnels now always generate link
      local addresses.
      
      Fixes: e5dd7294
      
       ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
      Signed-off-by: default avatarThomas Winter <Thomas.Winter@alliedtelesis.co.nz>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30e2291f
    • Thomas Winter's avatar
      ip/ip6_gre: Fix changing addr gen mode not generating IPv6 link local address · 23ca0c2c
      Thomas Winter authored
      For our point-to-point GRE tunnels, they have IN6_ADDR_GEN_MODE_NONE
      when they are created then we set IN6_ADDR_GEN_MODE_EUI64 when they
      come up to generate the IPv6 link local address for the interface.
      Recently we found that they were no longer generating IPv6 addresses.
      This issue would also have affected SIT tunnels.
      
      Commit e5dd7294 changed the code path so that GRE tunnels
      generate an IPv6 address based on the tunnel source address.
      It also changed the code path so GRE tunnels don't call addrconf_addr_gen
      in addrconf_dev_config which is called by addrconf_sysctl_addr_gen_mode
      when the IN6_ADDR_GEN_MODE is changed.
      
      This patch aims to fix this issue by moving the code in addrconf_notify
      which calls the addr gen for GRE and SIT into a separate function
      and calling it in the places that expect the IPv6 address to be
      generated.
      
      The previous addrconf_dev_config is renamed to addrconf_eth_config
      since it only expected eth type interfaces and follows the
      addrconf_gre/sit_config format.
      
      A part of this changes means that the loopback address will be
      attempted to be configured when changing addr_gen_mode for lo.
      This should not be a problem because the address should exist anyway
      and if does already exist then no error is produced.
      
      Fixes: e5dd7294
      
       ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
      Signed-off-by: default avatarThomas Winter <Thomas.Winter@alliedtelesis.co.nz>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      23ca0c2c
  2. Feb 01, 2023
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 64466c40
      Jakub Kicinski authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Release bridge info once packet escapes the br_netfilter path,
         from Florian Westphal.
      
      2) Revert incorrect fix for the SCTP connection tracking chunk
         iterator, also from Florian.
      
      First path fixes a long standing issue, the second path addresses
      a mistake in the previous pull request for net.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        Revert "netfilter: conntrack: fix bug in for_each_sctp_chunk"
        netfilter: br_netfilter: disable sabotage_in hook after first suppression
      ====================
      
      Link: https://lore.kernel.org/r/20230131133158.4052-1-pablo@netfilter.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      64466c40
    • Chris Healy's avatar
      net: phy: meson-gxl: Add generic dummy stubs for MMD register access · afc2336f
      Chris Healy authored
      The Meson G12A Internal PHY does not support standard IEEE MMD extended
      register access, therefore add generic dummy stubs to fail the read and
      write MMD calls. This is necessary to prevent the core PHY code from
      erroneously believing that EEE is supported by this PHY even though this
      PHY does not support EEE, as MMD register access returns all FFFFs.
      
      Fixes: 5c3407ab
      
       ("net: phy: meson-gxl: add g12a support")
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarChris Healy <healych@amazon.com>
      Reviewed-by: default avatarJerome Brunet <jbrunet@baylibre.com>
      Link: https://lore.kernel.org/r/20230130231402.471493-1-cphealy@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      afc2336f
    • Yan Zhai's avatar
      net: fix NULL pointer in skb_segment_list · 876e8ca8
      Yan Zhai authored
      Commit 3a1296a3 ("net: Support GRO/GSO fraglist chaining.")
      introduced UDP listifyed GRO. The segmentation relies on frag_list being
      untouched when passing through the network stack. This assumption can be
      broken sometimes, where frag_list itself gets pulled into linear area,
      leaving frag_list being NULL. When this happens it can trigger
      following NULL pointer dereference, and panic the kernel. Reverse the
      test condition should fix it.
      
      [19185.577801][    C1] BUG: kernel NULL pointer dereference, address:
      ...
      [19185.663775][    C1] RIP: 0010:skb_segment_list+0x1cc/0x390
      ...
      [19185.834644][    C1] Call Trace:
      [19185.841730][    C1]  <TASK>
      [19185.848563][    C1]  __udp_gso_segment+0x33e/0x510
      [19185.857370][    C1]  inet_gso_segment+0x15b/0x3e0
      [19185.866059][    C1]  skb_mac_gso_segment+0x97/0x110
      [19185.874939][    C1]  __skb_gso_segment+0xb2/0x160
      [19185.883646][    C1]  udp_queue_rcv_skb+0xc3/0x1d0
      [19185.892319][    C1]  udp_unicast_rcv_skb+0x75/0x90
      [19185.900979][    C1]  ip_protocol_deliver_rcu+0xd2/0x200
      [19185.910003][    C1]  ip_local_deliver_finish+0x44/0x60
      [19185.918757][    C1]  __netif_receive_skb_one_core+0x8b/0xa0
      [19185.927834][    C1]  process_backlog+0x88/0x130
      [19185.935840][    C1]  __napi_poll+0x27/0x150
      [19185.943447][    C1]  net_rx_action+0x27e/0x5f0
      [19185.951331][    C1]  ? mlx5_cq_tasklet_cb+0x70/0x160 [mlx5_core]
      [19185.960848][    C1]  __do_softirq+0xbc/0x25d
      [19185.968607][    C1]  irq_exit_rcu+0x83/0xb0
      [19185.976247][    C1]  common_interrupt+0x43/0xa0
      [19185.984235][    C1]  asm_common_interrupt+0x22/0x40
      ...
      [19186.094106][    C1]  </TASK>
      
      Fixes: 3a1296a3
      
       ("net: Support GRO/GSO fraglist chaining.")
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarYan Zhai <yan@cloudflare.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/Y9gt5EUizK1UImEP@debian
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      876e8ca8
    • Vladimir Oltean's avatar
      net: fman: memac: free mdio device if lynx_pcs_create() fails · efec2e2a
      Vladimir Oltean authored
      When memory allocation fails in lynx_pcs_create() and it returns NULL,
      there remains a dangling reference to the mdiodev returned by
      of_mdio_find_device() which is leaked as soon as memac_pcs_create()
      returns empty-handed.
      
      Fixes: a7c2a32e
      
       ("net: fman: memac: Use lynx pcs driver")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSean Anderson <sean.anderson@seco.com>
      Acked-by: default avatarMadalin Bucur <madalin.bucur@oss.nxp.com>
      Link: https://lore.kernel.org/r/20230130193051.563315-1-vladimir.oltean@nxp.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      efec2e2a
    • Xin Long's avatar
      sctp: do not check hb_timer.expires when resetting hb_timer · 8f35ae17
      Xin Long authored
      It tries to avoid the frequently hb_timer refresh in commit ba6f5e33
      ("sctp: avoid refreshing heartbeat timer too often"), and it only allows
      mod_timer when the new expires is after hb_timer.expires. It means even
      a much shorter interval for hb timer gets applied, it will have to wait
      until the current hb timer to time out.
      
      In sctp_do_8_2_transport_strike(), when a transport enters PF state, it
      expects to update the hb timer to resend a heartbeat every rto after
      calling sctp_transport_reset_hb_timer(), which will not work as the
      change mentioned above.
      
      The frequently hb_timer refresh was caused by sctp_transport_reset_timers()
      called in sctp_outq_flush() and it was already removed in the commit above.
      So we don't have to check hb_timer.expires when resetting hb_timer as it is
      now not called very often.
      
      Fixes: ba6f5e33
      
       ("sctp: avoid refreshing heartbeat timer too often")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Link: https://lore.kernel.org/r/d958c06985713ec84049a2d5664879802710179a.1675095933.git.lucien.xin@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8f35ae17
  3. Jan 31, 2023
    • Florian Westphal's avatar
      Revert "netfilter: conntrack: fix bug in for_each_sctp_chunk" · bd0e06f0
      Florian Westphal authored
      There is no bug.  If sch->length == 0, this would result in an infinite
      loop, but first caller, do_basic_checks(), errors out in this case.
      
      After this change, packets with bogus zero-length chunks are no longer
      detected as invalid, so revert & add comment wrt. 0 length check.
      
      Fixes: 98ee0077
      
       ("netfilter: conntrack: fix bug in for_each_sctp_chunk")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      bd0e06f0
    • Florian Westphal's avatar
      netfilter: br_netfilter: disable sabotage_in hook after first suppression · 2b272bb5
      Florian Westphal authored
      When using a xfrm interface in a bridged setup (the outgoing device is
      bridged), the incoming packets in the xfrm interface are only tracked
      in the outgoing direction.
      
      $ brctl show
      bridge name     interfaces
      br_eth1         eth1
      
      $ conntrack -L
      tcp 115 SYN_SENT src=192... dst=192... [UNREPLIED] ...
      
      If br_netfilter is enabled, the first (encrypted) packet is received onR
      eth1, conntrack hooks are called from br_netfilter emulation which
      allocates nf_bridge info for this skb.
      
      If the packet is for local machine, skb gets passed up the ip stack.
      The skb passes through ip prerouting a second time. br_netfilter
      ip_sabotage_in supresses the re-invocation of the hooks.
      
      After this, skb gets decrypted in xfrm layer and appears in
      network stack a second time (after decryption).
      
      Then, ip_sabotage_in is called again and suppresses netfilter
      hook invocation, even though the bridge layer never called them
      for the plaintext incarnation of the packet.
      
      Free the bridge info after the first suppression to avoid this.
      
      I was unable to figure out where the regression comes from, as far as i
      can see br_netfilter always had this problem; i did not expect that skb
      is looped again with different headers.
      
      Fixes: c4b0e771
      
       ("netfilter: avoid using skb->nf_bridge directly")
      Reported-and-tested-by: default avatarWolfgang Nothdurft <wolfgang@linogate.de>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2b272bb5
    • Kees Cook's avatar
      net: sched: sch: Bounds check priority · de5ca4c3
      Kees Cook authored
      
      
      Nothing was explicitly bounds checking the priority index used to access
      clpriop[]. WARN and bail out early if it's pathological. Seen with GCC 13:
      
      ../net/sched/sch_htb.c: In function 'htb_activate_prios':
      ../net/sched/sch_htb.c:437:44: warning: array subscript [0, 31] is outside array bounds of 'struct htb_prio[8]' [-Warray-bounds=]
        437 |                         if (p->inner.clprio[prio].feed.rb_node)
            |                             ~~~~~~~~~~~~~~~^~~~~~
      ../net/sched/sch_htb.c:131:41: note: while referencing 'clprio'
        131 |                         struct htb_prio clprio[TC_HTB_NUMPRIO];
            |                                         ^~~~~~
      
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20230127224036.never.561-kees@kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      de5ca4c3
    • Kees Cook's avatar
      net: ethernet: mtk_eth_soc: Avoid truncating allocation · f3eceaed
      Kees Cook authored
      
      
      There doesn't appear to be a reason to truncate the allocation used for
      flow_info, so do a full allocation and remove the unused empty struct.
      GCC does not like having a reference to an object that has been
      partially allocated, as bounds checking may become impossible when
      such an object is passed to other code. Seen with GCC 13:
      
      ../drivers/net/ethernet/mediatek/mtk_ppe.c: In function 'mtk_foe_entry_commit_subflow':
      ../drivers/net/ethernet/mediatek/mtk_ppe.c:623:18: warning: array subscript 'struct mtk_flow_entry[0]' is partly outside array bounds of 'unsigned char[48]' [-Warray-bounds=]
        623 |         flow_info->l2_data.base_flow = entry;
            |                  ^~
      
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: John Crispin <john@phrozen.org>
      Cc: Sean Wang <sean.wang@mediatek.com>
      Cc: Mark Lee <Mark-MC.Lee@mediatek.com>
      Cc: Lorenzo Bianconi <lorenzo@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: netdev@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-mediatek@lists.infradead.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230127223853.never.014-kees@kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f3eceaed
    • Jakub Kicinski's avatar
      Merge tag 'ieee802154-for-net-2023-01-30' of... · 9b3fc325
      Jakub Kicinski authored
      
      Merge tag 'ieee802154-for-net-2023-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
      
      Stefan Schmidt says:
      
      ====================
      ieee802154 for net 2023-01-30
      
      Only one fix this time around.
      
      Miquel Raynal fixed a potential double free spotted by Dan Carpenter.
      
      * tag 'ieee802154-for-net-2023-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
        mac802154: Fix possible double free upon parsing error
      ====================
      
      Link: https://lore.kernel.org/r/20230130095646.301448-1-stefan@datenfreihafen.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9b3fc325
    • Pietro Borrello's avatar
      net/tls: tls_is_tx_ready() checked list_entry · ffe2a225
      Pietro Borrello authored
      tls_is_tx_ready() checks that list_first_entry() does not return NULL.
      This condition can never happen. For empty lists, list_first_entry()
      returns the list_entry() of the head, which is a type confusion.
      Use list_first_entry_or_null() which returns NULL in case of empty
      lists.
      
      Fixes: a42055e8
      
       ("net/tls: Add support for async encryption of records for performance")
      Signed-off-by: default avatarPietro Borrello <borrello@diag.uniroma1.it>
      Link: https://lore.kernel.org/r/20230128-list-entry-null-check-tls-v1-1-525bbfe6f0d0@diag.uniroma1.it
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffe2a225
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 84115f0e
      Jakub Kicinski authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-01-27 (ice)
      
      This series contains updates to ice driver only.
      
      Dave prevents modifying channels when RDMA is active as this will break
      RDMA traffic.
      
      Michal fixes a broken URL.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: Fix broken link in ice NAPI doc
        ice: Prevent set_channel from changing queues while RDMA active
      ====================
      
      Link: https://lore.kernel.org/r/20230127225333.1534783-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      84115f0e
  4. Jan 30, 2023
    • Colin Foster's avatar
      net: phy: fix null dereference in phy_attach_direct · 73a87602
      Colin Foster authored
      Commit bc66fa87 ("net: phy: Add link between phy dev and mac dev")
      introduced a link between net devices and phy devices. It fails to check
      whether dev is NULL, leading to a NULL dereference error.
      
      Fixes: bc66fa87
      
       ("net: phy: Add link between phy dev and mac dev")
      Signed-off-by: default avatarColin Foster <colin.foster@in-advantage.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73a87602
    • Hyunwoo Kim's avatar
      netrom: Fix use-after-free caused by accept on already connected socket · 61179292
      Hyunwoo Kim authored
      
      
      If you call listen() and accept() on an already connect()ed
      AF_NETROM socket, accept() can successfully connect.
      This is because when the peer socket sends data to sendmsg,
      the skb with its own sk stored in the connected socket's
      sk->sk_receive_queue is connected, and nr_accept() dequeues
      the skb waiting in the sk->sk_receive_queue.
      
      As a result, nr_accept() allocates and returns a sock with
      the sk of the parent AF_NETROM socket.
      
      And here use-after-free can happen through complex race conditions:
      ```
                        cpu0                                                     cpu1
                                                                     1. socket_2 = socket(AF_NETROM)
                                                                              .
                                                                              .
                                                                        listen(socket_2)
                                                                        accepted_socket = accept(socket_2)
             2. socket_1 = socket(AF_NETROM)
                  nr_create()    // sk refcount : 1
                connect(socket_1)
                                                                     3. write(accepted_socket)
                                                                          nr_sendmsg()
                                                                          nr_output()
                                                                          nr_kick()
                                                                          nr_send_iframe()
                                                                          nr_transmit_buffer()
                                                                          nr_route_frame()
                                                                          nr_loopback_queue()
                                                                          nr_loopback_timer()
                                                                          nr_rx_frame()
                                                                          nr_process_rx_frame(sk, skb);    // sk : socket_1's sk
                                                                          nr_state3_machine()
                                                                          nr_queue_rx_frame()
                                                                          sock_queue_rcv_skb()
                                                                          sock_queue_rcv_skb_reason()
                                                                          __sock_queue_rcv_skb()
                                                                          __skb_queue_tail(list, skb);    // list : socket_1's sk->sk_receive_queue
             4. listen(socket_1)
                  nr_listen()
                uaf_socket = accept(socket_1)
                  nr_accept()
                  skb_dequeue(&sk->sk_receive_queue);
                                                                     5. close(accepted_socket)
                                                                          nr_release()
                                                                          nr_write_internal(sk, NR_DISCREQ)
                                                                          nr_transmit_buffer()    // NR_DISCREQ
                                                                          nr_route_frame()
                                                                          nr_loopback_queue()
                                                                          nr_loopback_timer()
                                                                          nr_rx_frame()    // sk : socket_1's sk
                                                                          nr_process_rx_frame()  // NR_STATE_3
                                                                          nr_state3_machine()    // NR_DISCREQ
                                                                          nr_disconnect()
                                                                          nr_sk(sk)->state = NR_STATE_0;
             6. close(socket_1)    // sk refcount : 3
                  nr_release()    // NR_STATE_0
                  sock_put(sk);    // sk refcount : 0
                  sk_free(sk);
                close(uaf_socket)
                  nr_release()
                  sock_hold(sk);    // UAF
      ```
      
      KASAN report by syzbot:
      ```
      BUG: KASAN: use-after-free in nr_release+0x66/0x460 net/netrom/af_netrom.c:520
      Write of size 4 at addr ffff8880235d8080 by task syz-executor564/5128
      
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:306 [inline]
       print_report+0x15e/0x461 mm/kasan/report.c:417
       kasan_report+0xbf/0x1f0 mm/kasan/report.c:517
       check_region_inline mm/kasan/generic.c:183 [inline]
       kasan_check_range+0x141/0x190 mm/kasan/generic.c:189
       instrument_atomic_read_write include/linux/instrumented.h:102 [inline]
       atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:116 [inline]
       __refcount_add include/linux/refcount.h:193 [inline]
       __refcount_inc include/linux/refcount.h:250 [inline]
       refcount_inc include/linux/refcount.h:267 [inline]
       sock_hold include/net/sock.h:775 [inline]
       nr_release+0x66/0x460 net/netrom/af_netrom.c:520
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x1c/0x20 net/socket.c:1365
       __fput+0x27c/0xa90 fs/file_table.c:320
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xaa8/0x2950 kernel/exit.c:867
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
       get_signal+0x21c3/0x2450 kernel/signal.c:2859
       arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
       exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
       exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
       do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f6c19e3c9b9
      Code: Unable to access opcode bytes at 0x7f6c19e3c98f.
      RSP: 002b:00007fffd4ba2ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: 0000000000000116 RBX: 0000000000000003 RCX: 00007f6c19e3c9b9
      RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000006
      RBP: 0000000000000003 R08: 000000000000000d R09: 000000000000000d
      R10: 0000000000000000 R11: 0000000000000246 R12: 000055555566a2c0
      R13: 0000000000000011 R14: 0000000000000000 R15: 0000000000000000
       </TASK>
      
      Allocated by task 5128:
       kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
       kasan_set_track+0x25/0x30 mm/kasan/common.c:52
       ____kasan_kmalloc mm/kasan/common.c:371 [inline]
       ____kasan_kmalloc mm/kasan/common.c:330 [inline]
       __kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380
       kasan_kmalloc include/linux/kasan.h:211 [inline]
       __do_kmalloc_node mm/slab_common.c:968 [inline]
       __kmalloc+0x5a/0xd0 mm/slab_common.c:981
       kmalloc include/linux/slab.h:584 [inline]
       sk_prot_alloc+0x140/0x290 net/core/sock.c:2038
       sk_alloc+0x3a/0x7a0 net/core/sock.c:2091
       nr_create+0xb6/0x5f0 net/netrom/af_netrom.c:433
       __sock_create+0x359/0x790 net/socket.c:1515
       sock_create net/socket.c:1566 [inline]
       __sys_socket_create net/socket.c:1603 [inline]
       __sys_socket_create net/socket.c:1588 [inline]
       __sys_socket+0x133/0x250 net/socket.c:1636
       __do_sys_socket net/socket.c:1649 [inline]
       __se_sys_socket net/socket.c:1647 [inline]
       __x64_sys_socket+0x73/0xb0 net/socket.c:1647
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Freed by task 5128:
       kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
       kasan_set_track+0x25/0x30 mm/kasan/common.c:52
       kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518
       ____kasan_slab_free mm/kasan/common.c:236 [inline]
       ____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200
       kasan_slab_free include/linux/kasan.h:177 [inline]
       __cache_free mm/slab.c:3394 [inline]
       __do_kmem_cache_free mm/slab.c:3580 [inline]
       __kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587
       sk_prot_free net/core/sock.c:2074 [inline]
       __sk_destruct+0x5df/0x750 net/core/sock.c:2166
       sk_destruct net/core/sock.c:2181 [inline]
       __sk_free+0x175/0x460 net/core/sock.c:2192
       sk_free+0x7c/0xa0 net/core/sock.c:2203
       sock_put include/net/sock.h:1991 [inline]
       nr_release+0x39e/0x460 net/netrom/af_netrom.c:554
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x1c/0x20 net/socket.c:1365
       __fput+0x27c/0xa90 fs/file_table.c:320
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xaa8/0x2950 kernel/exit.c:867
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
       get_signal+0x21c3/0x2450 kernel/signal.c:2859
       arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
       exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
       exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
       do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      ```
      
      To fix this issue, nr_listen() returns -EINVAL for sockets that
      successfully nr_connect().
      
      Reported-by: default avatar <syzbot+caa188bdfc1eeafeb418@syzkaller.appspotmail.com>
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarHyunwoo Kim <v4bel@theori.io>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61179292
    • Andrey Konovalov's avatar
      net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoC · 54aa39a5
      Andrey Konovalov authored
      
      
      Currently in phy_init_eee() the driver unconditionally configures the PHY
      to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt
      storm on my qcs404-base board.
      
      Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible
      device RX_CLK continues to run even in Rx LPI state.
      
      Signed-off-by: default avatarAndrey Konovalov <andrey.konovalov@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54aa39a5
  5. Jan 28, 2023
    • Andrei Gherzan's avatar
      selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibility · a6efc42a
      Andrei Gherzan authored
      
      
      "tcpdump" is used to capture traffic in these tests while using a random,
      temporary and not suffixed file for it. This can interfere with apparmor
      configuration where the tool is only allowed to read from files with
      'known' extensions.
      
      The MINE type application/vnd.tcpdump.pcap was registered with IANA for
      pcap files and .pcap is the extension that is both most common but also
      aligned with standard apparmor configurations. See TCPDUMP(8) for more
      details.
      
      This improves compatibility with standard apparmor configurations by
      using ".pcap" as the file extension for the tests' temporary files.
      
      Signed-off-by: default avatarAndrei Gherzan <andrei.gherzan@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6efc42a
    • David S. Miller's avatar
      Merge branch 't7xx-pm-fixes' · 906ad3c9
      David S. Miller authored
      Kornel Dulęba says:
      
      ====================
      net: wwan: t7xx: Fix Runtime PM implementation
      
      d10b3a69
      
       ("net: wwan: t7xx: Runtime PM") introduced support for
      Runtime PM for this driver, but due to a bug in the initialization logic
      the usage refcount would never reach 0, leaving the feature unused.
      This patchset addresses that, together with a bug found after runtime
      suspend was enabled.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      906ad3c9
    • Kornel Dulęba's avatar
      net: wwan: t7xx: Fix Runtime PM initialization · e3d6d152
      Kornel Dulęba authored
      For PCI devices the Runtime PM refcount is incremented twice:
      1. During device enumeration with a call to pm_runtime_forbid.
      2. Just before a driver probe logic is called.
      Because of that in order to enable Runtime PM on a given device
      we have to call both pm_runtime_allow and pm_runtime_put_noidle,
      once it's ready to be runtime suspended.
      The former was missing causing the pm refcount to never reach 0.
      
      Fixes: d10b3a69
      
       ("net: wwan: t7xx: Runtime PM")
      Signed-off-by: default avatarKornel Dulęba <mindal@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3d6d152
    • Kornel Dulęba's avatar
      net: wwan: t7xx: Fix Runtime PM resume sequence · 364d0221
      Kornel Dulęba authored
      Resume device before calling napi_schedule, instead of doing in the napi
      poll routine. Polling is done in softrq context. We can't call the PM
      resume logic from there as it's blocking and not irq safe.
      In order to make it work modify the interrupt handler to be run from irq
      handler thread.
      
      Fixes: 5545b7b9
      
       ("net: wwan: t7xx: Add NAPI support")
      Signed-off-by: default avatarKornel Dulęba <mindal@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      364d0221
    • Jeremy Kerr's avatar
      net: mctp: purge receive queues on sk destruction · 60bd1d90
      Jeremy Kerr authored
      We may have pending skbs in the receive queue when the sk is being
      destroyed; add a destructor to purge the queue.
      
      MCTP doesn't use the error queue, so only the receive_queue is purged.
      
      Fixes: 833ef3b9
      
       ("mctp: Populate socket implementation")
      Signed-off-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Link: https://lore.kernel.org/r/20230126064551.464468-1-jk@codeconstruct.com.au
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      60bd1d90
    • Andre Kalb's avatar
      net: phy: dp83822: Fix null pointer access on DP83825/DP83826 devices · 422ae7d9
      Andre Kalb authored
      The probe() function is only used for the DP83822 PHY, leaving the
      private data pointer uninitialized for the smaller DP83825/26 models.
      While all uses of the private data structure are hidden in 82822 specific
      callbacks, configuring the interrupt is shared across all models.
      This causes a NULL pointer dereference on the smaller PHYs as it accesses
      the private data unchecked. Verifying the pointer avoids that.
      
      Fixes: 5dc39fd5
      
       ("net: phy: DP83822: Add ability to advertise Fiber connection")
      Signed-off-by: default avatarAndre Kalb <andre.kalb@sma.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/Y9FzniUhUtbaGKU7@pc6682
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      422ae7d9
    • Natalia Petrova's avatar
      net: qrtr: free memory on error path in radix_tree_insert() · 29de68c2
      Natalia Petrova authored
      Function radix_tree_insert() returns errors if the node hasn't
      been initialized and added to the tree.
      
      "kfree(node)" and return value "NULL" of node_get() help
      to avoid using unclear node in other calls.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Cc: <stable@vger.kernel.org> # 5.7
      Fixes: 0c2204a4
      
       ("net: qrtr: Migrate nameservice to kernel from userspace")
      Signed-off-by: default avatarNatalia Petrova <n.petrova@fintech.ru>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarManivannan Sadhasivam <mani@kernel.org>
      Link: https://lore.kernel.org/r/20230125134831.8090-1-n.petrova@fintech.ru
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      29de68c2
    • Hyunwoo Kim's avatar
      net/rose: Fix to not accept on connected socket · 14caefcf
      Hyunwoo Kim authored
      
      
      If you call listen() and accept() on an already connect()ed
      rose socket, accept() can successfully connect.
      This is because when the peer socket sends data to sendmsg,
      the skb with its own sk stored in the connected socket's
      sk->sk_receive_queue is connected, and rose_accept() dequeues
      the skb waiting in the sk->sk_receive_queue.
      
      This creates a child socket with the sk of the parent
      rose socket, which can cause confusion.
      
      Fix rose_listen() to return -EINVAL if the socket has
      already been successfully connected, and add lock_sock
      to prevent this issue.
      
      Signed-off-by: default avatarHyunwoo Kim <v4bel@theori.io>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230125105944.GA133314@ubuntu
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      14caefcf
    • Íñigo Huguet's avatar
      sfc: correctly advertise tunneled IPv6 segmentation · ffffd245
      Íñigo Huguet authored
      Recent sfc NICs are TSO capable for some tunnel protocols. However, it
      was not working properly because the feature was not advertised in
      hw_enc_features, but in hw_features only.
      
      Setting up a GENEVE tunnel and using iperf3 to send IPv4 and IPv6 traffic
      to the tunnel show, with tcpdump, that the IPv4 packets still had ~64k
      size but the IPv6 ones had only ~1500 bytes (they had been segmented by
      software, not offloaded). With this patch segmentation is offloaded as
      expected and the traffic is correctly received at the other end.
      
      Fixes: 24b2c375
      
       ("sfc: advertise encapsulated offloads on EF10")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Acked-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Link: https://lore.kernel.org/r/20230125143513.25841-1-ihuguet@redhat.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffffd245
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 0548c5f2
      Jakub Kicinski authored
      
      
      Daniel Borkmann says:
      
      ====================
      bpf 2023-01-27
      
      We've added 10 non-merge commits during the last 9 day(s) which contain
      a total of 10 files changed, 170 insertions(+), 59 deletions(-).
      
      The main changes are:
      
      1) Fix preservation of register's parent/live fields when copying
         range-info, from Eduard Zingerman.
      
      2) Fix an off-by-one bug in bpf_mem_cache_idx() to select the right
         cache, from Hou Tao.
      
      3) Fix stack overflow from infinite recursion in sock_map_close(),
         from Jakub Sitnicki.
      
      4) Fix missing btf_put() in register_btf_id_dtor_kfuncs()'s error path,
         from Jiri Olsa.
      
      5) Fix a splat from bpf_setsockopt() via lsm_cgroup/socket_sock_rcv_skb,
         from Kui-Feng Lee.
      
      6) Fix bpf_send_signal[_thread]() helpers to hold a reference on the task,
         from Yonghong Song.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Fix the kernel crash caused by bpf_setsockopt().
        selftests/bpf: Cover listener cloning with progs attached to sockmap
        selftests/bpf: Pass BPF skeleton to sockmap_listen ops tests
        bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener
        bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
        bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
        selftests/bpf: Verify copy_register_state() preserves parent/live fields
        bpf: Fix to preserve reg parent/live fields when copying range info
        bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers
        bpf: Fix off-by-one error in bpf_mem_cache_idx()
      ====================
      
      Link: https://lore.kernel.org/r/20230127215820.4993-1-daniel@iogearbox.net
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0548c5f2
    • Alexander Duyck's avatar
      skb: Do mix page pool and page referenced frags in GRO · 7d2c89b3
      Alexander Duyck authored
      GSO should not merge page pool recycled frames with standard reference
      counted frames. Traditionally this didn't occur, at least not often.
      However as we start looking at adding support for wireless adapters there
      becomes the potential to mix the two due to A-MSDU repartitioning frames in
      the receive path. There are possibly other places where this may have
      occurred however I suspect they must be few and far between as we have not
      seen this issue until now.
      
      Fixes: 53e0961d
      
       ("page_pool: add frag page recycling support in page pool")
      Reported-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Acked-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/167475990764.1934330.11960904198087757911.stgit@localhost.localdomain
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7d2c89b3
    • Arınç ÜNAL's avatar
      net: dsa: mt7530: fix tristate and help description · ff445b83
      Arınç ÜNAL authored
      
      
      Fix description for tristate and help sections which include inaccurate
      information.
      
      Signed-off-by: default avatarArınç ÜNAL <arinc.unal@arinc9.com>
      Link: https://lore.kernel.org/r/20230126190110.9124-1-arinc.unal@arinc9.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ff445b83
    • Jakub Kicinski's avatar
      Merge branch 'net-xdp-execute-xdp_do_flush-before-napi_complete_done' · 3ac77ecd
      Jakub Kicinski authored
      
      
      Magnus Karlsson says:
      
      ====================
      net: xdp: execute xdp_do_flush() before napi_complete_done()
      
      Make sure that xdp_do_flush() is always executed before
      napi_complete_done(). This is important for two reasons. First, a
      redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
      napi context X on CPU Y will be followed by a xdp_do_flush() from the
      same napi context and CPU. This is not guaranteed if the
      napi_complete_done() is executed before xdp_do_flush(), as it tells
      the napi logic that it is fine to schedule napi context X on another
      CPU. Details from a production system triggering this bug using the
      veth driver can be found in [1].
      
      The second reason is that the XDP_REDIRECT logic in itself relies on
      being inside a single NAPI instance through to the xdp_do_flush() call
      for RCU protection of all in-kernel data structures. Details can be
      found in [2].
      
      The drivers have only been compile-tested since I do not own any of
      the HW below. So if you are a maintainer, it would be great if you
      could take a quick look to make sure I did not mess something up.
      
      Note that these were the drivers I found that violated the ordering by
      running a simple script and manually checking the ones that came up as
      potential offenders. But the script was not perfect in any way. There
      might still be offenders out there, since the script can generate
      false negatives.
      
      [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
      [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
      ====================
      
      Link: https://lore.kernel.org/r/20230125074901.2737-1-magnus.karlsson@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3ac77ecd
    • Magnus Karlsson's avatar
      dpaa2-eth: execute xdp_do_flush() before napi_complete_done() · a3191c4d
      Magnus Karlsson authored
      Make sure that xdp_do_flush() is always executed before
      napi_complete_done(). This is important for two reasons. First, a
      redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
      napi context X on CPU Y will be followed by a xdp_do_flush() from the
      same napi context and CPU. This is not guaranteed if the
      napi_complete_done() is executed before xdp_do_flush(), as it tells
      the napi logic that it is fine to schedule napi context X on another
      CPU. Details from a production system triggering this bug using the
      veth driver can be found following the first link below.
      
      The second reason is that the XDP_REDIRECT logic in itself relies on
      being inside a single NAPI instance through to the xdp_do_flush() call
      for RCU protection of all in-kernel data structures. Details can be
      found in the second link below.
      
      Fixes: d678be1d
      
       ("dpaa2-eth: add XDP_REDIRECT support")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
      Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a3191c4d
    • Magnus Karlsson's avatar
      dpaa_eth: execute xdp_do_flush() before napi_complete_done() · b5340137
      Magnus Karlsson authored
      Make sure that xdp_do_flush() is always executed before
      napi_complete_done(). This is important for two reasons. First, a
      redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
      napi context X on CPU Y will be followed by a xdp_do_flush() from the
      same napi context and CPU. This is not guaranteed if the
      napi_complete_done() is executed before xdp_do_flush(), as it tells
      the napi logic that it is fine to schedule napi context X on another
      CPU. Details from a production system triggering this bug using the
      veth driver can be found following the first link below.
      
      The second reason is that the XDP_REDIRECT logic in itself relies on
      being inside a single NAPI instance through to the xdp_do_flush() call
      for RCU protection of all in-kernel data structures. Details can be
      found in the second link below.
      
      Fixes: a1e031ff
      
       ("dpaa_eth: add XDP_REDIRECT support")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
      Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
      Acked-by: default avatarCamelia Groza <camelia.groza@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b5340137
    • Magnus Karlsson's avatar
      virtio-net: execute xdp_do_flush() before napi_complete_done() · ad7e615f
      Magnus Karlsson authored
      Make sure that xdp_do_flush() is always executed before
      napi_complete_done(). This is important for two reasons. First, a
      redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
      napi context X on CPU Y will be followed by a xdp_do_flush() from the
      same napi context and CPU. This is not guaranteed if the
      napi_complete_done() is executed before xdp_do_flush(), as it tells
      the napi logic that it is fine to schedule napi context X on another
      CPU. Details from a production system triggering this bug using the
      veth driver can be found following the first link below.
      
      The second reason is that the XDP_REDIRECT logic in itself relies on
      being inside a single NAPI instance through to the xdp_do_flush() call
      for RCU protection of all in-kernel data structures. Details can be
      found in the second link below.
      
      Fixes: 186b3c99
      
       ("virtio-net: support XDP_REDIRECT")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
      Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ad7e615f