Skip to content
  1. Aug 22, 2022
  2. Aug 20, 2022
  3. Aug 19, 2022
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 268603d7
      Jakub Kicinski authored
      
      
      No conflicts.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      268603d7
    • Lorenzo Bianconi's avatar
      igc: add xdp frags support to ndo_xdp_xmit · 8c78c1e5
      Lorenzo Bianconi authored
      
      
      Add the capability to map non-linear xdp frames in XDP_TX and
      ndo_xdp_xmit callback.
      
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20220817173628.109102-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8c78c1e5
    • Jakub Kicinski's avatar
      Merge branch 'selftests-mlxsw-add-ordering-tests-for-unified-bridge-model' · bafe1adb
      Jakub Kicinski authored
      Petr Machata says:
      
      ====================
      selftests: mlxsw: Add ordering tests for unified bridge model
      
      Amit Cohen writes:
      
      Commit 798661c7
      
       ("Merge branch 'mlxsw-unified-bridge-conversion-part-6'")
      converted mlxsw driver to use unified bridge model. In the legacy model,
      when a RIF was created / destroyed, it was firmware's responsibility to
      update it in the relevant FID classification records. In the unified bridge
      model, this responsibility moved to software.
      
      This set adds tests to check the order of configuration for the following
      classifications:
      1. {Port, VID} -> FID
      2. VID -> FID
      3. VNI -> FID (after decapsulation)
      
      In addition, in the legacy model, software is responsible to update a
      table which is used to determine the packet's egress VID. Add a test to
      check that the order of configuration does not impact switch behavior.
      
      See more details in the commit messages.
      
      Note that the tests supposed to pass also using the legacy model, they
      are added now as with the new model they test the driver and not the
      firmware.
      
      Patch set overview:
      Patch #1 adds test for {Port, VID} -> FID
      Patch #2 adds test for VID -> FID
      Patch #3 adds test for VNI -> FID
      Patch #4 adds test for egress VID classification
      ====================
      
      Link: https://lore.kernel.org/r/cover.1660747162.git.petrm@nvidia.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bafe1adb
    • Amit Cohen's avatar
      selftests: mlxsw: Add egress VID classification test · 1623d571
      Amit Cohen authored
      
      
      After routing, the device always consults a table that determines the
      packet's egress VID based on {egress RIF, egress local port}. In the
      unified bridge model, it is up to software to maintain this table via
      REIV register.
      
      The table needs to be updated in the following flows:
      1. When a RIF is set on a FID, for each FID's {Port, VID} mapping, a new
         {RIF, Port}->VID mapping should be created.
      2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
         a new {RIF, Port}->VID mapping should be created.
      
      Add a test to verify that packets get the correct VID after routing,
      regardless of the order of the configuration.
      
       # ./egress_vid_classification.sh
       TEST: Add RIF for existing {port, VID}->FID mapping                 [ OK ]
       TEST: Add {port, VID}->FID mapping for FID with a RIF               [ OK ]
      
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1623d571
    • Amit Cohen's avatar
      selftests: mlxsw: Add ingress RIF configuration test for VXLAN · cbeb6e11
      Amit Cohen authored
      
      
      Before layer 2 forwarding, the device classifies an incoming packet to a
      FID. After classification, the FID is known, but also all the attributes of
      the FID, such as the router interface (RIF) via which a packet that needs
      to be routed will ingress the router block.
      
      For VXLAN decapsulation, the FID classification is done according to the
      VNI. When a RIF is added on top of a FID, the existing VNI->FID mapping
      should be updated by the software with the new RIF. In addition, when a new
      mapping is added for FID which already has a RIF, the correct RIF should
      be used for it.
      
      Add a test to verify that packets can be routed after decapsulation which
      is done after VNI->FID classification, regardless of the order of the
      configuration.
      
       # ./ingress_rif_conf_vxlan.sh
       TEST: Add RIF for existing VNI->FID mapping                         [ OK ]
       TEST: Add VNI->FID mapping for FID with a RIF                       [ OK ]
      
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cbeb6e11
    • Amit Cohen's avatar
      selftests: mlxsw: Add ingress RIF configuration test for 802.1Q bridge · 3a5ddc88
      Amit Cohen authored
      
      
      Before layer 2 forwarding, the device classifies an incoming packet to a
      FID. After classification, the FID is known, but also all the attributes of
      the FID, such as the router interface (RIF) via which a packet that needs
      to be routed will ingress the router block.
      
      For VLAN-aware bridges (802.1Q), the FID classification is done according
      to VID. When a RIF is added on top of a FID, the existing VID->FID mapping
      should be updated by the software with the new RIF.
      
      We never map multiple VLANs to the same FID using VID->FID, so we cannot
      create VID->FID for FID which already has a RIF using 802.1Q. Anyway,
      verify that packets can be routed via port which is added after the FID
      already has a RIF.
      
      Add a test to verify that packets can be routed after VID->FID
      classification, regardless of the order of the configuration.
      
       # ./ingress_rif_conf_1q.sh
       TEST: Add RIF for existing VID->FID mapping                         [ OK ]
       TEST: Add port to VID->FID mapping for FID with a RIF               [ OK ]
      
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3a5ddc88
    • Amit Cohen's avatar
      selftests: mlxsw: Add ingress RIF configuration test for 802.1D bridge · 2cd87cea
      Amit Cohen authored
      
      
      Before layer 2 forwarding, the device classifies an incoming packet to a
      FID. After classification, the FID is known, but also all the attributes of
      the FID, such as the router interface (RIF) via which a packet that needs
      to be routed will ingress the router block.
      
      For VLAN-unaware bridges (802.1D), the FID classification is done according
      to {Port, VID}. When a RIF is added on top of a FID, all the existing
      {Port, VID}->FID mappings should be updated by the software with the new
      RIF. In addition, when a new mapping is added for FID which already has a
      RIF, the correct RIF should be used for it.
      
      Add a test to verify that packets can be routed after {Port, VID}->FID
      classification, regardless of the order of the configuration.
      
       # ./ingress_rif_conf_1d.sh
       TEST: Add RIF for existing {port, VID}->FID mapping                 [ OK ]
       TEST: Add {port, VID}->FID mapping for FID with a RIF               [ OK ]
      
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2cd87cea
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: remove unused txd_pdma pointer in mtk_xdp_submit_frame · a64bb2b0
      Lorenzo Bianconi authored
      
      
      Get rid of unnecessary txd_pdma pointer in mtk_xdp_submit_frame for loop
      since it is actually used at the end of the routine using latest mtk_tx_dma
      consumed pointer as reference.
      
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Link: https://lore.kernel.org/r/2c40b0fbb9163a0d62ff897abae17db84a9f3b99.1660669138.git.lorenzo@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a64bb2b0
    • Emeel Hakim's avatar
      net: macsec: Expose MACSEC_SALT_LEN definition to user space · 5d817578
      Emeel Hakim authored
      
      
      Expose MACSEC_SALT_LEN definition to user space to be
      used in various user space applications such as iproute.
      Iproute will use this as part of adding macsec extended
      packet number support.
      
      Reviewed-by: default avatarRaed Salem <raeds@nvidia.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Link: https://lore.kernel.org/r/20220818153229.4721-1-ehakim@nvidia.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5d817578
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4c2d0b03
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter.
      
        Current release - regressions:
      
         - tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF
           socket maps get data out of the TCP stack)
      
         - tls: rx: react to strparser initialization errors
      
         - netfilter: nf_tables: fix scheduling-while-atomic splat
      
         - net: fix suspicious RCU usage in bpf_sk_reuseport_detach()
      
        Current release - new code bugs:
      
         - mlxsw: ptp: fix a couple of races, static checker warnings and
           error handling
      
        Previous releases - regressions:
      
         - netfilter:
            - nf_tables: fix possible module reference underflow in error path
            - make conntrack helpers deal with BIG TCP (skbs > 64kB)
            - nfnetlink: re-enable conntrack expectation events
      
         - net: fix potential refcount leak in ndisc_router_discovery()
      
        Previous releases - always broken:
      
         - sched: cls_route: disallow handle of 0
      
         - neigh: fix possible local DoS due to net iface start/stop loop
      
         - rtnetlink: fix module refcount leak in rtnetlink_rcv_msg
      
         - sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu
      
         - virtio_net: fix endian-ness for RSS
      
         - dsa: mv88e6060: prevent crash on an unused port
      
         - fec: fix timer capture timing in `fec_ptp_enable_pps()`
      
         - ocelot: stats: fix races, integer wrapping and reading incorrect
           registers (the change of register definitions here accounts for
           bulk of the changed LoC in this PR)"
      
      * tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
        net: moxa: MAC address reading, generating, validity checking
        tcp: handle pure FIN case correctly
        tcp: refactor tcp_read_skb() a bit
        tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
        tcp: fix sock skb accounting in tcp_read_skb()
        igb: Add lock to avoid data race
        dt-bindings: Fix incorrect "the the" corrections
        net: genl: fix error path memory leak in policy dumping
        stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
        net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
        net/mlx5e: Allocate flow steering storage during uplink initialization
        net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
        net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
        net: mscc: ocelot: make struct ocelot_stat_layout array indexable
        net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
        net: mscc: ocelot: turn stats_lock into a spinlock
        net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
        net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
        net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters
        net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
        ...
      4c2d0b03
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-next-6.0-rc2' of... · 90b6b686
      Linus Torvalds authored
      Merge tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest fix from Shuah Khan:
      
       - fix landlock test build regression
      
      * tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/landlock: fix broken include of linux/landlock.h
      90b6b686
    • Linus Torvalds's avatar
      Merge tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 0de277d4
      Linus Torvalds authored
      Pull rtla tool fixes from Steven Rostedt:
       "Fixes for the Real-Time Linux Analysis tooling:
      
         - Fix tracer name in comments and prints
      
         - Fix setting up symlinks
      
         - Allow extra flags to be set in build
      
         - Consolidate and show all necessary libraries not found in build
           error"
      
      * tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        rtla: Consolidate and show all necessary libraries that failed for building
        tools/rtla: Build with EXTRA_{C,LD}FLAGS
        tools/rtla: Fix command symlinks
        rtla: Fix tracer name
      0de277d4
    • Jakub Kicinski's avatar
      Merge branch 'add-dt-property-to-disable-hibernation-mode' · aa447a87
      Jakub Kicinski authored
      
      
      Wei Fang says:
      
      ====================
      Add DT property to disable hibernation mode
      
      The patches add the ability to disable the hibernation mode of AR803x
      PHYs. Hibernation mode defaults to enabled after hardware reset on
      these PHYs. If the AR803x PHYs enter hibernation mode, they will not
      provide any clock. For some MACs, they might need the clocks which
      provided by the PHYs to support their own hardware logic.
      So, the patches add the support to disable hibernation mode by adding
      a boolean:
      
              qca,disable-hibernation-mode
      
      If one wished to disable hibernation mode to better match with the
      specifical MAC, just add this property in the phy node of DT.
      ====================
      
      Link: https://lore.kernel.org/r/20220818030054.1010660-1-wei.fang@nxp.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      aa447a87
    • Wei Fang's avatar
      net: phy: at803x: add disable hibernation mode support · 9ecf0401
      Wei Fang authored
      
      
      When the cable is unplugged, the Atheros AR803x PHYs will enter
      hibernation mode after about 10 seconds if the hibernation mode
      is enabled and will not provide any clock to the MAC. But for
      some MACs, this feature might cause unexpected issues due to the
      logic of MACs.
      Taking SYNP MAC (stmmac) as an example, if the cable is unplugged
      and the "eth0" interface is down, the AR803x PHY will enter
      hibernation mode. Then perform the "ifconfig eth0 up" operation,
      the stmmac can't be able to complete the software reset operation
      and fail to init it's own DMA. Therefore, the "eth0" interface is
      failed to ifconfig up. Why does it cause this issue? The truth is
      that the software reset operation of the stmmac is designed to
      depend on the RX_CLK of PHY.
      So, this patch offers an option for the user to determine whether
      to disable the hibernation mode of AR803x PHYs.
      
      Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9ecf0401
    • Wei Fang's avatar
      dt-bindings: net: ar803x: add disable-hibernation-mode propetry · 2e7f0899
      Wei Fang authored
      
      
      The hibernation mode of Atheros AR803x PHYs defaults to be
      enabled after hardware reset. When the cable is unplugged,
      the PHY will enter hibernation mode after about 10 seconds
      and the PHY clocks will be stopped to save power.
      However, some MACs need the phy output clock for proper
      functioning of their logic. For instance, stmmac needs the
      RX_CLK of PHY for software reset to complete.
      Therefore, add a DT property to configure the PHY to disable
      this hardware hibernation mode.
      
      Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e7f0899
    • Sergei Antonov's avatar
      net: moxa: MAC address reading, generating, validity checking · f4693b81
      Sergei Antonov authored
      
      
      This device does not remember its MAC address, so add a possibility
      to get it from the platform. If it fails, generate a random address.
      This will provide a MAC address early during boot without user space
      being involved.
      
      Also remove extra calls to is_valid_ether_addr().
      
      Made after suggestions by Andrew Lunn:
      1) Use eth_hw_addr_random() to assign a random MAC address during probe.
      2) Remove is_valid_ether_addr() from moxart_mac_open()
      3) Add a call to platform_get_ethdev_address() during probe
      4) Remove is_valid_ether_addr() from moxart_set_mac_address(). The core does this
      
      v1 -> v2:
      Handle EPROBE_DEFER returned from platform_get_ethdev_address().
      Move MAC reading code to the beginning of the probe function.
      
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      CC: Yang Yingliang <yangyingliang@huawei.com>
      CC: Pavel Skripkin <paskripkin@gmail.com>
      CC: Guobin Huang <huangguobin4@huawei.com>
      CC: Yang Wei <yang.wei9@zte.com.cn>
      CC: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220818092317.529557-1-saproj@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f4693b81
    • Jakub Kicinski's avatar
      Merge branch 'tcp-some-bug-fixes-for-tcp_read_skb' · 267ef48e
      Jakub Kicinski authored
      
      
      Cong Wang says:
      
      ====================
      tcp: some bug fixes for tcp_read_skb()
      
      This patchset contains 3 bug fixes and 1 minor refactor patch for
      tcp_read_skb(). V1 only had the first patch, as Eric prefers to fix all
      of them together, I have to group them together.
      ====================
      
      Link: https://lore.kernel.org/r/20220817195445.151609-1-xiyou.wangcong@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      267ef48e
    • Cong Wang's avatar
      tcp: handle pure FIN case correctly · 2e23acd9
      Cong Wang authored
      When skb->len==0, the recv_actor() returns 0 too, but we also use 0
      for error conditions. This patch amends this by propagating the errors
      to tcp_read_skb() so that we can distinguish skb->len==0 case from
      error cases.
      
      Fixes: 04919bed
      
       ("tcp: Introduce tcp_read_skb()")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e23acd9
    • Cong Wang's avatar
      tcp: refactor tcp_read_skb() a bit · a8688821
      Cong Wang authored
      
      
      As tcp_read_skb() only reads one skb at a time, the while loop is
      unnecessary, we can turn it into an if. This also simplifies the
      code logic.
      
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a8688821
    • Cong Wang's avatar
      tcp: fix tcp_cleanup_rbuf() for tcp_read_skb() · c457985a
      Cong Wang authored
      tcp_cleanup_rbuf() retrieves the skb from sk_receive_queue, it
      assumes the skb is not yet dequeued. This is no longer true for
      tcp_read_skb() case where we dequeue the skb first.
      
      Fix this by introducing a helper __tcp_cleanup_rbuf() which does
      not require any skb and calling it in tcp_read_skb().
      
      Fixes: 04919bed
      
       ("tcp: Introduce tcp_read_skb()")
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c457985a
    • Cong Wang's avatar
      tcp: fix sock skb accounting in tcp_read_skb() · e9c6e797
      Cong Wang authored
      Before commit 965b57b4 ("net: Introduce a new proto_ops
      ->read_skb()"), skb was not dequeued from receive queue hence
      when we close TCP socket skb can be just flushed synchronously.
      
      After this commit, we have to uncharge skb immediately after being
      dequeued, otherwise it is still charged in the original sock. And we
      still need to retain skb->sk, as eBPF programs may extract sock
      information from skb->sk. Therefore, we have to call
      skb_set_owner_sk_safe() here.
      
      Fixes: 965b57b4
      
       ("net: Introduce a new proto_ops ->read_skb()")
      Reported-and-tested-by: default avatar <syzbot+a0e6f8738b58f7654417@syzkaller.appspotmail.com>
      Tested-by: default avatarStanislav Fomichev <sdf@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e9c6e797