Skip to content
  1. Mar 17, 2023
    • Nikolay Aleksandrov's avatar
      bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change · 9ec7eb60
      Nikolay Aleksandrov authored
      
      
      Add bond_ether_setup helper which is used to fix ether_setup() calls in the
      bonding driver. It takes care of both IFF_MASTER and IFF_SLAVE flags, the
      former is always restored and the latter only if it was set.
      If the bond enslaves non-ARPHRD_ETHER device (changes its type), then
      releases it and enslaves ARPHRD_ETHER device (changes back) then we
      use ether_setup() to restore the bond device type but it also resets its
      flags and removes IFF_MASTER and IFF_SLAVE[1]. Use the bond_ether_setup
      helper to restore both after such transition.
      
      [1] reproduce (nlmon is non-ARPHRD_ETHER):
       $ ip l add nlmon0 type nlmon
       $ ip l add bond2 type bond mode active-backup
       $ ip l set nlmon0 master bond2
       $ ip l set nlmon0 nomaster
       $ ip l add bond1 type bond
       (we use bond1 as ARPHRD_ETHER device to restore bond2's mode)
       $ ip l set bond1 master bond2
       $ ip l sh dev bond2
       37: bond2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
          link/ether be:d7:c5:40:5b:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 1500
       (notice bond2's IFF_MASTER is missing)
      
      Fixes: e36b9d16 ("bonding: clean muticast addresses when device changes type")
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ec7eb60
    • David S. Miller's avatar
      Merge branch 'net-renesas-rswitch-fixes' · 53515a05
      David S. Miller authored
      
      
      Yoshihiro Shimoda says:
      
      ====================
      net: renesas: rswitch: Fix rx and timestamp
      
      I got reports locally about issues on the rswitch driver.
      So, fix the issues.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53515a05
    • Yoshihiro Shimoda's avatar
      net: renesas: rswitch: Fix GWTSDIE register handling · 2c59e993
      Yoshihiro Shimoda authored
      
      
      Since the GWCA has the TX timestamp feature, this driver
      should not disable it if one of ports is opened. So, fix it.
      
      Reported-by: default avatarPhong Hoang <phong.hoang.wz@renesas.com>
      Fixes: 33f5d733 ("net: renesas: rswitch: Improve TX timestamp accuracy")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c59e993
    • Yoshihiro Shimoda's avatar
      net: renesas: rswitch: Fix the output value of quote from rswitch_rx() · e05bb97d
      Yoshihiro Shimoda authored
      
      
      If the RX descriptor doesn't have any data, the output value of quote
      from rswitch_rx() will be increased unexpectedily. So, fix it.
      
      Reported-by: default avatarVolodymyr Babchuk <volodymyr_babchuk@epam.com>
      Fixes: 3590918b ("net: ethernet: renesas: Add support for "Ethernet Switch"")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e05bb97d
    • Liang He's avatar
      ethernet: sun: add check for the mdesc_grab() · 90de546d
      Liang He authored
      
      
      In vnet_port_probe() and vsw_port_probe(), we should
      check the return value of mdesc_grab() as it may
      return NULL which can caused NPD bugs.
      
      Fixes: 5d01fa0c ("ldmvsw: Add ldmvsw.c driver code")
      Fixes: 43fdf274 ("[SPARC64]: Abstract out mdesc accesses for better MD update handling.")
      Signed-off-by: default avatarLiang He <windhl@126.com>
      Reviewed-by: default avatarPiotr Raczynski <piotr.raczynski@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90de546d
    • Jakub Kicinski's avatar
      Merge branch 'net-ipa-minor-bug-fixes' · 0c98b8bc
      Jakub Kicinski authored
      Alex Elder says:
      
      ====================
      net: ipa: minor bug fixes
      
      The four patches in this series fix some errors, though none of them
      cause any compile or runtime problems.
      
      The first changes the files included by "drivers/net/ipa/reg.h" to
      ensure everything it requires is included with the file.  It also
      stops unnecessarily including another file.  The prerequisites are
      apparently satisfied other ways, currently.
      
      The second adds two struct declarations to "gsi_reg.h", to ensure
      they're declared before they're used later in the file.  Again, it
      seems these declarations are currently resolved wherever this file
      is included.
      
      The third removes register definitions that were added for IPA v5.0
      that are not needed.  And the last updates some validity checks for
      IPA v5.0 registers.  No IPA v5.0 platforms are yet supported, so the
      issues resolved here were never harmful.
      
      Versions 2 and 3 of this series change the "Fixes" tags in patches
      so they supply legitimate commit hashes.
      ====================
      
      Link: https://lore.kernel.org/r/20230316145136.1795469-1-elder@linaro.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0c98b8bc
    • Alex Elder's avatar
      net: ipa: fix some register validity checks · 21e8aaca
      Alex Elder authored
      
      
      A recent commit defined HW_PARAM_4 as a GSI register ID but did not
      add it to gsi_reg_id_valid() to indicate it's valid (for IPA v5.0+).
      Add version checks for the HW_PARAM_2 and INTER_EE IRQ GSI registers
      there as well.
      
      IPA v5.0 supports up to 8 source and destination resource groups.
      Update the validity check (and the comments where the register IDs
      are defined) to reflect that.  Similarly update comments and
      validity checks for the hash/cache-related registers.
      
      Note that this patch fixes an omission and constrains things
      further, but these don't technically represent bugs.
      
      Fixes: f651334e ("net: ipa: add HW_PARAM_4 GSI register")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      21e8aaca
    • Alex Elder's avatar
      net: ipa: kill FILT_ROUT_CACHE_CFG IPA register · 786bbe50
      Alex Elder authored
      
      
      A recent commit defined a few IPA registers used for IPA v5.0+.
      One of those was a mistake.  Although the filter and router caches
      get *flushed* using a single register, they use distinct registers
      (ENDP_FILTER_CACHE_CFG and ENDP_ROUTER_CACHE_CFG) for configuration.
      
      And although there *exists* a FILT_ROUT_CACHE_CFG register, it is
      not needed in upstream code.  So get rid of definitions related to
      FILT_ROUT_CACHE_CFG, because they are not needed.
      
      Fixes: 8ba59716 ("net: ipa: define IPA v5.0+ registers")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      786bbe50
    • Alex Elder's avatar
      net: ipa: add two missing declarations · 55c49e5c
      Alex Elder authored
      
      
      When gsi_reg_init() got added, its declaration was added to
      "gsi_reg.h" without declaring the two struct pointer types it uses.
      Add these struct declarations to "gsi_reg.h".
      
      Fixes: 3c506add ("net: ipa: introduce gsi_reg_init()")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      55c49e5c
    • Alex Elder's avatar
      net: ipa: reg: include <linux/bug.h> · dd172d0c
      Alex Elder authored
      
      
      When "reg.h" got created, it included calls to WARN() and WARN_ON().
      Those macros are defined via <linux/bug.h>.  In addition, it uses
      is_power_of_2(), which is defined in <linux/log2.h>.  Include those
      files so IPA "reg.h" has access to all definitions it requires.
      
      Meanwhile, <linux/bits.h> is included but nothing defined therein
      is required directly in "reg.h", so get rid of that.
      
      Fixes: 81772e44 ("net: ipa: start generalizing "ipa_reg"")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dd172d0c
    • Jakub Kicinski's avatar
      net: xdp: don't call notifiers during driver init · 769639c1
      Jakub Kicinski authored
      Drivers will commonly perform feature setting during init, if they use
      the xdp_set_features_flag() helper they'll likely run into an ASSERT_RTNL()
      inside call_netdevice_notifiers_info().
      
      Don't call the notifier until the device is actually registered.
      Nothing should be tracking the device until its registered and
      after its unregistration has started.
      
      Fixes: 4d5ab0ad ("net/mlx5e: take into account device reconfiguration for xdp_features flag")
      Link: https://lore.kernel.org/r/20230316220234.598091-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      769639c1
    • Jakub Kicinski's avatar
      Merge branch 'net-sched-fix-parsing-of-tca_ext_warn_msg-for-tc-action' · 85578fe4
      Jakub Kicinski authored
      Hangbin Liu says:
      
      ====================
      net/sched: fix parsing of TCA_EXT_WARN_MSG for tc action
      
      In my previous commit 0349b877 ("sched: add new attr TCA_EXT_WARN_MSG
      to report tc extact message") I didn't notice the tc action use different
      enum with filter. So we can't use TCA_EXT_WARN_MSG directly for tc action.
      
      Let's rever the previous fix 923b2e30 ("net/sched: act_api: move
      TCA_EXT_WARN_MSG to the correct hierarchy") and add a new
      TCA_ROOT_EXT_WARN_MSG for tc action specifically.
      
      Here is the tdc test result:
      
      1..1119
      ok 1 d959 - Add cBPF action with valid bytecode
      ok 2 f84a - Add cBPF action with invalid bytecode
      ok 3 e939 - Add eBPF action with valid object-file
      ok 4 282d - Add eBPF action with invalid object-file
      ok 5 d819 - Replace cBPF bytecode and action control
      ok 6 6ae3 - Delete cBPF action
      ok 7 3e0d - List cBPF actions
      ok 8 55ce - Flush BPF actions
      ok 9 ccc3 - Add cBPF action with duplicate index
      ok 10 89c7 - Add cBPF action with invalid index
      [...]
      ok 1115 2348 - Show TBF class
      ok 1116 84a0 - Create TEQL with default setting
      ok 1117 7734 - Create TEQL with multiple device
      ok 1118 34a9 - Delete TEQL with valid handle
      ok 1119 6289 - Show TEQL stats
      ====================
      
      Link: https://lore.kernel.org/r/20230316033753.2320557-1-liuhangbin@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      85578fe4
    • Hangbin Liu's avatar
      net/sched: act_api: add specific EXT_WARN_MSG for tc action · 2f59823f
      Hangbin Liu authored
      
      
      In my previous commit 0349b877 ("sched: add new attr TCA_EXT_WARN_MSG
      to report tc extact message") I didn't notice the tc action use different
      enum with filter. So we can't use TCA_EXT_WARN_MSG directly for tc action.
      Let's add a TCA_ROOT_EXT_WARN_MSG for tc action specifically and put this
      param before going to the TCA_ACT_TAB nest.
      
      Fixes: 0349b877 ("sched: add new attr TCA_EXT_WARN_MSG to report tc extact message")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2f59823f
    • Hangbin Liu's avatar
      Revert "net/sched: act_api: move TCA_EXT_WARN_MSG to the correct hierarchy" · 8de2bd02
      Hangbin Liu authored
      
      
      This reverts commit 923b2e30.
      
      This is not a correct fix as TCA_EXT_WARN_MSG is not a hierarchy to
      TCA_ACT_TAB. I didn't notice the TC actions use different enum when adding
      TCA_EXT_WARN_MSG. To fix the difference I will add a new WARN enum in
      TCA_ROOT_MAX as Jamal suggested.
      
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8de2bd02
    • Marek Vasut's avatar
      net: dsa: microchip: fix RGMII delay configuration on KSZ8765/KSZ8794/KSZ8795 · 5ae06327
      Marek Vasut authored
      
      
      The blamed commit has replaced a ksz_write8() call to address
      REG_PORT_5_CTRL_6 (0x56) with a ksz_set_xmii() -> ksz_pwrite8() call to
      regs[P_XMII_CTRL_1], which is also defined as 0x56 for ksz8795_regs[].
      
      The trouble is that, when compared to ksz_write8(), ksz_pwrite8() also
      adjusts the register offset with the port base address. So in reality,
      ksz_pwrite8(offset=0x56) accesses register 0x56 + 0x50 = 0xa6, which in
      this switch appears to be unmapped, and the RGMII delay configuration on
      the CPU port does nothing.
      
      So if the switch wasn't fine with the RGMII delay configuration done
      through pin strapping and relied on Linux to apply a different one in
      order to pass traffic, this is now broken.
      
      Using the offset translation logic imposed by ksz_pwrite8(), the correct
      value for regs[P_XMII_CTRL_1] should have been 0x6 on ksz8795_regs[], in
      order to really end up accessing register 0x56.
      
      Static code analysis shows that, despite there being multiple other
      accesses to regs[P_XMII_CTRL_1] in this driver, the only code path that
      is applicable to ksz8795_regs[] and ksz8_dev_ops is ksz_set_xmii().
      Therefore, the problem is isolated to RGMII delays.
      
      In its current form, ksz8795_regs[] contains the same value for
      P_XMII_CTRL_0 and for P_XMII_CTRL_1, and this raises valid suspicions
      that writes made by the driver to regs[P_XMII_CTRL_0] might overwrite
      writes made to regs[P_XMII_CTRL_1] or vice versa.
      
      Again, static analysis shows that the only accesses to P_XMII_CTRL_0
      from the driver are made from code paths which are not reachable with
      ksz8_dev_ops. So the accesses made by ksz_set_xmii() are safe for this
      switch family.
      
      [ vladimiroltean: rewrote commit message ]
      
      Fixes: c476bede ("net: dsa: microchip: ksz8795: use common xmii function")
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20230315231916.2998480-1-vladimir.oltean@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5ae06327
    • Jakub Kicinski's avatar
      Merge branch 'ynl-another-license-adjustment' · 816b1e8c
      Jakub Kicinski authored
      Jakub Kicinski says:
      
      ====================
      ynl: another license adjustment
      
      Hopefully the last adjustment to the licensing of the specs.
      I'm still the author so should be fine to do this.
      ====================
      
      Link: https://lore.kernel.org/r/20230315230351.478320-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      816b1e8c
    • Jakub Kicinski's avatar
      ynl: make the tooling check the license · cfab77c0
      Jakub Kicinski authored
      
      
      The (only recently documented) expectation is that all specs
      are under a certain license, but we don't actually enforce it.
      What's worse we then go ahead and assume the license was right,
      outputting the expected license into generated files.
      
      Fixes: 37d9df22 ("ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause")
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cfab77c0
    • Jakub Kicinski's avatar
      ynl: broaden the license even more · 4e16b6a7
      Jakub Kicinski authored
      
      
      I relicensed Netlink spec code to GPL-2.0 OR BSD-3-Clause but
      we still put a slightly different license on the uAPI header
      than the rest of the code. Use the Linux-syscall-note on all
      the specs and all generated code. It's moot for kernel code,
      but should not hurt. This way the licenses match everywhere.
      
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Fixes: 37d9df22 ("ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause")
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4e16b6a7
    • Jakub Kicinski's avatar
      tools: ynl: make definitions optional again · 054abb51
      Jakub Kicinski authored
      
      
      definitions are optional, commit in question breaks cli for ethtool.
      
      Fixes: 6517a60b ("tools: ynl: move the enum classes to shared code")
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      054abb51
    • Jakub Kicinski's avatar
      Merge tag 'mlx5-fixes-2023-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 0d2be75c
      Jakub Kicinski authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2023-03-15
      
      This series provides bug fixes to mlx5 driver.
      
      * tag 'mlx5-fixes-2023-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
        net/mlx5e: TC, Remove error message log print
        net/mlx5e: TC, fix cloned flow attribute
        net/mlx5e: TC, fix missing error code
        net/sched: TC, fix raw counter initialization
        net/mlx5e: Lower maximum allowed MTU in XSK to match XDP prerequisites
        net/mlx5: Set BREAK_FW_WAIT flag first when removing driver
        net/mlx5e: kTLS, Fix missing error unwind on unsupported cipher type
        net/mlx5e: Fix cleanup null-ptr deref on encap lock
        net/mlx5: E-switch, Fix missing set of split_count when forward to ovs internal port
        net/mlx5: E-switch, Fix wrong usage of source port rewrite in split rules
        net/mlx5: Disable eswitch before waiting for VF pages
        net/mlx5: Fix setting ec_function bit in MANAGE_PAGES
        net/mlx5e: Don't cache tunnel offloads capability
        net/mlx5e: Fix macsec ASO context alignment
      ====================
      
      Link: https://lore.kernel.org/r/20230315225847.360083-1-saeed@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0d2be75c
    • Matthieu Baerts's avatar
      hsr: ratelimit only when errors are printed · 1b0120e4
      Matthieu Baerts authored
      Recently, when automatically merging -net and net-next in MPTCP devel
      tree, our CI reported [1] a conflict in hsr, the same as the one
      reported by Stephen in netdev [2].
      
      When looking at the conflict, I noticed it is in fact the v1 [3] that
      has been applied in -net and the v2 [4] in net-next. Maybe the v1 was
      applied by accident.
      
      As mentioned by Jakub Kicinski [5], the new condition makes more sense
      before the net_ratelimit(), not to update net_ratelimit's state which is
      unnecessary if we're not going to print either way.
      
      Here, this modification applies the v2 but in -net.
      
      Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/4423171069 [1]
      Link: https://lore.kernel.org/netdev/20230315100914.53fc1760@canb.auug.org.au/ [2]
      Link: https://lore.kernel.org/netdev/20230307133229.127442-1-koverskeid@gmail.com/ [3]
      Link: https://lore.kernel.org/netdev/20230309092302.179586-1-koverskeid@gmail.com/ [4]
      Link: https://lore.kernel.org/netdev/20230308232001.2fb62013@kernel.org/
      
       [5]
      Fixes: 28e8cabe ("net: hsr: Don't log netdev_err message on unknown prp dst node")
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Reviewed-by: default avatarSteen Hegelund <Steen.Hegelund@microchip.com>
      Link: https://lore.kernel.org/r/20230315-net-20230315-hsr_framereg-ratelimit-v1-1-61d2ef176d11@tessares.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1b0120e4
    • Daniil Tatianin's avatar
      qed/qed_mng_tlv: correctly zero out ->min instead of ->hour · 470efd68
      Daniil Tatianin authored
      
      
      This fixes an issue where ->hour would erroneously get zeroed out
      instead of ->min because of a bad copy paste.
      
      Found by Linux Verification Center (linuxtesting.org) with the SVACE
      static analysis tool.
      
      Fixes: f240b688 ("qed: Add support for processing fcoe tlv request.")
      Signed-off-by: default avatarDaniil Tatianin <d-tatianin@yandex-team.ru>
      Link: https://lore.kernel.org/r/20230315194618.579286-1-d-tatianin@yandex-team.ru
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      470efd68
    • Po-Hsu Lin's avatar
      selftests: net: devlink_port_split.py: skip test if no suitable device available · 24994513
      Po-Hsu Lin authored
      The `devlink -j port show` command output may not contain the "flavour"
      key, an example from Ubuntu 22.10 s390x LPAR(5.19.0-37-generic), with
      mlx4 driver and iproute2-5.15.0:
        {"port":{"pci/0001:00:00.0/1":{"type":"eth","netdev":"ens301"},
                 "pci/0001:00:00.0/2":{"type":"eth","netdev":"ens301d1"},
                 "pci/0002:00:00.0/1":{"type":"eth","netdev":"ens317"},
                 "pci/0002:00:00.0/2":{"type":"eth","netdev":"ens317d1"}}}
      
      This will cause a KeyError exception.
      
      Create a validate_devlink_output() to check for this "flavour" from
      devlink command output to avoid this KeyError exception. Also let
      it handle the check for `devlink -j dev show` output in main().
      
      Apart from this, if the test was not started because the max lanes of
      the designated device is 0. The script will still return 0 and thus
      causing a false-negative test result.
      
      Use a found_max_lanes flag to determine if these tests were skipped
      due to this reason and return KSFT_SKIP to make it more clear.
      
      Link: https://bugs.launchpad.net/bugs/1937133
      
      
      Fixes: f3348a82 ("selftests: net: Add port split test")
      Signed-off-by: default avatarPo-Hsu Lin <po-hsu.lin@canonical.com>
      Link: https://lore.kernel.org/r/20230315165353.229590-1-po-hsu.lin@canonical.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      24994513
    • Thomas Bogendoerfer's avatar
      i825xx: sni_82596: use eth_hw_addr_set() · f3837334
      Thomas Bogendoerfer authored
      
      
      netdev->dev_addr is now const, we can't write to it directly.
      Copy scrambled mac address octects into an array then eth_hw_addr_set().
      
      Fixes: adeef3e3 ("net: constify netdev->dev_addr")
      Signed-off-by: default avatarThomas Bogendoerfer <tsbogend@alpha.franken.de>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Link: https://lore.kernel.org/r/20230315134117.79511-1-tsbogend@alpha.franken.de
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3837334
    • Alexandra Winter's avatar
      net/iucv: Fix size of interrupt data · 3d87debb
      Alexandra Winter authored
      
      
      iucv_irq_data needs to be 4 bytes larger.
      These bytes are not used by the iucv module, but written by
      the z/VM hypervisor in case a CPU is deconfigured.
      
      Reported as:
      BUG dma-kmalloc-64 (Not tainted): kmalloc Redzone overwritten
      -----------------------------------------------------------------------------
      0x0000000000400564-0x0000000000400567 @offset=1380. First byte 0x80 instead of 0xcc
      Allocated in iucv_cpu_prepare+0x44/0xd0 age=167839 cpu=2 pid=1
      __kmem_cache_alloc_node+0x166/0x450
      kmalloc_node_trace+0x3a/0x70
      iucv_cpu_prepare+0x44/0xd0
      cpuhp_invoke_callback+0x156/0x2f0
      cpuhp_issue_call+0xf0/0x298
      __cpuhp_setup_state_cpuslocked+0x136/0x338
      __cpuhp_setup_state+0xf4/0x288
      iucv_init+0xf4/0x280
      do_one_initcall+0x78/0x390
      do_initcalls+0x11a/0x140
      kernel_init_freeable+0x25e/0x2a0
      kernel_init+0x2e/0x170
      __ret_from_fork+0x3c/0x58
      ret_from_fork+0xa/0x40
      Freed in iucv_init+0x92/0x280 age=167839 cpu=2 pid=1
      __kmem_cache_free+0x308/0x358
      iucv_init+0x92/0x280
      do_one_initcall+0x78/0x390
      do_initcalls+0x11a/0x140
      kernel_init_freeable+0x25e/0x2a0
      kernel_init+0x2e/0x170
      __ret_from_fork+0x3c/0x58
      ret_from_fork+0xa/0x40
      Slab 0x0000037200010000 objects=32 used=30 fp=0x0000000000400640 flags=0x1ffff00000010200(slab|head|node=0|zone=0|
      Object 0x0000000000400540 @offset=1344 fp=0x0000000000000000
      Redzone  0000000000400500: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
      Redzone  0000000000400510: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
      Redzone  0000000000400520: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
      Redzone  0000000000400530: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
      Object   0000000000400540: 00 01 00 03 00 00 00 00 00 00 00 00 00 00 00 00  ................
      Object   0000000000400550: f3 86 81 f2 f4 82 f8 82 f0 f0 f0 f0 f0 f0 f0 f2  ................
      Object   0000000000400560: 00 00 00 00 80 00 00 00 cc cc cc cc cc cc cc cc  ................
      Object   0000000000400570: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
      Redzone  0000000000400580: cc cc cc cc cc cc cc cc                          ........
      Padding  00000000004005d4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      Padding  00000000004005e4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      Padding  00000000004005f4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a              ZZZZZZZZZZZZ
      CPU: 6 PID: 121030 Comm: 116-pai-crypto. Not tainted 6.3.0-20230221.rc0.git4.99b8246b2d71.300.fc37.s390x+debug #1
      Hardware name: IBM 3931 A01 704 (z/VM 7.3.0)
      Call Trace:
      [<000000032aa034ec>] dump_stack_lvl+0xac/0x100
      [<0000000329f5a6cc>] check_bytes_and_report+0x104/0x140
      [<0000000329f5aa78>] check_object+0x370/0x3c0
      [<0000000329f5ede6>] free_debug_processing+0x15e/0x348
      [<0000000329f5f06a>] free_to_partial_list+0x9a/0x2f0
      [<0000000329f5f4a4>] __slab_free+0x1e4/0x3a8
      [<0000000329f61768>] __kmem_cache_free+0x308/0x358
      [<000000032a91465c>] iucv_cpu_dead+0x6c/0x88
      [<0000000329c2fc66>] cpuhp_invoke_callback+0x156/0x2f0
      [<000000032aa062da>] _cpu_down.constprop.0+0x22a/0x5e0
      [<0000000329c3243e>] cpu_device_down+0x4e/0x78
      [<000000032a61dee0>] device_offline+0xc8/0x118
      [<000000032a61e048>] online_store+0x60/0xe0
      [<000000032a08b6b0>] kernfs_fop_write_iter+0x150/0x1e8
      [<0000000329fab65c>] vfs_write+0x174/0x360
      [<0000000329fab9fc>] ksys_write+0x74/0x100
      [<000000032aa03a5a>] __do_syscall+0x1da/0x208
      [<000000032aa177b2>] system_call+0x82/0xb0
      INFO: lockdep is turned off.
      FIX dma-kmalloc-64: Restoring kmalloc Redzone 0x0000000000400564-0x0000000000400567=0xcc
      FIX dma-kmalloc-64: Object at 0x0000000000400540 not freed
      
      Fixes: 2356f4cb ("[S390]: Rewrite of the IUCV base code, part 2")
      Signed-off-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230315131435.4113889-1-wintera@linux.ibm.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3d87debb
    • Toke Høiland-Jørgensen's avatar
      net: atlantic: Fix crash when XDP is enabled but no program is loaded · 37d01039
      Toke Høiland-Jørgensen authored
      
      
      The aq_xdp_run_prog() function falls back to the XDP_ABORTED action
      handler (using a goto) if the operations for any of the other actions fail.
      The XDP_ABORTED handler in turn calls the bpf_warn_invalid_xdp_action()
      tracepoint. However, the function also jumps into the XDP_PASS helper if no
      XDP program is loaded on the device, which means the XDP_ABORTED handler
      can be run with a NULL program pointer. This results in a NULL pointer
      deref because the tracepoint dereferences the 'prog' pointer passed to it.
      
      This situation can happen in multiple ways:
      - If a packet arrives between the removal of the program from the interface
        and the static_branch_dec() in aq_xdp_setup()
      - If there are multiple devices using the same driver in the system and
        one of them has an XDP program loaded and the other does not.
      
      Fix this by refactoring the aq_xdp_run_prog() function to remove the 'goto
      pass' handling if there is no XDP program loaded. Instead, factor out the
      skb building in a separate small helper function.
      
      Fixes: 26efaef7 ("net: atlantic: Implement xdp data plane")
      Reported-by: default avatarFreysteinn Alfredsson <Freysteinn.Alfredsson@kau.se>
      Tested-by: default avatarFreysteinn Alfredsson <Freysteinn.Alfredsson@kau.se>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20230315125539.103319-1-toke@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      37d01039
    • Szymon Heidrich's avatar
      net: usb: smsc75xx: Move packet length check to prevent kernel panic in skb_pull · 43ffe6ca
      Szymon Heidrich authored
      
      
      Packet length check needs to be located after size and align_count
      calculation to prevent kernel panic in skb_pull() in case
      rx_cmd_a & RX_CMD_A_RED evaluates to true.
      
      Fixes: d8b22831 ("net: usb: smsc75xx: Limit packet length to skb->len")
      Signed-off-by: default avatarSzymon Heidrich <szymon.heidrich@gmail.com>
      Link: https://lore.kernel.org/r/20230316110540.77531-1-szymon.heidrich@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      43ffe6ca
    • Ido Schimmel's avatar
      ipv4: Fix incorrect table ID in IOCTL path · 8a2618e1
      Ido Schimmel authored
      
      
      Commit f96a3d74 ("ipv4: Fix incorrect route flushing when source
      address is deleted") started to take the table ID field in the FIB info
      structure into account when determining if two structures are identical
      or not. This field is initialized using the 'fc_table' field in the
      route configuration structure, which is not set when adding a route via
      IOCTL.
      
      The above can result in user space being able to install two identical
      routes that only differ in the table ID field of their associated FIB
      info.
      
      Fix by initializing the table ID field in the route configuration
      structure in the IOCTL path.
      
      Before the fix:
      
       # ip route add default via 192.0.2.2
       # route add default gw 192.0.2.2
       # ip -4 r show default
       # default via 192.0.2.2 dev dummy10
       # default via 192.0.2.2 dev dummy10
      
      After the fix:
      
       # ip route add default via 192.0.2.2
       # route add default gw 192.0.2.2
       SIOCADDRT: File exists
       # ip -4 r show default
       default via 192.0.2.2 dev dummy10
      
      Audited the code paths to ensure there are no other paths that do not
      properly initialize the route configuration structure when installing a
      route.
      
      Fixes: 5a56a0b3 ("net: Don't delete routes in different VRFs")
      Fixes: f96a3d74 ("ipv4: Fix incorrect route flushing when source address is deleted")
      Reported-by: default avatargaoxingwang <gaoxingwang1@huawei.com>
      Link: https://lore.kernel.org/netdev/20230314144159.2354729-1-gaoxingwang1@huawei.com/
      
      
      Tested-by: default avatargaoxingwang <gaoxingwang1@huawei.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20230315124009.4015212-1-idosch@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8a2618e1
    • Jakub Kicinski's avatar
      Merge tag 'ipsec-2023-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 84770d12
      Jakub Kicinski authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2023-03-15
      
      1) Fix an information leak when dumping algos and encap.
         From Herbert Xu
      
      2) Allow transport-mode states with AF_UNSPEC selector
         to allow for nested transport-mode states.
         From Herbert Xu.
      
      * tag 'ipsec-2023-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
        xfrm: Allow transport-mode states with AF_UNSPEC selector
        xfrm: Zero padding when dumping algos and encap
      ====================
      
      Link: https://lore.kernel.org/r/20230315105623.1396491-1-steffen.klassert@secunet.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      84770d12
    • Jakub Kicinski's avatar
      Merge branch 'net-renesas-set-mac_managed_pm-at-probe-time' · c782c7f1
      Jakub Kicinski authored
      Wolfram Sang says:
      
      ====================
      net: renesas: set 'mac_managed_pm' at probe time
      
      When suspending/resuming an interface which was not up, we saw mdiobus
      related PM handling despite 'mac_managed_pm' being set for RAVB/SH_ETH.
      Heiner kindly suggested the fix to set this flag at probe time, not at
      init/open time. I implemented his suggestion and it works fine on these
      two Renesas drivers.
      ====================
      
      Link: https://lore.kernel.org/r/20230315074115.3008-1-wsa+renesas@sang-engineering.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c782c7f1
    • Wolfram Sang's avatar
      sh_eth: avoid PHY being resumed when interface is not up · c6be7136
      Wolfram Sang authored
      
      
      SH_ETH doesn't need mdiobus suspend/resume, that's why it sets
      'mac_managed_pm'. However, setting it needs to be moved from init to
      probe, so mdiobus PM functions will really never be called (e.g. when
      the interface is not up yet during suspend/resume).
      
      Fixes: 6a1dbfef ("net: sh_eth: Fix PHY state warning splat during system resume")
      Suggested-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c6be7136
    • Wolfram Sang's avatar
      ravb: avoid PHY being resumed when interface is not up · 7f5ebf5d
      Wolfram Sang authored
      
      
      RAVB doesn't need mdiobus suspend/resume, that's why it sets
      'mac_managed_pm'. However, setting it needs to be moved from init to
      probe, so mdiobus PM functions will really never be called (e.g. when
      the interface is not up yet during suspend/resume).
      
      Fixes: 4924c0cd ("net: ravb: Fix PHY state warning splat during system resume")
      Suggested-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7f5ebf5d
    • David S. Miller's avatar
      Merge branch 'virtio_net-xdp-bugs' · 04504793
      David S. Miller authored
      
      
      Xuan Zhuo says:
      
      ====================
      virtio_net: fix two bugs related to XDP
      
      This patch set fixes two bugs related to XDP.
      These two patch is not associated.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04504793
    • Xuan Zhuo's avatar
      virtio_net: free xdp shinfo frags when build_skb_from_xdp_buff() fails · 1a3bd6ea
      Xuan Zhuo authored
      
      
      build_skb_from_xdp_buff() may return NULL, in this case
      we need to free the frags of xdp shinfo.
      
      Fixes: fab89baf ("virtio-net: support multi-buffer xdp")
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a3bd6ea
    • Xuan Zhuo's avatar
      virtio_net: fix page_to_skb() miss headroom · fa0f1ba7
      Xuan Zhuo authored
      
      
      Because headroom is not passed to page_to_skb(), this causes the shinfo
      exceeds the range. Then the frags of shinfo are changed by other process.
      
      [  157.724634] stack segment: 0000 [#1] PREEMPT SMP NOPTI
      [  157.725358] CPU: 3 PID: 679 Comm: xdp_pass_user_f Tainted: G            E      6.2.0+ #150
      [  157.726401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/4
      [  157.727820] RIP: 0010:skb_release_data+0x11b/0x180
      [  157.728449] Code: 44 24 02 48 83 c3 01 39 d8 7e be 48 89 d8 48 c1 e0 04 41 80 7d 7e 00 49 8b 6c 04 30 79 0c 48 89 ef e8 89 b
      [  157.730751] RSP: 0018:ffffc90000178b48 EFLAGS: 00010202
      [  157.731383] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000
      [  157.732270] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff888100dd0b00
      [  157.733117] RBP: 5d5d76010f6e2408 R08: ffff888100dd0b2c R09: 0000000000000000
      [  157.734013] R10: ffffffff82effd30 R11: 000000000000a14e R12: ffff88810981ffc0
      [  157.734904] R13: ffff888100dd0b00 R14: 0000000000000002 R15: 0000000000002310
      [  157.735793] FS:  00007f06121d9740(0000) GS:ffff88842fcc0000(0000) knlGS:0000000000000000
      [  157.736794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  157.737522] CR2: 00007ffd9a56c084 CR3: 0000000104bda001 CR4: 0000000000770ee0
      [  157.738420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  157.739283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  157.740146] PKRU: 55555554
      [  157.740502] Call Trace:
      [  157.740843]  <IRQ>
      [  157.741117]  kfree_skb_reason+0x50/0x120
      [  157.741613]  __udp4_lib_rcv+0x52b/0x5e0
      [  157.742132]  ip_protocol_deliver_rcu+0xaf/0x190
      [  157.742715]  ip_local_deliver_finish+0x77/0xa0
      [  157.743280]  ip_sublist_rcv_finish+0x80/0x90
      [  157.743834]  ip_list_rcv_finish.constprop.0+0x16f/0x190
      [  157.744493]  ip_list_rcv+0x126/0x140
      [  157.744952]  __netif_receive_skb_list_core+0x29b/0x2c0
      [  157.745602]  __netif_receive_skb_list+0xed/0x160
      [  157.746190]  ? udp4_gro_receive+0x275/0x350
      [  157.746732]  netif_receive_skb_list_internal+0xf2/0x1b0
      [  157.747398]  napi_gro_receive+0xd1/0x210
      [  157.747911]  virtnet_receive+0x75/0x1c0
      [  157.748422]  virtnet_poll+0x48/0x1b0
      [  157.748878]  __napi_poll+0x29/0x1b0
      [  157.749330]  net_rx_action+0x27a/0x340
      [  157.749812]  __do_softirq+0xf3/0x2fb
      [  157.750298]  do_softirq+0xa2/0xd0
      [  157.750745]  </IRQ>
      [  157.751563]  <TASK>
      [  157.752329]  __local_bh_enable_ip+0x6d/0x80
      [  157.753178]  virtnet_xdp_set+0x482/0x860
      [  157.754159]  ? __pfx_virtnet_xdp+0x10/0x10
      [  157.755129]  dev_xdp_install+0xa4/0xe0
      [  157.756033]  dev_xdp_attach+0x20b/0x5e0
      [  157.756933]  do_setlink+0x82e/0xc90
      [  157.757777]  ? __nla_validate_parse+0x12b/0x1e0
      [  157.758744]  rtnl_setlink+0xd8/0x170
      [  157.759549]  ? mod_objcg_state+0xcb/0x320
      [  157.760328]  ? security_capable+0x37/0x60
      [  157.761209]  ? security_capable+0x37/0x60
      [  157.762072]  rtnetlink_rcv_msg+0x145/0x3d0
      [  157.762929]  ? ___slab_alloc+0x327/0x610
      [  157.763754]  ? __alloc_skb+0x141/0x170
      [  157.764533]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
      [  157.765422]  netlink_rcv_skb+0x58/0x110
      [  157.766229]  netlink_unicast+0x21f/0x330
      [  157.766951]  netlink_sendmsg+0x240/0x4a0
      [  157.767654]  sock_sendmsg+0x93/0xa0
      [  157.768434]  ? sockfd_lookup_light+0x12/0x70
      [  157.769245]  __sys_sendto+0xfe/0x170
      [  157.770079]  ? handle_mm_fault+0xe9/0x2d0
      [  157.770859]  ? preempt_count_add+0x51/0xa0
      [  157.771645]  ? up_read+0x3c/0x80
      [  157.772340]  ? do_user_addr_fault+0x1e9/0x710
      [  157.773166]  ? kvm_read_and_reset_apf_flags+0x49/0x60
      [  157.774087]  __x64_sys_sendto+0x29/0x30
      [  157.774856]  do_syscall_64+0x3c/0x90
      [  157.775518]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [  157.776382] RIP: 0033:0x7f06122def70
      
      Fixes: 18117a84 ("virtio-net: remove xdp related info from page_to_skb()")
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa0f1ba7
    • Rob Herring's avatar
      net: Use of_property_read_bool() for boolean properties · 1a87e641
      Rob Herring authored
      
      
      It is preferred to use typed property access functions (i.e.
      of_property_read_<type> functions) rather than low-level
      of_get_property/of_find_property functions for reading properties.
      Convert reading boolean properties to of_property_read_bool().
      
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can
      Acked-by: default avatarKalle Valo <kvalo@kernel.org>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Acked-by: default avatarFrancois Romieu <romieu@fr.zoreil.com>
      Reviewed-by: default avatarWei Fang <wei.fang@nxp.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a87e641
    • David S. Miller's avatar
      Merge branch 'net-dsa-marvell-mtu-reporting' · 65d63e82
      David S. Miller authored
      
      
      Vladimir Oltean says:
      
      ====================
      Fix MTU reporting for Marvell DSA switches where we can't change it
      
      As explained in patch 2, the driver doesn't know how to change the MTU
      on MV88E6165, MV88E6191, MV88E6220, MV88E6250 and MV88E6290, and there
      is a regression where it actually reports an MTU value below the
      Ethernet standard (1500).
      
      Fixing that shows another issue where DSA is unprepared to be told that
      a switch supports an MTU of only 1500, and still errors out. That is
      addressed by patch 1.
      
      Testing was not done on "real" hardware, but on a different Marvell DSA
      switch, with code modified such that the driver doesn't know how to
      change the MTU on that, either.
      
      A key assumption is that these switches don't need any MTU configuration
      to pass full MTU-sized, DSA-tagged packets, which seems like a
      reasonable assumption to make. My 6390 and 6190 switches, with
      .port_set_jumbo_size commented out, certainly don't seem to have any
      problem passing MTU-sized traffic, as can be seen in this iperf3 session
      captured with tcpdump on the DSA master:
      
      $MAC > $MAC, Marvell DSA mode Forward, dev 2, port 8, untagged, VID 1000,
      	FPri 0, ethertype IPv4 (0x0800), length 1518:
      	10.0.0.69.49590 > 10.0.0.1.5201: Flags [.], seq 81088:82536,
      	ack 1, win 502, options [nop,nop,TS val 2221498829 ecr 3012859850],
      	length 1448
      
      I don't want to go all the way and say that the adjustment made by
      commit b9c587fe ("dsa: mv88e6xxx: Include tagger overhead when
      setting MTU for DSA and CPU ports") is completely unnecessary, just that
      there's an equally good chance that the switches with unknown MTU
      configuration procedure "just work".
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65d63e82
    • Vladimir Oltean's avatar
      net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290 · 7e951737
      Vladimir Oltean authored
      
      
      There are 3 classes of switch families that the driver is aware of, as
      far as mv88e6xxx_change_mtu() is concerned:
      
      - MTU configuration is available per port. Here, the
        chip->info->ops->port_set_jumbo_size() method will be present.
      
      - MTU configuration is global to the switch. Here, the
        chip->info->ops->set_max_frame_size() method will be present.
      
      - We don't know how to change the MTU. Here, none of the above methods
        will be present.
      
      Switch families MV88E6165, MV88E6191, MV88E6220, MV88E6250 and MV88E6290
      fall in category 3.
      
      The blamed commit has adjusted the MTU for all 3 categories by EDSA_HLEN
      (8 bytes), resulting in a new maximum MTU of 1492 being reported by the
      driver for these switches.
      
      I don't have the hardware to test, but I do have a MV88E6390 switch on
      which I can simulate this by commenting out its .port_set_jumbo_size
      definition from mv88e6390_ops. The result is this set of messages at
      probe time:
      
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 1
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 2
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 3
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 4
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 5
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 6
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 7
      mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 8
      
      It is highly implausible that there exist Ethernet switches which don't
      support the standard MTU of 1500 octets, and this is what the DSA
      framework says as well - the error comes from dsa_slave_create() ->
      dsa_slave_change_mtu(slave_dev, ETH_DATA_LEN).
      
      But the error messages are alarming, and it would be good to suppress
      them.
      
      As a consequence of this unlikeliness, we reimplement mv88e6xxx_get_max_mtu()
      and mv88e6xxx_change_mtu() on switches from the 3rd category as follows:
      the maximum supported MTU is 1500, and any request to set the MTU to a
      value larger than that fails in dev_validate_mtu().
      
      Fixes: b9c587fe ("dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e951737
    • Vladimir Oltean's avatar
      net: dsa: don't error out when drivers return ETH_DATA_LEN in .port_max_mtu() · 636e8adf
      Vladimir Oltean authored
      
      
      Currently, when dsa_slave_change_mtu() is called on a user port where
      dev->max_mtu is 1500 (as returned by ds->ops->port_max_mtu()), the code
      will stumble upon this check:
      
      	if (new_master_mtu > mtu_limit)
      		return -ERANGE;
      
      because new_master_mtu is adjusted for the tagger overhead but mtu_limit
      is not.
      
      But it would be good if the logic went through, for example if the DSA
      master really depends on an MTU adjustment to accept DSA-tagged frames.
      
      To make the code pass through the check, we need to adjust mtu_limit for
      the overhead as well, if the minimum restriction was caused by the DSA
      user port's MTU (dev->max_mtu). A DSA user port MTU and a DSA master MTU
      are always offset by the protocol overhead.
      
      Currently no drivers return 1500 .port_max_mtu(), but this is only
      temporary and a bug in itself - mv88e6xxx should have done that, but
      since commit b9c587fe ("dsa: mv88e6xxx: Include tagger overhead when
      setting MTU for DSA and CPU ports") it no longer does. This is a
      preparation for fixing that.
      
      Fixes: bfcb8132 ("net: dsa: configure the MTU for switch ports")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      636e8adf
    • Maciej Fijalkowski's avatar
      ice: xsk: disable txq irq before flushing hw · b830c964
      Maciej Fijalkowski authored
      
      
      ice_qp_dis() intends to stop a given queue pair that is a target of xsk
      pool attach/detach. One of the steps is to disable interrupts on these
      queues. It currently is broken in a way that txq irq is turned off
      *after* HW flush which in turn takes no effect.
      
      ice_qp_dis():
      -> ice_qvec_dis_irq()
      --> disable rxq irq
      --> flush hw
      -> ice_vsi_stop_tx_ring()
      -->disable txq irq
      
      Below splat can be triggered by following steps:
      - start xdpsock WITHOUT loading xdp prog
      - run xdp_rxq_info with XDP_TX action on this interface
      - start traffic
      - terminate xdpsock
      
      [  256.312485] BUG: kernel NULL pointer dereference, address: 0000000000000018
      [  256.319560] #PF: supervisor read access in kernel mode
      [  256.324775] #PF: error_code(0x0000) - not-present page
      [  256.329994] PGD 0 P4D 0
      [  256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [  256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G           OE      6.2.0-rc5+ #51
      [  256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
      [  256.355807] RIP: 0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice]
      [  256.361423] Code: b7 8f 8a 00 00 00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66 89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04 24 49 89 44
      [  256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206
      [  256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX: 000000000000067f
      [  256.393012] RDX: 0000000000000775 RSI: 0000000000000000 RDI: ffff8881deb3ac80
      [  256.400256] RBP: 000000000000003c R08: ffff889847982710 R09: 0000000000010000
      [  256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12: 0000000000000000
      [  256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000 R15: ffff888119b37600
      [  256.421990] FS:  0000000000000000(0000) GS:ffff8897e0cc0000(0000) knlGS:0000000000000000
      [  256.430207] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  256.436036] CR2: 0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0
      [  256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  256.450527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  256.457770] PKRU: 55555554
      [  256.460529] Call Trace:
      [  256.463015]  <TASK>
      [  256.465157]  ? ice_xmit_zc+0x6e/0x150 [ice]
      [  256.469437]  ice_napi_poll+0x46d/0x680 [ice]
      [  256.473815]  ? _raw_spin_unlock_irqrestore+0x1b/0x40
      [  256.478863]  __napi_poll+0x29/0x160
      [  256.482409]  net_rx_action+0x136/0x260
      [  256.486222]  __do_softirq+0xe8/0x2e5
      [  256.489853]  ? smpboot_thread_fn+0x2c/0x270
      [  256.494108]  run_ksoftirqd+0x2a/0x50
      [  256.497747]  smpboot_thread_fn+0x1c1/0x270
      [  256.501907]  ? __pfx_smpboot_thread_fn+0x10/0x10
      [  256.506594]  kthread+0xea/0x120
      [  256.509785]  ? __pfx_kthread+0x10/0x10
      [  256.513597]  ret_from_fork+0x29/0x50
      [  256.517238]  </TASK>
      
      In fact, irqs were not disabled and napi managed to be scheduled and run
      while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers
      was already freed.
      
      To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also
      while at it, remove redundant ice_clean_rx_ring() call - this is handled
      in ice_qp_clean_rings().
      
      Fixes: 2d4238f5 ("ice: Add support for AF_XDP")
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Reviewed-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b830c964