Skip to content
  1. Apr 10, 2024
    • Arnd Bergmann's avatar
      ata: sata_mv: Fix PCI device ID table declaration compilation warning · 00f75760
      Arnd Bergmann authored
      [ Upstream commit 3137b83a ]
      
      Building with W=1 shows a warning for an unused variable when CONFIG_PCI
      is diabled:
      
      drivers/ata/sata_mv.c:790:35: error: unused variable 'mv_pci_tbl' [-Werror,-Wunused-const-variable]
      static const struct pci_device_id mv_pci_tbl[] = {
      
      Move the table into the same block that containsn the pci_driver
      definition.
      
      Fixes: 7bb3c529
      
       ("sata_mv: Remove PCI dependency")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      00f75760
    • Arnd Bergmann's avatar
      scsi: mylex: Fix sysfs buffer lengths · e0ad4c27
      Arnd Bergmann authored
      [ Upstream commit 1197c5b2 ]
      
      The myrb and myrs drivers use an odd way of implementing their sysfs files,
      calling snprintf() with a fixed length of 32 bytes to print into a page
      sized buffer. One of the strings is actually longer than 32 bytes, which
      clang can warn about:
      
      drivers/scsi/myrb.c:1906:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation]
      drivers/scsi/myrs.c:1089:10: error: 'snprintf' will always be truncated; specified size is 32, but format string expands to at least 34 [-Werror,-Wformat-truncation]
      
      These could all be plain sprintf() without a length as the buffer is always
      long enough. On the other hand, sysfs files should not be overly long
      either, so just double the length to make sure the longest strings don't
      get truncated here.
      
      Fixes: 77266186 ("scsi: myrs: Add Mylex RAID controller (SCSI interface)")
      Fixes: 081ff398
      
       ("scsi: myrb: Add Mylex RAID controller (block interface)")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20240326223825.4084412-8-arnd@kernel.org
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e0ad4c27
    • Arnd Bergmann's avatar
      ata: sata_sx4: fix pdc20621_get_from_dimm() on 64-bit · 117d7ef3
      Arnd Bergmann authored
      [ Upstream commit 52f80bb1 ]
      
      gcc warns about a memcpy() with overlapping pointers because of an
      incorrect size calculation:
      
      In file included from include/linux/string.h:369,
                       from drivers/ata/sata_sx4.c:66:
      In function 'memcpy_fromio',
          inlined from 'pdc20621_get_from_dimm.constprop' at drivers/ata/sata_sx4.c:962:2:
      include/linux/fortify-string.h:97:33: error: '__builtin_memcpy' accessing 4294934464 bytes at offsets 0 and [16, 16400] overlaps 6442385281 bytes at offset -2147450817 [-Werror=restrict]
         97 | #define __underlying_memcpy     __builtin_memcpy
            |                                 ^
      include/linux/fortify-string.h:620:9: note: in expansion of macro '__underlying_memcpy'
        620 |         __underlying_##op(p, q, __fortify_size);                        \
            |         ^~~~~~~~~~~~~
      include/linux/fortify-string.h:665:26: note: in expansion of macro '__fortify_memcpy_chk'
        665 | #define memcpy(p, q, s)  __fortify_memcpy_chk(p, q, s,                  \
            |                          ^~~~~~~~~~~~~~~~~~~~
      include/asm-generic/io.h:1184:9: note: in expansion of macro 'memcpy'
       1184 |         memcpy(buffer, __io_virt(addr), size);
            |         ^~~~~~
      
      The problem here is the overflow of an unsigned 32-bit number to a
      negative that gets converted into a signed 'long', keeping a large
      positive number.
      
      Replace the complex calculation with a more readable min() variant
      that avoids the warning.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      117d7ef3
    • Stephen Lee's avatar
      ASoC: ops: Fix wraparound for mask in snd_soc_get_volsw · e9b71370
      Stephen Lee authored
      [ Upstream commit fc563aa9 ]
      
      In snd_soc_info_volsw(), mask is generated by figuring out the index of
      the most significant bit set in max and converting the index to a
      bitmask through bit shift 1. Unintended wraparound occurs when max is an
      integer value with msb bit set. Since the bit shift value 1 is treated
      as an integer type, the left shift operation will wraparound and set
      mask to 0 instead of all 1's. In order to fix this, we type cast 1 as
      `1ULL` to prevent the wraparound.
      
      Fixes: 7077148f
      
       ("ASoC: core: Split ops out of soc-core.c")
      Signed-off-by: default avatarStephen Lee <slee08177@gmail.com>
      Link: https://msgid.link/r/20240326010131.6211-1-slee08177@gmail.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e9b71370
    • Pierre-Louis Bossart's avatar
      ASoC: rt711-sdw: fix locking sequence · 562adaf7
      Pierre-Louis Bossart authored
      [ Upstream commit aae86cfd ]
      
      The disable_irq_lock protects the 'disable_irq' value, we need to lock
      before testing it.
      
      Fixes: b69de265
      
       ("ASoC: rt711: fix for JD event handling in ClockStop Mode0")
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Reviewed-by: default avatarBard Liao <yung-chuan.liao@linux.intel.com>
      Reviewed-by: default avatarChao Song <chao.song@linux.intel.com>
      Link: https://msgid.link/r/20240325221817.206465-4-pierre-louis.bossart@linux.intel.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      562adaf7
    • Pierre-Louis Bossart's avatar
      ASoC: rt711-sdca: fix locking sequence · bcf894d7
      Pierre-Louis Bossart authored
      [ Upstream commit ee287771 ]
      
      The disable_irq_lock protects the 'disable_irq' value, we need to lock
      before testing it.
      
      Fixes: 23adeb70
      
       ("ASoC: rt711-sdca: fix for JD event handling in ClockStop Mode0")
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Reviewed-by: default avatarBard Liao <yung-chuan.liao@linux.intel.com>
      Reviewed-by: default avatarChao Song <chao.song@linux.intel.com>
      Link: https://msgid.link/r/20240325221817.206465-3-pierre-louis.bossart@linux.intel.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bcf894d7
    • Pierre-Louis Bossart's avatar
      ASoC: rt5682-sdw: fix locking sequence · b53cf951
      Pierre-Louis Bossart authored
      [ Upstream commit 310a5caa ]
      
      The disable_irq_lock protects the 'disable_irq' value, we need to lock
      before testing it.
      
      Fixes: 02fb23d7
      
       ("ASoC: rt5682-sdw: fix for JD event handling in ClockStop Mode0")
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Reviewed-by: default avatarBard Liao <yung-chuan.liao@linux.intel.com>
      Reviewed-by: default avatarChao Song <chao.song@linux.intel.com>
      Link: https://msgid.link/r/20240325221817.206465-2-pierre-louis.bossart@linux.intel.com
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b53cf951
    • Paul Barker's avatar
      net: ravb: Always process TX descriptor ring · 9df33e57
      Paul Barker authored
      [ Upstream commit 596a4254 ]
      
      The TX queue should be serviced each time the poll function is called,
      even if the full RX work budget has been consumed. This prevents
      starvation of the TX queue when RX bandwidth usage is high.
      
      Fixes: c156633f
      
       ("Renesas Ethernet AVB driver proper")
      Signed-off-by: default avatarPaul Barker <paul.barker.ct@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9df33e57
    • Wei Fang's avatar
      net: fec: Set mac_managed_pm during probe · fcc739d7
      Wei Fang authored
      [ Upstream commit cbc17e78 ]
      
      Setting mac_managed_pm during interface up is too late.
      
      In situations where the link is not brought up yet and the system suspends
      the regular PHY power management will run. Since the FEC ETHEREN control
      bit is cleared (automatically) on suspend the controller is off in resume.
      When the regular PHY power management resume path runs in this context it
      will write to the MII_DATA register but nothing will be transmitted on the
      MDIO bus.
      
      This can be observed by the following log:
      
          fec 5b040000.ethernet eth0: MDIO read timeout
          Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0xc8 returns -110
          Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: failed to resume: error -110
      
      The data written will however remain in the MII_DATA register.
      
      When the link later is set to administrative up it will trigger a call to
      fec_restart() which will restore the MII_SPEED register. This triggers the
      quirk explained in f166f890 ("net: ethernet: fec: Replace interrupt
      driven MDIO with polled IO") causing an extra MII_EVENT.
      
      This extra event desynchronizes all the MDIO register reads, causing them
      to complete too early. Leading all reads to read as 0 because
      fec_enet_mdio_wait() returns too early.
      
      When a Microchip LAN8700R PHY is connected to the FEC, the 0 reads causes
      the PHY to be initialized incorrectly and the PHY will not transmit any
      ethernet signal in this state. It cannot be brought out of this state
      without a power cycle of the PHY.
      
      Fixes: 557d5dc8
      
       ("net: fec: use mac-managed PHY PM")
      Closes: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/
      Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      [jernberg: commit message]
      Signed-off-by: default avatarJohn Ernberg <john.ernberg@actia.se>
      Link: https://lore.kernel.org/r/20240328155909.59613-2-john.ernberg@actia.se
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fcc739d7
    • Denis Kirjanov's avatar
      drivers: net: convert to boolean for the mac_managed_pm flag · 498cc233
      Denis Kirjanov authored
      [ Upstream commit eca485d2
      
       ]
      
      Signed-off-by: default avatarDennis Kirjanov <dkirjanov@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: cbc17e78
      
       ("net: fec: Set mac_managed_pm during probe")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      498cc233
    • Oleksij Rempel's avatar
      net: usb: asix: suspend embedded PHY if external is used · 0985fbfb
      Oleksij Rempel authored
      [ Upstream commit 4d17d43d
      
       ]
      
      In case external PHY is used, we need to take care of embedded PHY.
      Since there are no methods to disable this PHY from the MAC side and
      keeping RMII reference clock, we need to suspend it.
      
      This patch will reduce electrical noise (PHY is continuing to send FLPs)
      and power consumption by 0,22W.
      
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: cbc17e78
      
       ("net: fec: Set mac_managed_pm during probe")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0985fbfb
    • Ivan Vecera's avatar
      i40e: Enforce software interrupt during busy-poll exit · 342cb04d
      Ivan Vecera authored
      [ Upstream commit ea558de7 ]
      
      As for ice bug fixed by commit b7306b42 ("ice: manage interrupts
      during poll exit") followed by commit 23be7075 ("ice: fix software
      generating extra interrupts") I'm seeing the similar issue also with
      i40e driver.
      
      In certain situation when busy-loop is enabled together with adaptive
      coalescing, the driver occasionally misses that there are outstanding
      descriptors to clean when exiting busy poll.
      
      Try to catch the remaining work by triggering a software interrupt
      when exiting busy poll. No extra interrupts will be generated when
      busy polling is not used.
      
      The issue was found when running sockperf ping-pong tcp test with
      adaptive coalescing and busy poll enabled (50 as value busy_pool
      and busy_read sysctl knobs) and results in huge latency spikes
      with more than 100000us.
      
      The fix is inspired from the ice driver and do the following:
      1) During napi poll exit in case of busy-poll (napo_complete_done()
         returns false) this is recorded to q_vector that we were in busy
         loop.
      2) Extends i40e_buildreg_itr() to be able to add an enforced software
         interrupt into built value
      2) In i40e_update_enable_itr() enforces a software interrupt trigger
         if we are exiting busy poll to catch any pending clean-ups
      3) Reuses unused 3rd ITR (interrupt throttle) index and set it to
         20K interrupts per second to limit the number of these sw interrupts.
      
      Test results
      ============
      Prior:
      [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120
      sockperf: == version #3.10-no.git ==
      sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
      
      [ 0] IP = 10.9.9.1        PORT = 11111 # TCP
      sockperf: Warmup stage (sending a few dummy messages)...
      sockperf: Starting test...
      sockperf: Test end (interrupted by timer)
      sockperf: Test ended
      sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2438563; ReceivedMessages=2438562
      sockperf: ========= Printing statistics for Server No: 0
      sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2429473; ReceivedMessages=2429473
      sockperf: ====> avg-latency=24.571 (std-dev=93.297, mean-ad=4.904, median-ad=1.510, siqr=1.063, cv=3.797, std-error=0.060, 99.0% ci=[24.417, 24.725])
      sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
      sockperf: Summary: Latency is 24.571 usec
      sockperf: Total 2429473 observations; each percentile contains 24294.73 observations
      sockperf: ---> <MAX> observation = 103294.331
      sockperf: ---> percentile 99.999 =   45.633
      sockperf: ---> percentile 99.990 =   37.013
      sockperf: ---> percentile 99.900 =   35.910
      sockperf: ---> percentile 99.000 =   33.390
      sockperf: ---> percentile 90.000 =   28.626
      sockperf: ---> percentile 75.000 =   27.741
      sockperf: ---> percentile 50.000 =   26.743
      sockperf: ---> percentile 25.000 =   25.614
      sockperf: ---> <MIN> observation =   12.220
      
      After:
      [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120
      sockperf: == version #3.10-no.git ==
      sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
      
      [ 0] IP = 10.9.9.1        PORT = 11111 # TCP
      sockperf: Warmup stage (sending a few dummy messages)...
      sockperf: Starting test...
      sockperf: Test end (interrupted by timer)
      sockperf: Test ended
      sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2400055; ReceivedMessages=2400054
      sockperf: ========= Printing statistics for Server No: 0
      sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2391186; ReceivedMessages=2391186
      sockperf: ====> avg-latency=24.965 (std-dev=5.934, mean-ad=4.642, median-ad=1.485, siqr=1.067, cv=0.238, std-error=0.004, 99.0% ci=[24.955, 24.975])
      sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
      sockperf: Summary: Latency is 24.965 usec
      sockperf: Total 2391186 observations; each percentile contains 23911.86 observations
      sockperf: ---> <MAX> observation =  195.841
      sockperf: ---> percentile 99.999 =   45.026
      sockperf: ---> percentile 99.990 =   39.009
      sockperf: ---> percentile 99.900 =   35.922
      sockperf: ---> percentile 99.000 =   33.482
      sockperf: ---> percentile 90.000 =   28.902
      sockperf: ---> percentile 75.000 =   27.821
      sockperf: ---> percentile 50.000 =   26.860
      sockperf: ---> percentile 25.000 =   25.685
      sockperf: ---> <MIN> observation =   12.277
      
      Fixes: 0bcd952f
      
       ("ethernet/intel: consolidate NAPI and NAPI exit")
      Reported-by: default avatarHugo Ferreira <hferreir@redhat.com>
      Reviewed-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      342cb04d
    • Ivan Vecera's avatar
      i40e: Remove _t suffix from enum type names · c9bcd646
      Ivan Vecera authored
      [ Upstream commit addca917
      
       ]
      
      Enum type names should not be suffixed by '_t'. Either to use
      'typedef enum name name_t' to so plain 'name_t var' instead of
      'enum name_t var'.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20231113231047.548659-6-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: ea558de7
      
       ("i40e: Enforce software interrupt during busy-poll exit")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c9bcd646
    • Joe Damato's avatar
      i40e: Store the irq number in i40e_q_vector · 2a0a64c9
      Joe Damato authored
      [ Upstream commit 6b85a4f3
      
       ]
      
      Make it easy to figure out the IRQ number for a particular i40e_q_vector by
      storing the assigned IRQ in the structure itself.
      
      Signed-off-by: default avatarJoe Damato <jdamato@fastly.com>
      Acked-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Acked-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: ea558de7
      
       ("i40e: Enforce software interrupt during busy-poll exit")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a0a64c9
    • Alexander Stein's avatar
      Revert "usb: phy: generic: Get the vbus supply" · bf7396ec
      Alexander Stein authored
      [ Upstream commit fdada0db ]
      
      This reverts commit 75fd6485
      
      .
      This patch was applied twice by accident, causing probe failures.
      Revert the accident.
      
      Signed-off-by: default avatarAlexander Stein <alexander.stein@ew.tq-group.com>
      Fixes: 75fd6485
      
       ("usb: phy: generic: Get the vbus supply")
      Cc: stable <stable@kernel.org>
      Reviewed-by: default avatarSean Anderson <sean.anderson@seco.com>
      Link: https://lore.kernel.org/r/20240314092628.1869414-1-alexander.stein@ew.tq-group.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf7396ec
    • Bikash Hazarika's avatar
      scsi: qla2xxx: Update manufacturer detail · 506a9ec5
      Bikash Hazarika authored
      [ Upstream commit 688fa069
      
       ]
      
      Update manufacturer detail from "Marvell Semiconductor, Inc." to
      "Marvell".
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBikash Hazarika <bhazarika@marvell.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Link: https://lore.kernel.org/r/20240227164127.36465-5-njavali@marvell.com
      Reviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      506a9ec5
    • Bikash Hazarika's avatar
      scsi: qla2xxx: Update manufacturer details · 315c4527
      Bikash Hazarika authored
      [ Upstream commit 1ccad277
      
       ]
      
      Update manufacturer details to indicate Marvell Semiconductors.
      
      Link: https://lore.kernel.org/r/20220713052045.10683-10-njavali@marvell.com
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: default avatarBikash Hazarika <bhazarika@marvell.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Stable-dep-of: 688fa069
      
       ("scsi: qla2xxx: Update manufacturer detail")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      315c4527
    • Aleksandr Loktionov's avatar
      i40e: fix vf may be used uninitialized in this function warning · 951d2748
      Aleksandr Loktionov authored
      commit f37c4eac upstream.
      
      To fix the regression introduced by commit 52424f97, which causes
      servers hang in very hard to reproduce conditions with resets races.
      Using two sources for the information is the root cause.
      In this function before the fix bumping v didn't mean bumping vf
      pointer. But the code used this variables interchangeably, so stale vf
      could point to different/not intended vf.
      
      Remove redundant "v" variable and iterate via single VF pointer across
      whole function instead to guarantee VF pointer validity.
      
      Fixes: 52424f97
      
       ("i40e: Fix VF hang when reset is triggered on another VF")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Reviewed-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      951d2748
    • Aleksandr Loktionov's avatar
      i40e: fix i40e_count_filters() to count only active/new filters · 8db472e1
      Aleksandr Loktionov authored
      commit eb58c598 upstream.
      
      The bug usually affects untrusted VFs, because they are limited to 18 MACs,
      it affects them badly, not letting to create MAC all filters.
      Not stable to reproduce, it happens when VF user creates MAC filters
      when other MACVLAN operations are happened in parallel.
      But consequence is that VF can't receive desired traffic.
      
      Fix counter to be bumped only for new or active filters.
      
      Fixes: 621650ca
      
       ("i40e: Refactoring VF MAC filters counting to make more reliable")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Reviewed-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8db472e1
    • Su Hui's avatar
      octeontx2-pf: check negative error code in otx2_open() · f53bea1c
      Su Hui authored
      commit e709acbd upstream.
      
      otx2_rxtx_enable() return negative error code such as -EIO,
      check -EIO rather than EIO to fix this problem.
      
      Fixes: c9262522
      
       ("octeontx2-pf: Disable packet I/O for graceful exit")
      Signed-off-by: default avatarSu Hui <suhui@nfschina.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Link: https://lore.kernel.org/r/20240328020620.4054692-1-suhui@nfschina.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f53bea1c
    • Hariprasad Kelam's avatar
      octeontx2-af: Fix issue with loading coalesced KPU profiles · ec694ca1
      Hariprasad Kelam authored
      commit 0ba80d96 upstream.
      
      The current implementation for loading coalesced KPU profiles has
      a limitation.  The "offset" field, which is used to locate profiles
      within the profile is restricted to a u16.
      
      This restricts the number of profiles that can be loaded. This patch
      addresses this limitation by increasing the size of the "offset" field.
      
      Fixes: 11c730bf
      
       ("octeontx2-af: support for coalescing KPU profiles")
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec694ca1
    • Antoine Tenart's avatar
      udp: prevent local UDP tunnel packets from being GROed · 73a328df
      Antoine Tenart authored
      commit 64235eab upstream.
      
      GRO has a fundamental issue with UDP tunnel packets as it can't detect
      those in a foolproof way and GRO could happen before they reach the
      tunnel endpoint. Previous commits have fixed issues when UDP tunnel
      packets come from a remote host, but if those packets are issued locally
      they could run into checksum issues.
      
      If the inner packet has a partial checksum the information will be lost
      in the GRO logic, either in udp4/6_gro_complete or in
      udp_gro_complete_segment and packets will have an invalid checksum when
      leaving the host.
      
      Prevent local UDP tunnel packets from ever being GROed at the outer UDP
      level.
      
      Due to skb->encapsulation being wrongly used in some drivers this is
      actually only preventing UDP tunnel packets with a partial checksum to
      be GROed (see iptunnel_handle_offloads) but those were also the packets
      triggering issues so in practice this should be sufficient.
      
      Fixes: 9fd1ff5d ("udp: Support UDP fraglist GRO/GSO.")
      Fixes: 36707061
      
       ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73a328df
    • Antoine Tenart's avatar
      udp: do not transition UDP GRO fraglist partial checksums to unnecessary · 7223f4ee
      Antoine Tenart authored
      commit f0b8c303 upstream.
      
      UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
      are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
      this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
      an egress path and then their partial checksums are not fixed.
      
      Different issues can be observed, from invalid checksum on packets to
      traces like:
      
        gen01: hw csum failure
        skb len=3008 headroom=160 headlen=1376 tailroom=0
        mac=(106,14) net=(120,40) trans=160
        shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
        csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
        hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
        ...
      
      Fix this by only converting CHECKSUM_NONE packets to
      CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary. All
      other checksum types are kept as-is, including CHECKSUM_COMPLETE as
      fraglist packets being segmented back would have their skb->csum valid.
      
      Fixes: 9fd1ff5d
      
       ("udp: Support UDP fraglist GRO/GSO.")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7223f4ee
    • Antoine Tenart's avatar
      udp: do not accept non-tunnel GSO skbs landing in a tunnel · d49ae15a
      Antoine Tenart authored
      commit 3d010c80 upstream.
      
      When rx-udp-gro-forwarding is enabled UDP packets might be GROed when
      being forwarded. If such packets might land in a tunnel this can cause
      various issues and udp_gro_receive makes sure this isn't the case by
      looking for a matching socket. This is performed in
      udp4/6_gro_lookup_skb but only in the current netns. This is an issue
      with tunneled packets when the endpoint is in another netns. In such
      cases the packets will be GROed at the UDP level, which leads to various
      issues later on. The same thing can happen with rx-gro-list.
      
      We saw this with geneve packets being GROed at the UDP level. In such
      case gso_size is set; later the packet goes through the geneve rx path,
      the geneve header is pulled, the offset are adjusted and frag_list skbs
      are not adjusted with regard to geneve. When those skbs hit
      skb_fragment, it will misbehave. Different outcomes are possible
      depending on what the GROed skbs look like; from corrupted packets to
      kernel crashes.
      
      One example is a BUG_ON[1] triggered in skb_segment while processing the
      frag_list. Because gso_size is wrong (geneve header was pulled)
      skb_segment thinks there is "geneve header size" of data in frag_list,
      although it's in fact the next packet. The BUG_ON itself has nothing to
      do with the issue. This is only one of the potential issues.
      
      Looking up for a matching socket in udp_gro_receive is fragile: the
      lookup could be extended to all netns (not speaking about performances)
      but nothing prevents those packets from being modified in between and we
      could still not find a matching socket. It's OK to keep the current
      logic there as it should cover most cases but we also need to make sure
      we handle tunnel packets being GROed too early.
      
      This is done by extending the checks in udp_unexpected_gso: GSO packets
      lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must
      be segmented.
      
      [1] kernel BUG at net/core/skbuff.c:4408!
          RIP: 0010:skb_segment+0xd2a/0xf70
          __udp_gso_segment+0xaa/0x560
      
      Fixes: 9fd1ff5d ("udp: Support UDP fraglist GRO/GSO.")
      Fixes: 36707061
      
       ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d49ae15a
    • David Thompson's avatar
      mlxbf_gige: stop interface during shutdown · 63a10b53
      David Thompson authored
      commit 09ba28e1 upstream.
      
      The mlxbf_gige driver intermittantly encounters a NULL pointer
      exception while the system is shutting down via "reboot" command.
      The mlxbf_driver will experience an exception right after executing
      its shutdown() method.  One example of this exception is:
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
      Mem abort info:
        ESR = 0x0000000096000004
        EC = 0x25: DABT (current EL), IL = 32 bits
        SET = 0, FnV = 0
        EA = 0, S1PTW = 0
        FSC = 0x04: level 0 translation fault
      Data abort info:
        ISV = 0, ISS = 0x00000004
        CM = 0, WnR = 0
      user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000
      [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
      Internal error: Oops: 96000004 [#1] SMP
      CPU: 0 PID: 13 Comm: ksoftirqd/0 Tainted: G S         OE     5.15.0-bf.6.gef6992a #1
      Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.0.2.12669 Apr 21 2023
      pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige]
      lr : mlxbf_gige_poll+0x54/0x160 [mlxbf_gige]
      sp : ffff8000080d3c10
      x29: ffff8000080d3c10 x28: ffffcce72cbb7000 x27: ffff8000080d3d58
      x26: ffff0000814e7340 x25: ffff331cd1a05000 x24: ffffcce72c4ea008
      x23: ffff0000814e4b40 x22: ffff0000814e4d10 x21: ffff0000814e4128
      x20: 0000000000000000 x19: ffff0000814e4a80 x18: ffffffffffffffff
      x17: 000000000000001c x16: ffffcce72b4553f4 x15: ffff80008805b8a7
      x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101
      x11: 7f7f7f7f7f7f7f7f x10: c2ac898b17576267 x9 : ffffcce720fa5404
      x8 : ffff000080812138 x7 : 0000000000002e9a x6 : 0000000000000080
      x5 : ffff00008de3b000 x4 : 0000000000000000 x3 : 0000000000000001
      x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
      Call trace:
       mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige]
       mlxbf_gige_poll+0x54/0x160 [mlxbf_gige]
       __napi_poll+0x40/0x1c8
       net_rx_action+0x314/0x3a0
       __do_softirq+0x128/0x334
       run_ksoftirqd+0x54/0x6c
       smpboot_thread_fn+0x14c/0x190
       kthread+0x10c/0x110
       ret_from_fork+0x10/0x20
      Code: 8b070000 f9000ea0 f95056c0 f86178a1 (b9407002)
      ---[ end trace 7cc3941aa0d8e6a4 ]---
      Kernel panic - not syncing: Oops: Fatal exception in interrupt
      Kernel Offset: 0x4ce722520000 from 0xffff800008000000
      PHYS_OFFSET: 0x80000000
      CPU features: 0x000005c1,a3330e5a
      Memory Limit: none
      ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
      
      During system shutdown, the mlxbf_gige driver's shutdown() is always executed.
      However, the driver's stop() method will only execute if networking interface
      configuration logic within the Linux distribution has been setup to do so.
      
      If shutdown() executes but stop() does not execute, NAPI remains enabled
      and this can lead to an exception if NAPI is scheduled while the hardware
      interface has only been partially deinitialized.
      
      The networking interface managed by the mlxbf_gige driver must be properly
      stopped during system shutdown so that IFF_UP is cleared, the hardware
      interface is put into a clean state, and NAPI is fully deinitialized.
      
      Fixes: f92e1869
      
       ("Add Mellanox BlueField Gigabit Ethernet driver")
      Signed-off-by: default avatarDavid Thompson <davthompson@nvidia.com>
      Link: https://lore.kernel.org/r/20240325210929.25362-1-davthompson@nvidia.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63a10b53
    • Kuniyuki Iwashima's avatar
      ipv6: Fix infinite recursion in fib6_dump_done(). · 40a344b2
      Kuniyuki Iwashima authored
      commit d21d4060 upstream.
      
      syzkaller reported infinite recursive calls of fib6_dump_done() during
      netlink socket destruction.  [1]
      
      From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
      the response was generated.  The following recvmmsg() resumed the dump
      for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
      to the fault injection.  [0]
      
        12:01:34 executing program 3:
        r0 = socket$nl_route(0x10, 0x3, 0x0)
        sendmsg$nl_route(r0, ... snip ...)
        recvmmsg(r0, ... snip ...) (fail_nth: 8)
      
      Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
      of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
      receiving the response halfway through, and finally netlink_sock_destruct()
      called nlk_sk(sk)->cb.done().
      
      fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
      is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
      nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
      itself recursively and hitting the stack guard page.
      
      To avoid the issue, let's set the destructor after kzalloc().
      
      [0]:
      FAULT_INJECTION: forcing a failure.
      name failslab, interval 1, probability 0, space 0, times 0
      CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl (lib/dump_stack.c:117)
       should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153)
       should_failslab (mm/slub.c:3733)
       kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992)
       inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662)
       rtnl_dump_all (net/core/rtnetlink.c:4029)
       netlink_dump (net/netlink/af_netlink.c:2269)
       netlink_recvmsg (net/netlink/af_netlink.c:1988)
       ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801)
       ___sys_recvmsg (net/socket.c:2846)
       do_recvmmsg (net/socket.c:2943)
       __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034)
      
      [1]:
      BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb)
      stack guard page: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Workqueue: events netlink_sock_destruct_work
      RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570)
      Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff
      RSP: 0018:ffffc9000d980000 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3
      RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358
      RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000
      R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68
      FS:  0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0
      PKRU: 55555554
      Call Trace:
       <#DF>
       </#DF>
       <TASK>
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       ...
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       netlink_sock_destruct (net/netlink/af_netlink.c:401)
       __sk_destruct (net/core/sock.c:2177 (discriminator 2))
       sk_destruct (net/core/sock.c:2224)
       __sk_free (net/core/sock.c:2235)
       sk_free (net/core/sock.c:2246)
       process_one_work (kernel/workqueue.c:3259)
       worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416)
       kthread (kernel/kthread.c:388)
       ret_from_fork (arch/x86/kernel/process.c:153)
       ret_from_fork_asm (arch/x86/entry/entry_64.S:256)
      Modules linked in:
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240401211003.25274-1-kuniyu@amazon.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40a344b2
    • Jakub Kicinski's avatar
      selftests: reuseaddr_conflict: add missing new line at the end of the output · 61f5b43b
      Jakub Kicinski authored
      commit 31974122 upstream.
      
      The netdev CI runs in a VM and captures serial, so stdout and
      stderr get combined. Because there's a missing new line in
      stderr the test ends up corrupting KTAP:
      
        # Successok 1 selftests: net: reuseaddr_conflict
      
      which should have been:
      
        # Success
        ok 1 selftests: net: reuseaddr_conflict
      
      Fixes: 422d8dc6
      
       ("selftest: add a reuseaddr test")
      Reviewed-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Link: https://lore.kernel.org/r/20240329160559.249476-1-kuba@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61f5b43b
    • Eric Dumazet's avatar
      erspan: make sure erspan_base_hdr is present in skb->head · ee008810
      Eric Dumazet authored
      commit 17af4205 upstream.
      
      syzbot reported a problem in ip6erspan_rcv() [1]
      
      Issue is that ip6erspan_rcv() (and erspan_rcv()) no longer make
      sure erspan_base_hdr is present in skb linear part (skb->head)
      before getting @ver field from it.
      
      Add the missing pskb_may_pull() calls.
      
      v2: Reload iph pointer in erspan_rcv() after pskb_may_pull()
          because skb->head might have changed.
      
      [1]
      
       BUG: KMSAN: uninit-value in pskb_may_pull_reason include/linux/skbuff.h:2742 [inline]
       BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2756 [inline]
       BUG: KMSAN: uninit-value in ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline]
       BUG: KMSAN: uninit-value in gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610
        pskb_may_pull_reason include/linux/skbuff.h:2742 [inline]
        pskb_may_pull include/linux/skbuff.h:2756 [inline]
        ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline]
        gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610
        ip6_protocol_deliver_rcu+0x1d4c/0x2ca0 net/ipv6/ip6_input.c:438
        ip6_input_finish net/ipv6/ip6_input.c:483 [inline]
        NF_HOOK include/linux/netfilter.h:314 [inline]
        ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492
        ip6_mc_input+0xa7e/0xc80 net/ipv6/ip6_input.c:586
        dst_input include/net/dst.h:460 [inline]
        ip6_rcv_finish+0x955/0x970 net/ipv6/ip6_input.c:79
        NF_HOOK include/linux/netfilter.h:314 [inline]
        ipv6_rcv+0xde/0x390 net/ipv6/ip6_input.c:310
        __netif_receive_skb_one_core net/core/dev.c:5538 [inline]
        __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5652
        netif_receive_skb_internal net/core/dev.c:5738 [inline]
        netif_receive_skb+0x58/0x660 net/core/dev.c:5798
        tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1549
        tun_get_user+0x5566/0x69e0 drivers/net/tun.c:2002
        tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
        call_write_iter include/linux/fs.h:2108 [inline]
        new_sync_write fs/read_write.c:497 [inline]
        vfs_write+0xb63/0x1520 fs/read_write.c:590
        ksys_write+0x20f/0x4c0 fs/read_write.c:643
        __do_sys_write fs/read_write.c:655 [inline]
        __se_sys_write fs/read_write.c:652 [inline]
        __x64_sys_write+0x93/0xe0 fs/read_write.c:652
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was created at:
        slab_post_alloc_hook mm/slub.c:3804 [inline]
        slab_alloc_node mm/slub.c:3845 [inline]
        kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
        kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
        __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668
        alloc_skb include/linux/skbuff.h:1318 [inline]
        alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504
        sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795
        tun_alloc_skb drivers/net/tun.c:1525 [inline]
        tun_get_user+0x209a/0x69e0 drivers/net/tun.c:1846
        tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
        call_write_iter include/linux/fs.h:2108 [inline]
        new_sync_write fs/read_write.c:497 [inline]
        vfs_write+0xb63/0x1520 fs/read_write.c:590
        ksys_write+0x20f/0x4c0 fs/read_write.c:643
        __do_sys_write fs/read_write.c:655 [inline]
        __se_sys_write fs/read_write.c:652 [inline]
        __x64_sys_write+0x93/0xe0 fs/read_write.c:652
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      CPU: 1 PID: 5045 Comm: syz-executor114 Not tainted 6.9.0-rc1-syzkaller-00021-g962490525cff #0
      
      Fixes: cb73ee40
      
       ("net: ip_gre: use erspan key field for tunnel lookup")
      Reported-by: default avatar <syzbot+1c1cf138518bf0c53d68@syzkaller.appspotmail.com>
      Closes: https://lore.kernel.org/netdev/000000000000772f2c0614b66ef7@google.com/
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Lorenzo Bianconi <lorenzo@kernel.org>
      Link: https://lore.kernel.org/r/20240328112248.1101491-1-edumazet@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee008810
    • Antoine Tenart's avatar
      selftests: net: gro fwd: update vxlan GRO test expectations · 3f9a8b79
      Antoine Tenart authored
      commit 0fb101be upstream.
      
      UDP tunnel packets can't be GRO in-between their endpoints as this
      causes different issues. The UDP GRO fwd vxlan tests were relying on
      this and their expectations have to be fixed.
      
      We keep both vxlan tests and expected no GRO from happening. The vxlan
      UDP GRO bench test was removed as it's not providing any valuable
      information now.
      
      Fixes: a062260a
      
       ("selftests: net: add UDP GRO forwarding self-tests")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f9a8b79
    • Piotr Wejman's avatar
      net: stmmac: fix rx queue priority assignment · 457c832a
      Piotr Wejman authored
      commit b3da86d4 upstream.
      
      The driver should ensure that same priority is not mapped to multiple
      rx queues. From DesignWare Cores Ethernet Quality-of-Service
      Databook, section 17.1.29 MAC_RxQ_Ctrl2:
      "[...]The software must ensure that the content of this field is
      mutually exclusive to the PSRQ fields for other queues, that is,
      the same priority is not mapped to multiple Rx queues[...]"
      
      Previously rx_queue_priority() function was:
      - clearing all priorities from a queue
      - adding new priorities to that queue
      After this patch it will:
      - first assign new priorities to a queue
      - then remove those priorities from all other queues
      - keep other priorities previously assigned to that queue
      
      Fixes: a8f5102a ("net: stmmac: TX and RX queue priority configuration")
      Fixes: 2142754f
      
       ("net: stmmac: Add MAC related callbacks for XGMAC2")
      Signed-off-by: default avatarPiotr Wejman <piotrwejman90@gmail.com>
      Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      457c832a
    • Eric Dumazet's avatar
      net/sched: act_skbmod: prevent kernel-infoleak · a097fc19
      Eric Dumazet authored
      commit d313eb8b upstream.
      
      syzbot found that tcf_skbmod_dump() was copying four bytes
      from kernel stack to user space [1].
      
      The issue here is that 'struct tc_skbmod' has a four bytes hole.
      
      We need to clear the structure before filling fields.
      
      [1]
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
       BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
        instrument_copy_to_user include/linux/instrumented.h:114 [inline]
        copy_to_user_iter lib/iov_iter.c:24 [inline]
        iterate_ubuf include/linux/iov_iter.h:29 [inline]
        iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
        iterate_and_advance include/linux/iov_iter.h:271 [inline]
        _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
        copy_to_iter include/linux/uio.h:196 [inline]
        simple_copy_to_iter net/core/datagram.c:532 [inline]
        __skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420
        skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
        skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline]
        netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962
        sock_recvmsg_nosec net/socket.c:1046 [inline]
        sock_recvmsg+0x2c4/0x340 net/socket.c:1068
        __sys_recvfrom+0x35a/0x5f0 net/socket.c:2242
        __do_sys_recvfrom net/socket.c:2260 [inline]
        __se_sys_recvfrom net/socket.c:2256 [inline]
        __x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
        pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253
        netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317
        netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351
        nlmsg_unicast include/net/netlink.h:1144 [inline]
        nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610
        rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741
        rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline]
        tcf_add_notify net/sched/act_api.c:2048 [inline]
        tcf_action_add net/sched/act_api.c:2071 [inline]
        tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119
        rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
        netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
        rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
        netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
        ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
        __sys_sendmsg net/socket.c:2667 [inline]
        __do_sys_sendmsg net/socket.c:2676 [inline]
        __se_sys_sendmsg net/socket.c:2674 [inline]
        __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
        __nla_put lib/nlattr.c:1041 [inline]
        nla_put+0x1c6/0x230 lib/nlattr.c:1099
        tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256
        tcf_action_dump_old net/sched/act_api.c:1191 [inline]
        tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
        tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251
        tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628
        tcf_add_notify_msg net/sched/act_api.c:2023 [inline]
        tcf_add_notify net/sched/act_api.c:2042 [inline]
        tcf_action_add net/sched/act_api.c:2071 [inline]
        tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119
        rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
        netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
        rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
        netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
        ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
        __sys_sendmsg net/socket.c:2667 [inline]
        __do_sys_sendmsg net/socket.c:2676 [inline]
        __se_sys_sendmsg net/socket.c:2674 [inline]
        __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Local variable opt created at:
        tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244
        tcf_action_dump_old net/sched/act_api.c:1191 [inline]
        tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
      
      Bytes 188-191 of 248 are uninitialized
      Memory access of size 248 starts at ffff888117697680
      Data copied to user address 00007ffe56d855f0
      
      Fixes: 86da71b5
      
       ("net_sched: Introduce skbmod action")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a097fc19
    • Jakub Sitnicki's avatar
      bpf, sockmap: Prevent lock inversion deadlock in map delete elem · d1e73fb1
      Jakub Sitnicki authored
      commit ff910599 upstream.
      
      syzkaller started using corpuses where a BPF tracing program deletes
      elements from a sockmap/sockhash map. Because BPF tracing programs can be
      invoked from any interrupt context, locks taken during a map_delete_elem
      operation must be hardirq-safe. Otherwise a deadlock due to lock inversion
      is possible, as reported by lockdep:
      
             CPU0                    CPU1
             ----                    ----
        lock(&htab->buckets[i].lock);
                                     local_irq_disable();
                                     lock(&host->lock);
                                     lock(&htab->buckets[i].lock);
        <Interrupt>
          lock(&host->lock);
      
      Locks in sockmap are hardirq-unsafe by design. We expects elements to be
      deleted from sockmap/sockhash only in task (normal) context with interrupts
      enabled, or in softirq context.
      
      Detect when map_delete_elem operation is invoked from a context which is
      _not_ hardirq-unsafe, that is interrupts are disabled, and bail out with an
      error.
      
      Note that map updates are not affected by this issue. BPF verifier does not
      allow updating sockmap/sockhash from a BPF tracing program today.
      
      Fixes: 604326b4
      
       ("bpf, sockmap: convert to generic sk_msg interface")
      Reported-by: default avatarxingwei lee <xrivendell7@gmail.com>
      Reported-by: default avataryue sun <samsun1006219@gmail.com>
      Reported-by: default avatar <syzbot+bc922f476bd65abbd466@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+d4066896495db380182e@syzkaller.appspotmail.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatar <syzbot+d4066896495db380182e@syzkaller.appspotmail.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=d4066896495db380182e
      Closes: https://syzkaller.appspot.com/bug?extid=bc922f476bd65abbd466
      Link: https://lore.kernel.org/bpf/20240402104621.1050319-1-jakub@cloudflare.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1e73fb1
    • Christophe JAILLET's avatar
      vboxsf: Avoid an spurious warning if load_nls_xxx() fails · 465abe8a
      Christophe JAILLET authored
      commit de3f64b7 upstream.
      
      If an load_nls_xxx() function fails a few lines above, the 'sbi->bdi_id' is
      still 0.
      So, in the error handling path, we will call ida_simple_remove(..., 0)
      which is not allocated yet.
      
      In order to prevent a spurious "ida_free called for id=0 which is not
      allocated." message, tweak the error handling path and add a new label.
      
      Fixes: 0fd16957
      
       ("fs: Add VirtualBox guest shared folder (vboxsf) support")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Link: https://lore.kernel.org/r/d09eaaa4e2e08206c58a1a27ca9b3e81dc168773.1698835730.git.christophe.jaillet@wanadoo.fr
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      465abe8a
    • Eric Dumazet's avatar
      netfilter: validate user input for expected length · 440e948c
      Eric Dumazet authored
      commit 0c83842d upstream.
      
      I got multiple syzbot reports showing old bugs exposed
      by BPF after commit 20f2505f ("bpf: Try to avoid kzalloc
      in cgroup/{s,g}etsockopt")
      
      setsockopt() @optlen argument should be taken into account
      before copying data.
      
       BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
       BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline]
       BUG: KASAN: slab-out-of-bounds in do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline]
       BUG: KASAN: slab-out-of-bounds in do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627
      Read of size 96 at addr ffff88802cd73da0 by task syz-executor.4/7238
      
      CPU: 1 PID: 7238 Comm: syz-executor.4 Not tainted 6.9.0-rc2-next-20240403-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
      Call Trace:
       <TASK>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        print_address_description mm/kasan/report.c:377 [inline]
        print_report+0x169/0x550 mm/kasan/report.c:488
        kasan_report+0x143/0x180 mm/kasan/report.c:601
        kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
        __asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105
        copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
        copy_from_sockptr include/linux/sockptr.h:55 [inline]
        do_replace net/ipv4/netfilter/ip_tables.c:1111 [inline]
        do_ipt_set_ctl+0x902/0x3dd0 net/ipv4/netfilter/ip_tables.c:1627
        nf_setsockopt+0x295/0x2c0 net/netfilter/nf_sockopt.c:101
        do_sock_setsockopt+0x3af/0x720 net/socket.c:2311
        __sys_setsockopt+0x1ae/0x250 net/socket.c:2334
        __do_sys_setsockopt net/socket.c:2343 [inline]
        __se_sys_setsockopt net/socket.c:2340 [inline]
        __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x72/0x7a
      RIP: 0033:0x7fd22067dde9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fd21f9ff0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 00007fd2207abf80 RCX: 00007fd22067dde9
      RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00007fd2206ca47a R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000020000880 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000000000b R14: 00007fd2207abf80 R15: 00007ffd2d0170d8
       </TASK>
      
      Allocated by task 7238:
        kasan_save_stack mm/kasan/common.c:47 [inline]
        kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
        poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
        __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
        kasan_kmalloc include/linux/kasan.h:211 [inline]
        __do_kmalloc_node mm/slub.c:4069 [inline]
        __kmalloc_noprof+0x200/0x410 mm/slub.c:4082
        kmalloc_noprof include/linux/slab.h:664 [inline]
        __cgroup_bpf_run_filter_setsockopt+0xd47/0x1050 kernel/bpf/cgroup.c:1869
        do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293
        __sys_setsockopt+0x1ae/0x250 net/socket.c:2334
        __do_sys_setsockopt net/socket.c:2343 [inline]
        __se_sys_setsockopt net/socket.c:2340 [inline]
        __x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x72/0x7a
      
      The buggy address belongs to the object at ffff88802cd73da0
       which belongs to the cache kmalloc-8 of size 8
      The buggy address is located 0 bytes inside of
       allocated 1-byte region [ffff88802cd73da0, ffff88802cd73da1)
      
      The buggy address belongs to the physical page:
      page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802cd73020 pfn:0x2cd73
      flags: 0xfff80000000000(node=0|zone=1|lastcpupid=0xfff)
      page_type: 0xffffefff(slab)
      raw: 00fff80000000000 ffff888015041280 dead000000000100 dead000000000122
      raw: ffff88802cd73020 000000008080007f 00000001ffffefff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5103, tgid 2119833701 (syz-executor.4), ts 5103, free_ts 70804600828
        set_page_owner include/linux/page_owner.h:32 [inline]
        post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1490
        prep_new_page mm/page_alloc.c:1498 [inline]
        get_page_from_freelist+0x2e7e/0x2f40 mm/page_alloc.c:3454
        __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4712
        __alloc_pages_node_noprof include/linux/gfp.h:244 [inline]
        alloc_pages_node_noprof include/linux/gfp.h:271 [inline]
        alloc_slab_page+0x5f/0x120 mm/slub.c:2249
        allocate_slab+0x5a/0x2e0 mm/slub.c:2412
        new_slab mm/slub.c:2465 [inline]
        ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3615
        __slab_alloc+0x58/0xa0 mm/slub.c:3705
        __slab_alloc_node mm/slub.c:3758 [inline]
        slab_alloc_node mm/slub.c:3936 [inline]
        __do_kmalloc_node mm/slub.c:4068 [inline]
        kmalloc_node_track_caller_noprof+0x286/0x450 mm/slub.c:4089
        kstrdup+0x3a/0x80 mm/util.c:62
        device_rename+0xb5/0x1b0 drivers/base/core.c:4558
        dev_change_name+0x275/0x860 net/core/dev.c:1232
        do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2864
        __rtnl_newlink net/core/rtnetlink.c:3680 [inline]
        rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
        rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6594
        netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
      page last free pid 5146 tgid 5146 stack trace:
        reset_page_owner include/linux/page_owner.h:25 [inline]
        free_pages_prepare mm/page_alloc.c:1110 [inline]
        free_unref_page+0xd3c/0xec0 mm/page_alloc.c:2617
        discard_slab mm/slub.c:2511 [inline]
        __put_partials+0xeb/0x130 mm/slub.c:2980
        put_cpu_partial+0x17c/0x250 mm/slub.c:3055
        __slab_free+0x2ea/0x3d0 mm/slub.c:4254
        qlink_free mm/kasan/quarantine.c:163 [inline]
        qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
        kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
        __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
        kasan_slab_alloc include/linux/kasan.h:201 [inline]
        slab_post_alloc_hook mm/slub.c:3888 [inline]
        slab_alloc_node mm/slub.c:3948 [inline]
        __do_kmalloc_node mm/slub.c:4068 [inline]
        __kmalloc_node_noprof+0x1d7/0x450 mm/slub.c:4076
        kmalloc_node_noprof include/linux/slab.h:681 [inline]
        kvmalloc_node_noprof+0x72/0x190 mm/util.c:634
        bucket_table_alloc lib/rhashtable.c:186 [inline]
        rhashtable_rehash_alloc+0x9e/0x290 lib/rhashtable.c:367
        rht_deferred_worker+0x4e1/0x2440 lib/rhashtable.c:427
        process_one_work kernel/workqueue.c:3218 [inline]
        process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
        worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
        kthread+0x2f0/0x390 kernel/kthread.c:388
        ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
      
      Memory state around the buggy address:
       ffff88802cd73c80: 07 fc fc fc 05 fc fc fc 05 fc fc fc fa fc fc fc
       ffff88802cd73d00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
      >ffff88802cd73d80: fa fc fc fc 01 fc fc fc fa fc fc fc fa fc fc fc
                                     ^
       ffff88802cd73e00: fa fc fc fc fa fc fc fc 05 fc fc fc 07 fc fc fc
       ffff88802cd73e80: 07 fc fc fc 07 fc fc fc 07 fc fc fc 07 fc fc fc
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Link: https://lore.kernel.org/r/20240404122051.2303764-1-edumazet@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      440e948c
    • Ziyang Xuan's avatar
      netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get() · 2485bcfe
      Ziyang Xuan authored
      commit 24225011 upstream.
      
      nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
      concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
      And thhere is not any protection when iterate over nf_tables_flowtables
      list in __nft_flowtable_type_get(). Therefore, there is pertential
      data-race of nf_tables_flowtables list entry.
      
      Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
      in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
      nft_flowtable_type_get() to protect the entire type query process.
      
      Fixes: 3b49e2e9
      
       ("netfilter: nf_tables: add flow table netlink frontend")
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2485bcfe
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: flush pending destroy work before exit_net release · f7e3c88c
      Pablo Neira Ayuso authored
      commit 24cea967 upstream.
      
      Similar to 2c9f0293 ("netfilter: nf_tables: flush pending destroy
      work before netlink notifier") to address a race between exit_net and
      the destroy workqueue.
      
      The trace below shows an element to be released via destroy workqueue
      while exit_net path (triggered via module removal) has already released
      the set that is used in such transaction.
      
      [ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465
      [ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359
      [ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
      [ 1360.547984] Call Trace:
      [ 1360.547991]  <TASK>
      [ 1360.547998]  dump_stack_lvl+0x53/0x70
      [ 1360.548014]  print_report+0xc4/0x610
      [ 1360.548026]  ? __virt_addr_valid+0xba/0x160
      [ 1360.548040]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
      [ 1360.548054]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.548176]  kasan_report+0xae/0xe0
      [ 1360.548189]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.548312]  nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
      [ 1360.548447]  ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables]
      [ 1360.548577]  ? _raw_spin_unlock_irq+0x18/0x30
      [ 1360.548591]  process_one_work+0x2f1/0x670
      [ 1360.548610]  worker_thread+0x4d3/0x760
      [ 1360.548627]  ? __pfx_worker_thread+0x10/0x10
      [ 1360.548640]  kthread+0x16b/0x1b0
      [ 1360.548653]  ? __pfx_kthread+0x10/0x10
      [ 1360.548665]  ret_from_fork+0x2f/0x50
      [ 1360.548679]  ? __pfx_kthread+0x10/0x10
      [ 1360.548690]  ret_from_fork_asm+0x1a/0x30
      [ 1360.548707]  </TASK>
      
      [ 1360.548719] Allocated by task 192061:
      [ 1360.548726]  kasan_save_stack+0x20/0x40
      [ 1360.548739]  kasan_save_track+0x14/0x30
      [ 1360.548750]  __kasan_kmalloc+0x8f/0xa0
      [ 1360.548760]  __kmalloc_node+0x1f1/0x450
      [ 1360.548771]  nf_tables_newset+0x10c7/0x1b50 [nf_tables]
      [ 1360.548883]  nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink]
      [ 1360.548909]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
      [ 1360.548927]  netlink_unicast+0x367/0x4f0
      [ 1360.548935]  netlink_sendmsg+0x34b/0x610
      [ 1360.548944]  ____sys_sendmsg+0x4d4/0x510
      [ 1360.548953]  ___sys_sendmsg+0xc9/0x120
      [ 1360.548961]  __sys_sendmsg+0xbe/0x140
      [ 1360.548971]  do_syscall_64+0x55/0x120
      [ 1360.548982]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
      
      [ 1360.548994] Freed by task 192222:
      [ 1360.548999]  kasan_save_stack+0x20/0x40
      [ 1360.549009]  kasan_save_track+0x14/0x30
      [ 1360.549019]  kasan_save_free_info+0x3b/0x60
      [ 1360.549028]  poison_slab_object+0x100/0x180
      [ 1360.549036]  __kasan_slab_free+0x14/0x30
      [ 1360.549042]  kfree+0xb6/0x260
      [ 1360.549049]  __nft_release_table+0x473/0x6a0 [nf_tables]
      [ 1360.549131]  nf_tables_exit_net+0x170/0x240 [nf_tables]
      [ 1360.549221]  ops_exit_list+0x50/0xa0
      [ 1360.549229]  free_exit_list+0x101/0x140
      [ 1360.549236]  unregister_pernet_operations+0x107/0x160
      [ 1360.549245]  unregister_pernet_subsys+0x1c/0x30
      [ 1360.549254]  nf_tables_module_exit+0x43/0x80 [nf_tables]
      [ 1360.549345]  __do_sys_delete_module+0x253/0x370
      [ 1360.549352]  do_syscall_64+0x55/0x120
      [ 1360.549360]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
      
      (gdb) list *__nft_release_table+0x473
      0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354).
      11349           list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) {
      11350                   list_del(&flowtable->list);
      11351                   nft_use_dec(&table->use);
      11352                   nf_tables_flowtable_destroy(flowtable);
      11353           }
      11354           list_for_each_entry_safe(set, ns, &table->sets, list) {
      11355                   list_del(&set->list);
      11356                   nft_use_dec(&table->use);
      11357                   if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
      11358                           nft_map_deactivate(&ctx, set);
      (gdb)
      
      [ 1360.549372] Last potentially related work creation:
      [ 1360.549376]  kasan_save_stack+0x20/0x40
      [ 1360.549384]  __kasan_record_aux_stack+0x9b/0xb0
      [ 1360.549392]  __queue_work+0x3fb/0x780
      [ 1360.549399]  queue_work_on+0x4f/0x60
      [ 1360.549407]  nft_rhash_remove+0x33b/0x340 [nf_tables]
      [ 1360.549516]  nf_tables_commit+0x1c6a/0x2620 [nf_tables]
      [ 1360.549625]  nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink]
      [ 1360.549647]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
      [ 1360.549671]  netlink_unicast+0x367/0x4f0
      [ 1360.549680]  netlink_sendmsg+0x34b/0x610
      [ 1360.549690]  ____sys_sendmsg+0x4d4/0x510
      [ 1360.549697]  ___sys_sendmsg+0xc9/0x120
      [ 1360.549706]  __sys_sendmsg+0xbe/0x140
      [ 1360.549715]  do_syscall_64+0x55/0x120
      [ 1360.549725]  entry_SYSCALL_64_after_hwframe+0x55/0x5d
      
      Fixes: 0935d558
      
       ("netfilter: nf_tables: asynchronous release")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7e3c88c
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: reject new basechain after table flag update · 8ba81dca
      Pablo Neira Ayuso authored
      commit 994209dd upstream.
      
      When dormant flag is toggled, hooks are disabled in the commit phase by
      iterating over current chains in table (existing and new).
      
      The following configuration allows for an inconsistent state:
      
        add table x
        add chain x y { type filter hook input priority 0; }
        add table x { flags dormant; }
        add chain x w { type filter hook input priority 1; }
      
      which triggers the following warning when trying to unregister chain w
      which is already unregistered.
      
      [  127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50                                                                     1 __nf_unregister_net_hook+0x21a/0x260
      [...]
      [  127.322519] Call Trace:
      [  127.322521]  <TASK>
      [  127.322524]  ? __warn+0x9f/0x1a0
      [  127.322531]  ? __nf_unregister_net_hook+0x21a/0x260
      [  127.322537]  ? report_bug+0x1b1/0x1e0
      [  127.322545]  ? handle_bug+0x3c/0x70
      [  127.322552]  ? exc_invalid_op+0x17/0x40
      [  127.322556]  ? asm_exc_invalid_op+0x1a/0x20
      [  127.322563]  ? kasan_save_free_info+0x3b/0x60
      [  127.322570]  ? __nf_unregister_net_hook+0x6a/0x260
      [  127.322577]  ? __nf_unregister_net_hook+0x21a/0x260
      [  127.322583]  ? __nf_unregister_net_hook+0x6a/0x260
      [  127.322590]  ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
      [  127.322655]  nft_table_disable+0x75/0xf0 [nf_tables]
      [  127.322717]  nf_tables_commit+0x2571/0x2620 [nf_tables]
      
      Fixes: 179d9ba5
      
       ("netfilter: nf_tables: fix table flag updates")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ba81dca
    • Sean Christopherson's avatar
      KVM: x86: Mark target gfn of emulated atomic instruction as dirty · a9bd6bb6
      Sean Christopherson authored
      commit 910c57df upstream.
      
      When emulating an atomic access on behalf of the guest, mark the target
      gfn dirty if the CMPXCHG by KVM is attempted and doesn't fault.  This
      fixes a bug where KVM effectively corrupts guest memory during live
      migration by writing to guest memory without informing userspace that the
      page is dirty.
      
      Marking the page dirty got unintentionally dropped when KVM's emulated
      CMPXCHG was converted to do a user access.  Before that, KVM explicitly
      mapped the guest page into kernel memory, and marked the page dirty during
      the unmap phase.
      
      Mark the page dirty even if the CMPXCHG fails, as the old data is written
      back on failure, i.e. the page is still written.  The value written is
      guaranteed to be the same because the operation is atomic, but KVM's ABI
      is that all writes are dirty logged regardless of the value written.  And
      more importantly, that's what KVM did before the buggy commit.
      
      Huge kudos to the folks on the Cc list (and many others), who did all the
      actual work of triaging and debugging.
      
      Fixes: 1c2361f6
      
       ("KVM: x86: Use __try_cmpxchg_user() to emulate atomic accesses")
      Cc: stable@vger.kernel.org
      Cc: David Matlack <dmatlack@google.com>
      Cc: Pasha Tatashin <tatashin@google.com>
      Cc: Michael Krebs <mkrebs@google.com>
      base-commit: 6769ea8da8a93ed4630f1ce64df6aafcaabfce64
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Link: https://lore.kernel.org/r/20240215010004.1456078-2-seanjc@google.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9bd6bb6
    • Sean Christopherson's avatar
      KVM: x86: Bail to userspace if emulation of atomic user access faults · bd9a25a0
      Sean Christopherson authored
      commit 5d6c7de6
      
       upstream.
      
      Exit to userspace when emulating an atomic guest access if the CMPXCHG on
      the userspace address faults.  Emulating the access as a write and thus
      likely treating it as emulated MMIO is wrong, as KVM has already
      confirmed there is a valid, writable memslot.
      
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220202004945.2540433-6-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd9a25a0
    • Ye Zhang's avatar
      thermal: devfreq_cooling: Fix perf state when calculate dfc res_util · a7c6a643
      Ye Zhang authored
      commit a26de34b upstream.
      
      The issue occurs when the devfreq cooling device uses the EM power model
      and the get_real_power() callback is provided by the driver.
      
      The EM power table is sorted ascending,can't index the table by cooling
      device state,so convert cooling state to performance state by
      dfc->max_state - dfc->capped_state.
      
      Fixes: 615510fe
      
       ("thermal: devfreq_cooling: remove old power model and use EM")
      Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
      Signed-off-by: default avatarYe Zhang <ye.zhang@rock-chips.com>
      Reviewed-by: default avatarDhruva Gole <d-gole@ti.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7c6a643