Skip to content
  1. Apr 10, 2024
    • Ivan Vecera's avatar
      virtchnl: Add header dependencies · 0c52a50a
      Ivan Vecera authored
      [ Upstream commit 7151d87a
      
       ]
      
      The <linux/avf/virtchnl.h> uses BIT, struct_size and ETH_ALEN macros
      but does not include appropriate header files that defines them.
      Add these dependencies so this header file can be included anywhere.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Reviewed-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: 6dbdd4de
      
       ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0c52a50a
    • Ivan Vecera's avatar
      i40e: Refactor I40E_MDIO_CLAUSE* macros · 45116a7c
      Ivan Vecera authored
      [ Upstream commit 8196b5fd
      
       ]
      
      The macros I40E_MDIO_CLAUSE22* and I40E_MDIO_CLAUSE45* are using I40E_MASK
      together with the same values I40E_GLGEN_MSCA_STCODE_SHIFT and
      I40E_GLGEN_MSCA_OPCODE_SHIFT to define masks.
      Introduce I40E_GLGEN_MSCA_OPCODE_MASK and I40E_GLGEN_MSCA_STCODE_MASK
      for both shifts in i40e_register.h and use them to refactor the macros
      mentioned above.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Reviewed-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: 6dbdd4de
      
       ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      45116a7c
    • Ivan Vecera's avatar
      i40e: Remove back pointer from i40e_hw structure · f629cf15
      Ivan Vecera authored
      [ Upstream commit 39ec612a
      
       ]
      
      The .back field placed in i40e_hw is used to get pointer to i40e_pf
      instance but it is not necessary as the i40e_hw is a part of i40e_pf
      and containerof macro can be used to obtain the pointer to i40e_pf.
      Remove .back field from i40e_hw structure, introduce i40e_hw_to_pf()
      and i40e_hw_to_dev() helpers and use them.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Reviewed-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: 6dbdd4de
      
       ("e1000e: Workaround for sporadic MDI error on Meteor Lake systems")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f629cf15
    • Ivan Vecera's avatar
      i40e: Enforce software interrupt during busy-poll exit · 66ca011a
      Ivan Vecera authored
      [ Upstream commit ea558de7 ]
      
      As for ice bug fixed by commit b7306b42 ("ice: manage interrupts
      during poll exit") followed by commit 23be7075 ("ice: fix software
      generating extra interrupts") I'm seeing the similar issue also with
      i40e driver.
      
      In certain situation when busy-loop is enabled together with adaptive
      coalescing, the driver occasionally misses that there are outstanding
      descriptors to clean when exiting busy poll.
      
      Try to catch the remaining work by triggering a software interrupt
      when exiting busy poll. No extra interrupts will be generated when
      busy polling is not used.
      
      The issue was found when running sockperf ping-pong tcp test with
      adaptive coalescing and busy poll enabled (50 as value busy_pool
      and busy_read sysctl knobs) and results in huge latency spikes
      with more than 100000us.
      
      The fix is inspired from the ice driver and do the following:
      1) During napi poll exit in case of busy-poll (napo_complete_done()
         returns false) this is recorded to q_vector that we were in busy
         loop.
      2) Extends i40e_buildreg_itr() to be able to add an enforced software
         interrupt into built value
      2) In i40e_update_enable_itr() enforces a software interrupt trigger
         if we are exiting busy poll to catch any pending clean-ups
      3) Reuses unused 3rd ITR (interrupt throttle) index and set it to
         20K interrupts per second to limit the number of these sw interrupts.
      
      Test results
      ============
      Prior:
      [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120
      sockperf: == version #3.10-no.git ==
      sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
      
      [ 0] IP = 10.9.9.1        PORT = 11111 # TCP
      sockperf: Warmup stage (sending a few dummy messages)...
      sockperf: Starting test...
      sockperf: Test end (interrupted by timer)
      sockperf: Test ended
      sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2438563; ReceivedMessages=2438562
      sockperf: ========= Printing statistics for Server No: 0
      sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2429473; ReceivedMessages=2429473
      sockperf: ====> avg-latency=24.571 (std-dev=93.297, mean-ad=4.904, median-ad=1.510, siqr=1.063, cv=3.797, std-error=0.060, 99.0% ci=[24.417, 24.725])
      sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
      sockperf: Summary: Latency is 24.571 usec
      sockperf: Total 2429473 observations; each percentile contains 24294.73 observations
      sockperf: ---> <MAX> observation = 103294.331
      sockperf: ---> percentile 99.999 =   45.633
      sockperf: ---> percentile 99.990 =   37.013
      sockperf: ---> percentile 99.900 =   35.910
      sockperf: ---> percentile 99.000 =   33.390
      sockperf: ---> percentile 90.000 =   28.626
      sockperf: ---> percentile 75.000 =   27.741
      sockperf: ---> percentile 50.000 =   26.743
      sockperf: ---> percentile 25.000 =   25.614
      sockperf: ---> <MIN> observation =   12.220
      
      After:
      [root@dell-per640-07 net]# sockperf ping-pong -i 10.9.9.1 --tcp -m 1000 --mps=max -t 120
      sockperf: == version #3.10-no.git ==
      sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
      
      [ 0] IP = 10.9.9.1        PORT = 11111 # TCP
      sockperf: Warmup stage (sending a few dummy messages)...
      sockperf: Starting test...
      sockperf: Test end (interrupted by timer)
      sockperf: Test ended
      sockperf: [Total Run] RunTime=119.999 sec; Warm up time=400 msec; SentMessages=2400055; ReceivedMessages=2400054
      sockperf: ========= Printing statistics for Server No: 0
      sockperf: [Valid Duration] RunTime=119.549 sec; SentMessages=2391186; ReceivedMessages=2391186
      sockperf: ====> avg-latency=24.965 (std-dev=5.934, mean-ad=4.642, median-ad=1.485, siqr=1.067, cv=0.238, std-error=0.004, 99.0% ci=[24.955, 24.975])
      sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
      sockperf: Summary: Latency is 24.965 usec
      sockperf: Total 2391186 observations; each percentile contains 23911.86 observations
      sockperf: ---> <MAX> observation =  195.841
      sockperf: ---> percentile 99.999 =   45.026
      sockperf: ---> percentile 99.990 =   39.009
      sockperf: ---> percentile 99.900 =   35.922
      sockperf: ---> percentile 99.000 =   33.482
      sockperf: ---> percentile 90.000 =   28.902
      sockperf: ---> percentile 75.000 =   27.821
      sockperf: ---> percentile 50.000 =   26.860
      sockperf: ---> percentile 25.000 =   25.685
      sockperf: ---> <MIN> observation =   12.277
      
      Fixes: 0bcd952f
      
       ("ethernet/intel: consolidate NAPI and NAPI exit")
      Reported-by: default avatarHugo Ferreira <hferreir@redhat.com>
      Reviewed-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      66ca011a
    • Ivan Vecera's avatar
      i40e: Remove _t suffix from enum type names · e6d25dbd
      Ivan Vecera authored
      [ Upstream commit addca917
      
       ]
      
      Enum type names should not be suffixed by '_t'. Either to use
      'typedef enum name name_t' to so plain 'name_t var' instead of
      'enum name_t var'.
      
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20231113231047.548659-6-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: ea558de7
      
       ("i40e: Enforce software interrupt during busy-poll exit")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e6d25dbd
    • Mario Limonciello's avatar
      drm/amd: Flush GFXOFF requests in prepare stage · 3da10e91
      Mario Limonciello authored
      [ Upstream commit ca299b45 ]
      
      If the system hasn't entered GFXOFF when suspend starts it can cause
      hangs accessing GC and RLC during the suspend stage.
      
      Cc: <stable@vger.kernel.org> # 6.1.y: 5095d541 ("drm/amd: Evict resources during PM ops prepare() callback")
      Cc: <stable@vger.kernel.org> # 6.1.y: cb11ca32 ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks")
      Cc: <stable@vger.kernel.org> # 6.1.y: 2ceec37b ("drm/amd: Add missing kernel doc for prepare_suspend()")
      Cc: <stable@vger.kernel.org> # 6.1.y: 3a9626c8 ("drm/amd: Stop evicting resources on APUs in suspend")
      Cc: <stable@vger.kernel.org> # 6.6.y: 5095d541 ("drm/amd: Evict resources during PM ops prepare() callback")
      Cc: <stable@vger.kernel.org> # 6.6.y: cb11ca32 ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks")
      Cc: <stable@vger.kernel.org> # 6.6.y: 2ceec37b ("drm/amd: Add missing kernel doc for prepare_suspend()")
      Cc: <stable@vger.kernel.org> # 6.6.y: 3a9626c8 ("drm/amd: Stop evicting resources on APUs in suspend")
      Cc: <stable@vger.kernel.org> # 6.1+
      Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132
      Fixes: ab475033
      
       ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3da10e91
    • Mario Limonciello's avatar
      drm/amd: Add concept of running prepare_suspend() sequence for IP blocks · da67a113
      Mario Limonciello authored
      [ Upstream commit cb11ca32
      
       ]
      
      If any IP blocks allocate memory during their hw_fini() sequence
      this can cause the suspend to fail under memory pressure.  Introduce
      a new phase that IP blocks can use to allocate memory before suspend
      starts so that it can potentially be evicted into swap instead.
      
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Stable-dep-of: ca299b45
      
       ("drm/amd: Flush GFXOFF requests in prepare stage")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      da67a113
    • Mario Limonciello's avatar
      drm/amd: Evict resources during PM ops prepare() callback · 8b5f7204
      Mario Limonciello authored
      [ Upstream commit 5095d541 ]
      
      Linux PM core has a prepare() callback run before suspend.
      
      If the system is under high memory pressure, the resources may need
      to be evicted into swap instead.  If the storage backing for swap
      is offlined during the suspend() step then such a call may fail.
      
      So move this step into prepare() to move evict majority of
      resources and update all non-pmops callers to call the same callback.
      
      Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
      
      
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Stable-dep-of: ca299b45
      
       ("drm/amd: Flush GFXOFF requests in prepare stage")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b5f7204
    • Chris Park's avatar
      drm/amd/display: Prevent crash when disable stream · 4356a2c3
      Chris Park authored
      [ Upstream commit 72d72e8f
      
       ]
      
      [Why]
      Disabling stream encoder invokes a function that no longer exists.
      
      [How]
      Check if the function declaration is NULL in disable stream encoder.
      
      Cc: Mario Limonciello <mario.limonciello@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarCharlene Liu <charlene.liu@amd.com>
      Acked-by: default avatarWayne Lin <wayne.lin@amd.com>
      Signed-off-by: default avatarChris Park <chris.park@amd.com>
      Tested-by: default avatarDaniel Wheeler <daniel.wheeler@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4356a2c3
    • Dmytro Laktyushkin's avatar
      drm/amd/display: Fix DPSTREAM CLK on and off sequence · 8dc9a275
      Dmytro Laktyushkin authored
      [ Upstream commit e8d13128
      
       ]
      
      [Why]
      Secondary DP2 display fails to light up in some instances
      
      [How]
      Clock needs to be on when DPSTREAMCLK*_EN =1. This change
      moves dtbclk_p enable/disable point to make sure this is
      the case
      
      Reviewed-by: default avatarCharlene Liu <charlene.liu@amd.com>
      Reviewed-by: default avatarDmytro Laktyushkin <dmytro.laktyushkin@amd.com>
      Acked-by: default avatarTom Chung <chiahsuan.chung@amd.com>
      Signed-off-by: default avatarDaniel Miess <daniel.miess@amd.com>
      Signed-off-by: default avatarDmytro Laktyushkin <dmytro.laktyushkin@amd.com>
      Tested-by: default avatarDaniel Wheeler <daniel.wheeler@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Stable-dep-of: 72d72e8f
      
       ("drm/amd/display: Prevent crash when disable stream")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8dc9a275
    • Krishna Kurapati's avatar
      usb: typec: ucsi: Fix race between typec_switch and role_switch · 113b12e1
      Krishna Kurapati authored
      [ Upstream commit f5e9bda0 ]
      
      When orientation switch is enabled in ucsi glink, there is a xhci
      probe failure seen when booting up in host mode in reverse
      orientation.
      
      During bootup the following things happen in multiple drivers:
      
      a) DWC3 controller driver initializes the core in device mode when the
      dr_mode is set to DRD. It relies on role_switch call to change role to
      host.
      
      b) QMP driver initializes the lanes to TYPEC_ORIENTATION_NORMAL as a
      normal routine. It relies on the typec_switch_set call to get notified
      of orientation changes.
      
      c) UCSI core reads the UCSI_GET_CONNECTOR_STATUS via the glink and
      provides initial role switch to dwc3 controller.
      
      When booting up in host mode with orientation TYPEC_ORIENTATION_REVERSE,
      then we see the following things happening in order:
      
      a) UCSI gives initial role as host to dwc3 controller ucsi_register_port.
      Upon receiving this notification, the dwc3 core needs to program GCTL from
      PRTCAP_DEVICE to PRTCAP_HOST and as part of this change, it asserts GCTL
      Core soft reset and waits for it to be  completed before shifting it to
      host. Only after the reset is done will the dwc3_host_init be invoked and
      xhci is probed. DWC3 controller expects that the usb phy's are stable
      during this process i.e., the phy init is already done.
      
      b) During the 100ms wait for GCTL core soft reset, the actual notification
      from PPM is received by ucsi_glink via pmic glink for changing role to
      host. The pmic_glink_ucsi_notify routine first sends the orientation
      change to QMP and then sends role to dwc3 via ucsi framework. This is
      happening exactly at the time GCTL core soft reset is being processed.
      
      c) When QMP driver receives typec switch to TYPEC_ORIENTATION_REVERSE, it
      then re-programs the phy at the instant GCTL core soft reset has been
      asserted by dwc3 controller due to which the QMP PLL lock fails in
      qmp_combo_usb_power_on.
      
      d) After the 100ms of GCTL core soft reset is completed, the dwc3 core
      goes for initializing the host mode and invokes xhci probe. But at this
      point the QMP is non-responsive and as a result, the xhci plat probe fails
      during xhci_reset.
      
      Fix this by passing orientation switch to available ucsi instances if
      their gpio configuration is available before ucsi_register is invoked so
      that by the time, the pmic_glink_ucsi_notify provides typec_switch to QMP,
      the lane is already configured and the call would be a NOP thus not racing
      with role switch.
      
      Cc: stable@vger.kernel.org
      Fixes: c6165ed2
      
       ("usb: ucsi: glink: use the connector orientation GPIO to provide switch events")
      Suggested-by: default avatarWesley Cheng <quic_wcheng@quicinc.com>
      Signed-off-by: default avatarKrishna Kurapati <quic_kriskura@quicinc.com>
      Acked-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Link: https://lore.kernel.org/r/20240301040914.458492-1-quic_kriskura@quicinc.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      113b12e1
    • Aleksandr Loktionov's avatar
      i40e: fix vf may be used uninitialized in this function warning · 0dcf573f
      Aleksandr Loktionov authored
      commit f37c4eac upstream.
      
      To fix the regression introduced by commit 52424f97, which causes
      servers hang in very hard to reproduce conditions with resets races.
      Using two sources for the information is the root cause.
      In this function before the fix bumping v didn't mean bumping vf
      pointer. But the code used this variables interchangeably, so stale vf
      could point to different/not intended vf.
      
      Remove redundant "v" variable and iterate via single VF pointer across
      whole function instead to guarantee VF pointer validity.
      
      Fixes: 52424f97
      
       ("i40e: Fix VF hang when reset is triggered on another VF")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Reviewed-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dcf573f
    • Aleksandr Loktionov's avatar
      i40e: fix i40e_count_filters() to count only active/new filters · 89e29416
      Aleksandr Loktionov authored
      commit eb58c598 upstream.
      
      The bug usually affects untrusted VFs, because they are limited to 18 MACs,
      it affects them badly, not letting to create MAC all filters.
      Not stable to reproduce, it happens when VF user creates MAC filters
      when other MACVLAN operations are happened in parallel.
      But consequence is that VF can't receive desired traffic.
      
      Fix counter to be bumped only for new or active filters.
      
      Fixes: 621650ca
      
       ("i40e: Refactoring VF MAC filters counting to make more reliable")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Reviewed-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89e29416
    • Aleksandr Mishin's avatar
      octeontx2-af: Add array index check · 76c39cf8
      Aleksandr Mishin authored
      commit ef15ddee upstream.
      
      In rvu_map_cgx_lmac_pf() the 'iter', which is used as an array index, can reach
      value (up to 14) that exceed the size (MAX_LMAC_COUNT = 8) of the array.
      Fix this bug by adding 'iter' value check.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 91c6945e
      
       ("octeontx2-af: cn10k: Add RPM MAC support")
      Signed-off-by: default avatarAleksandr Mishin <amishin@t-argos.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76c39cf8
    • Su Hui's avatar
      octeontx2-pf: check negative error code in otx2_open() · 43b69da2
      Su Hui authored
      commit e709acbd upstream.
      
      otx2_rxtx_enable() return negative error code such as -EIO,
      check -EIO rather than EIO to fix this problem.
      
      Fixes: c9262522
      
       ("octeontx2-pf: Disable packet I/O for graceful exit")
      Signed-off-by: default avatarSu Hui <suhui@nfschina.com>
      Reviewed-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Link: https://lore.kernel.org/r/20240328020620.4054692-1-suhui@nfschina.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43b69da2
    • Hariprasad Kelam's avatar
      octeontx2-af: Fix issue with loading coalesced KPU profiles · b08b0c7a
      Hariprasad Kelam authored
      commit 0ba80d96 upstream.
      
      The current implementation for loading coalesced KPU profiles has
      a limitation.  The "offset" field, which is used to locate profiles
      within the profile is restricted to a u16.
      
      This restricts the number of profiles that can be loaded. This patch
      addresses this limitation by increasing the size of the "offset" field.
      
      Fixes: 11c730bf
      
       ("octeontx2-af: support for coalescing KPU profiles")
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b08b0c7a
    • Antoine Tenart's avatar
      udp: prevent local UDP tunnel packets from being GROed · 03b6f369
      Antoine Tenart authored
      commit 64235eab upstream.
      
      GRO has a fundamental issue with UDP tunnel packets as it can't detect
      those in a foolproof way and GRO could happen before they reach the
      tunnel endpoint. Previous commits have fixed issues when UDP tunnel
      packets come from a remote host, but if those packets are issued locally
      they could run into checksum issues.
      
      If the inner packet has a partial checksum the information will be lost
      in the GRO logic, either in udp4/6_gro_complete or in
      udp_gro_complete_segment and packets will have an invalid checksum when
      leaving the host.
      
      Prevent local UDP tunnel packets from ever being GROed at the outer UDP
      level.
      
      Due to skb->encapsulation being wrongly used in some drivers this is
      actually only preventing UDP tunnel packets with a partial checksum to
      be GROed (see iptunnel_handle_offloads) but those were also the packets
      triggering issues so in practice this should be sufficient.
      
      Fixes: 9fd1ff5d ("udp: Support UDP fraglist GRO/GSO.")
      Fixes: 36707061
      
       ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03b6f369
    • Antoine Tenart's avatar
      udp: do not transition UDP GRO fraglist partial checksums to unnecessary · 2a1b61d0
      Antoine Tenart authored
      commit f0b8c303 upstream.
      
      UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
      are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
      this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
      an egress path and then their partial checksums are not fixed.
      
      Different issues can be observed, from invalid checksum on packets to
      traces like:
      
        gen01: hw csum failure
        skb len=3008 headroom=160 headlen=1376 tailroom=0
        mac=(106,14) net=(120,40) trans=160
        shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
        csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
        hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
        ...
      
      Fix this by only converting CHECKSUM_NONE packets to
      CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary. All
      other checksum types are kept as-is, including CHECKSUM_COMPLETE as
      fraglist packets being segmented back would have their skb->csum valid.
      
      Fixes: 9fd1ff5d
      
       ("udp: Support UDP fraglist GRO/GSO.")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a1b61d0
    • Antoine Tenart's avatar
      udp: do not accept non-tunnel GSO skbs landing in a tunnel · 3001e7aa
      Antoine Tenart authored
      commit 3d010c80 upstream.
      
      When rx-udp-gro-forwarding is enabled UDP packets might be GROed when
      being forwarded. If such packets might land in a tunnel this can cause
      various issues and udp_gro_receive makes sure this isn't the case by
      looking for a matching socket. This is performed in
      udp4/6_gro_lookup_skb but only in the current netns. This is an issue
      with tunneled packets when the endpoint is in another netns. In such
      cases the packets will be GROed at the UDP level, which leads to various
      issues later on. The same thing can happen with rx-gro-list.
      
      We saw this with geneve packets being GROed at the UDP level. In such
      case gso_size is set; later the packet goes through the geneve rx path,
      the geneve header is pulled, the offset are adjusted and frag_list skbs
      are not adjusted with regard to geneve. When those skbs hit
      skb_fragment, it will misbehave. Different outcomes are possible
      depending on what the GROed skbs look like; from corrupted packets to
      kernel crashes.
      
      One example is a BUG_ON[1] triggered in skb_segment while processing the
      frag_list. Because gso_size is wrong (geneve header was pulled)
      skb_segment thinks there is "geneve header size" of data in frag_list,
      although it's in fact the next packet. The BUG_ON itself has nothing to
      do with the issue. This is only one of the potential issues.
      
      Looking up for a matching socket in udp_gro_receive is fragile: the
      lookup could be extended to all netns (not speaking about performances)
      but nothing prevents those packets from being modified in between and we
      could still not find a matching socket. It's OK to keep the current
      logic there as it should cover most cases but we also need to make sure
      we handle tunnel packets being GROed too early.
      
      This is done by extending the checks in udp_unexpected_gso: GSO packets
      lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must
      be segmented.
      
      [1] kernel BUG at net/core/skbuff.c:4408!
          RIP: 0010:skb_segment+0xd2a/0xf70
          __udp_gso_segment+0xaa/0x560
      
      Fixes: 9fd1ff5d ("udp: Support UDP fraglist GRO/GSO.")
      Fixes: 36707061
      
       ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3001e7aa
    • Atlas Yu's avatar
      r8169: skip DASH fw status checks when DASH is disabled · a5eae74f
      Atlas Yu authored
      commit 5e864d90 upstream.
      
      On devices that support DASH, the current code in the "rtl_loop_wait" function
      raises false alarms when DASH is disabled. This occurs because the function
      attempts to wait for the DASH firmware to be ready, even though it's not
      relevant in this case.
      
      r8169 0000:0c:00.0 eth0: RTL8168ep/8111ep, 38:7c:76:49:08:d9, XID 502, IRQ 86
      r8169 0000:0c:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
      r8169 0000:0c:00.0 eth0: DASH disabled
      ...
      r8169 0000:0c:00.0 eth0: rtl_ep_ocp_read_cond == 0 (loop: 30, delay: 10000).
      
      This patch modifies the driver start/stop functions to skip checking the DASH
      firmware status when DASH is explicitly disabled. This prevents unnecessary
      delays and false alarms.
      
      The patch has been tested on several ThinkStation P8/PX workstations.
      
      Fixes: 0ab0c45d
      
       ("r8169: add handling DASH when DASH is disabled")
      Signed-off-by: default avatarAtlas Yu <atlas.yu@canonical.com>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/20240328055152.18443-1-atlas.yu@canonical.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5eae74f
    • David Thompson's avatar
      mlxbf_gige: stop interface during shutdown · 36a1cb03
      David Thompson authored
      commit 09ba28e1 upstream.
      
      The mlxbf_gige driver intermittantly encounters a NULL pointer
      exception while the system is shutting down via "reboot" command.
      The mlxbf_driver will experience an exception right after executing
      its shutdown() method.  One example of this exception is:
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
      Mem abort info:
        ESR = 0x0000000096000004
        EC = 0x25: DABT (current EL), IL = 32 bits
        SET = 0, FnV = 0
        EA = 0, S1PTW = 0
        FSC = 0x04: level 0 translation fault
      Data abort info:
        ISV = 0, ISS = 0x00000004
        CM = 0, WnR = 0
      user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000
      [0000000000000070] pgd=0000000000000000, p4d=0000000000000000
      Internal error: Oops: 96000004 [#1] SMP
      CPU: 0 PID: 13 Comm: ksoftirqd/0 Tainted: G S         OE     5.15.0-bf.6.gef6992a #1
      Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.0.2.12669 Apr 21 2023
      pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige]
      lr : mlxbf_gige_poll+0x54/0x160 [mlxbf_gige]
      sp : ffff8000080d3c10
      x29: ffff8000080d3c10 x28: ffffcce72cbb7000 x27: ffff8000080d3d58
      x26: ffff0000814e7340 x25: ffff331cd1a05000 x24: ffffcce72c4ea008
      x23: ffff0000814e4b40 x22: ffff0000814e4d10 x21: ffff0000814e4128
      x20: 0000000000000000 x19: ffff0000814e4a80 x18: ffffffffffffffff
      x17: 000000000000001c x16: ffffcce72b4553f4 x15: ffff80008805b8a7
      x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101
      x11: 7f7f7f7f7f7f7f7f x10: c2ac898b17576267 x9 : ffffcce720fa5404
      x8 : ffff000080812138 x7 : 0000000000002e9a x6 : 0000000000000080
      x5 : ffff00008de3b000 x4 : 0000000000000000 x3 : 0000000000000001
      x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
      Call trace:
       mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige]
       mlxbf_gige_poll+0x54/0x160 [mlxbf_gige]
       __napi_poll+0x40/0x1c8
       net_rx_action+0x314/0x3a0
       __do_softirq+0x128/0x334
       run_ksoftirqd+0x54/0x6c
       smpboot_thread_fn+0x14c/0x190
       kthread+0x10c/0x110
       ret_from_fork+0x10/0x20
      Code: 8b070000 f9000ea0 f95056c0 f86178a1 (b9407002)
      ---[ end trace 7cc3941aa0d8e6a4 ]---
      Kernel panic - not syncing: Oops: Fatal exception in interrupt
      Kernel Offset: 0x4ce722520000 from 0xffff800008000000
      PHYS_OFFSET: 0x80000000
      CPU features: 0x000005c1,a3330e5a
      Memory Limit: none
      ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
      
      During system shutdown, the mlxbf_gige driver's shutdown() is always executed.
      However, the driver's stop() method will only execute if networking interface
      configuration logic within the Linux distribution has been setup to do so.
      
      If shutdown() executes but stop() does not execute, NAPI remains enabled
      and this can lead to an exception if NAPI is scheduled while the hardware
      interface has only been partially deinitialized.
      
      The networking interface managed by the mlxbf_gige driver must be properly
      stopped during system shutdown so that IFF_UP is cleared, the hardware
      interface is put into a clean state, and NAPI is fully deinitialized.
      
      Fixes: f92e1869
      
       ("Add Mellanox BlueField Gigabit Ethernet driver")
      Signed-off-by: default avatarDavid Thompson <davthompson@nvidia.com>
      Link: https://lore.kernel.org/r/20240325210929.25362-1-davthompson@nvidia.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36a1cb03
    • Kuniyuki Iwashima's avatar
      ipv6: Fix infinite recursion in fib6_dump_done(). · f2dd75e5
      Kuniyuki Iwashima authored
      commit d21d4060 upstream.
      
      syzkaller reported infinite recursive calls of fib6_dump_done() during
      netlink socket destruction.  [1]
      
      From the log, syzkaller sent an AF_UNSPEC RTM_GETROUTE message, and then
      the response was generated.  The following recvmmsg() resumed the dump
      for IPv6, but the first call of inet6_dump_fib() failed at kzalloc() due
      to the fault injection.  [0]
      
        12:01:34 executing program 3:
        r0 = socket$nl_route(0x10, 0x3, 0x0)
        sendmsg$nl_route(r0, ... snip ...)
        recvmmsg(r0, ... snip ...) (fail_nth: 8)
      
      Here, fib6_dump_done() was set to nlk_sk(sk)->cb.done, and the next call
      of inet6_dump_fib() set it to nlk_sk(sk)->cb.args[3].  syzkaller stopped
      receiving the response halfway through, and finally netlink_sock_destruct()
      called nlk_sk(sk)->cb.done().
      
      fib6_dump_done() calls fib6_dump_end() and nlk_sk(sk)->cb.done() if it
      is still not NULL.  fib6_dump_end() rewrites nlk_sk(sk)->cb.done() by
      nlk_sk(sk)->cb.args[3], but it has the same function, not NULL, calling
      itself recursively and hitting the stack guard page.
      
      To avoid the issue, let's set the destructor after kzalloc().
      
      [0]:
      FAULT_INJECTION: forcing a failure.
      name failslab, interval 1, probability 0, space 0, times 0
      CPU: 1 PID: 432110 Comm: syz-executor.3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl (lib/dump_stack.c:117)
       should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153)
       should_failslab (mm/slub.c:3733)
       kmalloc_trace (mm/slub.c:3748 mm/slub.c:3827 mm/slub.c:3992)
       inet6_dump_fib (./include/linux/slab.h:628 ./include/linux/slab.h:749 net/ipv6/ip6_fib.c:662)
       rtnl_dump_all (net/core/rtnetlink.c:4029)
       netlink_dump (net/netlink/af_netlink.c:2269)
       netlink_recvmsg (net/netlink/af_netlink.c:1988)
       ____sys_recvmsg (net/socket.c:1046 net/socket.c:2801)
       ___sys_recvmsg (net/socket.c:2846)
       do_recvmmsg (net/socket.c:2943)
       __x64_sys_recvmmsg (net/socket.c:3041 net/socket.c:3034 net/socket.c:3034)
      
      [1]:
      BUG: TASK stack guard page was hit at 00000000f2fa9af1 (stack is 00000000b7912430..000000009a436beb)
      stack guard page: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 223719 Comm: kworker/1:3 Not tainted 6.8.0-12821-g537c2e91d354-dirty #11
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Workqueue: events netlink_sock_destruct_work
      RIP: 0010:fib6_dump_done (net/ipv6/ip6_fib.c:570)
      Code: 3c 24 e8 f3 e9 51 fd e9 28 fd ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 41 57 41 56 41 55 41 54 55 48 89 fd <53> 48 8d 5d 60 e8 b6 4d 07 fd 48 89 da 48 b8 00 00 00 00 00 fc ff
      RSP: 0018:ffffc9000d980000 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: ffffffff84405990 RCX: ffffffff844059d3
      RDX: ffff8881028e0000 RSI: ffffffff84405ac2 RDI: ffff88810c02f358
      RBP: ffff88810c02f358 R08: 0000000000000007 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000224 R12: 0000000000000000
      R13: ffff888007c82c78 R14: ffff888007c82c68 R15: ffff888007c82c68
      FS:  0000000000000000(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffc9000d97fff8 CR3: 0000000102309002 CR4: 0000000000770ef0
      PKRU: 55555554
      Call Trace:
       <#DF>
       </#DF>
       <TASK>
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       ...
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       fib6_dump_done (net/ipv6/ip6_fib.c:572 (discriminator 1))
       netlink_sock_destruct (net/netlink/af_netlink.c:401)
       __sk_destruct (net/core/sock.c:2177 (discriminator 2))
       sk_destruct (net/core/sock.c:2224)
       __sk_free (net/core/sock.c:2235)
       sk_free (net/core/sock.c:2246)
       process_one_work (kernel/workqueue.c:3259)
       worker_thread (kernel/workqueue.c:3329 kernel/workqueue.c:3416)
       kthread (kernel/kthread.c:388)
       ret_from_fork (arch/x86/kernel/process.c:153)
       ret_from_fork_asm (arch/x86/entry/entry_64.S:256)
      Modules linked in:
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240401211003.25274-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2dd75e5
    • Duoming Zhou's avatar
      ax25: fix use-after-free bugs caused by ax25_ds_del_timer · 74204bf9
      Duoming Zhou authored
      commit fd819ad3 upstream.
      
      When the ax25 device is detaching, the ax25_dev_device_down()
      calls ax25_ds_del_timer() to cleanup the slave_timer. When
      the timer handler is running, the ax25_ds_del_timer() that
      calls del_timer() in it will return directly. As a result,
      the use-after-free bugs could happen, one of the scenarios
      is shown below:
      
            (Thread 1)          |      (Thread 2)
                                | ax25_ds_timeout()
      ax25_dev_device_down()    |
        ax25_ds_del_timer()     |
          del_timer()           |
        ax25_dev_put() //FREE   |
                                |  ax25_dev-> //USE
      
      In order to mitigate bugs, when the device is detaching, use
      timer_shutdown_sync() to stop the timer.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240329015023.9223-1-duoming@zju.edu.cn
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74204bf9
    • Kuniyuki Iwashima's avatar
      tcp: Fix bind() regression for v6-only wildcard and v4(-mapped-v6) non-wildcard addresses. · 8b88752d
      Kuniyuki Iwashima authored
      commit d91ef1e1 upstream.
      
      Jianguo Wu reported another bind() regression introduced by bhash2.
      
      Calling bind() for the following 3 addresses on the same port, the
      3rd one should fail but now succeeds.
      
        1. 0.0.0.0 or ::ffff:0.0.0.0
        2. [::] w/ IPV6_V6ONLY
        3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address
      
      The first two bind() create tb2 like this:
      
        bhash2 -> tb2(:: w/ IPV6_V6ONLY) -> tb2(0.0.0.0)
      
      The 3rd bind() will match with the IPv6 only wildcard address bucket
      in inet_bind2_bucket_match_addr_any(), however, no conflicting socket
      exists in the bucket.  So, inet_bhash2_conflict() will returns false,
      and thus, inet_bhash2_addr_any_conflict() returns false consequently.
      
      As a result, the 3rd bind() bypasses conflict check, which should be
      done against the IPv4 wildcard address bucket.
      
      So, in inet_bhash2_addr_any_conflict(), we must iterate over all buckets.
      
      Note that we cannot add ipv6_only flag for inet_bind2_bucket as it
      would confuse the following patetrn.
      
        1. [::] w/ SO_REUSE{ADDR,PORT} and IPV6_V6ONLY
        2. [::] w/ SO_REUSE{ADDR,PORT}
        3. IPv4 non-wildcard address or v4-mapped-v6 non-wildcard address
      
      The first bind() would create a bucket with ipv6_only flag true,
      the second bind() would add the [::] socket into the same bucket,
      and the third bind() could succeed based on the wrong assumption
      that ipv6_only bucket would not conflict with v4(-mapped-v6) address.
      
      Fixes: 28044fc1
      
       ("net: Add a bhash2 table hashed by port and address")
      Diagnosed-by: default avatarJianguo Wu <wujianguo106@163.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240326204251.51301-3-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b88752d
    • Jakub Kicinski's avatar
      selftests: reuseaddr_conflict: add missing new line at the end of the output · 690e877c
      Jakub Kicinski authored
      commit 31974122 upstream.
      
      The netdev CI runs in a VM and captures serial, so stdout and
      stderr get combined. Because there's a missing new line in
      stderr the test ends up corrupting KTAP:
      
        # Successok 1 selftests: net: reuseaddr_conflict
      
      which should have been:
      
        # Success
        ok 1 selftests: net: reuseaddr_conflict
      
      Fixes: 422d8dc6
      
       ("selftest: add a reuseaddr test")
      Reviewed-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Link: https://lore.kernel.org/r/20240329160559.249476-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      690e877c
    • Eric Dumazet's avatar
      erspan: make sure erspan_base_hdr is present in skb->head · 4e3fdeec
      Eric Dumazet authored
      commit 17af4205 upstream.
      
      syzbot reported a problem in ip6erspan_rcv() [1]
      
      Issue is that ip6erspan_rcv() (and erspan_rcv()) no longer make
      sure erspan_base_hdr is present in skb linear part (skb->head)
      before getting @ver field from it.
      
      Add the missing pskb_may_pull() calls.
      
      v2: Reload iph pointer in erspan_rcv() after pskb_may_pull()
          because skb->head might have changed.
      
      [1]
      
       BUG: KMSAN: uninit-value in pskb_may_pull_reason include/linux/skbuff.h:2742 [inline]
       BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2756 [inline]
       BUG: KMSAN: uninit-value in ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline]
       BUG: KMSAN: uninit-value in gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610
        pskb_may_pull_reason include/linux/skbuff.h:2742 [inline]
        pskb_may_pull include/linux/skbuff.h:2756 [inline]
        ip6erspan_rcv net/ipv6/ip6_gre.c:541 [inline]
        gre_rcv+0x11f8/0x1930 net/ipv6/ip6_gre.c:610
        ip6_protocol_deliver_rcu+0x1d4c/0x2ca0 net/ipv6/ip6_input.c:438
        ip6_input_finish net/ipv6/ip6_input.c:483 [inline]
        NF_HOOK include/linux/netfilter.h:314 [inline]
        ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492
        ip6_mc_input+0xa7e/0xc80 net/ipv6/ip6_input.c:586
        dst_input include/net/dst.h:460 [inline]
        ip6_rcv_finish+0x955/0x970 net/ipv6/ip6_input.c:79
        NF_HOOK include/linux/netfilter.h:314 [inline]
        ipv6_rcv+0xde/0x390 net/ipv6/ip6_input.c:310
        __netif_receive_skb_one_core net/core/dev.c:5538 [inline]
        __netif_receive_skb+0x1da/0xa00 net/core/dev.c:5652
        netif_receive_skb_internal net/core/dev.c:5738 [inline]
        netif_receive_skb+0x58/0x660 net/core/dev.c:5798
        tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1549
        tun_get_user+0x5566/0x69e0 drivers/net/tun.c:2002
        tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
        call_write_iter include/linux/fs.h:2108 [inline]
        new_sync_write fs/read_write.c:497 [inline]
        vfs_write+0xb63/0x1520 fs/read_write.c:590
        ksys_write+0x20f/0x4c0 fs/read_write.c:643
        __do_sys_write fs/read_write.c:655 [inline]
        __se_sys_write fs/read_write.c:652 [inline]
        __x64_sys_write+0x93/0xe0 fs/read_write.c:652
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was created at:
        slab_post_alloc_hook mm/slub.c:3804 [inline]
        slab_alloc_node mm/slub.c:3845 [inline]
        kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
        kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
        __alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668
        alloc_skb include/linux/skbuff.h:1318 [inline]
        alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504
        sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795
        tun_alloc_skb drivers/net/tun.c:1525 [inline]
        tun_get_user+0x209a/0x69e0 drivers/net/tun.c:1846
        tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
        call_write_iter include/linux/fs.h:2108 [inline]
        new_sync_write fs/read_write.c:497 [inline]
        vfs_write+0xb63/0x1520 fs/read_write.c:590
        ksys_write+0x20f/0x4c0 fs/read_write.c:643
        __do_sys_write fs/read_write.c:655 [inline]
        __se_sys_write fs/read_write.c:652 [inline]
        __x64_sys_write+0x93/0xe0 fs/read_write.c:652
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      CPU: 1 PID: 5045 Comm: syz-executor114 Not tainted 6.9.0-rc1-syzkaller-00021-g962490525cff #0
      
      Fixes: cb73ee40
      
       ("net: ip_gre: use erspan key field for tunnel lookup")
      Reported-by: default avatar <syzbot+1c1cf138518bf0c53d68@syzkaller.appspotmail.com>
      Closes: https://lore.kernel.org/netdev/000000000000772f2c0614b66ef7@google.com/
      
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Lorenzo Bianconi <lorenzo@kernel.org>
      Link: https://lore.kernel.org/r/20240328112248.1101491-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e3fdeec
    • Ivan Vecera's avatar
      i40e: Fix VF MAC filter removal · a03e138d
      Ivan Vecera authored
      commit ea2a1cfc upstream.
      
      Commit 73d9629e ("i40e: Do not allow untrusted VF to remove
      administratively set MAC") fixed an issue where untrusted VF was
      allowed to remove its own MAC address although this was assigned
      administratively from PF. Unfortunately the introduced check
      is wrong because it causes that MAC filters for other MAC addresses
      including multi-cast ones are not removed.
      
      <snip>
      	if (ether_addr_equal(addr, vf->default_lan_addr.addr) &&
      	    i40e_can_vf_change_mac(vf))
      		was_unimac_deleted = true;
      	else
      		continue;
      
      	if (i40e_del_mac_filter(vsi, al->list[i].addr)) {
      	...
      </snip>
      
      The else path with `continue` effectively skips any MAC filter
      removal except one for primary MAC addr when VF is allowed to do so.
      Fix the check condition so the `continue` is only done for primary
      MAC address.
      
      Fixes: 73d9629e
      
       ("i40e: Do not allow untrusted VF to remove administratively set MAC")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarBrett Creeley <brett.creeley@amd.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20240329180638.211412-1-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a03e138d
    • Petr Oros's avatar
      ice: fix enabling RX VLAN filtering · b9bd1498
      Petr Oros authored
      commit 8edfc7a4 upstream.
      
      ice_port_vlan_on/off() was introduced in commit 2946204b ("ice:
      implement bridge port vlan"). But ice_port_vlan_on() incorrectly assigns
      ena_rx_filtering to inner_vlan_ops in DVM mode.
      This causes an error when rx_filtering cannot be enabled in legacy mode.
      
      Reproducer:
       echo 1 > /sys/class/net/$PF/device/sriov_numvfs
       ip link set $PF vf 0 spoofchk off trust on vlan 3
      dmesg:
       ice 0000:41:00.0: failed to enable Rx VLAN filtering for VF 0 VSI 9 during VF rebuild, error -95
      
      Fixes: 2946204b
      
       ("ice: implement bridge port vlan")
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9bd1498
    • Antoine Tenart's avatar
      gro: fix ownership transfer · fc126c1d
      Antoine Tenart authored
      commit ed4cccef upstream.
      
      If packets are GROed with fraglist they might be segmented later on and
      continue their journey in the stack. In skb_segment_list those skbs can
      be reused as-is. This is an issue as their destructor was removed in
      skb_gro_receive_list but not the reference to their socket, and then
      they can't be orphaned. Fix this by also removing the reference to the
      socket.
      
      For example this could be observed,
      
        kernel BUG at include/linux/skbuff.h:3131!  (skb_orphan)
        RIP: 0010:ip6_rcv_core+0x11bc/0x19a0
        Call Trace:
         ipv6_list_rcv+0x250/0x3f0
         __netif_receive_skb_list_core+0x49d/0x8f0
         netif_receive_skb_list_internal+0x634/0xd40
         napi_complete_done+0x1d2/0x7d0
         gro_cell_poll+0x118/0x1f0
      
      A similar construction is found in skb_gro_receive, apply the same
      change there.
      
      Fixes: 5e10da53
      
       ("skbuff: allow 'slow_gro' for skb carring sock reference")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc126c1d
    • Antoine Tenart's avatar
      selftests: net: gro fwd: update vxlan GRO test expectations · 39864092
      Antoine Tenart authored
      commit 0fb101be upstream.
      
      UDP tunnel packets can't be GRO in-between their endpoints as this
      causes different issues. The UDP GRO fwd vxlan tests were relying on
      this and their expectations have to be fixed.
      
      We keep both vxlan tests and expected no GRO from happening. The vxlan
      UDP GRO bench test was removed as it's not providing any valuable
      information now.
      
      Fixes: a062260a
      
       ("selftests: net: add UDP GRO forwarding self-tests")
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      39864092
    • Michael Krummsdorf's avatar
      net: dsa: mv88e6xxx: fix usable ports on 88e6020 · 23e1c686
      Michael Krummsdorf authored
      commit 625aefac upstream.
      
      The switch has 4 ports with 2 internal PHYs, but ports are numbered up
      to 6, with ports 0, 1, 5 and 6 being usable.
      
      Fixes: 71d94a43
      
       ("net: dsa: mv88e6xxx: add support for MV88E6020 switch")
      Signed-off-by: default avatarMichael Krummsdorf <michael.krummsdorf@tq-group.com>
      Signed-off-by: default avatarMatthias Schiffer <matthias.schiffer@ew.tq-group.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240326123655.40666-1-matthias.schiffer@ew.tq-group.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23e1c686
    • Aleksandr Mishin's avatar
      net: phy: micrel: Fix potential null pointer dereference · 95c1016a
      Aleksandr Mishin authored
      commit 96c15594 upstream.
      
      In lan8814_get_sig_rx() and lan8814_get_sig_tx() ptp_parse_header() may
      return NULL as ptp_header due to abnormal packet type or corrupted packet.
      Fix this bug by adding ptp_header check.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: ece19502
      
       ("net: phy: micrel: 1588 support for LAN8814 phy")
      Signed-off-by: default avatarAleksandr Mishin <amishin@t-argos.ru>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20240329061631.33199-1-amishin@t-argos.ru
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      95c1016a
    • Wei Fang's avatar
      net: fec: Set mac_managed_pm during probe · f996e5ec
      Wei Fang authored
      commit cbc17e78 upstream.
      
      Setting mac_managed_pm during interface up is too late.
      
      In situations where the link is not brought up yet and the system suspends
      the regular PHY power management will run. Since the FEC ETHEREN control
      bit is cleared (automatically) on suspend the controller is off in resume.
      When the regular PHY power management resume path runs in this context it
      will write to the MII_DATA register but nothing will be transmitted on the
      MDIO bus.
      
      This can be observed by the following log:
      
          fec 5b040000.ethernet eth0: MDIO read timeout
          Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0xc8 returns -110
          Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: failed to resume: error -110
      
      The data written will however remain in the MII_DATA register.
      
      When the link later is set to administrative up it will trigger a call to
      fec_restart() which will restore the MII_SPEED register. This triggers the
      quirk explained in f166f890 ("net: ethernet: fec: Replace interrupt
      driven MDIO with polled IO") causing an extra MII_EVENT.
      
      This extra event desynchronizes all the MDIO register reads, causing them
      to complete too early. Leading all reads to read as 0 because
      fec_enet_mdio_wait() returns too early.
      
      When a Microchip LAN8700R PHY is connected to the FEC, the 0 reads causes
      the PHY to be initialized incorrectly and the PHY will not transmit any
      ethernet signal in this state. It cannot be brought out of this state
      without a power cycle of the PHY.
      
      Fixes: 557d5dc8 ("net: fec: use mac-managed PHY PM")
      Closes: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/
      
      
      Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      [jernberg: commit message]
      Signed-off-by: default avatarJohn Ernberg <john.ernberg@actia.se>
      Link: https://lore.kernel.org/r/20240328155909.59613-2-john.ernberg@actia.se
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f996e5ec
    • Duanqiang Wen's avatar
      net: txgbe: fix i2c dev name cannot match clkdev · 22a44eee
      Duanqiang Wen authored
      commit c644920c upstream.
      
      txgbe clkdev shortened clk_name, so i2c_dev info_name
      also need to shorten. Otherwise, i2c_dev cannot initialize
      clock.
      
      Fixes: e30cef00
      
       ("net: txgbe: fix clk_name exceed MAX_DEV_ID limits")
      Signed-off-by: default avatarDuanqiang Wen <duanqiangwen@net-swift.com>
      Link: https://lore.kernel.org/r/20240402021843.126192-1-duanqiangwen@net-swift.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22a44eee
    • Horatiu Vultur's avatar
      net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping · 1e304328
      Horatiu Vultur authored
      commit de99e1ea upstream.
      
      There are 2 issues with the blamed commit.
      1. When the phy is initialized, it would enable the disabled of UDPv4
         checksums. The UDPv6 checksum is already enabled by default. So when
         1-step is configured then it would clear these flags.
      2. After the 1-step is configured, then if 2-step is configured then the
         1-step would be still configured because it is not clearing the flag.
         So the sync frames will still have origin timestamps set.
      
      Fix this by reading first the value of the register and then
      just change bit 12 as this one determines if the timestamp needs to
      be inserted in the frame, without changing any other bits.
      
      Fixes: ece19502
      
       ("net: phy: micrel: 1588 support for LAN8814 phy")
      Signed-off-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: default avatarDivya Koppera <divya.koppera@microchip.com>
      Link: https://lore.kernel.org/r/20240402071634.2483524-1-horatiu.vultur@microchip.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e304328
    • Piotr Wejman's avatar
      net: stmmac: fix rx queue priority assignment · 784a6566
      Piotr Wejman authored
      commit b3da86d4 upstream.
      
      The driver should ensure that same priority is not mapped to multiple
      rx queues. From DesignWare Cores Ethernet Quality-of-Service
      Databook, section 17.1.29 MAC_RxQ_Ctrl2:
      "[...]The software must ensure that the content of this field is
      mutually exclusive to the PSRQ fields for other queues, that is,
      the same priority is not mapped to multiple Rx queues[...]"
      
      Previously rx_queue_priority() function was:
      - clearing all priorities from a queue
      - adding new priorities to that queue
      After this patch it will:
      - first assign new priorities to a queue
      - then remove those priorities from all other queues
      - keep other priorities previously assigned to that queue
      
      Fixes: a8f5102a ("net: stmmac: TX and RX queue priority configuration")
      Fixes: 2142754f
      
       ("net: stmmac: Add MAC related callbacks for XGMAC2")
      Signed-off-by: default avatarPiotr Wejman <piotrwejman90@gmail.com>
      Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      784a6566
    • Eric Dumazet's avatar
      net/sched: fix lockdep splat in qdisc_tree_reduce_backlog() · c040b994
      Eric Dumazet authored
      commit 7eb32236 upstream.
      
      qdisc_tree_reduce_backlog() is called with the qdisc lock held,
      not RTNL.
      
      We must use qdisc_lookup_rcu() instead of qdisc_lookup()
      
      syzbot reported:
      
      WARNING: suspicious RCU usage
      6.1.74-syzkaller #0 Not tainted
      -----------------------------
      net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      3 locks held by udevd/1142:
        #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
        #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
        #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282
        #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline]
        #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297
        #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
        #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
        #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792
      
      stack backtrace:
      CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
      Call Trace:
       <TASK>
        [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline]
        [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106
        [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113
        [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592
        [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305
        [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811
        [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51
        [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline]
        [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723
        [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline]
        [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline]
        [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415
        [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
        [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313
        [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616
        [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline]
        [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700
        [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712
        [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107
        [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656
      
      Fixes: d636fc5d
      
       ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20240402134133.2352776-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c040b994
    • Christophe JAILLET's avatar
      net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45() · f4d1fa51
      Christophe JAILLET authored
      commit c120209b upstream.
      
      The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
      parameters in the same order.
      
      Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
      in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:
      
         int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
      				  int reg, u16 val);
      
      it is likely that the definition is the one to change.
      
      Found with cppcheck, funcArgOrderDifferent.
      
      Fixes: ae271547
      
       ("net: dsa: sja1105: C45 only transactions for PCS")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarMichael Walle <mwalle@kernel.org>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f4d1fa51
    • Eric Dumazet's avatar
      net/sched: act_skbmod: prevent kernel-infoleak · 729ad2ac
      Eric Dumazet authored
      commit d313eb8b upstream.
      
      syzbot found that tcf_skbmod_dump() was copying four bytes
      from kernel stack to user space [1].
      
      The issue here is that 'struct tc_skbmod' has a four bytes hole.
      
      We need to clear the structure before filling fields.
      
      [1]
      BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
       BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
       BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
       BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
        instrument_copy_to_user include/linux/instrumented.h:114 [inline]
        copy_to_user_iter lib/iov_iter.c:24 [inline]
        iterate_ubuf include/linux/iov_iter.h:29 [inline]
        iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
        iterate_and_advance include/linux/iov_iter.h:271 [inline]
        _copy_to_iter+0x366/0x2520 lib/iov_iter.c:185
        copy_to_iter include/linux/uio.h:196 [inline]
        simple_copy_to_iter net/core/datagram.c:532 [inline]
        __skb_datagram_iter+0x185/0x1000 net/core/datagram.c:420
        skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:546
        skb_copy_datagram_msg include/linux/skbuff.h:4050 [inline]
        netlink_recvmsg+0x432/0x1610 net/netlink/af_netlink.c:1962
        sock_recvmsg_nosec net/socket.c:1046 [inline]
        sock_recvmsg+0x2c4/0x340 net/socket.c:1068
        __sys_recvfrom+0x35a/0x5f0 net/socket.c:2242
        __do_sys_recvfrom net/socket.c:2260 [inline]
        __se_sys_recvfrom net/socket.c:2256 [inline]
        __x64_sys_recvfrom+0x126/0x1d0 net/socket.c:2256
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
        pskb_expand_head+0x30f/0x19d0 net/core/skbuff.c:2253
        netlink_trim+0x2c2/0x330 net/netlink/af_netlink.c:1317
        netlink_unicast+0x9f/0x1260 net/netlink/af_netlink.c:1351
        nlmsg_unicast include/net/netlink.h:1144 [inline]
        nlmsg_notify+0x21d/0x2f0 net/netlink/af_netlink.c:2610
        rtnetlink_send+0x73/0x90 net/core/rtnetlink.c:741
        rtnetlink_maybe_send include/linux/rtnetlink.h:17 [inline]
        tcf_add_notify net/sched/act_api.c:2048 [inline]
        tcf_action_add net/sched/act_api.c:2071 [inline]
        tc_ctl_action+0x146e/0x19d0 net/sched/act_api.c:2119
        rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
        netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
        rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
        netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
        ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
        __sys_sendmsg net/socket.c:2667 [inline]
        __do_sys_sendmsg net/socket.c:2676 [inline]
        __se_sys_sendmsg net/socket.c:2674 [inline]
        __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Uninit was stored to memory at:
        __nla_put lib/nlattr.c:1041 [inline]
        nla_put+0x1c6/0x230 lib/nlattr.c:1099
        tcf_skbmod_dump+0x23f/0xc20 net/sched/act_skbmod.c:256
        tcf_action_dump_old net/sched/act_api.c:1191 [inline]
        tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
        tcf_action_dump+0x1fd/0x460 net/sched/act_api.c:1251
        tca_get_fill+0x519/0x7a0 net/sched/act_api.c:1628
        tcf_add_notify_msg net/sched/act_api.c:2023 [inline]
        tcf_add_notify net/sched/act_api.c:2042 [inline]
        tcf_action_add net/sched/act_api.c:2071 [inline]
        tc_ctl_action+0x1365/0x19d0 net/sched/act_api.c:2119
        rtnetlink_rcv_msg+0x1737/0x1900 net/core/rtnetlink.c:6595
        netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2559
        rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6613
        netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
        netlink_unicast+0xf4c/0x1260 net/netlink/af_netlink.c:1361
        netlink_sendmsg+0x10df/0x11f0 net/netlink/af_netlink.c:1905
        sock_sendmsg_nosec net/socket.c:730 [inline]
        __sock_sendmsg+0x30f/0x380 net/socket.c:745
        ____sys_sendmsg+0x877/0xb60 net/socket.c:2584
        ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
        __sys_sendmsg net/socket.c:2667 [inline]
        __do_sys_sendmsg net/socket.c:2676 [inline]
        __se_sys_sendmsg net/socket.c:2674 [inline]
        __x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2674
       do_syscall_64+0xd5/0x1f0
       entry_SYSCALL_64_after_hwframe+0x6d/0x75
      
      Local variable opt created at:
        tcf_skbmod_dump+0x9d/0xc20 net/sched/act_skbmod.c:244
        tcf_action_dump_old net/sched/act_api.c:1191 [inline]
        tcf_action_dump_1+0x85e/0x970 net/sched/act_api.c:1227
      
      Bytes 188-191 of 248 are uninitialized
      Memory access of size 248 starts at ffff888117697680
      Data copied to user address 00007ffe56d855f0
      
      Fixes: 86da71b5
      
       ("net_sched: Introduce skbmod action")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20240403130908.93421-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      729ad2ac
    • Will Deacon's avatar
      KVM: arm64: Ensure target address is granule-aligned for range TLBI · 3dcaf259
      Will Deacon authored
      commit 4c36a156 upstream.
      
      When zapping a table entry in stage2_try_break_pte(), we issue range
      TLB invalidation for the region that was mapped by the table. However,
      we neglect to align the base address down to the granule size and so
      if we ended up reaching the table entry via a misaligned address then
      we will accidentally skip invalidation for some prefix of the affected
      address range.
      
      Align 'ctx->addr' down to the granule size when performing TLB
      invalidation for an unmapped table in stage2_try_break_pte().
      
      Cc: Raghavendra Rao Ananta <rananta@google.com>
      Cc: Gavin Shan <gshan@redhat.com>
      Cc: Shaoqin Huang <shahuang@redhat.com>
      Cc: Quentin Perret <qperret@google.com>
      Fixes: defc8cc7
      
       ("KVM: arm64: Invalidate the table entries upon a range")
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Reviewed-by: default avatarShaoqin Huang <shahuang@redhat.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20240327124853.11206-5-will@kernel.org
      
      
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3dcaf259