Skip to content
  1. Jan 11, 2024
    • Geert Uytterhoeven's avatar
      mmc: core: Cancel delayed work before releasing host · 2813a434
      Geert Uytterhoeven authored
      commit 1036f69e
      
       upstream.
      
      On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
      deferral of the vqmmc-supply regulator:
      
          ------------[ cut here ]------------
          WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
          Modules linked in:
          CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 #101
          Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
          epc : __run_timers.part.0+0x1d0/0x1e8
           ra : __run_timers.part.0+0x134/0x1e8
          epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
           gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
           t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
           s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
           a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
           a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
           s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
           s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
           s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
           s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
           t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
          status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
          [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
          [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
          [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
          [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
          [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
          [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
          ---[ end trace 0000000000000000 ]---
      
      What happens?
      
          renesas_sdhi_probe()
          {
          	tmio_mmc_host_alloc()
      	    mmc_alloc_host()
      		INIT_DELAYED_WORK(&host->detect, mmc_rescan);
      
      	devm_request_irq(tmio_mmc_irq);
      
      	/*
      	 * After this, the interrupt handler may be invoked at any time
      	 *
      	 *  tmio_mmc_irq()
      	 *  {
      	 *	__tmio_mmc_card_detect_irq()
      	 *	    mmc_detect_change()
      	 *		_mmc_detect_change()
      	 *		    mmc_schedule_delayed_work(&host->detect, delay);
      	 *  }
      	 */
      
      	tmio_mmc_host_probe()
      	    tmio_mmc_init_ocr()
      		-EPROBE_DEFER
      
      	tmio_mmc_host_free()
      	    mmc_free_host()
          }
      
      When expire_timers() runs later, it warns because the MMC host structure
      containing the delayed work was freed, and now contains an invalid work
      function pointer.
      
      Fix this by cancelling any pending delayed work before releasing the
      MMC host structure.
      
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Tested-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
      
      
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2813a434
    • Jorge Ramirez-Ortiz's avatar
      mmc: rpmb: fixes pause retune on all RPMB partitions. · 575e1270
      Jorge Ramirez-Ortiz authored
      commit e7794c14 upstream.
      
      When RPMB was converted to a character device, it added support for
      multiple RPMB partitions (Commit 97548575 ("mmc: block: Convert RPMB to
      a character device").
      
      One of the changes in this commit was transforming the variable target_part
      defined in __mmc_blk_ioctl_cmd into a bitmask. This inadvertently regressed
      the validation check done in mmc_blk_part_switch_pre() and
      mmc_blk_part_switch_post(), so let's fix it.
      
      Fixes: 97548575
      
       ("mmc: block: Convert RPMB to a character device")
      Signed-off-by: default avatarJorge Ramirez-Ortiz <jorge@foundries.io>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20231201153143.1449753-1-jorge@foundries.io
      
      
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      575e1270
    • Ziyang Huang's avatar
      mmc: meson-mx-sdhc: Fix initialization frozen issue · 9c5efaa0
      Ziyang Huang authored
      commit 8c124d99 upstream.
      
      Commit 4bc31ede ("mmc: core: Set HS clock speed before sending
      HS CMD13") set HS clock (52MHz) before switching to HS mode. For this
      freq, FCLK_DIV5 will be selected and div value is 10 (reg value is 9).
      Then we set rx_clk_phase to 11 or 15 which is out of range and make
      hardware frozen. After we send command request, no irq will be
      interrupted and the mmc driver will keep to wait for request finished,
      even durning rebooting.
      
      So let's set it to Phase 90 which should work in most cases. Then let
      meson_mx_sdhc_execute_tuning() to find the accurate value for data
      transfer.
      
      If this doesn't work, maybe need to define a factor in dts.
      
      Fixes: e4bf1b09
      
       ("mmc: host: meson-mx-sdhc: new driver for the Amlogic Meson SDHC host")
      Signed-off-by: default avatarZiyang Huang <hzyitc@outlook.com>
      Tested-by: default avatarAnand Moon <linux.amoon@gmail.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/TYZPR01MB5556A3E71554A2EC08597EA4C9CDA@TYZPR01MB5556.apcprd01.prod.exchangelabs.com
      
      
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c5efaa0
    • Alex Deucher's avatar
      drm/amd/display: add nv12 bounding box · 48e1d426
      Alex Deucher authored
      commit 7e725c20 upstream.
      
      This was included in gpu_info firmware, move it into the
      driver for consistency with other nv1x parts.
      
      Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2318
      
      
      Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48e1d426
    • Alex Deucher's avatar
      drm/amdgpu: skip gpu_info fw loading on navi12 · 11c3510d
      Alex Deucher authored
      commit 21f6137c upstream.
      
      It's no longer required.
      
      Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2318
      
      
      Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11c3510d
    • Jiajun Xie's avatar
      mm: fix unmap_mapping_range high bits shift bug · dafdeb7b
      Jiajun Xie authored
      commit 9eab0421 upstream.
      
      The bug happens when highest bit of holebegin is 1, suppose holebegin is
      0x8000000111111000, after shift, hba would be 0xfff8000000111111, then
      vma_interval_tree_foreach would look it up fail or leads to the wrong
      result.
      
      error call seq e.g.:
      - mmap(..., offset=0x8000000111111000)
        |- syscall(mmap, ... unsigned long, off):
           |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT);
      
        here pgoff is correctly shifted to 0x8000000111111,
        but pass 0x8000000111111000 as holebegin to unmap
        would then cause terrible result, as shown below:
      
      - unmap_mapping_range(..., loff_t const holebegin)
        |- pgoff_t hba = holebegin >> PAGE_SHIFT;
                /* hba = 0xfff8000000111111 unexpectedly */
      
      The issue happens in Heterogeneous computing, where the device(e.g.
      gpu) and host share the same virtual address space.
      
      A simple workflow pattern which hit the issue is:
              /* host */
          1. userspace first mmap a file backed VA range with specified offset.
                              e.g. (offset=0x800..., mmap return: va_a)
          2. write some data to the corresponding sys page
                               e.g. (va_a = 0xAABB)
              /* device */
          3. gpu workload touches VA, triggers gpu fault and notify the host.
              /* host */
          4. reviced gpu fault notification, then it will:
                  4.1 unmap host pages and also takes care of cpu tlb
                        (use unmap_mapping_range with offset=0x800...)
                  4.2 migrate sys page to device
                  4.3 setup device page table and resolve device fault.
              /* device */
          5. gpu workload continued, it accessed va_a and got 0xAABB.
          6. gpu workload continued, it wrote 0xBBCC to va_a.
              /* host */
          7. userspace access va_a, as expected, it will:
                  7.1 trigger cpu vm fault.
                  7.2 driver handling fault to migrate gpu local page to host.
          8. userspace then could correctly get 0xBBCC from va_a
          9. done
      
      But in step 4.1, if we hit the bug this patch mentioned, then userspace
      would never trigger cpu fault, and still get the old value: 0xAABB.
      
      Making holebegin unsigned first fixes the bug.
      
      Link: https://lkml.kernel.org/r/20231220052839.26970-1-jiajun.xie.sh@gmail.com
      
      
      Signed-off-by: default avatarJiajun Xie <jiajun.xie.sh@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dafdeb7b
    • Benjamin Bara's avatar
      i2c: core: Fix atomic xfer check for non-preempt config · 08038069
      Benjamin Bara authored
      commit a3368e11 upstream.
      
      Since commit aa49c908 ("i2c: core: Run atomic i2c xfer when
      !preemptible"), the whole reboot/power off sequence on non-preempt kernels
      is using atomic i2c xfer, as !preemptible() always results to 1.
      
      During device_shutdown(), the i2c might be used a lot and not all busses
      have implemented an atomic xfer handler. This results in a lot of
      avoidable noise, like:
      
      [   12.687169] No atomic I2C transfer handler for 'i2c-0'
      [   12.692313] WARNING: CPU: 6 PID: 275 at drivers/i2c/i2c-core.h:40 i2c_smbus_xfer+0x100/0x118
      ...
      
      Fix this by allowing non-atomic xfer when the interrupts are enabled, as
      it was before.
      
      Link: https://lore.kernel.org/r/20231222230106.73f030a5@yea
      Link: https://lore.kernel.org/r/20240102150350.3180741-1-mwalle@kernel.org
      Link: https://lore.kernel.org/linux-i2c/13271b9b-4132-46ef-abf8-2c311967bb46@mailbox.org/
      Fixes: aa49c908
      
       ("i2c: core: Run atomic i2c xfer when !preemptible")
      Cc: stable@vger.kernel.org # v5.2+
      Signed-off-by: default avatarBenjamin Bara <benjamin.bara@skidata.com>
      Tested-by: default avatarMichael Walle <mwalle@kernel.org>
      Tested-by: default avatarTor Vic <torvic9@mailbox.org>
      [wsa: removed a comment which needs more work, code is ok]
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08038069
    • Jinghao Jia's avatar
      x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect · 53b42cb3
      Jinghao Jia authored
      commit f5d03da4 upstream.
      
      kprobe_emulate_call_indirect currently uses int3_emulate_call to emulate
      indirect calls. However, int3_emulate_call always assumes the size of
      the call to be 5 bytes when calculating the return address. This is
      incorrect for register-based indirect calls in x86, which can be either
      2 or 3 bytes depending on whether REX prefix is used. At kprobe runtime,
      the incorrect return address causes control flow to land onto the wrong
      place after return -- possibly not a valid instruction boundary. This
      can lead to a panic like the following:
      
      [    7.308204][    C1] BUG: unable to handle page fault for address: 000000000002b4d8
      [    7.308883][    C1] #PF: supervisor read access in kernel mode
      [    7.309168][    C1] #PF: error_code(0x0000) - not-present page
      [    7.309461][    C1] PGD 0 P4D 0
      [    7.309652][    C1] Oops: 0000 [#1] SMP
      [    7.309929][    C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.7.0-rc5-trace-for-next #6
      [    7.310397][    C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
      [    7.311068][    C1] RIP: 0010:__common_interrupt+0x52/0xc0
      [    7.311349][    C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 <8b> 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
      [    7.312512][    C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
      [    7.312899][    C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
      [    7.313334][    C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
      [    7.313702][    C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
      [    7.314146][    C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
      [    7.314509][    C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
      [    7.314951][    C1] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
      [    7.315396][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    7.315691][    C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
      [    7.316153][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [    7.316508][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [    7.316948][    C1] Call Trace:
      [    7.317123][    C1]  <IRQ>
      [    7.317279][    C1]  ? __die_body+0x64/0xb0
      [    7.317482][    C1]  ? page_fault_oops+0x248/0x370
      [    7.317712][    C1]  ? __wake_up+0x96/0xb0
      [    7.317964][    C1]  ? exc_page_fault+0x62/0x130
      [    7.318211][    C1]  ? asm_exc_page_fault+0x22/0x30
      [    7.318444][    C1]  ? __cfi_native_send_call_func_single_ipi+0x10/0x10
      [    7.318860][    C1]  ? default_idle+0xb/0x10
      [    7.319063][    C1]  ? __common_interrupt+0x52/0xc0
      [    7.319330][    C1]  common_interrupt+0x78/0x90
      [    7.319546][    C1]  </IRQ>
      [    7.319679][    C1]  <TASK>
      [    7.319854][    C1]  asm_common_interrupt+0x22/0x40
      [    7.320082][    C1] RIP: 0010:default_idle+0xb/0x10
      [    7.320309][    C1] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 66 90 0f 00 2d 09 b9 3b 00 fb f4 <fa> c3 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 e9
      [    7.321449][    C1] RSP: 0018:ffffc9000009bee8 EFLAGS: 00000256
      [    7.321808][    C1] RAX: ffff88813bca8b68 RBX: 0000000000000001 RCX: 000000000001ef0c
      [    7.322227][    C1] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000001ef0c
      [    7.322656][    C1] RBP: ffffc9000009bef8 R08: 8000000000000000 R09: 00000000000008c2
      [    7.323083][    C1] R10: 0000000000000000 R11: ffffffff81058e70 R12: 0000000000000000
      [    7.323530][    C1] R13: ffff8881002b30c0 R14: 0000000000000000 R15: 0000000000000000
      [    7.323948][    C1]  ? __cfi_lapic_next_deadline+0x10/0x10
      [    7.324239][    C1]  default_idle_call+0x31/0x50
      [    7.324464][    C1]  do_idle+0xd3/0x240
      [    7.324690][    C1]  cpu_startup_entry+0x25/0x30
      [    7.324983][    C1]  start_secondary+0xb4/0xc0
      [    7.325217][    C1]  secondary_startup_64_no_verify+0x179/0x17b
      [    7.325498][    C1]  </TASK>
      [    7.325641][    C1] Modules linked in:
      [    7.325906][    C1] CR2: 000000000002b4d8
      [    7.326104][    C1] ---[ end trace 0000000000000000 ]---
      [    7.326354][    C1] RIP: 0010:__common_interrupt+0x52/0xc0
      [    7.326614][    C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 <8b> 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
      [    7.327570][    C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
      [    7.327910][    C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
      [    7.328273][    C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
      [    7.328632][    C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
      [    7.329223][    C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
      [    7.329780][    C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
      [    7.330193][    C1] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
      [    7.330632][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    7.331050][    C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
      [    7.331454][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [    7.331854][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [    7.332236][    C1] Kernel panic - not syncing: Fatal exception in interrupt
      [    7.332730][    C1] Kernel Offset: disabled
      [    7.333044][    C1] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      The relevant assembly code is (from objdump, faulting address
      highlighted):
      
      ffffffff8102ed9d:       41 ff d3                  call   *%r11
      ffffffff8102eda0:       65 48 <8b> 05 30 c7 ff    mov    %gs:0x7effc730(%rip),%rax
      
      The emulation incorrectly sets the return address to be ffffffff8102ed9d
      + 0x5 = ffffffff8102eda2, which is the 8b byte in the middle of the next
      mov. This in turn causes incorrect subsequent instruction decoding and
      eventually triggers the page fault above.
      
      Instead of invoking int3_emulate_call, perform push and jmp emulation
      directly in kprobe_emulate_call_indirect. At this point we can obtain
      the instruction size from p->ainsn.size so that we can calculate the
      correct return address.
      
      Link: https://lore.kernel.org/all/20240102233345.385475-1-jinghao7@illinois.edu/
      
      Fixes: 6256e668
      
       ("x86/kprobes: Use int3 instead of debug trap for single-step")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJinghao Jia <jinghao7@illinois.edu>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53b42cb3
    • Takashi Sakamoto's avatar
      firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and... · d1db1ef5
      Takashi Sakamoto authored
      firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
      
      commit ac9184fb upstream.
      
      VIA VT6306/6307/6308 provides PCI interface compliant to 1394 OHCI. When
      the hardware is combined with Asmedia ASM1083/1085 PCIe-to-PCI bus bridge,
      it appears that accesses to its 'Isochronous Cycle Timer' register (offset
      0xf0 on PCI memory space) often causes unexpected system reboot in any
      type of AMD Ryzen machine (both 0x17 and 0x19 families). It does not
      appears in the other type of machine (AMD pre-Ryzen machine, Intel
      machine, at least), or in the other OHCI 1394 hardware (e.g. Texas
      Instruments).
      
      The issue explicitly appears at a commit dcadfd7f
      
       ("firewire: core:
      use union for callback of transaction completion") added to v6.5 kernel.
      It changed 1394 OHCI driver to access to the register every time to
      dispatch local asynchronous transaction. However, the issue exists in
      older version of kernel as long as it runs in AMD Ryzen machine, since
      the access to the register is required to maintain bus time. It is not
      hard to imagine that users experience the unexpected system reboot when
      generating bus reset by plugging any devices in, or reading the register
      by time-aware application programs; e.g. audio sample processing.
      
      This commit suppresses the unexpected system reboot in the combination of
      hardware. It avoids the access itself. As a result, the software stack can
      not provide the hardware time anymore to unit drivers, userspace
      applications, and nodes in the same IEEE 1394 bus. It brings apparent
      disadvantage since time-aware application programs require it, while
      time-unaware applications are available again; e.g. sbp2.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarJiri Slaby <jirislaby@kernel.org>
      Closes: https://bugzilla.suse.com/show_bug.cgi?id=1215436
      
      
      Reported-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217994
      
      
      Reported-by: default avatarTobias Gruetzmacher <tobias-lists@23.gs>
      Closes: https://sourceforge.net/p/linux1394/mailman/message/58711901/
      Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2240973
      Closes: https://bugs.launchpad.net/linux/+bug/2043905
      Link: https://lore.kernel.org/r/20240102110150.244475-1-o-takashi@sakamocchi.jp
      
      
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1db1ef5
    • Mathieu Desnoyers's avatar
      ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg() · 09a44d99
      Mathieu Desnoyers authored
      [ Upstream commit dec89008 ]
      
      The following race can cause rb_time_read() to observe a corrupted time
      stamp:
      
      rb_time_cmpxchg()
      [...]
              if (!rb_time_read_cmpxchg(&t->msb, msb, msb2))
                      return false;
              if (!rb_time_read_cmpxchg(&t->top, top, top2))
                      return false;
      <interrupted before updating bottom>
      __rb_time_read()
      [...]
              do {
                      c = local_read(&t->cnt);
                      top = local_read(&t->top);
                      bottom = local_read(&t->bottom);
                      msb = local_read(&t->msb);
              } while (c != local_read(&t->cnt));
      
              *cnt = rb_time_cnt(top);
      
              /* If top and msb counts don't match, this interrupted a write */
              if (*cnt != rb_time_cnt(msb))
                      return false;
                ^ this check fails to catch that "bottom" is still not updated.
      
      So the old "bottom" value is returned, which is wrong.
      
      Fix this by checking that all three of msb, top, and bottom 2-bit cnt
      values match.
      
      The reason to favor checking all three fields over requiring a specific
      update order for both rb_time_set() and rb_time_cmpxchg() is because
      checking all three fields is more robust to handle partial failures of
      rb_time_cmpxchg() when interrupted by nested rb_time_set().
      
      Link: https://lore.kernel.org/lkml/20231211201324.652870-1-mathieu.desnoyers@efficios.com/
      Link: https://lore.kernel.org/linux-trace-kernel/20231212193049.680122-1-mathieu.desnoyers@efficios.com
      
      Fixes: f458a145
      
       ("ring-buffer: Test last update in 32bit version of __rb_time_read()")
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      09a44d99
    • Christoph Hellwig's avatar
      btrfs: mark the len field in struct btrfs_ordered_sum as unsigned · 820a7802
      Christoph Hellwig authored
      [ Upstream commit 6e4b2479
      
       ]
      
      len can't ever be negative, so mark it as an u32 instead of int.
      
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Stable-dep-of: 9e65bfca
      
       ("btrfs: fix qgroup_free_reserved_data int overflow")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      820a7802
    • Boris Burkov's avatar
      btrfs: fix qgroup_free_reserved_data int overflow · ab220f4f
      Boris Burkov authored
      [ Upstream commit 9e65bfca
      
       ]
      
      The reserved data counter and input parameter is a u64, but we
      inadvertently accumulate it in an int. Overflowing that int results in
      freeing the wrong amount of data and breaking reserve accounting.
      
      Unfortunately, this overflow rot spreads from there, as the qgroup
      release/free functions rely on returning an int to take advantage of
      negative values for error codes.
      
      Therefore, the full fix is to return the "released" or "freed" amount by
      a u64 argument and to return 0 or negative error code via the return
      value.
      
      Most of the call sites simply ignore the return value, though some
      of them handle the error and count the returned bytes. Change all of
      them accordingly.
      
      CC: stable@vger.kernel.org # 6.1+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarBoris Burkov <boris@bur.io>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ab220f4f
    • Rakesh Babu Saladi's avatar
      octeontx2-af: Support variable number of lmacs · 0f74dde5
      Rakesh Babu Saladi authored
      [ Upstream commit f2e664ad
      
       ]
      
      Most of the code in CGX/RPM driver assumes that max lmacs per
      given MAC as always, 4 and the number of MAC blocks also as 4.
      With this assumption, the max number of interfaces supported is
      hardcoded to 16. This creates a problem as next gen CN10KB silicon
      MAC supports 8 lmacs per MAC block.
      
      This patch solves the problem by using "max lmac per MAC block"
      value from constant csrs and uses cgx_cnt_max value which is
      populated based number of MAC blocks supported by silicon.
      
      Signed-off-by: default avatarRakesh Babu Saladi <rsaladi2@marvell.com>
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Stable-dep-of: e307b5a8
      
       ("octeontx2-af: Fix pause frame configuration")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0f74dde5
    • Hariprasad Kelam's avatar
      octeontx2-af: Fix pause frame configuration · 7d391261
      Hariprasad Kelam authored
      [ Upstream commit e307b5a8 ]
      
      The current implementation's default Pause Forward setting is causing
      unnecessary network traffic. This patch disables Pause Forward to
      address this issue.
      
      Fixes: 1121f6b0
      
       ("octeontx2-af: Priority flow control configuration support")
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7d391261
    • Vlad Buslov's avatar
      net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table · a29b15cc
      Vlad Buslov authored
      [ Upstream commit 125f1c7f ]
      
      The referenced change added custom cleanup code to act_ct to delete any
      callbacks registered on the parent block when deleting the
      tcf_ct_flow_table instance. However, the underlying issue is that the
      drivers don't obtain the reference to the tcf_ct_flow_table instance when
      registering callbacks which means that not only driver callbacks may still
      be on the table when deleting it but also that the driver can still have
      pointers to its internal nf_flowtable and can use it concurrently which
      results either warning in netfilter[0] or use-after-free.
      
      Fix the issue by taking a reference to the underlying struct
      tcf_ct_flow_table instance when registering the callback and release the
      reference when unregistering. Expose new API required for such reference
      counting by adding two new callbacks to nf_flowtable_type and implementing
      them for act_ct flowtable_ct type. This fixes the issue by extending the
      lifetime of nf_flowtable until all users have unregistered.
      
      [0]:
      [106170.938634] ------------[ cut here ]------------
      [106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm
      ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis
      try overlay mlx5_core
      [106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1
      [106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
      [106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 <0f> 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00
      [106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202
      [106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48
      [106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000
      [106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700
      [106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58
      [106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000
      [106170.951616] FS:  0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000
      [106170.952329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0
      [106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [106170.954766] Call Trace:
      [106170.955057]  <TASK>
      [106170.955315]  ? __warn+0x79/0x120
      [106170.955648]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.956172]  ? report_bug+0x17c/0x190
      [106170.956537]  ? handle_bug+0x3c/0x60
      [106170.956891]  ? exc_invalid_op+0x14/0x70
      [106170.957264]  ? asm_exc_invalid_op+0x16/0x20
      [106170.957666]  ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core]
      [106170.958172]  ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core]
      [106170.958788]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.959339]  ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core]
      [106170.959854]  ? mapping_remove+0x154/0x1d0 [mlx5_core]
      [106170.960342]  ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core]
      [106170.960927]  mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core]
      [106170.961441]  mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core]
      [106170.962001]  mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core]
      [106170.962524]  mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core]
      [106170.963034]  mlx5e_flow_put+0x73/0xe0 [mlx5_core]
      [106170.963506]  mlx5e_put_flow_list+0x38/0x70 [mlx5_core]
      [106170.964002]  mlx5e_rep_update_flows+0xec/0x290 [mlx5_core]
      [106170.964525]  mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core]
      [106170.965056]  process_one_work+0x13a/0x2c0
      [106170.965443]  worker_thread+0x2e5/0x3f0
      [106170.965808]  ? rescuer_thread+0x410/0x410
      [106170.966192]  kthread+0xc6/0xf0
      [106170.966515]  ? kthread_complete_and_exit+0x20/0x20
      [106170.966970]  ret_from_fork+0x2d/0x50
      [106170.967332]  ? kthread_complete_and_exit+0x20/0x20
      [106170.967774]  ret_from_fork_asm+0x11/0x20
      [106170.970466]  </TASK>
      [106170.970726] ---[ end trace 0000000000000000 ]---
      
      Fixes: 77ac5e40
      
       ("net/sched: act_ct: remove and free nf_table callbacks")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a29b15cc
    • Pablo Neira Ayuso's avatar
      netfilter: flowtable: GC pushes back packets to classic path · 2bb4ecb3
      Pablo Neira Ayuso authored
      [ Upstream commit 735795f6 ]
      
      Since 41f2c7c3 ("net/sched: act_ct: Fix promotion of offloaded
      unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY
      back to classic path in every run, ie. every second. This is because of
      a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct.
      
      In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on
      and IPS_SEEN_REPLY is unreliable since users decide when to offload the
      flow before, such bit might be set on at a later stage.
      
      Fix it by adding a custom .gc handler that sched/act_ct can use to
      deal with its NF_FLOW_HW_ESTABLISHED bit.
      
      Fixes: 41f2c7c3
      
       ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
      Reported-by: default avatarVladimir Smelhaus <vl.sm@email.cz>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Stable-dep-of: 125f1c7f
      
       ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2bb4ecb3
    • Paul Blakey's avatar
      net/sched: act_ct: Fix promotion of offloaded unreplied tuple · df01de08
      Paul Blakey authored
      [ Upstream commit 41f2c7c3 ]
      
      Currently UNREPLIED and UNASSURED connections are added to the nf flow
      table. This causes the following connection packets to be processed
      by the flow table which then skips conntrack_in(), and thus such the
      connections will remain UNREPLIED and UNASSURED even if reply traffic
      is then seen. Even still, the unoffloaded reply packets are the ones
      triggering hardware update from new to established state, and if
      there aren't any to triger an update and/or previous update was
      missed, hardware can get out of sync with sw and still mark
      packets as new.
      
      Fix the above by:
      1) Not skipping conntrack_in() for UNASSURED packets, but still
         refresh for hardware, as before the cited patch.
      2) Try and force a refresh by reply-direction packets that update
         the hardware rules from new to established state.
      3) Remove any bidirectional flows that didn't failed to update in
         hardware for re-insertion as bidrectional once any new packet
         arrives.
      
      Fixes: 6a9bad00
      
       ("net/sched: act_ct: offload UDP NEW connections")
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Stable-dep-of: 125f1c7f
      
       ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df01de08
    • Vlad Buslov's avatar
      net/sched: act_ct: offload UDP NEW connections · 87466a37
      Vlad Buslov authored
      [ Upstream commit 6a9bad00
      
       ]
      
      Modify the offload algorithm of UDP connections to the following:
      
      - Offload NEW connection as unidirectional.
      
      - When connection state changes to ESTABLISHED also update the hardware
      flow. However, in order to prevent act_ct from spamming offload add wq for
      every packet coming in reply direction in this state verify whether
      connection has already been updated to ESTABLISHED in the drivers. If that
      it the case, then skip flow_table and let conntrack handle such packets
      which will also allow conntrack to potentially promote the connection to
      ASSURED.
      
      - When connection state changes to ASSURED set the flow_table flow
      NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
      the reply direction.
      
      All other protocols have their offload algorithm preserved and are always
      offloaded as bidirectional.
      
      Note that this change tries to minimize the load on flow_table add
      workqueue. First, it tracks the last ctinfo that was offloaded by using new
      flow 'NF_FLOW_HW_ESTABLISHED' flag and doesn't schedule the refresh for
      reply direction packets when the offloads have already been updated with
      current ctinfo. Second, when 'add' task executes on workqueue it always
      update the offload with current flow state (by checking 'bidirectional'
      flow flag and obtaining actual ctinfo/cookie through meta action instead of
      caching any of these from the moment of scheduling the 'add' work)
      preventing the need from scheduling more updates if state changed
      concurrently while the 'add' work was pending on workqueue.
      
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: 125f1c7f
      
       ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      87466a37
    • Vlad Buslov's avatar
      netfilter: flowtable: cache info of last offload · 8b160f2f
      Vlad Buslov authored
      [ Upstream commit 1a441a9b
      
       ]
      
      Modify flow table offload to cache the last ct info status that was passed
      to the driver offload callbacks by extending enum nf_flow_flags with new
      "NF_FLOW_HW_ESTABLISHED" flag. Set the flag if ctinfo was 'established'
      during last act_ct meta actions fill call. This infrastructure change is
      necessary to optimize promoting of UDP connections from 'new' to
      'established' in following patches in this series.
      
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: 125f1c7f
      
       ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b160f2f
    • Vlad Buslov's avatar
      netfilter: flowtable: allow unidirectional rules · c29a7656
      Vlad Buslov authored
      [ Upstream commit 8f84780b
      
       ]
      
      Modify flow table offload to support unidirectional connections by
      extending enum nf_flow_flags with new "NF_FLOW_HW_BIDIRECTIONAL" flag. Only
      offload reply direction when the flag is set. This infrastructure change is
      necessary to support offloading UDP NEW connections in original direction
      in following patches in series.
      
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: 125f1c7f
      
       ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c29a7656
    • Xin Long's avatar
      net: sched: call tcf_ct_params_free to free params in tcf_ct_init · e681f711
      Xin Long authored
      [ Upstream commit 19138941
      
       ]
      
      This patch is to make the err path simple by calling tcf_ct_params_free(),
      so that it won't cause problems when more members are added into param and
      need freeing on the err path.
      
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Stable-dep-of: 125f1c7f
      
       ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e681f711
    • Sumanth Korikkar's avatar
      mm/memory_hotplug: fix error handling in add_memory_resource() · d49bf9c1
      Sumanth Korikkar authored
      [ Upstream commit f42ce5f0 ]
      
      In add_memory_resource(), creation of memory block devices occurs after
      successful call to arch_add_memory().  However, creation of memory block
      devices could fail.  In that case, arch_remove_memory() is called to
      perform necessary cleanup.
      
      Currently with or without altmap support, arch_remove_memory() is always
      passed with altmap set to NULL during error handling.  This leads to
      freeing of struct pages using free_pages(), eventhough the allocation
      might have been performed with altmap support via
      altmap_alloc_block_buf().
      
      Fix the error handling by passing altmap in arch_remove_memory(). This
      ensures the following:
      * When altmap is disabled, deallocation of the struct pages array occurs
        via free_pages().
      * When altmap is enabled, deallocation occurs via vmem_altmap_free().
      
      Link: https://lkml.kernel.org/r/20231120145354.308999-3-sumanthk@linux.ibm.com
      Fixes: a08a2ae3
      
       ("mm,memory_hotplug: allocate memmap from the added memory range")
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Reviewed-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: kernel test robot <lkp@intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: <stable@vger.kernel.org>	[5.15+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d49bf9c1
    • Sumanth Korikkar's avatar
      mm/memory_hotplug: add missing mem_hotplug_lock · 4666f003
      Sumanth Korikkar authored
      [ Upstream commit 001002e7 ]
      
      From Documentation/core-api/memory-hotplug.rst:
      When adding/removing/onlining/offlining memory or adding/removing
      heterogeneous/device memory, we should always hold the mem_hotplug_lock
      in write mode to serialise memory hotplug (e.g. access to global/zone
      variables).
      
      mhp_(de)init_memmap_on_memory() functions can change zone stats and
      struct page content, but they are currently called w/o the
      mem_hotplug_lock.
      
      When memory block is being offlined and when kmemleak goes through each
      populated zone, the following theoretical race conditions could occur:
      CPU 0:					     | CPU 1:
      memory_offline()			     |
      -> offline_pages()			     |
      	-> mem_hotplug_begin()		     |
      	   ...				     |
      	-> mem_hotplug_done()		     |
      					     | kmemleak_scan()
      					     | -> get_online_mems()
      					     |    ...
      -> mhp_deinit_memmap_on_memory()	     |
        [not protected by mem_hotplug_begin/done()]|
        Marks memory section as offline,	     |   Retrieves zone_start_pfn
        poisons vmemmap struct pages and updates   |   and struct page members.
        the zone related data			     |
         					     |    ...
         					     | -> put_online_mems()
      
      Fix this by ensuring mem_hotplug_lock is taken before performing
      mhp_init_memmap_on_memory().  Also ensure that
      mhp_deinit_memmap_on_memory() holds the lock.
      
      online/offline_pages() are currently only called from
      memory_block_online/offline(), so it is safe to move the locking there.
      
      Link: https://lkml.kernel.org/r/20231120145354.308999-2-sumanthk@linux.ibm.com
      Fixes: a08a2ae3
      
       ("mm,memory_hotplug: allocate memmap from the added memory range")
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Reviewed-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: kernel test robot <lkp@intel.com>
      Cc: <stable@vger.kernel.org>	[5.15+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4666f003
    • Ming Lei's avatar
      lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly · a576780a
      Ming Lei authored
      [ Upstream commit 0263f92f ]
      
      group_cpus_evenly() could be part of storage driver's error handler, such
      as nvme driver, when may happen during CPU hotplug, in which storage queue
      has to drain its pending IOs because all CPUs associated with the queue
      are offline and the queue is becoming inactive.  And handling IO needs
      error handler to provide forward progress.
      
      Then deadlock is caused:
      
      1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's
         handler is waiting for inflight IO
      
      2) error handler is waiting for CPU hotplug lock
      
      3) inflight IO can't be completed in blk-mq's CPU hotplug handler
         because error handling can't provide forward progress.
      
      Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(),
      in which two stage spreads are taken: 1) the 1st stage is over all present
      CPUs; 2) the end stage is over all other CPUs.
      
      Turns out the two stage spread just needs consistent 'cpu_present_mask',
      and remove the CPU hotplug lock by storing it into one local cache.  This
      way doesn't change correctness, because all CPUs are still covered.
      
      Link: https://lkml.kernel.org/r/20231120083559.285174-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reported-by: default avatarYi Zhang <yi.zhang@redhat.com>
      Reported-by: default avatarGuangwu Zhang <guazhang@redhat.com>
      Tested-by: default avatarGuangwu Zhang <guazhang@redhat.com>
      Reviewed-by: default avatarChengming Zhou <zhouchengming@bytedance.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Keith Busch <kbusch@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a576780a
    • Ming Lei's avatar
      genirq/affinity: Move group_cpus_evenly() into lib/ · f33b27f5
      Ming Lei authored
      [ Upstream commit f7b3ea8c
      
       ]
      
      group_cpus_evenly() has become a generic function which can be used for
      other subsystems than the interrupt subsystem, so move it into lib/.
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/20221227022905.352674-6-ming.lei@redhat.com
      Stable-dep-of: 0263f92f
      
       ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f33b27f5
    • Ming Lei's avatar
      genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly · 617ba373
      Ming Lei authored
      [ Upstream commit 523f1ea7
      
       ]
      
      Map irq vector into group, which allows to abstract the algorithm for
      a generic use case outside of the interrupt core.
      
      Rename irq_build_affinity_masks as group_cpus_evenly, so the API can be
      reused for blk-mq to make default queue mapping even though irq vectors
      aren't involved.
      
      No functional change, just rename vector as group.
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/20221227022905.352674-5-ming.lei@redhat.com
      Stable-dep-of: 0263f92f
      
       ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      617ba373
    • Ming Lei's avatar
      genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks · aeeb4e4e
      Ming Lei authored
      [ Upstream commit e7bdd7f0
      
       ]
      
      Prepare for abstracting irq_build_affinity_masks() into a public function
      for assigning all CPUs evenly into several groups.
      
      Don't pass irq_affinity_desc array to irq_build_affinity_masks, instead
      return a cpumask array by storing each assigned group into one element of
      the array.
      
      This allows to provide a generic interface for grouping all CPUs evenly
      from a NUMA and CPU locality viewpoint, and the cost is one extra allocation
      in irq_build_affinity_masks(), which should be fine since it is done via
      GFP_KERNEL and irq_build_affinity_masks() is a slow path anyway.
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/20221227022905.352674-4-ming.lei@redhat.com
      Stable-dep-of: 0263f92f
      
       ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aeeb4e4e
    • Ming Lei's avatar
      genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks · 9e84d7bb
      Ming Lei authored
      [ Upstream commit 1f962d91
      
       ]
      
      Pass affinity managed mask array to irq_build_affinity_masks() so that the
      index of the first affinity managed vector is always zero.
      
      This allows to simplify the implementation a bit.
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/20221227022905.352674-3-ming.lei@redhat.com
      Stable-dep-of: 0263f92f
      
       ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9e84d7bb
    • Ming Lei's avatar
      genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks · a1dcd179
      Ming Lei authored
      [ Upstream commit cdf07f0e
      
       ]
      
      The 'firstvec' parameter is always same with the parameter of
      'startvec', so use 'startvec' directly inside irq_build_affinity_masks().
      
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/20221227022905.352674-2-ming.lei@redhat.com
      Stable-dep-of: 0263f92f
      
       ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a1dcd179
    • Takashi Iwai's avatar
      ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 · f4fe7646
      Takashi Iwai authored
      [ Upstream commit 634e5e1e ]
      
      Lenovo Yoga Pro 7 14APH8 (PCI SSID 17aa:3882) seems requiring the
      similar workaround like Yoga 9 model for the bass speaker.
      
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/CAGGk=CRRQ1L9p771HsXTN_ebZP41Qj+3gw35Gezurn+nokRewg@mail.gmail.com
      Link: https://lore.kernel.org/r/20231207182035.30248-1-tiwai@suse.de
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f4fe7646
    • Sudeep Holla's avatar
      firmware: arm_scmi: Fix frequency truncation by promoting multiplier type · aee60930
      Sudeep Holla authored
      [ Upstream commit 8e3c98d9 ]
      
      Fix the possible frequency truncation for all values equal to or greater
      4GHz on 64bit machines by updating the multiplier 'mult_factor' to
      'unsigned long' type. It is also possible that the multiplier itself can
      be greater than or equal to 2^32. So we need to also fix the equation
      computing the value of the multiplier.
      
      Fixes: a9e3fbfa
      
       ("firmware: arm_scmi: add initial support for performance protocol")
      Reported-by: default avatarSibi Sankar <quic_sibis@quicinc.com>
      Closes: https://lore.kernel.org/all/20231129065748.19871-3-quic_sibis@quicinc.com/
      Cc: Cristian Marussi <cristian.marussi@arm.com>
      Link: https://lore.kernel.org/r/20231130204343.503076-1-sudeep.holla@arm.com
      
      
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aee60930
    • John Fastabend's avatar
      bpf, sockmap: af_unix stream sockets need to hold ref for pair sock · 90d1f74c
      John Fastabend authored
      [ Upstream commit 8866730a ]
      
      AF_UNIX stream sockets are a paired socket. So sending on one of the pairs
      will lookup the paired socket as part of the send operation. It is possible
      however to put just one of the pairs in a BPF map. This currently increments
      the refcnt on the sock in the sockmap to ensure it is not free'd by the
      stack before sockmap cleans up its state and stops any skbs being sent/recv'd
      to that socket.
      
      But we missed a case. If the peer socket is closed it will be free'd by the
      stack. However, the paired socket can still be referenced from BPF sockmap
      side because we hold a reference there. Then if we are sending traffic through
      BPF sockmap to that socket it will try to dereference the free'd pair in its
      send logic creating a use after free. And following splat:
      
         [59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
         [59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
         [...]
         [59.905468] Call Trace:
         [59.905787]  <TASK>
         [59.906066]  dump_stack_lvl+0x130/0x1d0
         [59.908877]  print_report+0x16f/0x740
         [59.910629]  kasan_report+0x118/0x160
         [59.912576]  sk_wake_async+0x31/0x1b0
         [59.913554]  sock_def_readable+0x156/0x2a0
         [59.914060]  unix_stream_sendmsg+0x3f9/0x12a0
         [59.916398]  sock_sendmsg+0x20e/0x250
         [59.916854]  skb_send_sock+0x236/0xac0
         [59.920527]  sk_psock_backlog+0x287/0xaa0
      
      To fix let BPF sockmap hold a refcnt on both the socket in the sockmap and its
      paired socket. It wasn't obvious how to contain the fix to bpf_unix logic. The
      primarily problem with keeping this logic in bpf_unix was: In the sock close()
      we could handle the deref by having a close handler. But, when we are destroying
      the psock through a map delete operation we wouldn't have gotten any signal
      thorugh the proto struct other than it being replaced. If we do the deref from
      the proto replace its too early because we need to deref the sk_pair after the
      backlog worker has been stopped.
      
      Given all this it seems best to just cache it at the end of the psock and eat 8B
      for the af_unix and vsock users. Notice dgram sockets are OK because they handle
      locking already.
      
      Fixes: 94531cfc
      
       ("af_unix: Add unix_stream_proto for sockmap")
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/20231129012557.95371-2-john.fastabend@gmail.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      90d1f74c
    • Jakub Kicinski's avatar
      ethtool: don't propagate EOPNOTSUPP from dumps · 5ff1682f
      Jakub Kicinski authored
      [ Upstream commit cbeb989e ]
      
      The default dump handler needs to clear ret before returning.
      Otherwise if the last interface returns an inconsequential
      error this error will propagate to user space.
      
      This may confuse user space (ethtool CLI seems to ignore it,
      but YNL doesn't). It will also terminate the dump early
      for mutli-skb dump, because netlink core treats EOPNOTSUPP
      as a real error.
      
      Fixes: 728480f1
      
       ("ethtool: default handlers for GET requests")
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231126225806.2143528-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5ff1682f
    • Ioana Ciornei's avatar
      dpaa2-eth: recycle the RX buffer only after all processing done · e570b150
      Ioana Ciornei authored
      [ Upstream commit beb1930f ]
      
      The blamed commit added support for Rx copybreak. This meant that for
      certain frame sizes, a new skb was allocated and the initial data buffer
      was recycled. Instead of waiting to recycle the Rx buffer only after all
      processing was done on it (like accessing the parse results or timestamp
      information), the code path just went ahead and re-used the buffer right
      away.
      
      This sometimes lead to corrupted HW and SW annotation areas.
      Fix this by delaying the moment when the buffer is recycled.
      
      Fixes: 50f82699
      
       ("dpaa2-eth: add rx copybreak support")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e570b150
    • Ioana Ciornei's avatar
      net: dpaa2-eth: rearrange variable in dpaa2_eth_get_ethtool_stats · 5b8938fc
      Ioana Ciornei authored
      [ Upstream commit 33132068
      
       ]
      
      Rearrange the variables in the dpaa2_eth_get_ethtool_stats() function so
      that we adhere to the reverse Christmas tree rule.
      Also, in the next patch we are adding more variables and I didn't know
      where to place them with the current ordering.
      
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Stable-dep-of: beb1930f
      
       ("dpaa2-eth: recycle the RX buffer only after all processing done")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5b8938fc
    • Paulo Alcantara's avatar
      smb: client: fix missing mode bits for SMB symlinks · e88275ce
      Paulo Alcantara authored
      [ Upstream commit ef22bb80
      
       ]
      
      When instantiating inodes for SMB symlinks, add the mode bits from
      @cifs_sb->ctx->file_mode as we already do for the other special files.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaulo Alcantara (SUSE) <pc@manguebit.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e88275ce
    • Christoph Hellwig's avatar
      block: update the stable_writes flag in bdev_add · bf223fd4
      Christoph Hellwig authored
      [ Upstream commit 1898efcd ]
      
      Propagate the per-queue stable_write flags into each bdev inode in bdev_add.
      This makes sure devices that require stable writes have it set for I/O
      on the block device node as well.
      
      Note that this doesn't cover the case of a flag changing on a live device
      yet.  We should handle that as well, but I plan to cover it as part of a
      more general rework of how changing runtime paramters on block devices
      works.
      
      Fixes: 1cb039f3
      
       ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag")
      Reported-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20231025141020.192413-3-hch@lst.de
      
      
      Tested-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf223fd4
    • Christoph Hellwig's avatar
      filemap: add a per-mapping stable writes flag · a8e4300a
      Christoph Hellwig authored
      [ Upstream commit 762321da
      
       ]
      
      folio_wait_stable waits for writeback to finish before modifying the
      contents of a folio again, e.g. to support check summing of the data
      in the block integrity code.
      
      Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
      on the super_block, which means it is uniform for the entire file system.
      This is wrong for the block device pseudofs which is shared by all
      block devices, or file systems that can use multiple devices like XFS
      witht the RT subvolume or btrfs (although btrfs currently reimplements
      folio_wait_stable anyway).
      
      Add a per-address_space AS_STABLE_WRITES flag to control the behavior
      in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
      to initialize AS_STABLE_WRITES to the existing default which covers
      most cases.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20231025141020.192413-2-hch@lst.de
      
      
      Tested-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      Stable-dep-of: 1898efcd
      
       ("block: update the stable_writes flag in bdev_add")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a8e4300a
    • David Howells's avatar
      mm, netfs, fscache: stop read optimisation when folio removed from pagecache · d0eafc76
      David Howells authored
      [ Upstream commit b4fa966f ]
      
      Fscache has an optimisation by which reads from the cache are skipped
      until we know that (a) there's data there to be read and (b) that data
      isn't entirely covered by pages resident in the netfs pagecache.  This is
      done with two flags manipulated by fscache_note_page_release():
      
      	if (...
      	    test_bit(FSCACHE_COOKIE_HAVE_DATA, &cookie->flags) &&
      	    test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags))
      		clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
      
      where the NO_DATA_TO_READ flag causes cachefiles_prepare_read() to
      indicate that netfslib should download from the server or clear the page
      instead.
      
      The fscache_note_page_release() function is intended to be called from
      ->releasepage() - but that only gets called if PG_private or PG_private_2
      is set - and currently the former is at the discretion of the network
      filesystem and the latter is only set whils...
      d0eafc76
    • David Howells's avatar
      mm: merge folio_has_private()/filemap_release_folio() call pairs · bceff380
      David Howells authored
      [ Upstream commit 0201ebf2 ]
      
      Patch series "mm, netfs, fscache: Stop read optimisation when folio
      removed from pagecache", v7.
      
      This fixes an optimisation in fscache whereby we don't read from the cache
      for a particular file until we know that there's data there that we don't
      have in the pagecache.  The problem is that I'm no longer using PG_fscache
      (aka PG_private_2) to indicate that the page is cached and so I don't get
      a notification when a cached page is dropped from the pagecache.
      
      The first patch merges some folio_has_private() and
      filemap_release_folio() pairs and introduces a helper,
      folio_needs_release(), to indicate if a release is required.
      
      The second patch is the actual fix.  Following Willy's suggestions[1], it
      adds an AS_RELEASE_ALWAYS flag to an address_space that will make
      filemap_release_folio() always call ->release_folio(), even if
      PG_private/PG_private_2 aren't set.  folio_needs_release() is altered to
      add a check for this.
      
      This patch (of 2):
      
      Make filemap_release_folio() check folio_has_private().  Then, in most
      cases, where a call to folio_has_private() is immediately followed by a
      call to filemap_release_folio(), we can get rid of the test in the pair.
      
      There are a couple of sites in mm/vscan.c that this can't so easily be
      done.  In shrink_folio_list(), there are actually three cases (something
      different is done for incompletely invalidated buffers), but
      filemap_release_folio() elides two of them.
      
      In shrink_active_list(), we don't have have the folio lock yet, so the
      check allows us to avoid locking the page unnecessarily.
      
      A wrapper function to check if a folio needs release is provided for those
      places that still need to do it in the mm/ directory.  This will acquire
      additional parts to the condition in a future patch.
      
      After this, the only remaining caller of folio_has_private() outside of
      mm/ is a check in fuse.
      
      Link: https://lkml.kernel.org/r/20230628104852.3391651-1-dhowells@redhat.com
      Link: https://lkml.kernel.org/r/20230628104852.3391651-2-dhowells@redhat.com
      
      
      Reported-by: default avatarRohith Surabattula <rohiths.msft@gmail.com>
      Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steve French <sfrench@samba.org>
      Cc: Shyam Prasad N <nspmangalore@gmail.com>
      Cc: Rohith Surabattula <rohiths.msft@gmail.com>
      Cc: Dave Wysochanski <dwysocha@redhat.com>
      Cc: Dominique Martinet <asmadeus@codewreck.org>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Xiubo Li <xiubli@redhat.com>
      Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Stable-dep-of: 1898efcd
      
       ("block: update the stable_writes flag in bdev_add")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bceff380