Skip to content
  1. Jul 29, 2023
  2. Jul 24, 2023
    • Linus Torvalds's avatar
      Linux 6.5-rc3 · 6eaae198
      Linus Torvalds authored
      6eaae198
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 3b4e48b8
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Swapping the ring buffer for snapshotting (for things like irqsoff)
         can crash if the ring buffer is being resized. Disable swapping when
         this happens. The missed swap will be reported to the tracer
      
       - Report error if the histogram fails to be created due to an error in
         adding a histogram variable, in event_hist_trigger_parse()
      
       - Remove unused declaration of tracing_map_set_field_descr()
      
      * tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing/histograms: Return an error if we fail to add histogram to hist_vars list
        ring-buffer: Do not swap cpu_buffer during resize process
        tracing: Remove unused extern declaration tracing_map_set_field_descr()
      3b4e48b8
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.5' of... · 12a5336c
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix stale help text in gconfig
      
       - Support *.S files in compile_commands.json
      
       - Flatten KBUILD_CFLAGS
      
       - Fix external module builds with Rust so that temporary files are
         created in the modules directories instead of the kernel tree
      
      * tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: rust: avoid creating temporary files
        kbuild: flatten KBUILD_CFLAGS
        gen_compile_commands: add assembly files to compilation database
        kconfig: gconfig: correct program name in help text
        kconfig: gconfig: drop the Show Debug Info help text
      12a5336c
    • Miguel Ojeda's avatar
      kbuild: rust: avoid creating temporary files · df01b7cf
      Miguel Ojeda authored
      
      
      `rustc` outputs by default the temporary files (i.e. the ones saved
      by `-Csave-temps`, such as `*.rcgu*` files) in the current working
      directory when `-o` and `--out-dir` are not given (even if
      `--emit=x=path` is given, i.e. it does not use those for temporaries).
      
      Since out-of-tree modules are compiled from the `linux` tree,
      `rustc` then tries to create them there, which may not be accessible.
      
      Thus pass `--out-dir` explicitly, even if it is just for the temporary
      files.
      
      Similarly, do so for Rust host programs too.
      
      Reported-by: default avatarRaphael Nestler <raphael.nestler@gmail.com>
      Closes: https://github.com/Rust-for-Linux/linux/issues/1015
      
      
      Reported-by: default avatarAndrea Righi <andrea.righi@canonical.com>
      Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs
      Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs
      Fixes: 295d8398
      
       ("kbuild: specify output names separately for each emission type from rustc")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Tested-by: default avatarMartin Rodriguez Reboredo <yakoyoku@gmail.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      df01b7cf
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 269f4a4b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Avoid pKVM finalization if KVM initialization fails
      
         - Add missing BTI instructions in the hypervisor, fixing an early
           boot failure on BTI systems
      
         - Handle MMU notifiers correctly for non hugepage-aligned memslots
      
         - Work around a bug in the architecture where hypervisor timer
           controls have UNKNOWN behavior under nested virt
      
         - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel
           BUG in cpu hotplug resulting from per-CPU accessor sanity checking
      
         - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
           consistently requesting a doorbell interrupt on vcpu_put()
      
         - Uphold RES0 sysreg behavior when emulating older PMU versions
      
         - Avoid macro expansion when initializing PMU register names,
           ensuring the tracepoints pretty-print the sysreg
      
        s390:
      
         - Two fixes for asynchronous destroy
      
        x86 fixes will come early next week"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: pv: fix index value of replaced ASCE
        KVM: s390: pv: simplify shutdown and fix race
        KVM: arm64: Fix the name of sys_reg_desc related to PMU
        KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
        KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
        KVM: arm64: Add missing BTI instructions
        KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
        KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
        KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
        KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
      269f4a4b
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 15b593ba
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
        checkpoint code"
      
      * tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
        ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
        ext4: correct inline offset when handling xattrs in inode body
        jbd2: remove __journal_try_to_free_buffer()
        jbd2: fix a race when checking checkpoint buffer busy
        jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
        jbd2: remove journal_clean_one_cp_list()
        jbd2: remove t_checkpoint_io_list
        jbd2: recheck chechpointing non-dirty buffer
      15b593ba
    • Linus Torvalds's avatar
      Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6 · 8266f53b
      Linus Torvalds authored
      Pull smb client fix from Steve French:
       "Add minor debugging improvement.
      
        The change improves ability to read a network trace to debug problems
        on encrypted connections which are very common (e.g. using wireshark
        or tcpdump).
      
        That works today with tools like 'smbinfo keys /mnt/file' but requires
        passing in a filename on the mount (see e.g. [1]), but it often makes
        more sense to just pass in the mount point path (ie a directory not a
        filename).
      
        So this fix was needed to debug some types of problems (an obvious
        example is on an encrypted connection failing operations on an empty
        share or with no files in the root of the directory) - so you can
        simply pass in the 'smbinfo keys <mntpoint>' and get the information
        that wireshark needs"
      
      Link: https://wiki.samba.org/index.php/Wireshark_Decryption [1]
      
      * tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: update internal module version number for cifs.ko
        cifs: allow dumping keys for directories too
      8266f53b
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-master-6.5-1' of... · 0c189708
      Paolo Bonzini authored
      Merge tag 'kvm-s390-master-6.5-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      Two fixes for asynchronous destroy
      0c189708
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.5-1' of... · 675a15f4
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.5, part #1
      
       - Avoid pKVM finalization if KVM initialization fails
      
       - Add missing BTI instructions in the hypervisor, fixing an early boot
         failure on BTI systems
      
       - Handle MMU notifiers correctly for non hugepage-aligned memslots
      
       - Work around a bug in the architecture where hypervisor timer controls
         have UNKNOWN behavior under nested virt.
      
       - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG
         in cpu hotplug resulting from per-CPU accessor sanity checking.
      
       - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
         consistently requesting a doorbell interrupt on vcpu_put()
      
       - Uphold RES0 sysreg behavior when emulating older PMU versions
      
       - Avoid macro expansion when initializing PMU register names, ensuring
         the tracepoints pretty-print the sysreg.
      675a15f4
  3. Jul 23, 2023
    • Mohamed Khalfella's avatar
      tracing/histograms: Return an error if we fail to add histogram to hist_vars list · 4b8b3905
      Mohamed Khalfella authored
      Commit 6018b585 ("tracing/histograms: Add histograms to hist_vars if
      they have referenced variables") added a check to fail histogram creation
      if save_hist_vars() failed to add histogram to hist_vars list. But the
      commit failed to set ret to failed return code before jumping to
      unregister histogram, fix it.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com
      
      Cc: stable@vger.kernel.org
      Fixes: 6018b585
      
       ("tracing/histograms: Add histograms to hist_vars if they have referenced variables")
      Signed-off-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      4b8b3905
    • Chen Lin's avatar
      ring-buffer: Do not swap cpu_buffer during resize process · 8a96c028
      Chen Lin authored
      When ring_buffer_swap_cpu was called during resize process,
      the cpu buffer was swapped in the middle, resulting in incorrect state.
      Continuing to run in the wrong state will result in oops.
      
      This issue can be easily reproduced using the following two scripts:
      /tmp # cat test1.sh
      //#! /bin/sh
      for i in `seq 0 100000`
      do
               echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
               sleep 0.5
               echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
               sleep 0.5
      done
      /tmp # cat test2.sh
      //#! /bin/sh
      for i in `seq 0 100000`
      do
              echo irqsoff > /sys/kernel/debug/tracing/current_tracer
              sleep 1
              echo nop > /sys/kernel/debug/tracing/current_tracer
              sleep 1
      done
      /tmp # ./test1.sh &
      /tmp # ./test2.sh &
      
      A typical oops log is as follows, sometimes with other different oops logs.
      
      [  231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
      [  231.713375] Modules linked in:
      [  231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
      [  231.716750] Hardware name: linux,dummy-virt (DT)
      [  231.718152] Workqueue: events update_pages_handler
      [  231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  231.721171] pc : rb_update_pages+0x378/0x3f8
      [  231.722212] lr : rb_update_pages+0x25c/0x3f8
      [  231.723248] sp : ffff800082b9bd50
      [  231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
      [  231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
      [  231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
      [  231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
      [  231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
      [  231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
      [  231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
      [  231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
      [  231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
      [  231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
      [  231.744196] Call trace:
      [  231.744892]  rb_update_pages+0x378/0x3f8
      [  231.745893]  update_pages_handler+0x1c/0x38
      [  231.746893]  process_one_work+0x1f0/0x468
      [  231.747852]  worker_thread+0x54/0x410
      [  231.748737]  kthread+0x124/0x138
      [  231.749549]  ret_from_fork+0x10/0x20
      [  231.750434] ---[ end trace 0000000000000000 ]---
      [  233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [  233.721696] Mem abort info:
      [  233.721935]   ESR = 0x0000000096000004
      [  233.722283]   EC = 0x25: DABT (current EL), IL = 32 bits
      [  233.722596]   SET = 0, FnV = 0
      [  233.722805]   EA = 0, S1PTW = 0
      [  233.723026]   FSC = 0x04: level 0 translation fault
      [  233.723458] Data abort info:
      [  233.723734]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
      [  233.724176]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      [  233.724589]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
      [  233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
      [  233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
      [  233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
      [  233.726720] Modules linked in:
      [  233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
      [  233.727777] Hardware name: linux,dummy-virt (DT)
      [  233.728225] Workqueue: events update_pages_handler
      [  233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  233.729054] pc : rb_update_pages+0x1a8/0x3f8
      [  233.729334] lr : rb_update_pages+0x154/0x3f8
      [  233.729592] sp : ffff800082b9bd50
      [  233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
      [  233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
      [  233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
      [  233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
      [  233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
      [  233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
      [  233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
      [  233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
      [  233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
      [  233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
      [  233.734418] Call trace:
      [  233.734593]  rb_update_pages+0x1a8/0x3f8
      [  233.734853]  update_pages_handler+0x1c/0x38
      [  233.735148]  process_one_work+0x1f0/0x468
      [  233.735525]  worker_thread+0x54/0x410
      [  233.735852]  kthread+0x124/0x138
      [  233.736064]  ret_from_fork+0x10/0x20
      [  233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
      [  233.736959] ---[ end trace 0000000000000000 ]---
      
      After analysis, the seq of the error is as follows [1-5]:
      
      int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
      			int cpu_id)
      {
      	for_each_buffer_cpu(buffer, cpu) {
      		cpu_buffer = buffer->buffers[cpu];
      		//1. get cpu_buffer, aka cpu_buffer(A)
      		...
      		...
      		schedule_work_on(cpu,
      		 &cpu_buffer->update_pages_work);
      		//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
      		// update_pages_handler, do the update process, set 'update_done' in
      		// complete(&cpu_buffer->update_done) and to wakeup resize process.
      	//---->
      		//3. Just at this moment, ring_buffer_swap_cpu is triggered,
      		//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
      		//ring_buffer_swap_cpu is called as the 'Call trace' below.
      
      		Call trace:
      		 dump_backtrace+0x0/0x2f8
      		 show_stack+0x18/0x28
      		 dump_stack+0x12c/0x188
      		 ring_buffer_swap_cpu+0x2f8/0x328
      		 update_max_tr_single+0x180/0x210
      		 check_critical_timing+0x2b4/0x2c8
      		 tracer_hardirqs_on+0x1c0/0x200
      		 trace_hardirqs_on+0xec/0x378
      		 el0_svc_common+0x64/0x260
      		 do_el0_svc+0x90/0xf8
      		 el0_svc+0x20/0x30
      		 el0_sync_handler+0xb0/0xb8
      		 el0_sync+0x180/0x1c0
      	//<----
      
      	/* wait for all the updates to complete */
      	for_each_buffer_cpu(buffer, cpu) {
      		cpu_buffer = buffer->buffers[cpu];
      		//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
      		//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
      		//for example, cpu_buffer(A)->update_done will leave be set 1, and will
      		//not 'wait_for_completion' at the next resize round.
      		  if (!cpu_buffer->nr_pages_to_update)
      			continue;
      
      		if (cpu_online(cpu))
      			wait_for_completion(&cpu_buffer->update_done);
      		cpu_buffer->nr_pages_to_update = 0;
      	}
      	...
      }
      	//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
      	//Continuing to run in the wrong state, then oops occurs.
      
      Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn
      
      
      
      Signed-off-by: default avatarChen Lin <chen.lin5@zte.com.cn>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      8a96c028
    • YueHaibing's avatar
      tracing: Remove unused extern declaration tracing_map_set_field_descr() · 1faf7e4a
      YueHaibing authored
      Since commit 08d43a5f ("tracing: Add lock-free tracing_map"),
      this is never used, so can be removed.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com
      
      
      
      Cc: <mhiramat@kernel.org>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      1faf7e4a
    • Alexey Dobriyan's avatar
      kbuild: flatten KBUILD_CFLAGS · 0817d259
      Alexey Dobriyan authored
      
      
      Make it slightly easier to see which compiler options are added and
      removed (and not worry about column limit too!).
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarNicolas Schier <n.schier@avm.de>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      0817d259
    • Benjamin Gray's avatar
      gen_compile_commands: add assembly files to compilation database · 1c679214
      Benjamin Gray authored
      
      
      Like C source files, tooling can find it useful to have the assembly
      source file compilation recorded.
      
      The .S extension appears to used across all architectures.
      
      Signed-off-by: default avatarBenjamin Gray <bgray@linux.ibm.com>
      Reviewed-by: default avatarFangrui Song <maskray@google.com>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1c679214
    • Ojaswin Mujoo's avatar
      ext4: fix rbtree traversal bug in ext4_mb_use_preallocated · 9d3de7ee
      Ojaswin Mujoo authored
      During allocations, while looking for preallocations(PA) in the per
      inode rbtree, we can't do a direct traversal of the tree because
      ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
      and that can cause direct traversal to skip some entries. This was
      leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
      our request and ultimately tried to create a new PA that would overlap
      with the missed one.
      
      To makes sure we handle that case while still keeping the performance of
      the rbtree, we make use of the fact that the only pa that could possibly
      overlap the original goal start is the one that satisfies the below
      conditions:
      
        1. It must have it's logical start immediately to the left of
        (ie less than) original logical start.
      
        2. It must not be deleted
      
      To find this pa we use the following traversal method:
      
      1. Descend into the rbtree normally to find the immediate neighboring
      PA. Here we keep descending irrespective of if the PA is deleted or if
      it overlaps with our request etc. The goal is to find an immediately
      adjacent PA.
      
      2. If the found PA is on right of original goal, use rb_prev() to find
      the left adjacent PA.
      
      3. Check if this PA is deleted and keep moving left with rb_prev() until
      a non deleted PA is found.
      
      4. This is the PA we are looking for. Now we can check if it can satisfy
      the original request and proceed accordingly.
      
      This approach also takes care of having deleted PAs in the tree.
      
      (While we are at it, also fix a possible overflow bug in calculating the
      end of a PA)
      
      [1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/
      
      Cc: stable@kernel.org # 6.4
      Fixes: 38727786
      
       ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Reviewed-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Tested-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      9d3de7ee
    • Ojaswin Mujoo's avatar
      ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail() · 5d5460fa
      Ojaswin Mujoo authored
      
      
      In ext4_mb_choose_next_group_best_avail(), we want the start order to be
      1 less than goal length and the min_order to be, at max, 1 more than the
      original length. This commit fixes an off by one issue that arose due to
      the fact that 1 << fls(n) > (n).
      
      After all the processing:
      
      order = 1 order below goal len
      min_order = maximum of the three:-
                   - order - trim_order
                   - 1 order below B2C(s_stripe)
                   - 1 order above original len
      
      Cc: stable@kernel.org
      Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)")
      Signed-off-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      5d5460fa
    • Eric Whitney's avatar
      ext4: correct inline offset when handling xattrs in inode body · 6909cf5c
      Eric Whitney authored
      
      
      When run on a file system where the inline_data feature has been
      enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
      to emit error messages indicating that inline directory entries are
      corrupted.  This occurs because the inline offset used to locate
      inline directory entries in the inode body is not updated when an
      xattr in that shared region is deleted and the region is shifted in
      memory to recover the space it occupied.  If the deleted xattr precedes
      the system.data attribute, which points to the inline directory entries,
      that attribute will be moved further up in the region.  The inline
      offset continues to point to whatever is located in system.data's former
      location, with unfortunate effects when used to access directory entries
      or (presumably) inline data in the inode body.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6909cf5c
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c2782531
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Reinstate support for little endian ELFv1 binaries, which it turns
         out still exist in the wild.
      
       - Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it
         lead to dead code generation and seemed to trigger compiler bugs in
         some edge cases.
      
       - Fix a deadlock in the pseries VAS code, between live migration and
         the driver's mmap handler.
      
       - Disable KCOV instrumentation in the powerpc KASAN code.
      
      Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren
      Myneni, Russell Currey, and Uwe Kleine-König.
      
      * tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"
        powerpc/kasan: Disable KCOV in KASAN code
        powerpc/512x: lpbfifo: Convert to platform remove callback returning void
        powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files
        Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"
        powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
      c2782531
    • Steve French's avatar
      cifs: update internal module version number for cifs.ko · ba61a03a
      Steve French authored
      
      
      From 2.43 to 2.44
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      ba61a03a
    • Shyam Prasad N's avatar
      cifs: allow dumping keys for directories too · b3edef6b
      Shyam Prasad N authored
      
      
      Dumping the enc/dec keys is a session wide operation.
      And it should not matter if the ioctl was run on
      a regular file or a directory.
      
      Currently, we obtain the tcon pointer from the
      cifs file handle. But since there's no dir open call
      in cifs, this is not populated for dirs.
      
      This change allows dumping of session keys using ioctl
      even for directories. To do this, we'll now get the
      tcon pointer from the superblock, and not from the file
      handle.
      
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      b3edef6b
    • Linus Torvalds's avatar
      Merge tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 295e1388
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Fix per vma lock fault handling: add missing !(fault & VM_FAULT_ERROR)
         check to fault handler to prevent error handling for return values
         that don't indicate an error
      
       - Use kfree_sensitive() instead of kfree() in paes crypto code to clear
         memory that may contain keys before freeing it
      
       - Fix reply buffer size calculation for CCA replies in zcrypt device
         driver
      
      * tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/zcrypt: fix reply buffer calculations for CCA replies
        s390/crypto: use kfree_sensitive() instead of kfree()
        s390/mm: fix per vma lock fault handling
      295e1388
    • Linus Torvalds's avatar
      Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux · f036d67c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix for loop regressions (Mauricio)
      
       - Fix a potential stall with batched wakeups in sbitmap (David)
      
       - Fix for stall with recursive plug flushes (Ross)
      
       - Skip accounting of empty requests for blk-iocost (Chengming)
      
       - Remove a dead field in struct blk_mq_hw_ctx (Chengming)
      
      * tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux:
        loop: do not enforce max_loop hard limit by (new) default
        loop: deprecate autoloading callback loop_probe()
        sbitmap: fix batching wakeup
        blk-iocost: skip empty flush bio in iocost
        blk-mq: delete dead struct blk_mq_hw_ctx->queued field
        blk-mq: Fix stall due to recursive flush plug
      f036d67c
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux · bdd1d82e
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix for io-wq not always honoring REQ_F_NOWAIT, if it was set and
         punted directly (eg via DRAIN) (me)
      
       - Capability check fix (Ondrej)
      
       - Regression fix for the mmap changes that went into 6.4, which
         apparently broke IA64 (Helge)
      
      * tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux:
        ia64: mmap: Consider pgoff when searching for free mapping
        io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
        io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
        io_uring: don't audit the capability check in io_uring_create()
      bdd1d82e
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 725d444d
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Fix moortec,mr75203 schema usage of 'multipleOf' keyword
      
       - Fix regression in systems depending on "of-display" device name
      
       - Build fix for s390 with CONFIG_PCI=n and OF_EARLY_FLATTREE=y
      
       - Drop two obsolete serial .txt bindings
      
      * tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt
        dt-bindings: serial: Remove obsolete cavium-uart.txt
        dt-bindings: hwmon: moortec,mr75203: fix multipleOf for coefficients
        of: Preserve "of-display" device name for compatibility
        of: make OF_EARLY_FLATTREE depend on HAS_IOMEM
      725d444d
    • Linus Torvalds's avatar
      Merge tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 39b14286
      Linus Torvalds authored
      Pull regmap fixes from Mark Brown:
       "Three fixes here:
      
         - The issues with accounting for register and padding length on raw
           buses turn out to be quite widespread in custom buses.
      
           In order to avoid disturbing anything drop the initial fixes and
           fall back to a point fix in the SMBus code where the issue was
           originally noticed, a more substantial refactoring of the API which
           ensures that all buses make the same assumptions will follow.
      
         - The generic regcache code had been forcing on async I/O which did
           not work with the new maple tree sync code when used with SPI.
      
           Since that was mainly for the rbtree cache and the assumptions
           about hardware that drove the choice are probably not true any more
           fix this by pushing the enablement of async down into the rbtree
           code.
      
           This probably also makes cache syncs for systems faster though it's
           not the point.
      
         - The test code was triggering use of the rbtree and maple tree
           caches with dynamic allocation of nodes since all the testing is
           with RAM backed caches with no I/O performance issues.
      
           Just disable the locking in the tests to avoid triggering warnings
           when allocation debugging is turned on, it's not really what's
           being tested"
      
      * tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: Disable locking for RBTREE and MAPLE unit tests
        regcache: Push async I/O request down into the rbtree cache
        regmap: Account for register length in SMBus I/O limits
        regmap: Drop initial version of maximum transfer length fixes
      39b14286
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · c0842db5
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix initial value handling for output-only pins in gpio-tps68470
      
       - fix two resource leaks in gpio-mvebu
      
      * tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: mvebu: fix irq domain leak
        gpio: mvebu: Make use of devm_pwmchip_add
        gpio: tps68470: Make tps68470_gpio_output() always set the initial value
      c0842db5
  4. Jul 22, 2023
    • Rob Herring's avatar
      dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt · ffc59c64
      Rob Herring authored
      
      
      nxp,lpc1850-uart.txt binding is already covered by 8250.yaml, so remove
      it.
      
      Reviewed-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Link: https://lore.kernel.org/r/20230707221607.1064888-1-robh@kernel.org
      
      
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      ffc59c64
    • Rob Herring's avatar
      dt-bindings: serial: Remove obsolete cavium-uart.txt · 5921181c
      Rob Herring authored
      
      
      cavium-uart.txt binding is already covered by 8250.yaml, so remove it.
      
      Reviewed-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Link: https://lore.kernel.org/r/20230707221602.1063972-1-robh@kernel.org
      
      
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      5921181c
    • Mauricio Faria de Oliveira's avatar
      loop: do not enforce max_loop hard limit by (new) default · bb5faa99
      Mauricio Faria de Oliveira authored
      Problem:
      
      The max_loop parameter is used for 2 different purposes:
      
      1) initial number of loop devices to pre-create on init
      2) maximum number of loop devices to add on access/open()
      
      Historically, its default value (zero) caused 1) to create non-zero
      number of devices (CONFIG_BLK_DEV_LOOP_MIN_COUNT), and no hard limit on
      2) to add devices with autoloading.
      
      However, the default value changed in commit 85c50197 ("loop: Fix
      the max_loop commandline argument treatment when it is set to 0") to
      CONFIG_BLK_DEV_LOOP_MIN_COUNT, for max_loop=0 not to pre-create devices.
      
      That does improve 1), but unfortunately it breaks 2), as the default
      behavior changed from no-limit to hard-limit.
      
      Example:
      
      For example, this userspace code broke for N >= CONFIG, if the user
      relied on the default value 0 for max_loop:
      
          mknod("/dev/loopN");
          open("/dev/loopN");  // now fails with ENXIO
      
      Though affected users may "fix" it with (loop.)max_loop=0, this means to
      require a kernel parameter change on stable kernel update (that commit
      Fixes: an old commit in stable).
      
      Solution:
      
      The original semantics for the default value in 2) can be applied if the
      parameter is not set (ie, default behavior).
      
      This still keeps the intended function in 1) and 2) if set, and that
      commit's intended improvement in 1) if max_loop=0.
      
      Before 85c50197:
        - default:     1) CONFIG devices   2) no limit
        - max_loop=0:  1) CONFIG devices   2) no limit
        - max_loop=X:  1) X devices        2) X limit
      
      After 85c50197:
        - default:     1) CONFIG devices   2) CONFIG limit (*)
        - max_loop=0:  1) 0 devices (*)    2) no limit
        - max_loop=X:  1) X devices        2) X limit
      
      This commit:
        - default:     1) CONFIG devices   2) no limit (*)
        - max_loop=0:  1) 0 devices        2) no limit
        - max_loop=X:  1) X devices        2) X limit
      
      Future:
      
      The issue/regression from that commit only affects code under the
      CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard, thus the fix too is
      contained under it.
      
      Once that deprecated functionality/code is removed, the purpose 2) of
      max_loop (hard limit) is no longer in use, so the module parameter
      description can be changed then.
      
      Tests:
      
      Linux 6.4-rc7
      CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
      CONFIG_BLOCK_LEGACY_AUTOLOAD=y
      
      - default (original)
      
      	# ls -1 /dev/loop*
      	/dev/loop-control
      	/dev/loop0
      	...
      	/dev/loop7
      
      	# ./test-loop
      	open: /dev/loop8: No such device or address
      
      - default (patched)
      
      	# ls -1 /dev/loop*
      	/dev/loop-control
      	/dev/loop0
      	...
      	/dev/loop7
      
      	# ./test-loop
      	#
      
      - max_loop=0 (original & patched):
      
      	# ls -1 /dev/loop*
      	/dev/loop-control
      
      	# ./test-loop
      	#
      
      - max_loop=8 (original & patched):
      
      	# ls -1 /dev/loop*
      	/dev/loop-control
      	/dev/loop0
      	...
      	/dev/loop7
      
      	# ./test-loop
      	open: /dev/loop8: No such device or address
      
      - max_loop=0 (patched; CONFIG_BLOCK_LEGACY_AUTOLOAD is not set)
      
      	# ls -1 /dev/loop*
      	/dev/loop-control
      
      	# ./test-loop
      	open: /dev/loop8: No such device or address
      
      Fixes: 85c50197
      
       ("loop: Fix the max_loop commandline argument treatment when it is set to 0")
      Signed-off-by: default avatarMauricio Faria de Oliveira <mfo@canonical.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20230720143033.841001-3-mfo@canonical.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bb5faa99
    • Mauricio Faria de Oliveira's avatar
      loop: deprecate autoloading callback loop_probe() · 23881aec
      Mauricio Faria de Oliveira authored
      The 'probe' callback in __register_blkdev() is only used under the
      CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard.
      
      The loop_probe() function is only used for that callback, so guard it
      too, accordingly.
      
      See commit fbdee71b
      
       ("block: deprecate autoloading based on dev_t").
      
      Signed-off-by: default avatarMauricio Faria de Oliveira <mfo@canonical.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20230720143033.841001-2-mfo@canonical.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      23881aec
    • David Jeffery's avatar
      sbitmap: fix batching wakeup · 10639737
      David Jeffery authored
      
      
      Current code supposes that it is enough to provide forward progress by
      just waking up one wait queue after one completion batch is done.
      
      Unfortunately this way isn't enough, cause waiter can be added to wait
      queue just after it is woken up.
      
      Follows one example(64 depth, wake_batch is 8)
      
      1) all 64 tags are active
      
      2) in each wait queue, there is only one single waiter
      
      3) each time one completion batch(8 completions) wakes up just one
         waiter in each wait queue, then immediately one new sleeper is added
         to this wait queue
      
      4) after 64 completions, 8 waiters are wakeup, and there are still 8
         waiters in each wait queue
      
      5) after another 8 active tags are completed, only one waiter can be
         wakeup, and the other 7 can't be waken up anymore.
      
      Turns out it isn't easy to fix this problem, so simply wakeup enough
      waiters for single batch.
      
      Cc: Kemeng Shi <shikemeng@huaweicloud.com>
      Cc: Chengming Zhou <zhouchengming@bytedance.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarGabriel Krisman Bertazi <krisman@suse.de>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Link: https://lore.kernel.org/r/20230721095715.232728-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      10639737
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · d192f538
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "I've picked up a handful of arm64 fixes while Catalin's been away, so
        here they are. Below is the usual summary, but we have basically have
        two cleanups, a fix for an SME crash and a fix for hibernation:
      
         - Fix saving of SME state after SVE vector length is changed
      
         - Fix sparse warnings for missing vDSO function prototypes
      
         - Fix hibernation resume path when kfence is enabled
      
         - Fix field names for the HFGxTR_EL2 register"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
        arm64: vdso: Clear common make C=2 warnings
        arm64: mm: Make hibernation aware of KFENCE
        arm64: Fix HFGxTR_EL2 field naming
      d192f538
    • Linus Torvalds's avatar
      Merge tag 'pm-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 892d7c1b
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "Revert three recent intel_idle commits that introduced a functional
        issue, included a coding mistake and have been questioned at the
        design level"
      
      * tag 'pm-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "intel_idle: Add support for using intel_idle in a VM guest using just hlt"
        Revert "intel_idle: Add a "Long HLT" C1 state for the VM guest mode"
        Revert "intel_idle: Add __init annotation to matchup_vm_state_with_baremetal()"
      892d7c1b
    • Linus Torvalds's avatar
      Merge tag 'sound-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 3c05547a
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A pile of fixes that have been gathered since the previous pull. Most
        of changes are device-specific, and nothing looks too scary.
      
         - A memory leak fix in ALSA sequencer code in 6.5-rc
      
         - Many fixes for ASoC Qualcomm CODEC drivers, covering SoundWire
           probe problems
      
         - A series of ASoC AMD fixes
      
         - A few fixes and cleanups of selftest stuff
      
         - HD-audio codec fixes and quirks for Clevo, HP, Lenovo, Dell"
      
      * tag 'sound-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (52 commits)
        ALSA: hda/realtek: Add support for DELL Oasis 13/14/16 laptops
        ALSA: hda/realtek: Fix generic fixup definition for cs35l41 amp
        ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx
        selftests: ALSA: Add test-pcmtest-driver to .gitignore
        ALSA: hda/realtek: Add quirk for Clevo NS70AU
        ASoC: fsl_sai: Disable bit clock with transmitter
        ALSA: seq: Fix memory leak at error path in snd_seq_create_port()
        ASoC: SOF: ipc3-dtrace: uninitialized data in dfsentry_trace_filter_write()
        ASoC: cs42l51: fix driver to properly autoload with automatic module loading
        MAINTAINERS: Redo addition of ssm3515 to APPLE SOUND
        ASoC: rt5640: Fix the issue of speaker noise
        ALSA: hda/realtek - remove 3k pull low procedure
        selftests: ALSA: Fix fclose on an already fclosed file pointer
        ALSA: pcmtest: Don't use static storage to track per device data
        ALSA: pcmtest: Convert to platform remove callback returning void
        ASoC: dt-bindings: audio-graph-card2: Drop incomplete example
        ASoC: dt-bindings: Update maintainer email id
        ASoC: amd: ps: Fix extraneous error messages
        ASoC: fsl_sai: Revert "ASoC: fsl_sai: Enable MCTL_MCLK_EN bit for master mode"
        ASoC: codecs: SND_SOC_WCD934X should select REGMAP_IRQ
        ...
      3c05547a
    • Linus Torvalds's avatar
      Merge tag 'fbdev-for-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev · 55c225fb
      Linus Torvalds authored
      Pull fbdev fixes and cleanups from Helge Deller:
       "Just the usual bunch of code cleanups in various drivers, this time
        mostly in vgacon and imxfb:
      
         - Code cleanup in vgacon (Jiri Slaby)
      
         - Explicitly include correct DT includes (Rob Herring)
      
         - imxfb code cleanup (Yangtao Li, Martin Kaiser)
      
         - kyrofb: make arrays const and smaller (Colin Ian King)
      
         - ep93xx-fb: return value check fix (Yuanjun Gong)
      
         - au1200fb: add missing IRQ check (Zhang Shurong)"
      
      * tag 'fbdev-for-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
        fbdev: Explicitly include correct DT includes
        fbdev: ep93xx-fb: fix return value check in ep93xxfb_probe
        fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe
        fbdev: kyro: make some const read-only arrays static and reduce type size
        fbcon: remove unused display (p) from fbcon_redraw()
        sticon: make sticon_set_def_font() void and remove op parameter
        vgacon: cache vc_cell_height in vgacon_cursor()
        vgacon: let vgacon_doresize() return void
        vgacon: remove unused xpos from vgacon_set_cursor_size()
        vgacon: remove unneeded forward declarations
        vgacon: switch vgacon_scrolldelta() and vgacon_restore_screen()
        fbdev: imxfb: remove unneeded labels
        fbdev: imxfb: Convert to devm_platform_ioremap_resource()
        fbdev: imxfb: Convert to devm_kmalloc_array()
        fbdev: imxfb: Removed unneeded release_mem_region
        fbdev: imxfb: switch to DEFINE_SIMPLE_DEV_PM_OPS
        fbdev: imxfb: warn about invalid left/right margin
      55c225fb
    • Daniel Vetter's avatar
      drm/atomic: Fix potential use-after-free in nonblocking commits · 4e076c73
      Daniel Vetter authored
      
      
      This requires a bit of background.  Properly done a modeset driver's
      unload/remove sequence should be
      
      	drm_dev_unplug();
      	drm_atomic_helper_shutdown();
      	drm_dev_put();
      
      The trouble is that the drm_dev_unplugged() checks are by design racy,
      they do not synchronize against all outstanding ioctl.  This is because
      those ioctl could block forever (both for modeset and for driver
      specific ioctls), leading to deadlocks in hotunplug.  Instead the code
      sections that touch the hardware need to be annotated with
      drm_dev_enter/exit, to avoid accessing hardware resources after the
      unload/remove has finished.
      
      To avoid use-after-free issues all the involved userspace visible
      objects are supposed to hold a reference on the underlying drm_device,
      like drm_file does.
      
      The issue now is that we missed one, the atomic modeset ioctl can be run
      in a nonblocking fashion, and in that case it cannot rely on the implied
      drm_device reference provided by the ioctl calling context.  This can
      result in a use-after-free if an nonblocking atomic commit is carefully
      raced against a driver unload.
      
      Fix this by unconditionally grabbing a drm_device reference for any
      drm_atomic_state structures.  Strictly speaking this isn't required for
      blocking commits and TEST_ONLY calls, but it's the simpler approach.
      
      Thanks to shanzhulig for the initial idea of grabbing an unconditional
      reference, I just added comments, a condensed commit message and fixed a
      minor potential issue in where exactly we drop the final reference.
      
      Reported-by: default avatarshanzhulig <shanzhulig@gmail.com>
      Suggested-by: default avatarshanzhulig <shanzhulig@gmail.com>
      Reviewed-by: default avatarMaxime Ripard <mripard@kernel.org>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Thomas Zimmermann <tzimmermann@suse.de>
      Cc: David Airlie <airlied@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e076c73
  5. Jul 21, 2023