Skip to content
  1. Jan 22, 2024
    • Nathan Chancellor's avatar
      platform/x86: intel-uncore-freq: Fix types in sysfs callbacks · 416de024
      Nathan Chancellor authored
      When booting a kernel with CONFIG_CFI_CLANG, there is a CFI failure when
      accessing any of the values under
      /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00:
      
        $ cat /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/max_freq_khz
        fish: Job 1, 'cat /sys/devices/system/cpu/int…' terminated by signal SIGSEGV (Address boundary error)
      
        $ sudo dmesg &| grep 'CFI failure'
        [  170.953925] CFI failure at kobj_attr_show+0x19/0x30 (target: show_max_freq_khz+0x0/0xc0 [intel_uncore_frequency_common]; expected type: 0xd34078c5
      
      The sysfs callback functions such as show_domain_id() are written as if
      they are going to be called by dev_attr_show() but as the above message
      shows, they are instead called by kobj_attr_show(). kCFI checks that the
      destination of an indirect jump has the exact same type as the prototype
      of the function pointer it is called through and fails when they do not.
      
      These callbacks are called through kobj_attr_show() because
      uncore_root_kobj was initialized with kobject_create_and_add(), which
      means uncore_root_kobj has a ->sysfs_ops of kobj_sysfs_ops from
      kobject_create(), which uses kobj_attr_show() as its ->show() value.
      
      The only reason there has not been a more noticeable problem until this
      point is that 'struct kobj_attribute' and 'struct device_attribute' have
      the same layout, so getting the callback from container_of() works the
      same with either value.
      
      Change all the callbacks and their uses to be compatible with
      kobj_attr_show() and kobj_attr_store(), which resolves the kCFI failure
      and allows the sysfs files to work properly.
      
      Closes: https://github.com/ClangBuiltLinux/linux/issues/1974
      Fixes: ae7b2ce5
      
       ("platform/x86/intel/uncore-freq: Use sysfs API to create attributes")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarSami Tolvanen <samitolvanen@google.com>
      Acked-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Link: https://lore.kernel.org/r/20240104-intel-uncore-freq-kcfi-fix-v1-1-bf1e8939af40@kernel.org
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      416de024
    • Dan Carpenter's avatar
      platform/x86: wmi: Fix wmi_dev_probe() · 8446f9d1
      Dan Carpenter authored
      This has a reversed if statement so it accidentally disables the wmi
      method before returning.
      
      Fixes: 704af3a4
      
       ("platform/x86: wmi: Remove chardev interface")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarArmin Wolf <W_Armin@gmx.de>
      Link: https://lore.kernel.org/r/9c81251b-bc87-4ca3-bb86-843dc85e5145@moroto.mountain
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      8446f9d1
    • Armin Wolf's avatar
      platform/x86: wmi: Fix notify callback locking · 29e473f4
      Armin Wolf authored
      When an legacy WMI event handler is removed, an WMI event could
      have called the handler just before it was removed, meaning the
      handler could still be running after wmi_remove_notify_handler()
      returns.
      Something similar could also happens when using the WMI bus, as
      the WMI core might still call the notify() callback from an WMI
      driver even if its remove() callback was just called.
      
      Fix this by introducing a rw semaphore which ensures that the
      event state of a WMI device does not change while the WMI core
      is handling an event for it.
      
      Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.
      
      Fixes: 1686f544
      
       ("platform/x86: wmi: Incorporate acpi_install_notify_handler")
      Signed-off-by: default avatarArmin Wolf <W_Armin@gmx.de>
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20240103192707.115512-5-W_Armin@gmx.de
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      29e473f4
    • Armin Wolf's avatar
      platform/x86: wmi: Decouple legacy WMI notify handlers from wmi_block_list · 3ea7f59a
      Armin Wolf authored
      
      
      Until now, legacy WMI notify handler functions where using the
      wmi_block_list, which did no refcounting on the returned WMI device.
      This meant that the WMI device could disappear at any moment,
      potentially leading to various errors.
      Fix this by using bus_find_device() which returns an actual
      reference to the found WMI device.
      
      Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.
      
      Signed-off-by: default avatarArmin Wolf <W_Armin@gmx.de>
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20240103192707.115512-4-W_Armin@gmx.de
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      3ea7f59a
    • Armin Wolf's avatar
      platform/x86: wmi: Return immediately if an suitable WMI event is found · 3d8a29fe
      Armin Wolf authored
      Commit 58f6425e
      
       ("WMI: Cater for multiple events with same GUID")
      allowed legacy WMI notify handlers to be installed for multiple WMI
      devices with the same GUID.
      However this is useless since the legacy GUID-based interface is
      blacklisted from seeing WMI devices with duplicated GUIDs.
      
      Return immediately if a suitable WMI event is found in
      wmi_install/remove_notify_handler() since searching for other suitable
      events is pointless.
      
      Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.
      
      Signed-off-by: default avatarArmin Wolf <W_Armin@gmx.de>
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20240103192707.115512-3-W_Armin@gmx.de
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      3d8a29fe
    • Armin Wolf's avatar
      platform/x86: wmi: Fix error handling in legacy WMI notify handler functions · 6ba7843b
      Armin Wolf authored
      When wmi_install_notify_handler()/wmi_remove_notify_handler() are
      unable to enable/disable the WMI device, they unconditionally return
      an error to the caller.
      When registering legacy WMI notify handlers, this means that the
      callback remains registered despite wmi_install_notify_handler()
      having returned an error.
      When removing legacy WMI notify handlers, this means that the
      callback is removed despite wmi_remove_notify_handler() having
      returned an error.
      
      Fix this by only warning when the WMI device could not be enabled.
      This behaviour matches the bus-based WMI interface.
      
      Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.
      
      Fixes: 58f6425e
      
       ("WMI: Cater for multiple events with same GUID")
      Signed-off-by: default avatarArmin Wolf <W_Armin@gmx.de>
      Reviewed-by: default avatarIlpo Järvinen <ilpo.jarvinen@linux.intel.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20240103192707.115512-2-W_Armin@gmx.de
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      6ba7843b
    • Linus Torvalds's avatar
      Linux 6.8-rc1 · 6613476e
      Linus Torvalds authored
      v6.8-rc1
      6613476e
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs · 35a4474b
      Linus Torvalds authored
      Pull more bcachefs updates from Kent Overstreet:
       "Some fixes, Some refactoring, some minor features:
      
         - Assorted prep work for disk space accounting rewrite
      
         - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
           makes our trigger context more explicit
      
         - A few fixes to avoid excessive transaction restarts on
           multithreaded workloads: fstests (in addition to ktest tests) are
           now checking slowpath counters, and that's shaking out a few bugs
      
         - Assorted tracepoint improvements
      
         - Starting to break up bcachefs_format.h and move on disk types so
           they're with the code they belong to; this will make room to start
           documenting the on disk format better.
      
         - A few minor fixes"
      
      * tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
        bcachefs: Improve inode_to_text()
        bcachefs: logged_ops_format.h
        bcachefs: reflink_format.h
        bcachefs; extents_format.h
        bcachefs: ec_format.h
        bcachefs: subvolume_format.h
        bcachefs: snapshot_format.h
        bcachefs: alloc_background_format.h
        bcachefs: xattr_format.h
        bcachefs: dirent_format.h
        bcachefs: inode_format.h
        bcachefs; quota_format.h
        bcachefs: sb-counters_format.h
        bcachefs: counters.c -> sb-counters.c
        bcachefs: comment bch_subvolume
        bcachefs: bch_snapshot::btime
        bcachefs: add missing __GFP_NOWARN
        bcachefs: opts->compression can now also be applied in the background
        bcachefs: Prep work for variable size btree node buffers
        bcachefs: grab s_umount only if snapshotting
        ...
      35a4474b
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4fbbed78
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Updates for time and clocksources:
      
         - A fix for the idle and iowait time accounting vs CPU hotplug.
      
           The time is reset on CPU hotplug which makes the accumulated
           systemwide time jump backwards.
      
         - Assorted fixes and improvements for clocksource/event drivers"
      
      * tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
        clocksource/drivers/ep93xx: Fix error handling during probe
        clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
        clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
        clocksource/timer-riscv: Add riscv_clock_shutdown callback
        dt-bindings: timer: Add StarFive JH8100 clint
        dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
      4fbbed78
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 7b297a5c
      Linus Torvalds authored
      Pull powerpc fixes from Aneesh Kumar:
      
       - Increase default stack size to 32KB for Book3S
      
      Thanks to Michael Ellerman.
      
      * tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Increase default stack size to 32KB
      7b297a5c
    • Kent Overstreet's avatar
      bcachefs: Improve inode_to_text() · 249f441f
      Kent Overstreet authored
      
      
      Add line breaks - inode_to_text() is now much easier to read.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      249f441f
    • Kent Overstreet's avatar
      bcachefs: logged_ops_format.h · d826cc57
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      d826cc57
    • Kent Overstreet's avatar
      bcachefs: reflink_format.h · 8d52ba60
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      8d52ba60
    • Kent Overstreet's avatar
      bcachefs; extents_format.h · b2fa1b63
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      b2fa1b63
    • Kent Overstreet's avatar
      bcachefs: ec_format.h · 0560eb9a
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      0560eb9a
    • Kent Overstreet's avatar
      bcachefs: subvolume_format.h · c6c4ff65
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      c6c4ff65
    • Kent Overstreet's avatar
      bcachefs: snapshot_format.h · 8fed323b
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      8fed323b
    • Kent Overstreet's avatar
      d455179f
    • Kent Overstreet's avatar
      bcachefs: xattr_format.h · 72e08010
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      72e08010
    • Kent Overstreet's avatar
      bcachefs: dirent_format.h · 7ffc4daa
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      7ffc4daa
    • Kent Overstreet's avatar
      bcachefs: inode_format.h · b36425da
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      b36425da
    • Kent Overstreet's avatar
      bcachefs; quota_format.h · 82de6207
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      82de6207
    • Kent Overstreet's avatar
      bcachefs: sb-counters_format.h · 43314801
      Kent Overstreet authored
      
      
      bcachefs_format.h has gotten too big; let's do some organizing.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      43314801
    • Kent Overstreet's avatar
      3a58dfbc
    • Kent Overstreet's avatar
      12207f49
    • Kent Overstreet's avatar
      bcachefs: bch_snapshot::btime · d32088f2
      Kent Overstreet authored
      
      
      Add a field to bch_snapshot for creation time; this will be important
      when we start exposing the snapshot tree to userspace.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      d32088f2
    • Kent Overstreet's avatar
      7be0208f
    • Kent Overstreet's avatar
      bcachefs: opts->compression can now also be applied in the background · d7e77f53
      Kent Overstreet authored
      
      
      The "apply this compression method in the background" paths now use the
      compression option if background_compression is not set; this means that
      setting or changing the compression option will cause existing data to
      be compressed accordingly in the background.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      d7e77f53
    • Kent Overstreet's avatar
      bcachefs: Prep work for variable size btree node buffers · ec4edd7b
      Kent Overstreet authored
      
      
      bcachefs btree nodes are big - typically 256k - and btree roots are
      pinned in memory. As we're now up to 18 btrees, we now have significant
      memory overhead in mostly empty btree roots.
      
      And in the future we're going to start enforcing that certain btree node
      boundaries exist, to solve lock contention issues - analagous to XFS's
      AGIs.
      
      Thus, we need to start allocating smaller btree node buffers when we
      can. This patch changes code that refers to the filesystem constant
      c->opts.btree_node_size to refer to the btree node buffer size -
      btree_buf_bytes() - where appropriate.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      ec4edd7b
    • Su Yue's avatar
      bcachefs: grab s_umount only if snapshotting · 2acc59dd
      Su Yue authored
      When I was testing mongodb over bcachefs with compression,
      there is a lockdep warning when snapshotting mongodb data volume.
      
      $ cat test.sh
      prog=bcachefs
      
      $prog subvolume create /mnt/data
      $prog subvolume create /mnt/data/snapshots
      
      while true;do
          $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
          sleep 1s
      done
      
      $ cat /etc/mongodb.conf
      systemLog:
        destination: file
        logAppend: true
        path: /mnt/data/mongod.log
      
      storage:
        dbPath: /mnt/data/
      
      lockdep reports:
      [ 3437.452330] ======================================================
      [ 3437.452750] WARNING: possible circular locking dependency detected
      [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
      [ 3437.453562] ------------------------------------------------------
      [ 3437.453981] bcachefs/35533 is trying to acquire lock:
      [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
      [ 3437.454875]
                     but task is already holding lock:
      [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
      [ 3437.456009]
                     which lock already depends on the new lock.
      
      [ 3437.456553]
                     the existing dependency chain (in reverse order) is:
      [ 3437.457054]
                     -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
      [ 3437.457507]        down_read+0x3e/0x170
      [ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
      [ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
      [ 3437.458498]        do_syscall_64+0x42/0xf0
      [ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.459155]
                     -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
      [ 3437.459615]        down_read+0x3e/0x170
      [ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
      [ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
      [ 3437.460686]        notify_change+0x1f1/0x4a0
      [ 3437.461283]        do_truncate+0x7f/0xd0
      [ 3437.461555]        path_openat+0xa57/0xce0
      [ 3437.461836]        do_filp_open+0xb4/0x160
      [ 3437.462116]        do_sys_openat2+0x91/0xc0
      [ 3437.462402]        __x64_sys_openat+0x53/0xa0
      [ 3437.462701]        do_syscall_64+0x42/0xf0
      [ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.463359]
                     -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
      [ 3437.463843]        down_write+0x3b/0xc0
      [ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
      [ 3437.464493]        vfs_write+0x21b/0x4c0
      [ 3437.464653]        ksys_write+0x69/0xf0
      [ 3437.464839]        do_syscall_64+0x42/0xf0
      [ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.465231]
                     -> #0 (sb_writers#10){.+.+}-{0:0}:
      [ 3437.465471]        __lock_acquire+0x1455/0x21b0
      [ 3437.465656]        lock_acquire+0xc6/0x2b0
      [ 3437.465822]        mnt_want_write+0x46/0x1a0
      [ 3437.465996]        filename_create+0x62/0x190
      [ 3437.466175]        user_path_create+0x2d/0x50
      [ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
      [ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
      [ 3437.466791]        do_syscall_64+0x42/0xf0
      [ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.467180]
                     other info that might help us debug this:
      
      [ 3437.469670] 2 locks held by bcachefs/35533:
                     other info that might help us debug this:
      
      [ 3437.467507] Chain exists of:
                       sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48
      
      [ 3437.467979]  Possible unsafe locking scenario:
      
      [ 3437.468223]        CPU0                    CPU1
      [ 3437.468405]        ----                    ----
      [ 3437.468585]   rlock(&type->s_umount_key#48);
      [ 3437.468758]                                lock(&c->snapshot_create_lock);
      [ 3437.469030]                                lock(&type->s_umount_key#48);
      [ 3437.469291]   rlock(sb_writers#10);
      [ 3437.469434]
                      *** DEADLOCK ***
      
      [ 3437.469670] 2 locks held by bcachefs/35533:
      [ 3437.469838]  #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
      [ 3437.470294]  #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
      [ 3437.470744]
                     stack backtrace:
      [ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
      [ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
      [ 3437.471694] Call Trace:
      [ 3437.471795]  <TASK>
      [ 3437.471884]  dump_stack_lvl+0x57/0x90
      [ 3437.472035]  check_noncircular+0x132/0x150
      [ 3437.472202]  __lock_acquire+0x1455/0x21b0
      [ 3437.472369]  lock_acquire+0xc6/0x2b0
      [ 3437.472518]  ? filename_create+0x62/0x190
      [ 3437.472683]  ? lock_is_held_type+0x97/0x110
      [ 3437.472856]  mnt_want_write+0x46/0x1a0
      [ 3437.473025]  ? filename_create+0x62/0x190
      [ 3437.473204]  filename_create+0x62/0x190
      [ 3437.473380]  user_path_create+0x2d/0x50
      [ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
      [ 3437.473819]  ? lock_acquire+0xc6/0x2b0
      [ 3437.474002]  ? __fget_files+0x2a/0x190
      [ 3437.474195]  ? __fget_files+0xbc/0x190
      [ 3437.474380]  ? lock_release+0xc5/0x270
      [ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
      [ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
      [ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
      [ 3437.475277]  do_syscall_64+0x42/0xf0
      [ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [ 3437.475691] RIP: 0033:0x7f2743c313af
      ======================================================
      
      In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
      and unlock it at the end of the function. There is a comment
      "why do we need this lock?" about the lock coming from
      commit 42d23732
      
       ("bcachefs: Snapshot creation, deletion")
      The reason is that __bch2_ioctl_subvolume_create() calls
      sync_inodes_sb() which enforce locked s_umount to writeback all dirty
      nodes before doing snapshot works.
      
      Fix it by read locking s_umount for snapshotting only and unlocking
      s_umount after sync_inodes_sb().
      
      Signed-off-by: default avatarSu Yue <glass.su@suse.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      2acc59dd
    • Su Yue's avatar
      bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit · 369acf97
      Su Yue authored
      
      
      bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
      It should be freed by kvfree not kfree.
      Or umount will triger:
      
      [  406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
      [  406.830676 ] #PF: supervisor read access in kernel mode
      [  406.831643 ] #PF: error_code(0x0000) - not-present page
      [  406.832487 ] PGD 0 P4D 0
      [  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
      [  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
      [  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
      [  406.835796 ] RIP: 0010:kfree+0x62/0x140
      [  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
      [  406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
      [  406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
      [  406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
      [  406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
      [  406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
      [  406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
      [  406.840451 ] FS:  00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
      [  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
      [  406.841464 ] Call Trace:
      [  406.841583 ]  <TASK>
      [  406.841682 ]  ? __die+0x1f/0x70
      [  406.841828 ]  ? page_fault_oops+0x159/0x470
      [  406.842014 ]  ? fixup_exception+0x22/0x310
      [  406.842198 ]  ? exc_page_fault+0x1ed/0x200
      [  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
      [  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
      [  406.842842 ]  ? kfree+0x62/0x140
      [  406.842988 ]  ? kfree+0x104/0x140
      [  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
      [  406.843390 ]  kobject_put+0xb7/0x170
      [  406.843552 ]  deactivate_locked_super+0x2f/0xa0
      [  406.843756 ]  cleanup_mnt+0xba/0x150
      [  406.843917 ]  task_work_run+0x59/0xa0
      [  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
      [  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
      [  406.844510 ]  do_syscall_64+0x4e/0xf0
      [  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
      [  406.844907 ] RIP: 0033:0x7f0a2664e4fb
      
      Signed-off-by: default avatarSu Yue <glass.su@suse.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      369acf97
    • Kent Overstreet's avatar
      bcachefs: bios must be 512 byte algined · 00fff4dd
      Kent Overstreet authored
      Fixes: 023f9ac9
      
       bcachefs: Delete dio read alignment check
      Reported-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      00fff4dd
    • Colin Ian King's avatar
      bcachefs: remove redundant variable tmp · aead3428
      Colin Ian King authored
      
      
      The variable tmp is being assigned a value but it isn't being
      read afterwards. The assignment is redundant and so tmp can be
      removed.
      
      Cleans up clang scan build warning:
      warning: Although the value stored to 'ret' is used in the enclosing
      expression, the value is never actually read from 'ret'
      [deadcode.DeadStores]
      
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      aead3428
    • Kent Overstreet's avatar
    • Kent Overstreet's avatar
      bcachefs: Fix excess transaction restarts in __bchfs_fallocate() · 46bf2e9c
      Kent Overstreet authored
      
      
      drop_locks_do() should not be used in a fastpath without first trying
      the do in nonblocking mode - the unlock and relock will cause excessive
      transaction restarts and potentially livelocking with other threads that
      are contending for the same locks.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      46bf2e9c
    • Kent Overstreet's avatar
      bcachefs: extents_to_bp_state · 1a503904
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      1a503904
    • Kent Overstreet's avatar
      bcachefs: bkey_and_val_eq() · ba96d36c
      Kent Overstreet authored
      
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      ba96d36c
    • Kent Overstreet's avatar
      bcachefs: Better journal tracepoints · e6a2566f
      Kent Overstreet authored
      
      
      Factor out bch2_journal_bufs_to_text(), and use it in the
      journal_entry_full() tracepoint; when we can't get a journal reservation
      we need to know the outstanding journal entry sizes to know if the
      problem is due to excessive flushing.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      e6a2566f
    • Kent Overstreet's avatar
    • Kent Overstreet's avatar
      bcachefs: Avoid flushing the journal in the discard path · a6548c8b
      Kent Overstreet authored
      
      
      When issuing discards, we may need to flush the journal if there's too
      many buckets that can't be discarded until a journal flush.
      
      But the heuristic was bad; we should be comparing the number of buckets
      that need to flushes against the number of free buckets, not the number
      of buckets we saw.
      
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
      a6548c8b