Commits · b5cbb42dc59f519fa3cf49b9afbd5ee4805be01b · Mirrors / git.yoctoproject.org / linux-yocto

Jun 30, 2024

bcachefs: Repair fragmentation_lru in alloc_write_key() · b5cbb42d

Kent Overstreet authored Jun 29, 2024



fragmentation_lru derives from dirty_sectors, and wasn't being checked.

Co-developed-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b5cbb42d

bcachefs: add check for missing fragmentation in check_alloc_to_lru_ref() · d39881d2

Kent Overstreet authored Jun 29, 2024



We need to make sure we're not missing any fragmenation entries in the
LRU BTREE after repairing ALLOC BTREE

Also, use the new bch2_btree_write_buffer_maybe_flush() helper; this was
only working without it before since bucket invalidation (usually)
wasn't happening while fsck was running.

Co-developed-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d39881d2

bcachefs: bch2_btree_write_buffer_maybe_flush() · 92e1c29a

Kent Overstreet authored Jun 29, 2024

Add a new helper for checking references to write buffer btrees, where
we need a flush before we definitively know we have an inconsistency.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

92e1c29a

bcachefs: Add missing printbuf_tabstops_reset() calls · ef05bdf5

Kent Overstreet authored Jun 29, 2024



Fixes warnings from bch2_print_allocator_stuck()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ef05bdf5

Jun 29, 2024

bcachefs: Fix loop restart in bch2_btree_transactions_read() · 67c56411

Kent Overstreet authored Jun 28, 2024



Accidental infinite loop; also fix btree_deadlock_to_text()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

67c56411

bcachefs: Fix bch2_read_retry_nodecode() · 1539bdf5

Kent Overstreet authored Jun 28, 2024

BCH_READ_NODECODE mode - used by the move paths - really wants to use
only the original rbio, but the retry path really wants to clone - oof.

Make sure to copy the crc of the pointer we read from back to the
original rbio, or we'll see spurious checksum errors later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1539bdf5

bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs · 44ec5990

Kent Overstreet authored Jun 28, 2024



On a new filesystem or device we have to allocate the journal with a
bump allocator, because allocation info isn't ready yet - but when
hot-adding a device that doesn't have a journal, we don't want to use
that path.

Reported-by:  <syzbot+24a867cb90d8315cccff@syzkaller.appspotmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

44ec5990

bcachefs: Fix shift greater than integer size · a0bd30e4

Kent Overstreet authored Jun 28, 2024



Reported-by:  <syzbot+e5292b50f1957164a4b6@syzkaller.appspotmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a0bd30e4

bcachefs: Change bch2_fs_journal_stop() BUG_ON() to warning · 600b8be5
Kent Overstreet authored Jun 28, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
600b8be5

bcachefs: Delete old faulty bch2_trans_unlock() call · 84db6000

Kent Overstreet authored Jun 28, 2024



the unlock is now in read_extent, this fixes an assertion pop in
read_from_stale_dirty_pointer()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

84db6000

Jun 28, 2024
- bcachefs: Switch online_reserved shutdown assert to WARN() · 759b2e80
  Kent Overstreet authored Jun 28, 2024
  
  Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
  759b2e80
Jun 26, 2024

bcachefs: Fix kmalloc bug in __snapshot_t_mut · 64cd7de9

Pei Li authored Jun 25, 2024



When allocating too huge a snapshot table, we should fail gracefully
in __snapshot_t_mut() instead of fail in kmalloc().

Reported-by:  <syzbot+770e99b65e26fa023ab1@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=770e99b65e26fa023ab1


Tested-by:  <syzbot+770e99b65e26fa023ab1@syzkaller.appspotmail.com>
Signed-off-by: Pei Li <peili.dev@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

64cd7de9

bcachefs: Discard, invalidate workers are now per device · 64ee1431

Kent Overstreet authored Jun 23, 2024



There's no reason for discards to be single threaded across all devices;
this will improve performance on multi device setups.

Additionally, making them per-device simplifies the refcounting on
bch_dev->io_ref; we now hold it for the duration that the discard path
is running, which fixes a race between the discard path and device
removal.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

64ee1431

bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gc · 472237b6

Pei Li authored Jun 25, 2024



This series fix the shift-out-of-bounds issue in
bch2_blacklist_entries_gc().

Instead of passing 0 to eytzinger0_first() when iterating the entries,
we explicitly check 0 and initialize i to be 0.

syzbot has tested the proposed patch and the reproducer did not trigger
any issue:

Reported-and-tested-by:  <syzbot+835d255ad6bc7f29ee12@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=835d255ad6bc7f29ee12


Signed-off-by: Pei Li <peili.dev@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

472237b6

bcachefs: slab-use-after-free Read in bch2_sb_errors_from_cpu · 211c581d

Pei Li authored Jun 25, 2024



Acquire fsck_error_counts_lock before accessing the critical section
protected by this lock.

syzbot has tested the proposed patch and the reproducer did not trigger
any issue.

Reported-by:  <syzbot+a2bc0e838efd7663f4d9@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=a2bc0e838efd7663f4d9


Signed-off-by: Pei Li <peili.dev@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

211c581d

Jun 24, 2024

bcachefs: Add missing bch2_journal_do_writes() call · 89d21b69

Kent Overstreet authored Jun 23, 2024



This fixes a rare deadlock when we're doing an emergency shutdown due to
failure to do a journal write.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

89d21b69

bcachefs: Fix null ptr deref in journal_pins_to_text() · d6b52f68
Kent Overstreet authored Jun 23, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
d6b52f68

Jun 23, 2024

bcachefs: Add missing recalc_capacity() call · 36da8e38

Kent Overstreet authored Jun 23, 2024



This fixes filesystem size not changing on device removal.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

36da8e38

bcachefs: Fix btree_trans list ordering · 1aaf5cb4

Kent Overstreet authored Jun 22, 2024



The debug code relies on btree_trans_list being ordered so that it can
resume on subsequent calls or lock restarts.

However, it was using trans->locknig_wait.task.pid, which is incorrect
since btree_trans objects are cached and reused - typically by different
tasks.

Fix this by switching to pointer order, and also sort them lazily when
required - speeding up the btree_trans_get() fastpath.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1aaf5cb4

bcachefs: Fix race between trans_put() and btree_transactions_read() · de611ab6

Kent Overstreet authored Jun 22, 2024



debug.c was using closure_get() on a different thread's closure where
the we don't know if the object being refcounted is alive.

We keep btree_trans objects on a list so they can be printed by debug
code, and because it is cost prohibitive to touch the btree_trans list
every time we allocate and free btree_trans objects, cached objects are
also on this list.

However, we do not want the debug code to see cached but not in use
btree_trans objects - critically because the btree_paths array will have
been freed (if it was reallocated).

closure_get() is also incorrect to use when that get may race with it
hitting zero, i.e. we must already have a ref on the object or know the
ref can't currently hit 0 for other reasons (as used in the cycle
detector).

to fix this, use the previously introduced closure_get_not_zero(),
closure_return_sync(), and closure_init_stack_release(); the debug code
now can only take a ref on a trans object if it's alive and in use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

de611ab6

bcachefs: Make btree_deadlock_to_text() clearer · 18e92841

Kent Overstreet authored Jun 22, 2024



btree_deadlock_to_text() searches the list of btree transactions to find
a deadlock - when it finds one it's done; it's not like other *_read()
functions that's printing each object.

Factor out btree_deadlock_to_text() to make this clearer.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

18e92841

bcachefs: fix seqmutex_relock() · f44cc269

Kent Overstreet authored Jun 22, 2024

We were grabbing the sequence number before unlock incremented it - fix
this by moving the increment to seqmutex_lock() (so the seqmutex_relock()
failure path skips the mutex_trylock()), and returning the sequence
number from unlock(), to make the API simpler and safer.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f44cc269

bcachefs: Fix freeing of error pointers · 9bd01500

Kent Overstreet authored Jun 22, 2024



This fixes incorrect/missign checking of strndup_user() returns.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9bd01500

Jun 21, 2024

bcachefs: Move the ei_flags setting to after initialization · bd4da046

Youling Tang authored Jun 04, 2024



`inode->ei_flags` setting and cleaning should be done after initialization,
otherwise the operation is invalid.

Fixes: 9ca4853b ("bcachefs: Fix quota support for snapshots")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bd4da046

bcachefs: Fix a UAF after write_super() · 2fe79ce7

Kent Overstreet authored Jun 20, 2024



write_super() may reallocate the superblock buffer - but
bch_sb_field_ext was referencing it; don't use it after the write_super
call.

Reported-by:  <syzbot+8992fc10a192067b8d8a@syzkaller.appspotmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2fe79ce7

bcachefs: Use bch2_print_string_as_lines for long err · e6b3a655

Kent Overstreet authored Jun 20, 2024



printk strings get truncated to 1024 bytes; if we have a long error
message (journal debug info) we need to use a helper.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e6b3a655

bcachefs: Fix I_NEW warning in race path in bch2_inode_insert() · dd908648

Kent Overstreet authored Jun 20, 2024

discard_new_inode() is the correct interface for tearing down an indoe
that was fully created but not made visible to other threads, but it
expects I_NEW to be set, which we don't use.

Reported-by: https://github.com/koverstreet/bcachefs/issues/690


Fixes: bcachefs: Fix race path in bch2_inode_insert()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

dd908648

bcachefs: Replace bare EEXIST with private error codes · 50479406
Kent Overstreet authored May 26, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
50479406

bcachefs: Fix missing alloc_data_type_set() · f648b6c1

Kent Overstreet authored Jun 20, 2024



Incorrect bucket state transition in the discard path; when incrementing
a bucket's generation number that had already been discarded, we were
forgetting to check if it should be need_gc_gens, not free.

This was caught by the .invalid checks in the transaction commit path,
causing us to go emergency read only.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f648b6c1

Jun 20, 2024

bcachefs: fix alignment of VMA for memory mapped files on THP · c6cab97c

Youling Tang authored Jun 20, 2024



With CONFIG_READ_ONLY_THP_FOR_FS, the Linux kernel supports using THPs
for read-only mmapped files, such as shared libraries. However, the
kernel makes no attempt to actually align those mappings on 2MB
boundaries, which makes it impossible to use those THPs most of the
time. This issue applies to general file mapping THP as well as
existing setups using CONFIG_READ_ONLY_THP_FOR_FS. This is easily
fixed by using thp_get_unmapped_area for the unmapped_area function
in bcachefs, which is what ext2, ext4, fuse, xfs and btrfs all use.

Similar to commit b0c58223 ("btrfs: fix alignment of VMA for
memory mapped files on THP").

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c6cab97c

bcachefs: Fix safe errors by default · 33dfafa9

Kent Overstreet authored Jun 19, 2024



i.e. the start of automatic self healing:

If errors=continue or fix_safe, we now automatically fix simple errors
without user intervention.

New error action option: fix_safe

This replaces the existing errors=ro option, which gets a new slot, i.e.
existing errors=ro users now get errors=fix_safe.

This is currently only enabled for a limited set of errors - initially
just disk accounting; errors we would never not want to fix, and we
don't want to require user intervention (i.e. to make sure a bug report
gets filed).

Errors will still be counted in the superblock, so we (developers) will
still know they've been occuring if a bug report gets filed (as bug
reports typically include the errors superblock section).

Eventually we'll be enabling this for a much wider set of errors, after
we've done thorough error injection testing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

33dfafa9

bcachefs: Fix bch2_trans_put() · a56da697

Kent Overstreet authored Jun 19, 2024

reference: https://github.com/koverstreet/bcachefs/issues/692



trans->ref is the reference used by the cycle detector, which walks
btree_trans objects of other threads to walk the graph of held locks and
issue wakeups when an abort is required.

We have to wait for the ref to go to 1 before freeing trans->paths or
clearing trans->locking_wait.task.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a56da697

bcachefs: set_worker_desc() for delete_dead_snapshots · 0a2a507d

Kent Overstreet authored Jun 19, 2024



this is long running - help users see what's going on

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0a2a507d

bcachefs: Fix bch2_sb_downgrade_update() · ddd118ab
Kent Overstreet authored Jun 17, 2024
```
Missing enum conversion

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
ddd118ab

bcachefs: Handle cached data LRU wraparound · 2e9940d4

Kent Overstreet authored Jun 17, 2024



We only have 48 bits for the LRU time field, which is insufficient to
prevent wraparound.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2e9940d4

bcachefs: Guard against overflowing LRU_TIME_BITS · cff07e27

Kent Overstreet authored Jun 17, 2024



LRUs only have 48 bits for the time field (i.e. LRU order); thus we need
overflow checks and guards.

Reported-by:  <syzbot+df3bf3f088dcaa728857@syzkaller.appspotmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

cff07e27

bcachefs: delete_dead_snapshots() doesn't need to go RW · 1ba44217

Kent Overstreet authored Jun 17, 2024



We've been moving away from going RW lazily; if we want to go RW we do
that in set_may_go_rw(), and if we didn't go RW we don't need to delete
dead snapshots.

Reported-by:  <syzbot+4366624c0b5aac4906cf@syzkaller.appspotmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1ba44217

bcachefs: Fix early init error path in journal code · dbf4d79b

Kent Overstreet authored Jun 17, 2024



We shouln't be running the journal shutdown sequence if we never fully
initialized the journal.

Reported-by:  <syzbot+ffd2270f0bca3322ee00@syzkaller.appspotmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

dbf4d79b

bcachefs: Check for invalid btree IDs · 9e7cfb35

Kent Overstreet authored Jun 17, 2024



We can only handle btree IDs up to 62, since the btree id (plus the type
for interior btree nodes) has to fit ito a 64 bit bitmask - check for
invalid ones to avoid invalid shifts later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9e7cfb35

bcachefs: Fix btree ID bitmasks · e3fd3faa

Kent Overstreet authored Jun 17, 2024



these should be 64 bit bitmasks, not 32 bit.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e3fd3faa