Unverified Commit 013aa462 authored by openeuler-ci-bot's avatar openeuler-ci-bot Committed by Gitee
Browse files

!14289 v2 erofs: add ondemand mode support

Merge Pull Request from: @ci-robot 
 
PR sync from: Baokun Li <libaokun1@huawei.com>
https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/message/AORGXXDYNWOIA2MALCTD3D6UYGLCOVMY/ 
Changes since v1:
 * Corrected patch header format.

including readahead, domain sharing, failover, etc.

In addition to patches from the open source community, we have made some
compatibility modifications and quality hardening for this feature:
 * Added the dynamic switches of erofs_enabled and
   cachefiles_ondemand_enabled.
 * The structure of the cache file directory is modified to be the same as
   that of the mainline to prevent the problem that the existing cache
   cannot be found due to kernel upgrade and downgrade.
 * Use the same xattr as that in the mainline to prevent cache failure
   caused by kernel upgrade or downgrade.
 * Various bugfixes...

Al Viro (1):
  erofs: fix handling kern_mount() failure

Baokun Li (29):
  openeuler_defconfig: enable erofs ondemand for x86 and arm64
  fscache: fix reference count leakage during abort init
  fscache: fix assertion failure in fscache_put_object()
  erofs: fix lockdep false positives on initializing erofs_pseudo_mnt
  erofs: remove erofs_fscache_netfs
  fscache: rename new_location to new_version
  cachefiles: remove err_put_fd tag in cachefiles_ondemand_daemon_read()
  cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
  cachefiles: fix slab-use-after-free in
    cachefiles_ondemand_daemon_read()
  cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
  cachefiles: add consistency check for copen/cread
  cachefiles: stop sending new request when dropping object
  cachefiles: flush all requests for the object that is being dropped
  fscache: limit fscache_object_max_active to avoid blocking
  cachefiles: add spin_lock for cachefiles_ondemand_info
  cachefiles: never get a new anon fd if ondemand_id is valid
  cachefiles: make on-demand read killable
  fscache: set the default value of object_max_active to 256
  cachefiles: flush all requests after setting CACHEFILES_DEAD
  cachefiles: call cachefiles_ondemand_init_object() out of dir inode
    lock
  cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
  cachefiles: disallow to complete open requests with uninitialised
    ondemand_id
  cachefiles: clear FSCACHE_COOKIE_NO_DATA_YET for the new ondemand
    object
  erofs: correct the blknr/blkoff calculation in erofs_read_raw_page()
  fscache: fix op leak due to abort init after parent ready
  cachefiles: cyclic allocation of msg_id to avoid reuse
  fscache: fix assertion failure in cachefiles_put_object()
  cachefiles: prefault in user pages to aovid ABBA deadlock

David Howells (1):
  cachefiles, erofs: Fix NULL deref in when cachefiles is not doing

Dawei Li (1):
  erofs: protect s_inodes with s_inode_list_lock for fscache

Gao Xiang (26):
  erofs: clean up file headers & footers
  erofs: introduce chunk-based file on-disk format
  erofs: support reading chunk-based uncompressed files
  erofs: fix double free of 'copied'
  erofs: introduce erofs_sb_has_xxx() helpers
  erofs: decouple basic mount options from fs_context
  erofs: add multiple device support
  erofs: clean up erofs_map_blocks tracepoints
  erofs: introduce meta buffer operations
  erofs: use meta buffers for inode operations
  erofs: use meta buffers for super operations
  erofs: use meta buffers for xattr operations
  erofs: use meta buffers for zmap operations
  erofs: fix misbehavior of unsupported chunk format check
  erofs: register fscache volume
  erofs: add fscache context helper functions
  erofs: add anonymous inode caching metadata for data blobs
  erofs: register fscache context for primary data blob
  erofs: register fscache context for extra data blobs
  erofs: implement fscache-based metadata read
  erofs: implement fscache-based data read for non-inline layout
  erofs: implement fscache-based data read for inline layout
  erofs: add 'fsid' mount option
  erofs: scan devices from device table
  erofs: fix order >= MAX_ORDER warning due to crafted negative i_size

Hou Tao (2):
  erofs: check the uniqueness of fsid in shared domain in advance
  cachefiles: flush ondemand_object_worker during clean object

Jeffle Xu (9):
  erofs: use meta buffers for erofs_read_superblock()
  cachefiles: notify the user daemon when looking up cookie
  cachefiles: notify the user daemon when withdrawing cookie
  cachefiles: implement on-demand read
  erofs: make erofs_map_blocks() generally available

Jia Zhu (12):
  cachefiles: narrow the scope of flushed requests when releasing fd
  erofs: code clean up for fscache
  erofs: introduce fscache-based domain
  erofs: introduce a pseudo mnt to manage shared cookies
  erofs: Support sharing cookies in the same domain
  erofs: introduce 'domain_id' mount option
  anolis: cachefiles: introduce object ondemand state
  anolis: cachefiles: extract ondemand info field from cachefiles_object
  anolis: cachefiles: resend an open request if the read request's
    object is closed
  anolis: cachefiles: narrow the scope of triggering EPOLLIN events in
  anolis: cachefiles: add restore command to recover inflight ondemand
    read requests

Jingbo Xu (18):
  anolis: cachefiles: replace BUG_ON() with WARN_ON()
  anolis: cachefiles: fix volume key setup for cachefiles_open
  anolis: cachefiles: refactor cachefiles_ondemand_daemon_read()
  anolis: erofs: fix the name of erofs_fscache_super_index_def
  anolis: cachefiles: maintain a file descriptor to the backing file
  anolis: fscache,cachefiles: add fscache_prepare_read() helper
  anolis: erofs: implement fscache-based data readahead
  erofs: fix use-after-free of fsid and domain_id string
  erofs: remove unused EROFS_GET_BLOCKS_RAW flag
  anolis: cachefiles: optimize on-demand IO path with buffer IO
  anolis: fscache: export fscache_object_wq
  anolis: cachefiles: reset object->private to NULL when it's freed
  anolis: cachefiles: add missing lock protection when polling
  anolis: cachefiles: fix potential NULL in error path
  erofs: relinquish volume with mutex held
  erofs: maintain cookies of share domain in self-contained list
  erofs: remove unused device mapping in meta routine
  erofs: unify anonymous inodes for blob

Mikulas Patocka (1):
  wait_on_bit: add an acquire memory barrier

Sun Ke (1):
  cachefiles: fix error return code in cachefiles_ondemand_copen()

Xin Yin (1):
  cachefiles: make on-demand request distribution fairer

Yu Kuai (10):
  fscache: add new helper to determine cachefile location
  fscache: generate new key_hash for new location
  cachefiles: factor out helper to generate acc from
    cachefiles_cook_key()
  cachefiles: factor out helper to generate csum from
    cachefiles_cook_key()
  cachefiles: use volume key directly in cachefiles_cook_key() for new
    location
  cachefiles: skip acc in cachefiles_cook_key() for new location
  cachefiles: skip volum csum cachefiles_cook_key() for new location
  cachefiles: use key_hash as csum for new location
  cachefiles: handle the unprintable case for new location in
    cachefiles_cook_key()
  erofs: switch to use new location

Yue Hu (1):
  erofs: don't use erofs_map_blocks() any more

Zizhi Wo (20):
  erofs: fix page unlock advance during readahead
  erofs: add erofs switch to better control it
  erofs: add erofs_ondemand switch
  fscache: fix kernel BUG at __fscache_read_or_alloc_page
  fscache: add a waiting mechanism when duplicate cookies are detected
  fscache: Fix trace UAF in fscache_cookie_put()
  cachefiles: Fix NULL pointer dereference in object->file
  cachefiles: Restrict monitor calls to read_page
  cachefiles: Fix cookie reference count leakage error
  fscache: Add the unhash_cookie mechanism to fscache_drop_object()
  cachefiles: Add restrictions to cachefiles_daemon_cull()
  cachefiles: Set object to close if ondemand_id < 0 in copen
  s390: provide arch_test_bit_acquire() for architecture s390
  wait_on_bit: Add wait_on_bit_acquire() to provide memory barrier
  fscache: add a memory barrier for FSCACHE_COOKIE_LOOKING_UP
  fscache/cachefiles: add a memory barrier for waking and waiting
  fscache/cachefiles: add a memory barrier for page_write
  fscache: modify fscache_hash_cookie() to enhance security
  cachefiles: modify inappropriate error return value in
    cachefiles_daemon_secctx


-- 
2.46.1
 
https://gitee.com/openeuler/kernel/issues/IB5UKT 
 
Link:https://gitee.com/openeuler/kernel/pulls/14289

 

Reviewed-by: default avatarHou Tao <houtao1@huawei.com>
Reviewed-by: default avatarLi Nan <linan122@huawei.com>
Signed-off-by: default avatarLi Nan <linan122@huawei.com>
parents 345ca570 093d63c1
Loading
Loading
Loading
Loading
+4 −6
Original line number Diff line number Diff line
@@ -58,13 +58,11 @@ Like with atomic_t, the rule of thumb is:

 - RMW operations that have a return value are fully ordered.

 - RMW operations that are conditional are unordered on FAILURE,
   otherwise the above rules apply. In the case of test_and_{}_bit() operations,
   if the bit in memory is unchanged by the operation then it is deemed to have
   failed.
 - RMW operations that are conditional are fully ordered.

Except for a successful test_and_set_bit_lock() which has ACQUIRE semantics and
clear_bit_unlock() which has RELEASE semantics.
Except for a successful test_and_set_bit_lock() which has ACQUIRE semantics,
clear_bit_unlock() which has RELEASE semantics and test_bit_acquire which has
ACQUIRE semantics.

Since a platform only has a single means of achieving atomic operations
the same barriers as for atomic_t are used, see atomic_t.txt.
+22 −6
Original line number Diff line number Diff line
@@ -19,9 +19,10 @@ It is designed as a better filesystem solution for the following scenarios:
   immutable and bit-for-bit identical to the official golden image for
   their releases due to security and other considerations and

 - hope to save some extra storage space with guaranteed end-to-end performance
   by using reduced metadata and transparent file compression, especially
   for those embedded devices with limited memory (ex, smartphone);
 - hope to minimize extra storage space with guaranteed end-to-end performance
   by using compact layout, transparent file compression and direct access,
   especially for those embedded devices with limited memory and high-density
   hosts with numerous containers;

Here is the main features of EROFS:

@@ -51,7 +52,9 @@ Here is the main features of EROFS:
 - Support POSIX.1e ACLs by using xattrs;

 - Support transparent file compression as an option:
   LZ4 algorithm with 4 KB fixed-sized output compression for high performance.
   LZ4 algorithm with 4 KB fixed-sized output compression for high performance;

 - Multiple device support for multi-layer container images.

The following git tree provides the file system user-space tools under
development (ex, formatting tool mkfs.erofs):
@@ -84,6 +87,7 @@ cache_strategy=%s Select a strategy for cached decompression from now on:
                                   It still does in-place I/O decompression
                                   for the rest compressed physical clusters.
		       ==========  =============================================
device=%s              Specify a path to an extra device to be used together.
===================    =========================================================

On-disk details
@@ -153,13 +157,14 @@ may not. All metadatas can be now observed in two different spaces (views):

    Xattrs, extents, data inline are followed by the corresponding inode with
    proper alignment, and they could be optional for different data mappings.
    _currently_ total 4 valid data mappings are supported:
    _currently_ total 5 data layouts are supported:

    ==  ====================================================================
     0  flat file data without data inline (no extent);
     1  fixed-sized output data compression (with non-compacted indexes);
     2  flat file data with tail packing data inline (no extent);
     3  fixed-sized output data compression (with compacted indexes, v5.3+).
     3  fixed-sized output data compression (with compacted indexes, v5.3+);
     4  chunk-based file (v5.15+).
    ==  ====================================================================

    The size of the optional xattrs is indicated by i_xattr_count in inode
@@ -211,6 +216,17 @@ Note that apart from the offset of the first filename, nameoff0 also indicates
the total number of directory entries in this block since it is no need to
introduce another on-disk field at all.

Chunk-based file
----------------
In order to support chunk-based data deduplication, a new inode data layout has
been supported since Linux v5.15: Files are split in equal-sized data chunks
with ``extents`` area of the inode metadata indicating how to get the chunk
data: these can be simply as a 4-byte block address array or in the 8-byte
chunk index form (see struct erofs_inode_chunk_index in erofs_fs.h for more
details.)

By the way, chunk-based files are all uncompressed for now.

Compression
-----------
Currently, EROFS supports 4KB fixed-sized output transparent file compression,
+8 −1
Original line number Diff line number Diff line
@@ -6408,6 +6408,7 @@ CONFIG_FSCACHE_STATS=y
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set
CONFIG_CACHEFILES_ONDEMAND=y
# end of Caches

#
@@ -6521,7 +6522,13 @@ CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
CONFIG_PSTORE_RAM=m
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EROFS_FS is not set
CONFIG_EROFS_FS=m
# CONFIG_EROFS_FS_DEBUG is not set
CONFIG_EROFS_FS_XATTR=y
CONFIG_EROFS_FS_POSIX_ACL=y
CONFIG_EROFS_FS_SECURITY=y
# CONFIG_EROFS_FS_ZIP is not set
CONFIG_EROFS_FS_ONDEMAND=y
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V2=m
+7 −0
Original line number Diff line number Diff line
@@ -241,6 +241,13 @@ static inline void arch___clear_bit_unlock(unsigned long nr,
	arch___clear_bit(nr, ptr);
}

static __always_inline bool
arch_test_bit_acquire(unsigned long nr, const volatile unsigned long *addr)
{
	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
	return 1UL & (smp_load_acquire(p) >> (nr & (BITS_PER_LONG-1)));
}

#include <asm-generic/bitops/instrumented-atomic.h>
#include <asm-generic/bitops/instrumented-non-atomic.h>
#include <asm-generic/bitops/instrumented-lock.h>
+8 −1
Original line number Diff line number Diff line
@@ -7463,6 +7463,7 @@ CONFIG_FSCACHE_STATS=y
CONFIG_CACHEFILES=m
# CONFIG_CACHEFILES_DEBUG is not set
# CONFIG_CACHEFILES_HISTOGRAM is not set
CONFIG_CACHEFILES_ONDEMAND=y
# end of Caches

#
@@ -7578,7 +7579,13 @@ CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
CONFIG_PSTORE_RAM=m
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_EROFS_FS is not set
CONFIG_EROFS_FS=m
# CONFIG_EROFS_FS_DEBUG is not set
CONFIG_EROFS_FS_XATTR=y
CONFIG_EROFS_FS_POSIX_ACL=y
CONFIG_EROFS_FS_SECURITY=y
# CONFIG_EROFS_FS_ZIP is not set
CONFIG_EROFS_FS_ONDEMAND=y
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
# CONFIG_NFS_V2 is not set
Loading