Skip to content
  1. Jan 04, 2021
  2. Jan 03, 2021
    • Linus Torvalds's avatar
      Merge tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3516bd72
      Linus Torvalds authored
      Pull s390 cleanups from Vasily Gorbik:
       "Update defconfigs and sort config select list"
      
      * tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/Kconfig: sort config S390 select list once again
        s390: update defconfigs
      3516bd72
    • Linus Torvalds's avatar
      Merge tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d9296a7b
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a crash in intel_pstate during resume from suspend-to-RAM
        that may occur after recent changes and two resource leaks in error
        paths in the operating performance points (OPP) framework, add a new
        C-states table to intel_idle and update the cpuidle MAINTAINERS entry
        to cover the governors too.
      
        Specifics:
      
         - Fix recently introduced crash in the intel_pstate driver that
           occurs if scale-invariance is disabled during resume from
           suspend-to-RAM due to inconsistent changes of APERF or MPERF MSR
           values made by the platform firmware (Rafael Wysocki).
      
         - Fix a memory leak and add a missing clk_put() in error paths in the
           OPP framework (Quanyang Wang, Viresh Kumar).
      
         - Add new C-states table for SnowRidge processors to the intel_idle
           driver (Artem Bityutskiy).
      
         - Update the MAINTAINERS entry for cpuidle to make it clear that the
           governors are covered by it too (Lukas Bulwahn)"
      
      * tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        intel_idle: add SnowRidge C-state table
        cpufreq: intel_pstate: Fix fast-switch fallback path
        opp: Call the missing clk_put() on error
        opp: fix memory leak in _allocate_opp_table
        MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
      d9296a7b
  3. Jan 02, 2021
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq' and 'pm-cpuidle' · 89ecf09e
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: intel_pstate: Fix fast-switch fallback path
      
      * pm-cpuidle:
        intel_idle: add SnowRidge C-state table
        MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
      89ecf09e
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · eda809ae
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).
      
        The big core two fixes are for power management ("block: Do not accept
        any requests while suspended" and "block: Fix a race in the runtime
        power management code") which finally sorts out the resume problems
        we've occasionally been having.
      
        To make the resume fix, there are seven necessary precursors which
        effectively renames REQ_PREEMPT to REQ_PM, so every "special" request
        in block is automatically a power management exempt one.
      
        All of the non-PM preempt cases are removed except for the one in the
        SCSI Parallel Interface (spi) domain validation which is a genuine
        case where we have to run requests at high priority to validate the
        bus so this becomes an autopm get/put protected request"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits)
        scsi: cxgb4i: Fix TLS dependency
        scsi: ufs: Un-inline ufshcd_vops_device_reset function
        scsi: ufs: Re-enable WriteBooster after device reset
        scsi: ufs-mediatek: Use correct path to fix compile error
        scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
        scsi: block: Do not accept any requests while suspended
        scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
        scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
        scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
        scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
        scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
        scsi: block: Introduce BLK_MQ_REQ_PM
        scsi: block: Fix a race in the runtime power management code
        scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
        scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
        scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
        scsi: ufs-pci: Fix restore from S4 for Intel controllers
        scsi: ufs-mediatek: Keep VCC always-on for specific devices
        scsi: ufs: Allow regulators being always-on
        scsi: ufs: Clear UAC for RPMB after ufshcd resets
        ...
      eda809ae
    • Linus Torvalds's avatar
      Merge tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block · 8b4805c6
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Two minor block fixes from this last week that should go into 5.11:
      
         - Add missing NOWAIT debugfs definition (Andres)
      
         - Fix kerneldoc warning introduced this merge window (Randy)"
      
      * tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
        block: add debugfs stanza for QUEUE_FLAG_NOWAIT
        fs: block_dev.c: fix kernel-doc warnings from struct block_device changes
      8b4805c6
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block · dc3e24b2
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A few fixes that should go into 5.11, all marked for stable as well:
      
         - Fix issue around identity COW'ing and users that share a ring
           across processes
      
         - Fix a hang associated with unregistering fixed files (Pavel)
      
         - Move the 'process is exiting' cancelation a bit earlier, so
           task_works aren't affected by it (Pavel)"
      
      * tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
        kernel/io_uring: cancel io_uring before task works
        io_uring: fix io_sqe_files_unregister() hangs
        io_uring: add a helper for setting a ref node
        io_uring: don't assume mm is constant across submits
      dc3e24b2
    • Linus Torvalds's avatar
      depmod: handle the case of /sbin/depmod without /sbin in PATH · cedd1862
      Linus Torvalds authored
      Commit 436e980e
      
       ("kbuild: don't hardcode depmod path") stopped
      hard-coding the path of depmod, but in the process caused trouble for
      distributions that had that /sbin location, but didn't have it in the
      PATH (generally because /sbin is limited to the super-user path).
      
      Work around it for now by just adding /sbin to the end of PATH in the
      depmod.sh script.
      
      Reported-and-tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cedd1862
  4. Dec 31, 2020
  5. Dec 30, 2020
    • Andres Freund's avatar
      block: add debugfs stanza for QUEUE_FLAG_NOWAIT · dc304326
      Andres Freund authored
      This was missed in 021a2446. Leads to the numeric value of
      QUEUE_FLAG_NOWAIT (i.e. 29) showing up in
      /sys/kernel/debug/block/*/state.
      
      Fixes: 021a2446
      
      
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndres Freund <andres@anarazel.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      dc304326
    • Randy Dunlap's avatar
      fs: block_dev.c: fix kernel-doc warnings from struct block_device changes · 875b2376
      Randy Dunlap authored
      Fix new kernel-doc warnings in fs/block_dev.c:
      
      ../fs/block_dev.c:1066: warning: Excess function parameter 'whole' description in 'bd_abort_claiming'
      ../fs/block_dev.c:1837: warning: Function parameter or member 'dev' not described in 'lookup_bdev'
      
      Fixes: 4e7b5671 ("block: remove i_bdev")
      Fixes: 37c3fc9a
      
       ("block: simplify the block device claiming interface")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      875b2376
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 139711f0
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "16 patches
      
        Subsystems affected by this patch series: mm (selftests, hugetlb,
        pagecache, mremap, kasan, and slub), kbuild, checkpatch, misc, and
        lib"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: slub: call account_slab_page() after slab page initialization
        zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c
        lib/zlib: fix inflating zlib streams on s390
        lib/genalloc: fix the overflow when size is too big
        kdev_t: always inline major/minor helper functions
        sizes.h: add SZ_8G/SZ_16G/SZ_32G macros
        local64.h: make <asm/local64.h> mandatory
        kasan: fix null pointer dereference in kasan_record_aux_stack
        mm: generalise COW SMC TLB flushing race comment
        mm/mremap.c: fix extent calculation
        mm: memmap defer init doesn't work as expected
        mm: add prototype for __add_to_page_cache_locked()
        checkpatch: prefer strscpy to strlcpy
        Revert "kbuild: avoid static_assert for genksyms"
        mm/hugetlb: fix deadlock in hugetlb_cow error path
        selftests/vm: fix building protection keys test
      139711f0
    • Roman Gushchin's avatar
      mm: slub: call account_slab_page() after slab page initialization · 1f3147b4
      Roman Gushchin authored
      
      
      It's convenient to have page->objects initialized before calling into
      account_slab_page().  In particular, this information can be used to
      pre-alloc the obj_cgroup vector.
      
      Let's call account_slab_page() a bit later, after the initialization of
      page->objects.
      
      This commit doesn't bring any functional change, but is required for
      further optimizations.
      
      [akpm@linux-foundation.org: undo changes needed by forthcoming mm-memcg-slab-pre-allocate-obj_cgroups-for-slab-caches-with-slab_account.patch]
      
      Link: https://lkml.kernel.org/r/20201110195753.530157-1-guro@fb.com
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f3147b4
    • Randy Dunlap's avatar
      zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c · 605cc30d
      Randy Dunlap authored
      In commit 11fb479f ("zlib: export S390 symbols for zlib modules"), I
      added EXPORT_SYMBOL()s to dfltcc_inflate.c but then Mikhail said that
      these should probably be in dfltcc_syms.c with the other
      EXPORT_SYMBOL()s.
      
      However, that is contrary to the current kernel style, which places
      EXPORT_SYMBOL() immediately after the function that it applies to, so
      move all EXPORT_SYMBOL()s to their respective function locations and
      drop the dfltcc_syms.c file.  Also move MODULE_LICENSE() from the
      deleted file to dfltcc.c.
      
      [rdunlap@infradead.org: remove dfltcc_syms.o from Makefile]
        Link: https://lkml.kernel.org/r/20201227171837.15492-1-rdunlap@infradead.org
      
      Link: https://lkml.kernel.org/r/20201219052530.28461-1-rdunlap@infradead.org
      Fixes: 11fb479f
      
       ("zlib: export S390 symbols for zlib modules")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Cc: Zaslonko Mikhail <zaslonko@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      605cc30d
    • Ilya Leoshkevich's avatar
      lib/zlib: fix inflating zlib streams on s390 · f0bb29e8
      Ilya Leoshkevich authored
      Decompressing zlib streams on s390 fails with "incorrect data check"
      error.
      
      Userspace zlib checks inflate_state.flags in order to byteswap checksums
      only for zlib streams, and s390 hardware inflate code, which was ported
      from there, tries to match this behavior.  At the same time, kernel zlib
      does not use inflate_state.flags, so it contains essentially random
      values.  For many use cases either zlib stream is zeroed out or checksum
      is not used, so this problem is masked, but at least SquashFS is still
      affected.
      
      Fix by always passing a checksum to and from the hardware as is, which
      matches zlib_inflate()'s expectations.
      
      Link: https://lkml.kernel.org/r/20201215155551.894884-1-iii@linux.ibm.com
      Fixes: 12619610
      
       ("lib/zlib: add s390 hardware support for kernel zlib_inflate")
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Tested-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: default avatarMikhail Zaslonko <zaslonko@linux.ibm.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
      Cc: <stable@vger.kernel.org>	[5.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0bb29e8
    • Huang Shijie's avatar
      lib/genalloc: fix the overflow when size is too big · 36845663
      Huang Shijie authored
      
      
      Some graphic card has very big memory on chip, such as 32G bytes.
      
      In the following case, it will cause overflow:
      
          pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE);
          ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE);
      
          va = gen_pool_alloc(pool, SZ_4G);
      
      The overflow occurs in gen_pool_alloc_algo_owner():
      
      		....
      		size = nbits << order;
      		....
      
      The @nbits is "int" type, so it will overflow.
      Then the gen_pool_avail() will return the wrong value.
      
      This patch converts some "int" to "unsigned long", and
      changes the compare code in while.
      
      Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.ai
      Signed-off-by: default avatarHuang Shijie <sjhuang@iluvatar.ai>
      Reported-by: default avatarShi Jiasheng <jiasheng.shi@iluvatar.ai>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      36845663
    • Josh Poimboeuf's avatar
      kdev_t: always inline major/minor helper functions · aa8c7db4
      Josh Poimboeuf authored
      
      
      Silly GCC doesn't always inline these trivial functions.
      
      Fixes the following warning:
      
        arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled
      
      Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: Randy Dunlap <rdunlap@infradead.org>	[build-tested]
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa8c7db4
    • Huang Shijie's avatar
      sizes.h: add SZ_8G/SZ_16G/SZ_32G macros · 8b0fac44
      Huang Shijie authored
      
      
      Add these macros, since we can use them in drivers.
      
      Link: https://lkml.kernel.org/r/20201229072819.11183-1-sjhuang@iluvatar.ai
      Signed-off-by: default avatarHuang Shijie <sjhuang@iluvatar.ai>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b0fac44
    • Randy Dunlap's avatar
      local64.h: make <asm/local64.h> mandatory · 87dbc209
      Randy Dunlap authored
      
      
      Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and
      remove all arch/*/include/asm/local64.h arch-specific files since they
      only #include <asm-generic/local64.h>.
      
      This fixes build errors on arch/c6x/ and arch/nios2/ for
      block/blk-iocost.c.
      
      Build-tested on 21 of 25 arch-es.  (tools problems on the others)
      
      Yes, we could even rename <asm-generic/local64.h> to
      <linux/local64.h> and change all #includes to use
      <linux/local64.h> instead.
      
      Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Suggested-by: default avatarChristoph Hellwig <hch@infradead.org>
      Reviewed-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foun...>
      87dbc209
    • Walter Wu's avatar
      kasan: fix null pointer dereference in kasan_record_aux_stack · 13384f61
      Walter Wu authored
      
      
      Syzbot reported the following [1]:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000008
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 2d993067 P4D 2d993067 PUD 19a3c067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP KASAN
        CPU: 1 PID: 3852 Comm: kworker/1:2 Not tainted 5.10.0-syzkaller #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Workqueue: events free_ipc
        RIP: 0010:kasan_record_aux_stack+0x77/0xb0
      
      Add null checking slab object from kasan_get_alloc_meta() in order to
      avoid null pointer dereference.
      
      [1] https://syzkaller.appspot.com/x/log.txt?x=10a82a50d00000
      
      Link: https://lkml.kernel.org/r/20201228080018.23041-1-walter-zh.wu@mediatek.com
      Signed-off-by: default avatarWalter Wu <walter-zh.wu@mediatek.com>
      Suggested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      13384f61
    • Nicholas Piggin's avatar
      mm: generalise COW SMC TLB flushing race comment · 111fe718
      Nicholas Piggin authored
      I'm not sure if I'm completely missing something here, but AFAIKS the
      reference to the mysterious "COW SMC race" confuses the issue.  The
      original changelog and mailing list thread didn't help me either.
      
      This SMC race is where the problem was detected, but isn't the general
      problem bigger and more obvious: that the new PTE could be picked up at
      any time by any TLB while entries for the old PTE exist in other TLBs
      before the TLB flush takes effect?
      
      The case where the iTLB and dTLB of a CPU are pointing at different pages
      is an interesting one but follows from the general problem.
      
      The other (minor) thing with the comment I think it makes it a bit clearer
      to say what the old code was doing (i.e., it avoids the race as opposed to
      what?).
      
      References: 4ce072f1
      
       ("mm: fix a race condition under SMC + COW")
      Link: https://lkml.kernel.org/r/20201215121119.351650-1-npiggin@gmail.com
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      111fe718
    • Kalesh Singh's avatar
      mm/mremap.c: fix extent calculation · e05986ee
      Kalesh Singh authored
      When `next < old_addr`, `next - old_addr` arithmetic underflows causing
      `extent` to be incorrect.
      
      Make `extent` the smaller of `next - old_addr` or `old_end - old_addr`.
      
      Link: https://lkml.kernel.org/r/20201219170433.2418867-1-kaleshsingh@google.com
      Fixes: c49dd340
      
       ("mm: speedup mremap on 1GB or larger regions")
      Signed-off-by: default avatarKalesh Singh <kaleshsingh@google.com>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Lokesh Gidra <lokeshgidra@google.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Kalesh Singh <kaleshsingh@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e05986ee
    • Baoquan He's avatar
      mm: memmap defer init doesn't work as expected · dc2da7b4
      Baoquan He authored
      VMware observed a performance regression during memmap init on their
      platform, and bisected to commit 73a6e474 ("mm: memmap_init:
      iterate over memblock regions rather that check each PFN") causing it.
      
      Before the commit:
      
        [0.033176] Normal zone: 1445888 pages used for memmap
        [0.033176] Normal zone: 89391104 pages, LIFO batch:63
        [0.035851] ACPI: PM-Timer IO Port: 0x448
      
      With commit
      
        [0.026874] Normal zone: 1445888 pages used for memmap
        [0.026875] Normal zone: 89391104 pages, LIFO batch:63
        [2.028450] ACPI: PM-Timer IO Port: 0x448
      
      The root cause is the current memmap defer init doesn't work as expected.
      
      Before, memmap_init_zone() was used to do memmap init of one whole zone,
      to initialize all low zones of one numa node, but defer memmap init of
      the last zone in that numa node.  However, since commit 73a6e474,
      function memmap_init() is adapted to iterater over memblock regions
      inside one zone, then call memmap_init_zone() to do memmap init for each
      region.
      
      E.g, on VMware's system, the memory layout is as below, there are two
      memory regions in node 2.  The current code will mistakenly initialize the
      whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
      iniatialize only one memmory section on the 2nd region [mem
      0x10000000000-0x1033fffffff].  In fact, we only expect to see that there's
      only one memory section's memmap initialized.  That's why more time is
      costed at the time.
      
      [    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
      [    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
      [    0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
      [    0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
      [    0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
      [    0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]
      
      Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
      down the real zone end pfn so that defer_init() can use it to judge
      whether defer need be taken in zone wide.
      
      Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
      Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
      Fixes: commit 73a6e474
      
       ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Reported-by: default avatarRahul Gopakumar <gopakumarr@vmware.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc2da7b4
    • Souptick Joarder's avatar
      mm: add prototype for __add_to_page_cache_locked() · 6d87d0ec
      Souptick Joarder authored
      
      
      Otherwise it causes a gcc warning:
      
        mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes]
      
      A previous attempt to make this function static led to compilation
      errors when CONFIG_DEBUG_INFO_BTF is enabled because
      __add_to_page_cache_locked() is referred to by BPF code.
      
      Adding a prototype will silence the warning.
      
      Link: https://lkml.kernel.org/r/1608693702-4665-1-git-send-email-jrdr.linux@gmail.com
      Signed-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Cc: Alex Shi <alex.shi@linux.alibaba.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d87d0ec
    • Joe Perches's avatar
      checkpatch: prefer strscpy to strlcpy · 5dbdb2d8
      Joe Perches authored
      
      
      Prefer strscpy over the deprecated strlcpy function.
      
      Link: https://lkml.kernel.org/r/19fe91084890e2c16fe56f960de6c570a93fa99b.camel@perches.com
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Requested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5dbdb2d8
    • Masahiro Yamada's avatar
      Revert "kbuild: avoid static_assert for genksyms" · 3a176b94
      Masahiro Yamada authored
      This reverts commit 14dc3983.
      
      Macro Elver had sent a fix proper fix earlier, and also pointed out
      corner cases:
       "I guess what you propose is simpler, but might still have corner cases
        where we still get warnings. In particular, if some file (for whatever
        reason) does not include build_bug.h and uses a raw _Static_assert(),
        then we still get warnings. E.g. I see 1 user of raw _Static_assert()
        (drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h )."
      
      I believe the raw use of _Static_assert() should be allowed, so this
      should be fixed in genksyms.
      
      Even after commit 14dc3983
      
       ("kbuild: avoid static_assert for
      genksyms"), I confirmed the following test code emits the warning.
      
        ---------------->8----------------
        #include <linux/export.h>
      
        _Static_assert((1 ?: 0), "");
      
        void foo(void) { }
        EXPORT_SYMBOL(foo);
        ---------------->8----------------
      
        WARNING: modpost: EXPORT symbol "foo" [vmlinux] version generation failed, symbol will not be versioned.
      
      Now that commit 869b91992bce ("genksyms: Ignore module scoped
      _Static_assert()") fixed this issue properly, the workaround should
      be reverted.
      
      Link: https://lkml.org/lkml/2020/12/10/845
      Link: https://lkml.kernel.org/r/20201219183911.181442-1-masahiroy@kernel.org
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a176b94
    • Mike Kravetz's avatar
      mm/hugetlb: fix deadlock in hugetlb_cow error path · e7dd91c4
      Mike Kravetz authored
      syzbot reported the deadlock here [1].  The issue is in hugetlb cow
      error handling when there are not enough huge pages for the faulting
      task which took the original reservation.  It is possible that other
      (child) tasks could have consumed pages associated with the reservation.
      In this case, we want the task which took the original reservation to
      succeed.  So, we unmap any associated pages in children so that they can
      be used by the faulting task that owns the reservation.
      
      The unmapping code needs to hold i_mmap_rwsem in write mode.  However,
      due to commit c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd
      sharing synchronization") we are already holding i_mmap_rwsem in read
      mode when hugetlb_cow is called.
      
      Technically, i_mmap_rwsem does not need to be held in read mode for COW
      mappings as they can not share pmd's.  Modifying the fault code to not
      take i_mmap_rwsem in read mode for COW (and other non-sharable) mappings
      is too involved for a stable fix.
      
      Instead, we simply drop the hugetlb_fault_mutex and i_mmap_rwsem before
      unmapping.  This is OK as it is technically not needed.  They are
      reacquired after unmapping as expected by calling code.  Since this is
      done in an uncommon error path, the overhead of dropping and reacquiring
      mutexes is acceptable.
      
      While making changes, remove redundant BUG_ON after unmap_ref_private.
      
      [1] https://lkml.kernel.org/r/000000000000b73ccc05b5cf8558@google.com
      
      Link: https://lkml.kernel.org/r/4c5781b8-3b00-761e-c0c7-c5edebb6ec1a@oracle.com
      Fixes: c0d0381a
      
       ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatar <syzbot+5eee4145df3c15e96625@syzkaller.appspotmail.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7dd91c4
    • Harish's avatar
      selftests/vm: fix building protection keys test · 7cf22a1c
      Harish authored
      Commit d8cbe8bf ("tools/testing/selftests/vm: fix build error") tried
      to include a ARCH check for powerpc, however ARCH is not defined in the
      Makefile before including lib.mk.  This makes test building to skip on
      both x86 and powerpc.
      
      Fix the arch check by replacing it using machine type as it is already
      defined and used in the test.
      
      Link: https://lkml.kernel.org/r/20201215100402.257376-1-harish@linux.ibm.com
      Fixes: d8cbe8bf
      
       ("tools/testing/selftests/vm: fix build error")
      Signed-off-by: default avatarHarish <harish@linux.ibm.com>
      Reviewed-by: default avatarSandipan Das <sandipan@linux.ibm.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sandipan Das <sandipan@linux.ibm.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7cf22a1c
    • Jens Axboe's avatar
      io_uring: don't assume mm is constant across submits · 77788775
      Jens Axboe authored
      If we COW the identity, we assume that ->mm never changes. But this
      isn't true of multiple processes end up sharing the ring. Hence treat
      id->mm like like any other process compontent when it comes to the
      identity mapping. This is pretty trivial, just moving the existing grab
      into io_grab_identity(), and including a check for the match.
      
      Cc: stable@vger.kernel.org # 5.10
      Fixes: 1e6fa521
      
       ("io_uring: COW io_identity on mismatch")
      Reported-by: default avatarChristian Brauner <christian.brauner@ubuntu.com&gt;:>
      Tested-by: default avatarChristian Brauner <christian.brauner@ubuntu.com&gt;:>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      77788775
  6. Dec 29, 2020