Skip to content
  1. Feb 10, 2021
  2. Feb 09, 2021
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · e0756cfc
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "Fix output of top level event tracing 'enable' file.
      
        When writing a tool for enabling events in the tracing system, an
        anomaly was discovered. The top level event 'enable' file would never
        show '1' when all events were enabled.
      
        The system and event 'enable' files worked as expected.
      
        The reason was because the top level event 'enable' file included the
        'ftrace' tracer events, which are not controlled by the 'enable' file
        and would cause the output to be wrong. This appears to have been a
        bug since it was created"
      
      * tag 'trace-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Do not count ftrace events in top level enable output
      e0756cfc
  3. Feb 08, 2021
    • Linus Torvalds's avatar
      Linux 5.11-rc7 · 92bf2261
      Linus Torvalds authored
      92bf2261
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · b75dba7f
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "A fix for a crash scenario that has been present since the initial
        merge, a minor regression in sysfs attribute visibility, and a fix for
        some flexible array warnings.
      
        The bulk of this pull is an update to the libnvdimm unit test
        infrastructure to test non-ACPI platforms. Given there is zero
        regression risk for test updates, and the tests enable validation of
        bits headed towards the next merge window, I saw no reason to hold the
        new tests back. Santosh originally submitted this before the v5.11
        window opened.
      
        Summary:
      
         - Fix a crash when sysfs accesses race 'dimm' driver probe/remove.
      
         - Fix a regression in 'resource' attribute visibility necessary for
           mapping badblocks and other physical address interrogations.
      
         - Fix some flexible array warnings
      
         - Expand the unit test infrastructure for non-ACPI platforms"
      
      * tag 'libnvdimm-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm/dimm: Avoid race between probe and available_slots_show()
        ndtest: Add papr health related flags
        ndtest: Add nvdimm control functions
        ndtest: Add regions and mappings to the test buses
        ndtest: Add dimm attributes
        ndtest: Add dimms to the two buses
        ndtest: Add compatability string to treat it as PAPR family
        testing/nvdimm: Add test module for non-nfit platforms
        libnvdimm/namespace: Fix visibility of namespace resource attribute
        libnvdimm/pmem: Remove unused header
        ACPI: NFIT: Fix flexible_array.cocci warnings
      b75dba7f
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping · ff92acb2
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
       "Fix a 32 vs 64-bit padding issue in the new benchmark code (Barry
        Song)"
      
      * tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: benchmark: use u8 for reserved field in uAPI structure
      ff92acb2
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fc6c0ae5
      Linus Torvalds authored
      Pull irq fixes from Borislav Petkov:
      
       - Prevent device managed IRQ allocation helpers from returning IRQ 0
      
       - A fix for MSI activation of PCI endpoints with multiple MSIs
      
      * tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
        genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
      fc6c0ae5
    • Linus Torvalds's avatar
      Merge tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c6792d44
      Linus Torvalds authored
      Pull syscall entry fixes from Borislav Petkov:
      
       - For syscall user dispatch, separate prctl operation from syscall
         redirection range specification before the API has been made official
         in 5.11.
      
       - Ensure tasks using the generic syscall code do trap after returning
         from a syscall when single-stepping is requested.
      
      * tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        entry: Use different define for selector variable in SUD
        entry: Ensure trap after single-step on system call return
      c6792d44
    • Linus Torvalds's avatar
      Merge tag 'sched_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6fed85df
      Linus Torvalds authored
      Pull scheduler fix from Borislav Petkov:
       "Revert an attempt to not spread IRQ threads on isolated CPUs which has
        a bunch of problems"
      
      * tag 'sched_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs"
      6fed85df
    • Linus Torvalds's avatar
      Merge tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 814daadb
      Linus Torvalds authored
      Pull timer fixes from Borislav Petkov:
       "Two more timers-related fixes for v5.11:
      
         - Use a freezable workqueue for RTC sync because the sync can happen
           at any time and trigger suspend assertion checks in the i2c
           subsystem.
      
         - Correct a previous RTC validation change to check only bit 6 in
           register D because some Intel machines use bits 0-5"
      
      * tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        ntp: Use freezable workqueue for RTC synchronization
        rtc: mc146818: Dont test for bit 0-5 in Register D
      814daadb
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e24f9c5f
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
       "I hope this is the last batch of x86/urgent updates for this round:
      
         - Remove superfluous EFI PGD range checks which lead to those
           assertions failing with certain kernel configs and LLVM.
      
         - Disable setting breakpoints on facilities involved in #DB exception
           handling to avoid infinite loops.
      
         - Add extra serialization to non-serializing MSRs (IA32_TSC_DEADLINE
           and x2 APIC MSRs) to adhere to SDM's recommendation and avoid any
           theoretical issues.
      
         - Re-add the EPB MSR reading on turbostat so that it works on older
           kernels which don't have the corresponding EPB sysfs file.
      
         - Add Alder Lake to the list of CPUs which support split lock.
      
         - Fix %dr6 register handling in order to be able to set watchpoints
           with gdb again.
      
         - Disable CET instrumentation in the kernel so that gcc doesn't add
           ENDBR64 to kernel code and thus confuse tracing"
      
      * tag 'x86_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/efi: Remove EFI PGD build time checks
        x86/debug: Prevent data breakpoints on cpu_dr7
        x86/debug: Prevent data breakpoints on __per_cpu_offset
        x86/apic: Add extra serialization for non-serializing MSRs
        tools/power/turbostat: Fallback to an MSR read for EPB
        x86/split_lock: Enable the split lock feature on another Alder Lake CPU
        x86/debug: Fix DR6 handling
        x86/build: Disable CET instrumentation in the kernel
      e24f9c5f
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.11-2' of... · 2db138bb
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Use the 'python3' command to invoke python scripts because some
         distributions do not provide the 'python' command any more.
      
       - Clean-up and update documents
      
       - Use pkg-config to search libcrypto
      
       - Fix duplicated debug flags
      
       - Ignore some more stubs in scripts/kallsyms.c
      
      * tag 'kbuild-fixes-v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kallsyms: fix nonconverging kallsyms table with lld
        kbuild: fix duplicated flags in DEBUG_CFLAGS
        scripts/clang-tools: switch explicitly to Python 3
        kbuild: remove PYTHON variable
        Documentation/llvm: Add a section about supported architectures
        Revert "checkpatch: add check for keyword 'boolean' in Kconfig definitions"
        scripts: use pkg-config to locate libcrypto
        kconfig: mconf: fix HOSTCC call
        doc: gcc-plugins: update gcc-plugins.rst
        kbuild: simplify GCC_PLUGINS enablement in dummy-tools/gcc
        Documentation/Kbuild: Remove references to gcc-plugin.sh
        scripts: switch explicitly to Python 3
      2db138bb
  4. Feb 07, 2021
    • Linus Torvalds's avatar
      Merge tag '5.11-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6 · 825b5991
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Three small smb3 fixes for stable"
      
      * tag '5.11-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: report error instead of invalid when revalidating a dentry fails
        smb3: fix crediting for compounding when only one request in flight
        smb3: Fix out-of-bounds bug in SMB2_negotiate()
      825b5991
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · f7455e5d
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
       "A handful of fixes for this week:
      
         - A fix to avoid evalating the VA twice in virt_addr_valid, which
           fixes some WARNs under DEBUG_VIRTUAL.
      
         - Two fixes related to STRICT_KERNEL_RWX: one that fixes some
           permissions when strict is disabled, and one to fix some alignment
           issues when strict is enabled.
      
         - A fix to disallow the selection of MAXPHYSMEM_2GB on RV32, which
           isn't valid any more but may still show up in some oldconfigs.
      
        We still have the HiFive Unleashed ethernet phy reset regression, so
        there will likely be something coming next week"
      
      * tag 'riscv-for-linus-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: Define MAXPHYSMEM_1GB only for RV32
        riscv: Align on L1_CACHE_BYTES when STRICT_KERNEL_RWX
        RISC-V: Fix .init section permission update
        riscv: virt_addr_valid must check the address belongs to linear mapping
      f7455e5d
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · f06279ea
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - A fix for a change we made to __kernel_sigtramp_rt64() which confused
         glibc's backtrace logic, and also changed the semantics of that
         symbol, which was arguably an ABI break.
      
       - A fix for a stack overwrite in our VSX instruction emulation.
      
       - A couple of fixes for the Makefile logic in the new C VDSO.
      
      Thanks to Masahiro Yamada, Naveen N.  Rao, Raoni Fassina Firmino, and
      Ravi Bangoria.
      
      * tag 'powerpc-5.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64() semantics
        powerpc/vdso64: remove meaningless vgettimeofday.o build rule
        powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o
        powerpc/sstep: Fix array out of bound warning
      f06279ea
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 4a7859ea
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
      
       - Fix latent bug with DC21285 (Footbridge PCI bridge) configuration
         accessors that affects GCC >= 4.9.2
      
       - Fix misplaced tegra_uart_config in decompressor
      
       - Ensure signal page contents are initialised
      
       - Fix kexec oops
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: kexec: fix oops after TLB are invalidated
        ARM: ensure the signal page contains defined contents
        ARM: 9043/1: tegra: Fix misplaced tegra_uart_config in decompressor
        ARM: footbridge: fix dc21285 PCI configuration accessors
      4a7859ea
    • Linus Torvalds's avatar
      Merge tag 'usb-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 368afecb
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small, last-minute, USB driver fixes for 5.11-rc7
      
        They all resolve issues reported, or are a few new device ids for some
        drivers. They include:
      
         - new device ids for some usb-serial drivers
      
         - xhci fixes for a variety of reported problems
      
         - dwc3 driver bugfixes
      
         - dwc2 driver bugfixes
      
         - usblp driver bugfix
      
         - thunderbolt bugfix
      
         - few other tiny fixes
      
        All have been in linux-next with no reported issues"
      
      * tag 'usb-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: dwc2: Fix endpoint direction check in ep_from_windex
        usb: dwc3: fix clock issue during resume in OTG mode
        xhci: fix bounce buffer usage for non-sg list case
        usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
        usb: xhci-mtk: break loop when find the endpoint to drop
        usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints
        usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
        USB: gadget: legacy: fix an error code in eth_bind()
        thunderbolt: Fix possible NULL pointer dereference in tb_acpi_add_link()
        USB: serial: option: Adding support for Cinterion MV31
        usb: xhci-mtk: fix unreleased bandwidth data
        usb: gadget: aspeed: add missing of_node_put
        USB: usblp: don't call usb_set_interface if there's a single alt
        USB: serial: cp210x: add pid/vid for WSDA-200-USB
        USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
      368afecb
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 7c2d1835
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
       "Nothing terribly interesting, just a few fixups"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: xpad - sync supported devices with fork on GitHub
        Input: ariel-pwrbutton - remove unused variable ariel_pwrbutton_id_table
        Input: goodix - add support for Goodix GT9286 chip
        dt-bindings: input: touchscreen: goodix: Add binding for GT9286 IC
        dt-bindings: input: adc-keys: clarify description
        Input: ili210x - implement pressure reporting for ILI251x
        Input: i8042 - unbreak Pegatron C15B
        Input: st1232 - wait until device is ready before reading resolution
        Input: st1232 - do not read more bytes than needed
        Input: st1232 - fix off-by-one error in resolution handling
      7c2d1835
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 964d069f
      Linus Torvalds authored
      Pull SCSI fix from James Bottomley:
       "One fix in drivers (lpfc) that stops an oops on resource exhaustion"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: lpfc: Fix EEH encountering oops with NVMe traffic
      964d069f
    • Linus Torvalds's avatar
      Merge tag 'block-5.11-2021-02-05' of git://git.kernel.dk/linux-block · eec79181
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few small regression fixes:
      
         - NVMe pull request from Christoph:
             - more quirks for buggy devices (Thorsten Leemhuis, Claus Stovgaard)
             - update the email address for Keith (Keith Busch)
             - fix an out of bounds access in nvmet-tcp (Sagi Grimberg)
      
         - Regression fix for BFQ shallow depth calculations introduced in
           this merge window (Lin)"
      
      * tag 'block-5.11-2021-02-05' of git://git.kernel.dk/linux-block:
        nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
        bfq-iosched: Revert "bfq: Fix computation of shallow depth"
        update the email address for Keith Bush
        nvme-pci: ignore the subsysem NQN on Phison E16
        nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
      eec79181
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block · 860b45da
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Two small fixes that should go into 5.11:
      
         - task_work resource drop fix (Pavel)
      
         - identity COW fix (Xiaoguang)"
      
      * tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block:
        io_uring: drop mm/files between task_work_submit
        io_uring: don't modify identity's files uncess identity is cowed
      860b45da
  5. Feb 06, 2021
    • Borislav Petkov's avatar
      x86/efi: Remove EFI PGD build time checks · 816ef8d7
      Borislav Petkov authored
      
      
      With CONFIG_X86_5LEVEL, CONFIG_UBSAN and CONFIG_UBSAN_UNSIGNED_OVERFLOW
      enabled, clang fails the build with
      
        x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':
        efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'
      
      which happens due to -fsanitize=unsigned-integer-overflow being enabled:
      
        -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where
        the result of an unsigned integer computation cannot be represented
        in its type. Unlike signed integer overflow, this is not undefined
        behavior, but it is often unintentional. This sanitizer does not check
        for lossy implicit conversions performed before such a computation
        (see -fsanitize=implicit-conversion).
      
      and that fires when the (intentional) EFI_VA_START/END defines overflow
      an unsigned long, leading to the assertion expressions not getting
      optimized away (on GCC they do)...
      
      However, those checks are superfluous: the runtime services mapping
      code already makes sure the ranges don't overshoot EFI_VA_END as the
      EFI mapping range is hardcoded. On each runtime services call, it is
      switched to the EFI-specific PGD and even if mappings manage to escape
      that last PGD, this won't remain unnoticed for long.
      
      So rip them out.
      
      See https://github.com/ClangBuiltLinux/linux/issues/256 for more info.
      
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: http://lkml.kernel.org/r/20210107223424.4135538-1-arnd@kernel.org
      816ef8d7
    • Gabriel Krisman Bertazi's avatar
      entry: Use different define for selector variable in SUD · 36a6c843
      Gabriel Krisman Bertazi authored
      
      
      Michael Kerrisk suggested that, from an API perspective, it is a bad
      idea to share the PR_SYS_DISPATCH_ defines between the prctl operation
      and the selector variable.
      
      Therefore, define two new constants to be used by SUD's selector variable
      and update the corresponding documentation and test cases.
      
      While this changes the API syscall user dispatch has never been part of a
      Linux release, it will show up for the first time in 5.11.
      
      Suggested-by: default avatarMichael Kerrisk (man-pages) <mtk.manpages@gmail.com>
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com
      
      36a6c843
    • Gabriel Krisman Bertazi's avatar
      entry: Ensure trap after single-step on system call return · 6342adca
      Gabriel Krisman Bertazi authored
      Commit 29915524 ("entry: Drop usage of TIF flags in the generic syscall
      code") introduced a bug on architectures using the generic syscall entry
      code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall
      return after receiving a TIF_SINGLESTEP.
      
      The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to
      cause the trap after a system call is executed, but since the above commit,
      the syscall call handler only checks for the SYSCALL_WORK flags on the exit
      work.
      
      Split the meaning of TIF_SINGLESTEP such that it only means single-step
      mode, and create a new type of SYSCALL_WORK to request a trap immediately
      after a syscall in single-step mode.  In the current implementation, the
      SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity.
      
      Update x86 to flip this bit when a tracer enables single stepping.
      
      Fixes: 29915524
      
       ("entry: Drop usage of TIF flags in the generic syscall code")
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarKyle Huey <me@kylehuey.com>
      Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com
      6342adca
    • Thomas Gleixner's avatar
      Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs" · 2452483d
      Thomas Gleixner authored
      This reverts commit 1abdfe70
      
      .
      
      This change is broken and not solving any problem it claims to solve.
      
      Robin reported that cpumask_local_spread() now returns any cpu out of
      cpu_possible_mask in case that NOHZ_FULL is disabled (runtime or compile
      time). It can also return any offline or not-present CPU in the
      housekeeping mask. Before that it was returning a CPU out of
      online_cpu_mask.
      
      While the function is racy against CPU hotplug if the caller does not
      protect against it, the actual use cases are not caring much about it as
      they use it mostly as hint for:
      
       - the user space affinity hint which is unused by the kernel
       - memory node selection which is just suboptimal
       - network queue affinity which might fail but is handled gracefully
      
      But the occasional fail vs. hotplug is very different from returning
      anything from possible_cpu_mask which can have a large amount of offline
      CPUs obviously.
      
      The changelog of the commit claims:
      
       "The current implementation of cpumask_local_spread() does not respect
        the isolated CPUs, i.e., even if a CPU has been isolated for Real-Time
        task, it will return it to the caller for pinning of its IRQ
        threads. Having these unwanted IRQ threads on an isolated CPU adds up
        to a latency overhead."
      
      The only correct part of this changelog is:
      
       "The current implementation of cpumask_local_spread() does not respect
        the isolated CPUs."
      
      Everything else is just disjunct from reality.
      
      Reported-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Nitesh Narayan Lal <nitesh@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: abelits@marvell.com
      Cc: davem@davemloft.net
      Link: https://lore.kernel.org/r/87y2g26tnt.fsf@nanos.tec.linutronix.de
      2452483d
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 1e0d27fc
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "18 patches.
      
        Subsystems affected by this patch series: mm (hugetlb, compaction,
        vmalloc, shmem, memblock, pagecache, kasan, and hugetlb), mailmap,
        gcov, ubsan, and MAINTAINERS"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        MAINTAINERS/.mailmap: use my @kernel.org address
        mm: hugetlb: fix missing put_page in gather_surplus_pages()
        ubsan: implement __ubsan_handle_alignment_assumption
        kasan: make addr_has_metadata() return true for valid addresses
        kasan: add explicit preconditions to kasan_report()
        mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
        mailmap: add entries for Manivannan Sadhasivam
        mailmap: fix name/email for Viresh Kumar
        memblock: do not start bottom-up allocations with kernel_end
        mm: thp: fix MADV_REMOVE deadlock on shmem THP
        init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov
        mm/vmalloc: separate put pages and flush VM flags
        mm, compaction: move high_pfn to the for loop scope
        mm: migrate: do not migrate HugeTLB page whose refcount is one
        mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
        mm: hugetlb: fix a race between isolating and freeing page
        mm: hugetlb: fix a race between freeing and dissolving the page
        mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
      1e0d27fc
    • Steven Rostedt (VMware)'s avatar
      tracing: Do not count ftrace events in top level enable output · 256cfdd6
      Steven Rostedt (VMware) authored
      The file /sys/kernel/tracing/events/enable is used to enable all events by
      echoing in "1", or disabling all events when echoing in "0". To know if all
      events are enabled, disabled, or some are enabled but not all of them,
      cating the file should show either "1" (all enabled), "0" (all disabled), or
      "X" (some enabled but not all of them). This works the same as the "enable"
      files in the individule system directories (like tracing/events/sched/enable).
      
      But when all events are enabled, the top level "enable" file shows "X". The
      reason is that its checking the "ftrace" events, which are special events
      that only exist for their format files. These include the format for the
      function tracer events, that are enabled when the function tracer is
      enabled, but not by the "enable" file. The check includes these events,
      which will always be disabled, and even though all true events are enabled,
      the top level "enable" file will show "X" instead of "1".
      
      To fix this, have the check test the event's flags to see if it has the
      "IGNORE_ENABLE" flag set, and if so, not test it.
      
      Cc: stable@vger.kernel.org
      Fixes: 553552ce
      
       ("tracing: Combine event filter_active and enable into single flags field")
      Reported-by: default avatar"Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      256cfdd6
    • Hans de Goede's avatar
      genirq: Prevent [devm_]irq_alloc_desc from returning irq 0 · 4c7bcb51
      Hans de Goede authored
      Since commit a85a6c86 ("driver core: platform: Clarify that IRQ 0
      is invalid"), having a linux-irq with number 0 will trigger a WARN()
      when calling platform_get_irq*() to retrieve that linux-irq.
      
      Since [devm_]irq_alloc_desc allocs a single irq and since irq 0 is not used
      on some systems, it can return 0, triggering that WARN(). This happens
      e.g. on Intel Bay Trail and Cherry Trail devices using the LPE audio engine
      for HDMI audio:
      
       0 is an invalid IRQ number
       WARNING: CPU: 3 PID: 472 at drivers/base/platform.c:238 platform_get_irq_optional+0x108/0x180
       Modules linked in: snd_hdmi_lpe_audio(+) ...
      
       Call Trace:
        platform_get_irq+0x17/0x30
        hdmi_lpe_audio_probe+0x4a/0x6c0 [snd_hdmi_lpe_audio]
      
       ---[ end trace ceece38854223a0b ]---
      
      Change the 'from' parameter passed to __[devm_]irq_alloc_descs() by the
      [devm_]irq_alloc_desc macros from 0 to 1, so that these macros will no
      longer return 0.
      
      Fixes: a85a6c86
      
       ("driver core: platform: Clarify that IRQ 0 is invalid")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20201221185647.226146-1-hdegoede@redhat.com
      4c7bcb51
    • Aurelien Aptel's avatar
      cifs: report error instead of invalid when revalidating a dentry fails · 21b200d0
      Aurelien Aptel authored
      
      
      Assuming
      - //HOST/a is mounted on /mnt
      - //HOST/b is mounted on /mnt/b
      
      On a slow connection, running 'df' and killing it while it's
      processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.
      
      This triggers the following chain of events:
      => the dentry revalidation fail
      => dentry is put and released
      => superblock associated with the dentry is put
      => /mnt/b is unmounted
      
      This patch makes cifs_d_revalidate() return the error instead of 0
      (invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file
      deleted) and ESTALE (file recreated).
      
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Suggested-by: default avatarShyam Prasad N <nspmangalore@gmail.com>
      Reviewed-by: default avatarShyam Prasad N <nspmangalore@gmail.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      21b200d0
    • Lai Jiangshan's avatar
      x86/debug: Prevent data breakpoints on cpu_dr7 · 3943abf2
      Lai Jiangshan authored
      local_db_save() is called at the start of exc_debug_kernel(), reads DR7 and
      disables breakpoints to prevent recursion.
      
      When running in a guest (X86_FEATURE_HYPERVISOR), local_db_save() reads the
      per-cpu variable cpu_dr7 to check whether a breakpoint is active or not
      before it accesses DR7.
      
      A data breakpoint on cpu_dr7 therefore results in infinite #DB recursion.
      
      Disallow data breakpoints on cpu_dr7 to prevent that.
      
      Fixes: 84b6a349
      
      ("x86/entry: Optimize local_db_save() for virt")
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20210204152708.21308-2-jiangshanlai@gmail.com
      3943abf2
    • Lai Jiangshan's avatar
      x86/debug: Prevent data breakpoints on __per_cpu_offset · c4bed4b9
      Lai Jiangshan authored
      When FSGSBASE is enabled, paranoid_entry() fetches the per-CPU GSBASE value
      via __per_cpu_offset or pcpu_unit_offsets.
      
      When a data breakpoint is set on __per_cpu_offset[cpu] (read-write
      operation), the specific CPU will be stuck in an infinite #DB loop.
      
      RCU will try to send an NMI to the specific CPU, but it is not working
      either since NMI also relies on paranoid_entry(). Which means it's
      undebuggable.
      
      Fixes: eaad9812
      
      ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20210204152708.21308-1-jiangshanlai@gmail.com
      c4bed4b9
    • Nathan Chancellor's avatar
      MAINTAINERS/.mailmap: use my @kernel.org address · 654eb3f2
      Nathan Chancellor authored
      
      
      Use my @kernel.org for all points of contact so that I am always
      accessible.
      
      Link: https://lkml.kernel.org/r/20210126212730.2097108-1-nathan@kernel.org
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Acked-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Acked-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      654eb3f2
    • Muchun Song's avatar
      mm: hugetlb: fix missing put_page in gather_surplus_pages() · e558464b
      Muchun Song authored
      The VM_BUG_ON_PAGE avoids the generation of any code, even if that
      expression has side-effects when !CONFIG_DEBUG_VM.
      
      Link: https://lkml.kernel.org/r/20210126031009.96266-1-songmuchun@bytedance.com
      Fixes: e5dfaceb
      
       ("mm/hugetlb.c: just use put_page_testzero() instead of page_count()")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e558464b
    • Nathan Chancellor's avatar
      ubsan: implement __ubsan_handle_alignment_assumption · 28abcc96
      Nathan Chancellor authored
      When building ARCH=mips 32r2el_defconfig with CONFIG_UBSAN_ALIGNMENT:
      
        ld.lld: error: undefined symbol: __ubsan_handle_alignment_assumption
           referenced by slab.h:557 (include/linux/slab.h:557)
                         main.o:(do_initcalls) in archive init/built-in.a
           referenced by slab.h:448 (include/linux/slab.h:448)
                         do_mounts_rd.o:(rd_load_image) in archive init/built-in.a
           referenced by slab.h:448 (include/linux/slab.h:448)
                         do_mounts_rd.o:(identify_ramdisk_image) in archive init/built-in.a
           referenced 1579 more times
      
      Implement this for the kernel based on LLVM's
      handleAlignmentAssumptionImpl because the kernel is not linked against
      the compiler runtime.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1245
      Link: https://github.com/llvm/llvm-project/blob/llvmorg-11.0.1/compiler-rt/lib/ubsan/ubsan_handlers.cpp#L151-L190
      Link: https://lkml.kernel.org/r/20210127224451.2587372-1-nathan@kernel.org
      Si...
      28abcc96
    • Vincenzo Frascino's avatar
      kasan: make addr_has_metadata() return true for valid addresses · b99acdcb
      Vincenzo Frascino authored
      Currently, addr_has_metadata() returns true for every address.  An
      invalid address (e.g.  NULL) passed to the function when, KASAN_HW_TAGS
      is enabled, leads to a kernel panic.
      
      Make addr_has_metadata() return true for valid addresses only.
      
      Note: KASAN_HW_TAGS support for vmalloc will be added with a future
      patch.
      
      Link: https://lkml.kernel.org/r/20210126134409.47894-3-vincenzo.frascino@arm.com
      Fixes: 2e903b91
      
       ("kasan, arm64: implement HW_TAGS runtime")
      Signed-off-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
      Cc: "Paul E . McKenney" <paulmck@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b99acdcb
    • Vincenzo Frascino's avatar
      kasan: add explicit preconditions to kasan_report() · 49c6631d
      Vincenzo Frascino authored
      
      
      Patch series "kasan: Fix metadata detection for KASAN_HW_TAGS", v5.
      
      With the introduction of KASAN_HW_TAGS, kasan_report() currently assumes
      that every location in memory has valid metadata associated.  This is
      due to the fact that addr_has_metadata() returns always true.
      
      As a consequence of this, an invalid address (e.g.  NULL pointer
      address) passed to kasan_report() when KASAN_HW_TAGS is enabled, leads
      to a kernel panic.
      
      Example below, based on arm64:
      
         BUG: KASAN: invalid-access in 0x0
         Read at addr 0000000000000000 by task swapper/0/1
         Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
         Mem abort info:
           ESR = 0x96000004
           EC = 0x25: DABT (current EL), IL = 32 bits
           SET = 0, FnV = 0
           EA = 0, S1PTW = 0
         Data abort info:
           ISV = 0, ISS = 0x00000004
           CM = 0, WnR = 0
      
        ...
      
         Call trace:
          mte_get_mem_tag+0x24/0x40
          kasan_report+0x1a4/0x410
          alsa_sound_last_init+0x8c/0xa4
          do_one_initcall+0x50/0x1b0
          kernel_init_freeable+0x1d4/0x23c
          kernel_init+0x14/0x118
          ret_from_fork+0x10/0x34
         Code: d65f03c0 9000f021 f9428021 b6cfff61 (d9600000)
         ---[ end trace 377c8bb45bdd3a1a ]---
         hrtimer: interrupt took 48694256 ns
         note: swapper/0[1] exited with preempt_count 1
         Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
         SMP: stopping secondary CPUs
         Kernel Offset: 0x35abaf140000 from 0xffff800010000000
         PHYS_OFFSET: 0x40000000
         CPU features: 0x0a7e0152,61c0a030
         Memory Limit: none
         ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
      
      This series fixes the behavior of addr_has_metadata() that now returns
      true only when the address is valid.
      
      This patch (of 2):
      
      With the introduction of KASAN_HW_TAGS, kasan_report() accesses the
      metadata only when addr_has_metadata() succeeds.
      
      Add a comment to make sure that the preconditions to the function are
      explicitly clarified.
      
      Link: https://lkml.kernel.org/r/20210126134409.47894-1-vincenzo.frascino@arm.com
      Link: https://lkml.kernel.org/r/20210126134409.47894-2-vincenzo.frascino@arm.com
      Signed-off-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: "Paul E . McKenney" <paulmck@kernel.org>
      Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49c6631d
    • Waiman Long's avatar
      mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked() · da74240e
      Waiman Long authored
      Commit 3fea5a49 ("mm: memcontrol: convert page cache to a new
      mem_cgroup_charge() API") introduced a bug in __add_to_page_cache_locked()
      causing the following splat:
      
        page dumped because: VM_BUG_ON_PAGE(page_memcg(page))
        pages's memcg:ffff8889a4116000
        ------------[ cut here ]------------
        kernel BUG at mm/memcontrol.c:2924!
        invalid opcode: 0000 [#1] SMP KASAN PTI
        CPU: 35 PID: 12345 Comm: cat Tainted: G S      W I       5.11.0-rc4-debug+ #1
        Hardware name: HP HP Z8 G4 Workstation/81C7, BIOS P60 v01.25 12/06/2017
        RIP: commit_charge+0xf4/0x130
        Call Trace:
          mem_cgroup_charge+0x175/0x770
          __add_to_page_cache_locked+0x712/0xad0
          add_to_page_cache_lru+0xc5/0x1f0
          cachefiles_read_or_alloc_pages+0x895/0x2e10 [cachefiles]
          __fscache_read_or_alloc_pages+0x6c0/0xa00 [fscache]
          __nfs_readpages_from_fscache+0x16d/0x630 [nfs]
          nfs_readpages+0x24e/0x540 [nfs]
          read_pages+0x5b1/0xc40
          page_cache_ra_unbounded+0x460/0x750
          generic_file_buffered_read_get_pages+0x290/0x1710
          generic_file_buffered_read+0x2a9/0xc30
          nfs_file_read+0x13f/0x230 [nfs]
          new_sync_read+0x3af/0x610
          vfs_read+0x339/0x4b0
          ksys_read+0xf1/0x1c0
          do_syscall_64+0x33/0x40
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Before that commit, there was a try_charge() and commit_charge() in
      __add_to_page_cache_locked().  These two separated charge functions were
      replaced by a single mem_cgroup_charge().  However, it forgot to add a
      matching mem_cgroup_uncharge() when the xarray insertion failed with the
      page released back to the pool.
      
      Fix this by adding a mem_cgroup_uncharge() call when insertion error
      happens.
      
      Link: https://lkml.kernel.org/r/20210125042441.20030-1-longman@redhat.com
      Fixes: 3fea5a49
      
       ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Reviewed-by: default avatarAlex Shi <alex.shi@linux.alibaba.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Muchun Song <smuchun@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      da74240e
    • Manivannan Sadhasivam's avatar
      mailmap: add entries for Manivannan Sadhasivam · 9c41e526
      Manivannan Sadhasivam authored
      
      
      Map my personal and work addresses to korg mail address.
      
      Link: https://lkml.kernel.org/r/20210201104640.108556-1-manivannan.sadhasivam@linaro.org
      Signed-off-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9c41e526
    • Viresh Kumar's avatar
      mailmap: fix name/email for Viresh Kumar · 4c415b9a
      Viresh Kumar authored
      
      
      For some of the patches the email id was misspelled to linaro.com
      instead of linaro.org and for others Viresh Kumar was written as "viresh
      kumar" (all small).  Fix both with help of mailmap entries.
      
      Link: https://lkml.kernel.org/r/d6b80b210d7fe0ddc1d4d0b22eff9708c72ef8b3.1612178938.git.viresh.kumar@linaro.org
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4c415b9a
    • Roman Gushchin's avatar
      memblock: do not start bottom-up allocations with kernel_end · 2dcb3964
      Roman Gushchin authored
      With kaslr the kernel image is placed at a random place, so starting the
      bottom-up allocation with the kernel_end can result in an allocation
      failure and a warning like this one:
      
        hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
        ------------[ cut here ]------------
        memblock: bottom-up allocation failed, memory hotremove may be affected
        WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
        RIP: 0010:memblock_find_in_range_node+0x178/0x25a
        Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
        RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
        RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
        RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
        RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
        R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
        R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
        FS:  0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
        Call Trace:
          memblock_alloc_range_nid+0x8d/0x11e
          cma_declare_contiguous_nid+0x2c4/0x38c
          hugetlb_cma_reserve+0xdc/0x128
          flush_tlb_one_kernel+0xc/0x20
          native_set_fixmap+0x82/0xd0
          flat_get_apic_id+0x5/0x10
          register_lapic_address+0x8e/0x97
          setup_arch+0x8a5/0xc3f
          start_kernel+0x66/0x547
          load_ucode_bsp+0x4c/0xcd
          secondary_startup_64_no_verify+0xb0/0xbb
        random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
        ---[ end trace f151227d0b39be70 ]---
      
      At the same time, the kernel image is protected with memblock_reserve(),
      so we can just start searching at PAGE_SIZE.  In this case the bottom-up
      allocation has the same chances to success as a top-down allocation, so
      there is no reason to fallback in the case of a failure.  All together it
      simplifies the logic.
      
      Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
      Fixes: 8fabc623
      
       ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Wonhyuk Yang <vvghjk1234@gmail.com>
      Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2dcb3964
    • Hugh Dickins's avatar
      mm: thp: fix MADV_REMOVE deadlock on shmem THP · 1c2f6730
      Hugh Dickins authored
      Sergey reported deadlock between kswapd correctly doing its usual
      lock_page(page) followed by down_read(page->mapping->i_mmap_rwsem), and
      madvise(MADV_REMOVE) on an madvise(MADV_HUGEPAGE) area doing
      down_write(page->mapping->i_mmap_rwsem) followed by lock_page(page).
      
      This happened when shmem_fallocate(punch hole)'s unmap_mapping_range()
      reaches zap_pmd_range()'s call to __split_huge_pmd().  The same deadlock
      could occur when partially truncating a mapped huge tmpfs file, or using
      fallocate(FALLOC_FL_PUNCH_HOLE) on it.
      
      __split_huge_pmd()'s page lock was added in 5.8, to make sure that any
      concurrent use of reuse_swap_page() (holding page lock) could not catch
      the anon THP's mapcounts and swapcounts while they were being split.
      
      Fortunately, reuse_swap_page() is never applied to a shmem or file THP
      (not even by khugepaged, which checks PageSwapCache before calling), and
      anonymous THPs are never created in shmem or file areas: so that
      __split_huge_pmd()'s page lock can only be necessary for anonymous THPs,
      on which there is no risk of deadlock with i_mmap_rwsem.
      
      Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2101161409470.2022@eggly.anvils
      Fixes: c444eb56
      
       ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c2f6730