Skip to content
  1. Nov 16, 2020
  2. Nov 15, 2020
    • Paolo Bonzini's avatar
      kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use · c887c9b9
      Paolo Bonzini authored
      
      
      In some cases where shadow paging is in use, the root page will
      be either mmu->pae_root or vcpu->arch.mmu->lm_root.  Then it will
      not have an associated struct kvm_mmu_page, because it is allocated
      with alloc_page instead of kvm_mmu_alloc_page.
      
      Just return false quickly from is_tdp_mmu_root if the TDP MMU is
      not in use, which also includes the case where shadow paging is
      enabled.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c887c9b9
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · e28c0d7c
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "14 patches.
      
        Subsystems affected by this patch series: mm (migration, vmscan, slub,
        gup, memcg, hugetlbfs), mailmap, kbuild, reboot, watchdog, panic, and
        ocfs2"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        ocfs2: initialize ip_next_orphan
        panic: don't dump stack twice on warn
        hugetlbfs: fix anon huge page migration race
        mm: memcontrol: fix missing wakeup polling thread
        kernel/watchdog: fix watchdog_allowed_mask not used warning
        reboot: fix overflow parsing reboot cpu number
        Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
        compiler.h: fix barrier_data() on clang
        mm/gup: use unpin_user_pages() in __gup_longterm_locked()
        mm/slub: fix panic in slab_alloc_node()
        mailmap: fix entry for Dmitry Baryshkov/Eremin-Solenikov
        mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
        mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate
        mm/compaction: count pages and stop correctly during page isolation
      e28c0d7c
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 31908a60
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "Two small clk driver fixes:
      
         - Make to_clk_regmap() inline to avoid compiler annoyance
      
         - Fix critical clks on i.MX imx8m SoCs"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: imx8m: fix bus critical clk registration
        clk: define to_clk_regmap() as inline function
      31908a60
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.10-rc4' of... · 7e908b74
      Linus Torvalds authored
      Merge tag 'hwmon-for-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Fix potential bufer overflow in pmbus/max20730 driver
      
       - Fix locking issue in pmbus core
      
       - Fix regression causing timeouts in applesmc driver
      
       - Fix RPM calculation in pwm-fan driver
      
       - Restrict counter visibility in amd_energy driver
      
      * tag 'hwmon-for-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (amd_energy) modify the visibility of the counters
        hwmon: (applesmc) Re-work SMC comms
        hwmon: (pwm-fan) Fix RPM calculation
        hwmon: (pmbus) Add mutex locking for sysfs reads
        hwmon: (pmbus/max20730) use scnprintf() instead of snprintf()
      7e908b74
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 0c045111
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small fixes, all in the embedded ufs driver subsystem"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufshcd: Fix missing destroy_workqueue()
        scsi: ufs: Try to save power mode change and UIC cmd completion timeout
        scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()
      0c045111
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 30636a59
      Linus Torvalds authored
      Pull selinux fix from Paul Moore:
       "One small SELinux patch to make sure we return an error code when an
        allocation fails. It passes all of our tests, but given the nature of
        the patch that isn't surprising"
      
      * tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: Fix error return code in sel_ib_pkey_sid_slow()
      30636a59
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 4aea779d
      Linus Torvalds authored
      Pull uml fix from Richard Weinberger:
       "Call PMD destructor in __pmd_free_tlb()"
      
      * tag 'for-linus-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
        um: Call pgtable_pmd_page_dtor() in __pmd_free_tlb()
      4aea779d
    • David Howells's avatar
      afs: Fix afs_write_end() when called with copied == 0 [ver #3] · 3ad216ee
      David Howells authored
      When afs_write_end() is called with copied == 0, it tries to set the
      dirty region, but there's no way to actually encode a 0-length region in
      the encoding in page->private.
      
      "0,0", for example, indicates a 1-byte region at offset 0.  The maths
      miscalculates this and sets it incorrectly.
      
      Fix it to just do nothing but unlock and put the page in this case.  We
      don't actually need to mark the page dirty as nothing presumably
      changed.
      
      Fixes: 65dd2d60
      
       ("afs: Alter dirty range encoding in page->private")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ad216ee
    • Wengang Wang's avatar
      ocfs2: initialize ip_next_orphan · f5785283
      Wengang Wang authored
      
      
      Though problem if found on a lower 4.1.12 kernel, I think upstream has
      same issue.
      
      In one node in the cluster, there is the following callback trace:
      
         # cat /proc/21473/stack
         __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2]
         ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2]
         ocfs2_evict_inode+0x152/0x820 [ocfs2]
         evict+0xae/0x1a0
         iput+0x1c6/0x230
         ocfs2_orphan_filldir+0x5d/0x100 [ocfs2]
         ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2]
         ocfs2_dir_foreach+0x29/0x30 [ocfs2]
         ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2]
         ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2]
         process_one_work+0x169/0x4a0
         worker_thread+0x5b/0x560
         kthread+0xcb/0xf0
         ret_from_fork+0x61/0x90
      
      The above stack is not reasonable, the final iput shouldn't happen in
      ocfs2_orphan_filldir() function.  Looking at the code,
      
        2067         /* Skip inodes which are already added to recover list, since dio may
        2068          * happen concurrently with unlink/rename */
        2069         if (OCFS2_I(iter)->ip_next_orphan) {
        2070                 iput(iter);
        2071                 return 0;
        2072         }
        2073
      
      The logic thinks the inode is already in recover list on seeing
      ip_next_orphan is non-NULL, so it skip this inode after dropping a
      reference which incremented in ocfs2_iget().
      
      While, if the inode is already in recover list, it should have another
      reference and the iput() at line 2070 should not be the final iput
      (dropping the last reference).  So I don't think the inode is really in
      the recover list (no vmcore to confirm).
      
      Note that ocfs2_queue_orphans(), though not shown up in the call back
      trace, is holding cluster lock on the orphan directory when looking up
      for unlinked inodes.  The on disk inode eviction could involve a lot of
      IOs which may need long time to finish.  That means this node could hold
      the cluster lock for very long time, that can lead to the lock requests
      (from other nodes) to the orhpan directory hang for long time.
      
      Looking at more on ip_next_orphan, I found it's not initialized when
      allocating a new ocfs2_inode_info structure.
      
      This causes te reflink operations from some nodes hang for very long
      time waiting for the cluster lock on the orphan directory.
      
      Fix: initialize ip_next_orphan as NULL.
      
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5785283
    • Christophe Leroy's avatar
      panic: don't dump stack twice on warn · 2f31ad64
      Christophe Leroy authored
      Before commit 3f388f28 ("panic: dump registers on panic_on_warn"),
      __warn() was calling show_regs() when regs was not NULL, and show_stack()
      otherwise.
      
      After that commit, show_stack() is called regardless of whether
      show_regs() has been called or not, leading to duplicated Call Trace:
      
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/nohash/8xx.c:186 mmu_mark_initmem_nx+0x24/0x94
        CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092
        NIP:  c00128b4 LR: c0010228 CTR: 00000000
        REGS: c9023e40 TRAP: 0700   Not tainted  (5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty)
        MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 24000424  XER: 00000000
      
        GPR00: c0010228 c9023ef8 c2100000 0074c000 ffffffff 00000000 c2151000 c07b3880
        GPR08: ff000900 0074c000 c8000000 c33b53a8 24000822 00000000 c0003a20 00000000
        GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
        GPR24: 00000000 00000000...
      2f31ad64
    • Mike Kravetz's avatar
      hugetlbfs: fix anon huge page migration race · 336bf30e
      Mike Kravetz authored
      Qian Cai reported the following BUG in [1]
      
        LTP: starting move_pages12
        BUG: unable to handle page fault for address: ffffffffffffffe0
        ...
        RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170 avc_start_pgoff at mm/interval_tree.c:63
        Call Trace:
          rmap_walk_anon+0x141/0xa30 rmap_walk_anon at mm/rmap.c:1864
          try_to_unmap+0x209/0x2d0 try_to_unmap at mm/rmap.c:1763
          migrate_pages+0x1005/0x1fb0
          move_pages_and_store_status.isra.47+0xd7/0x1a0
          __x64_sys_move_pages+0xa5c/0x1100
          do_syscall_64+0x5f/0x310
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Hugh Dickins diagnosed this as a migration bug caused by code introduced
      to use i_mmap_rwsem for pmd sharing synchronization.  Specifically, the
      routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
      flag to try_to_unmap() while holding i_mmap_rwsem.  This is wrong for
      anon pages as the anon_vma_lock should be held in this case.  Further
      analysis suggested that i_mmap_rwsem was not required to he held at all
      when calling try_to_unmap for anon pages as an anon page could never be
      part of a shared pmd mapping.
      
      Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
      to drop page lock and acquire i_mmap_rwsem is wrong.  There is no way to
      keep mapping valid while dropping page lock.
      
      This patch does the following:
      
       - Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
         calling try_to_unmap.
      
       - Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
         will now simply do a 'trylock' while still holding the page lock. If
         the trylock fails, it will return NULL. This could impact the
         callers:
      
          - migration calling code will receive -EAGAIN and retry up to the
            hard coded limit (10).
      
          - memory error code will treat the page as BUSY. This will force
            killing (SIGKILL) instead of SIGBUS any mapping tasks.
      
         Do note that this change in behavior only happens when there is a
         race. None of the standard kernel testing suites actually hit this
         race, but it is possible.
      
      [1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
      [2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/
      
      Fixes: c0d0381a
      
       ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Suggested-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      336bf30e
    • Muchun Song's avatar
      mm: memcontrol: fix missing wakeup polling thread · 8b21ca02
      Muchun Song authored
      When we poll the swap.events, we can miss being woken up when the swap
      event occurs.  Because we didn't notify.
      
      Fixes: f3a53a3a
      
       ("mm, memcontrol: implement memory.swap.events")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Yafang Shao <laoar.shao@gmail.com>
      Cc: Chris Down <chris@chrisdown.name>
      Cc: Tejun Heo <tj@kernel.org>
      Link: https://lkml.kernel.org/r/20201105161936.98312-1-songmuchun@bytedance.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b21ca02