Skip to content
  1. Feb 01, 2020
    • Ralph Campbell's avatar
      mm/migrate: clean up some minor coding style · c23a0c99
      Ralph Campbell authored
      
      
      Fix some comment typos and coding style clean up in preparation for the
      next patch.  No functional changes.
      
      Link: http://lkml.kernel.org/r/20200107211208.24595-3-rcampbell@nvidia.com
      Signed-off-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Acked-by: default avatarChris Down <chris@chrisdown.name>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Bharata B Rao <bharata@linux.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c23a0c99
    • Ralph Campbell's avatar
      mm/migrate: remove useless mask of start address · 872ea707
      Ralph Campbell authored
      
      
      Addresses passed to walk_page_range() callback functions are already
      page aligned and don't need to be masked with PAGE_MASK.
      
      Link: http://lkml.kernel.org/r/20200107211208.24595-2-rcampbell@nvidia.com
      Signed-off-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Bharata B Rao <bharata@linux.ibm.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Chris Down <chris@chrisdown.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      872ea707
    • Wei Yang's avatar
      mm/huge_memory.c: reduce critical section protected by split_queue_lock · afb97172
      Wei Yang authored
      split_queue_lock protects data in struct deferred_split.  We can release
      the lock after delete the page from deferred_split_queue.
      
      This patch moves the THP accounting out of the lock protection, which is
      introduced in commit 65c45377
      
       ("mm, rmap: account shmem thp pages").
      
      Link: http://lkml.kernel.org/r/20200110025516.23996-1-richardw.yang@linux.intel.com
      Signed-off-by: default avatarWei Yang <richardw.yang@linux.intel.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      afb97172
    • Wei Yang's avatar
      mm/huge_memory.c: use head to emphasize the purpose of page · a8803e6c
      Wei Yang authored
      
      
      During split huge page, it checks the property of the page.  Currently
      we do the check on page and head without emphasizing the check is on the
      compound page.  In case the page passed to split_huge_page_to_list is a
      tail page, audience would take some time to think about whether the
      check is on compound page or tail page itself.
      
      To make it explicit, use head instead of page for those checks.  After
      this, audience would be more clear about the checks are on compound page
      and the page is used to do the split and dump error message if failed.
      
      Link: http://lkml.kernel.org/r/20200110032610.26499-2-richardw.yang@linux.intel.com
      Signed-off-by: default avatarWei Yang <richardw.yang@linux.intel.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8803e6c
    • Wei Yang's avatar
      mm/huge_memory.c: use head to check huge zero page · cb829624
      Wei Yang authored
      The page could be a tail page, if this is the case, this BUG_ON will
      never be triggered.
      
      Link: http://lkml.kernel.org/r/20200110032610.26499-1-richardw.yang@linux.intel.com
      Fixes: e9b61f19
      
       ("thp: reintroduce split_huge_page()")
      
      Signed-off-by: default avatarWei Yang <richardw.yang@linux.intel.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb829624
    • David Rientjes's avatar
      mm, oom: dump stack of victim when reaping failed · 8a7ff02a
      David Rientjes authored
      
      
      When a process cannot be oom reaped, for whatever reason, currently the
      list of locks that are held is currently dumped to the kernel log.
      
      Much more interesting is the stack trace of the victim that cannot be
      reaped.  If the stack trace is dumped, we have the ability to find
      related occurrences in the same kernel code and hopefully solve the
      issue that is making it wedged.
      
      Dump the stack trace when a process fails to be oom reaped.
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2001141519280.200484@chino.kir.corp.google.com
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a7ff02a
    • Anshuman Khandual's avatar
      memblock: Use __func__ in remaining memblock_dbg() call sites · a090d711
      Anshuman Khandual authored
      
      
      Replace open function name strings with %s (__func__) in all remaining
      memblock_dbg() call sites.
      
      Link: http://lkml.kernel.org/r/1578285510-28261-1-git-send-email-anshuman.khandual@arm.com
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a090d711
    • Anshuman Khandual's avatar
      mm/memblock: define memblock_physmem_add() · 02634a44
      Anshuman Khandual authored
      
      
      On the s390 platform memblock.physmem array is being built by directly
      calling into memblock_add_range() which is a low level function not
      intended to be used outside of memblock.  Hence lets conditionally add
      helper functions for physmem array when HAVE_MEMBLOCK_PHYS_MAP is
      enabled.  Also use MAX_NUMNODES instead of 0 as node ID similar to
      memblock_add() and memblock_reserve().  Make memblock_add_range() a
      static function as it is no longer getting used outside of memblock.
      
      Link: http://lkml.kernel.org/r/1578283835-21969-1-git-send-email-anshuman.khandual@arm.com
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Collin Walling <walling@linux.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Philipp Rudo <prudo@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      02634a44
    • Daniel Wagner's avatar
      tools/vm/slabinfo: fix sanity checks enabling · e25974aa
      Daniel Wagner authored
      
      
      The sysfs file name for enabling sanity checking is called
      'sanity_checks' and not 'sanity'.
      
      The name of the file has never changed since the introduction of the
      slub allocator.  Obviously, most people turn the checks on via the
      command line option and not during runtime using slabinfo.
      
      Link: http://lkml.kernel.org/r/20200116131642.642-1-dwagner@suse.de
      Signed-off-by: default avatarDaniel Wagner <dwagner@suse.de>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: "Tobin C. Harding" <tobin@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e25974aa
    • Alex Shi's avatar
      mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE · 648b5cf3
      Alex Shi authored
      Commit 1b2ffb78
      
       ("[PATCH] Zone reclaim: Allow modification of zone
      reclaim behavior")' defined RECLAIM_OFF/RECLAIM_ZONE, but never use
      them, so better to remove them.
      
      [dwagner@suse.de: fix sanity checks enabling]
        Link: http://lkml.kernel.org/r/20200116131642.642-1-dwagner@suse.de
      [akpm@linux-foundation.org: renumber the bits for neatness]
      Link: http://lkml.kernel.org/r/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com
      Signed-off-by: default avatarAlex Shi <alex.shi@linux.alibaba.com>
      Signed-off-by: default avatarDaniel Wagner <dwagner@suse.de>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: "Tobin C. Harding" <tobin@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      648b5cf3
    • Alex Shi's avatar
      mm/vmscan: remove prefetch_prev_lru_page · fffbacc1
      Alex Shi authored
      
      
      This macro was never used in git history.  So better to remove.
      
      Link: http://lkml.kernel.org/r/1579006500-127143-1-git-send-email-alex.shi@linux.alibaba.com
      Signed-off-by: default avatarAlex Shi <alex.shi@linux.alibaba.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Qian Cai <cai@lca.pw>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fffbacc1
    • Liu Song's avatar
      mm/vmscan.c: remove unused return value of shrink_node · 6c9e0907
      Liu Song authored
      
      
      The return value of shrink_node is not used, so remove unnecessary
      operations.
      
      Link: http://lkml.kernel.org/r/20191128143524.3223-1-fishland@aliyun.com
      Signed-off-by: default avatarLiu Song <liu.song11@zte.com.cn>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c9e0907
    • David Hildenbrand's avatar
      mm: remove "count" parameter from has_unmovable_pages() · fe4c86c9
      David Hildenbrand authored
      
      
      Now that the memory isolate notifier is gone, the parameter is always 0.
      Drop it and cleanup has_unmovable_pages().
      
      Link: http://lkml.kernel.org/r/20191114131911.11783-3-david@redhat.com
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Pingfan Liu <kernelfans@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Wei Yang <richardw.yang@linux.intel.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Arun KS <arunks@codeaurora.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fe4c86c9
    • David Hildenbrand's avatar
      mm: remove the memory isolate notifier · 3f9903b9
      David Hildenbrand authored
      
      
      Luckily, we have no users left, so we can get rid of it.  Cleanup
      set_migratetype_isolate() a little bit.
      
      Link: http://lkml.kernel.org/r/20191114131911.11783-2-david@redhat.com
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Pingfan Liu <kernelfans@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3f9903b9
    • Kirill A. Shutemov's avatar
      mm/page_alloc: skip non present sections on zone initialization · 3f135355
      Kirill A. Shutemov authored
      
      
      memmap_init_zone() can be called on the ranges with holes during the
      boot.  It will skip any non-valid PFNs one-by-one.  It works fine as
      long as holes are not too big.
      
      But huge holes in the memory map causes a problem.  It takes over 20
      seconds to walk 32TiB hole.  x86-64 with 5-level paging allows for much
      larger holes in the memory map which would practically hang the system.
      
      Deferred struct page init doesn't help here.  It only works on the
      present ranges.
      
      Skipping non-present sections would fix the issue.
      
      Link: http://lkml.kernel.org/r/20191230093828.24613-1-kirill.shutemov@linux.intel.com
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: "Jin, Zhi" <zhi.jin@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3f135355
    • Andy Shevchenko's avatar
      mm/early_ioremap.c: use %pa to print resource_size_t variables · 7b69d79f
      Andy Shevchenko authored
      
      
      %pa takes into consideration the special types such as resource_size_t.
      Use this specifier %instead of explicit casting.
      
      Link: http://lkml.kernel.org/r/20191209165413.56263-1-andriy.shevchenko@linux.intel.com
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b69d79f
    • Gustavo A. R. Silva's avatar
      lib/test_kasan.c: fix memory leak in kmalloc_oob_krealloc_more() · 3e21d9a5
      Gustavo A. R. Silva authored
      In case memory resources for _ptr2_ were allocated, release them before
      return.
      
      Notice that in case _ptr1_ happens to be NULL, krealloc() behaves
      exactly like kmalloc().
      
      Addresses-Coverity-ID: 1490594 ("Resource leak")
      Link: http://lkml.kernel.org/r/20200123160115.GA4202@embeddedor
      Fixes: 3f15801c
      
       ("lib: add kasan test module")
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3e21d9a5
    • Junyong Sun's avatar
      mm, tracing: print symbol name for kmem_alloc_node call_site events · 7e168b9b
      Junyong Sun authored
      
      
      Print the call_site ip of kmem_alloc_node using '%pS' to improve the
      readability of raw slab trace points.
      
      Link: http://lkml.kernel.org/r/1577949568-4518-1-git-send-email-sunjunyong@xiaomi.com
      Signed-off-by: default avatarJunyong Sun <sunjunyong@xiaomi.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Tim Murray <timmurray@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7e168b9b
    • Li Xinhai's avatar
      mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page · 5b8d6e37
      Li Xinhai authored
      
      
      When check_pte, pfn of normal, hugetlbfs and THP page need be compared.
      The current implementation apply comparison as
      
      - normal 4K page: page_pfn <= pfn < page_pfn + 1
      - hugetlbfs page:  page_pfn <= pfn < page_pfn + HPAGE_PMD_NR
      - THP page: page_pfn <= pfn < page_pfn + HPAGE_PMD_NR
      
      in pfn_in_hpage.  For hugetlbfs page, it should be page_pfn == pfn
      
      Now, change pfn_in_hpage to pfn_is_match to highlight that comparison is
      not only for THP and explicitly compare for these cases.
      
      No impact upon current behavior, just make the code clear.  I think it
      is important to make the code clear - comparing hugetlbfs page in range
      page_pfn <= pfn < page_pfn + HPAGE_PMD_NR is confusing.
      
      Link: http://lkml.kernel.org/r/1578737885-8890-1-git-send-email-lixinhai.lxh@gmail.com
      Signed-off-by: default avatarLi Xinhai <lixinhai.lxh@gmail.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5b8d6e37
    • Kaitao Cheng's avatar
      mm/memcontrol.c: cleanup some useless code · 92855270
      Kaitao Cheng authored
      
      
      Compound pages handling in mem_cgroup_migrate is more convoluted than
      necessary.  The state is duplicated in compound variable and the same
      could be achieved by PageTransHuge check which is trivial and
      hpage_nr_pages is already PageTransHuge aware.
      
      It is much simpler to just use hpage_nr_pages for nr_pages and replace
      the local variable by PageTransHuge check directly
      
      Link: http://lkml.kernel.org/r/20191210160450.3395-1-pilgrimtao@gmail.com
      Signed-off-by: default avatarKaitao Cheng <pilgrimtao@gmail.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92855270
    • Vasily Averin's avatar
      mm/swapfile.c: swap_next should increase position index · 10c8d69f
      Vasily Averin authored
      If seq_file .next fuction does not change position index, read after
      some lseek can generate unexpected output.
      
      In Aug 2018 NeilBrown noticed commit 1f4aace6
      
       ("fs/seq_file.c:
      simplify seq_file iteration code and interface") "Some ->next functions
      do not increment *pos when they return NULL...  Note that such ->next
      functions are buggy and should be fixed.  A simple demonstration is
      
        dd if=/proc/swaps bs=1000 skip=1
      
      Choose any block size larger than the size of /proc/swaps.  This will
      always show the whole last line of /proc/swaps"
      
      Described problem is still actual.  If you make lseek into middle of
      last output line following read will output end of last line and whole
      last line once again.
      
        $ dd if=/proc/swaps bs=1  # usual output
        Filename				Type		Size	Used	Priority
        /dev/dm-0                               partition	4194812	97536	-2
        104+0 records in
        104+0 records out
        104 bytes copied
      
        $ dd if=/proc/swaps bs=40 skip=1    # last line was generated twice
        dd: /proc/swaps: cannot skip to specified offset
        v/dm-0                               partition	4194812	97536	-2
        /dev/dm-0                               partition	4194812	97536	-2
        3+1 records in
        3+1 records out
        131 bytes copied
      
      https://bugzilla.kernel.org/show_bug.cgi?id=206283
      
      Link: http://lkml.kernel.org/r/bd8cfd7b-ac95-9b91-f9e7-e8438bd5047d@virtuozzo.com
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10c8d69f
    • John Hubbard's avatar
      mm, tree-wide: rename put_user_page*() to unpin_user_page*() · f1f6a7dd
      John Hubbard authored
      
      
      In order to provide a clearer, more symmetric API for pinning and
      unpinning DMA pages.  This way, pin_user_pages*() calls match up with
      unpin_user_pages*() calls, and the API is a lot closer to being
      self-explanatory.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f1f6a7dd
    • John Hubbard's avatar
      mm/gup_benchmark: use proper FOLL_WRITE flags instead of hard-coding "1" · bdffe23e
      John Hubbard authored
      
      
      Fix the gup benchmark flags to use the symbolic FOLL_WRITE, instead of a
      hard-coded "1" value.
      
      Also, clean up the filtering of gup flags a little, by just doing it
      once before issuing any of the get_user_pages*() calls.  This makes it
      harder to overlook, instead of having little "gup_flags & 1" phrases in
      the function calls.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-22-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bdffe23e
    • John Hubbard's avatar
      powerpc: book3s64: convert to pin_user_pages() and put_user_page() · aa4b87fe
      John Hubbard authored
      
      
      1. Convert from get_user_pages() to pin_user_pages().
      
      2. As required by pin_user_pages(), release these pages via
         put_user_page().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-21-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa4b87fe
    • John Hubbard's avatar
      vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion · 19fed0da
      John Hubbard authored
      
      
      1. Change vfio from get_user_pages_remote(), to
         pin_user_pages_remote().
      
      2. Because all FOLL_PIN-acquired pages must be released via
         put_user_page(), also convert the put_page() call over to
         put_user_pages_dirty_lock().
      
      Note that this effectively changes the code's behavior in
      vfio_iommu_type1.c: put_pfn(): it now ultimately calls
      set_page_dirty_lock(), instead of set_page_dirty().  This is probably
      more accurate.
      
      As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
      dealing with a file backed page where we have reference on the inode it
      hangs off." [1]
      
      [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-20-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Tested-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Acked-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19fed0da
    • John Hubbard's avatar
      media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion · 1f815afc
      John Hubbard authored
      
      
      1. Change v4l2 from get_user_pages() to pin_user_pages().
      
      2. Because all FOLL_PIN-acquired pages must be released via
         put_user_page(), also convert the put_page() call over to
         put_user_pages_dirty_lock().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-19-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f815afc
    • John Hubbard's avatar
      net/xdp: set FOLL_PIN via pin_user_pages() · fb48b474
      John Hubbard authored
      
      
      Convert net/xdp to use the new pin_longterm_pages() call, which sets
      FOLL_PIN.  Setting FOLL_PIN is now required for code that requires
      tracking of pinned pages.
      
      In partial anticipation of this work, the net/xdp code was already calling
      put_user_page() instead of put_page().  Therefore, in order to convert
      from the get_user_pages()/put_page() model, to the
      pin_user_pages()/put_user_page() model, the only change required here is
      to change get_user_pages() to pin_user_pages().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-18-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fb48b474
    • John Hubbard's avatar
      fs/io_uring: set FOLL_PIN via pin_user_pages() · 2113b05d
      John Hubbard authored
      
      
      Convert fs/io_uring to use the new pin_user_pages() call, which sets
      FOLL_PIN.  Setting FOLL_PIN is now required for code that requires
      tracking of pinned pages, and therefore for any code that calls
      put_user_page().
      
      In partial anticipation of this work, the io_uring code was already
      calling put_user_page() instead of put_page().  Therefore, in order to
      convert from the get_user_pages()/put_page() model, to the
      pin_user_pages()/put_user_page() model, the only change required here is
      to change get_user_pages() to pin_user_pages().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-17-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2113b05d
    • John Hubbard's avatar
      drm/via: set FOLL_PIN via pin_user_pages_fast() · a5adf0a0
      John Hubbard authored
      
      
      Convert drm/via to use the new pin_user_pages_fast() call, which sets
      FOLL_PIN.  Setting FOLL_PIN is now required for code that requires
      tracking of pinned pages, and therefore for any code that calls
      put_user_page().
      
      In partial anticipation of this work, the drm/via driver was already
      calling put_user_page() instead of put_page().  Therefore, in order to
      convert from the get_user_pages()/put_page() model, to the
      pin_user_pages()/put_user_page() model, the only change required is to
      change get_user_pages() to pin_user_pages().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-16-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5adf0a0
    • John Hubbard's avatar
      mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote() · 803e4572
      John Hubbard authored
      
      
      Convert process_vm_access to use the new pin_user_pages_remote() call,
      which sets FOLL_PIN.  Setting FOLL_PIN is now required for code that
      requires tracking of pinned pages.
      
      Also, release the pages via put_user_page*().
      
      Also, rename "pages" to "pinned_pages", as this makes for easier reading
      of process_vm_rw_single_vec().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-15-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      803e4572
    • John Hubbard's avatar
      IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP · dfa0a4ff
      John Hubbard authored
      
      
      Convert infiniband to use the new pin_user_pages*() calls.
      
      Also, revert earlier changes to Infiniband ODP that had it using
      put_user_page().  ODP is "Case 3" in
      Documentation/core-api/pin_user_pages.rst, which is to say, normal
      get_user_pages() and put_page() is the API to use there.
      
      The new pin_user_pages*() calls replace corresponding get_user_pages*()
      calls, and set the FOLL_PIN flag.  The FOLL_PIN flag requires that the
      caller must return the pages via put_user_page*() calls, but infiniband
      was already doing that as part of an earlier commit.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-14-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dfa0a4ff
    • John Hubbard's avatar
      goldish_pipe: convert to pin_user_pages() and put_user_page() · 57459435
      John Hubbard authored
      
      
      1. Call the new global pin_user_pages_fast(), from
         pin_goldfish_pages().
      
      2. As required by pin_user_pages(), release these pages via
         put_user_page().  In this case, do so via put_user_pages_dirty_lock().
      
      That has the side effect of calling set_page_dirty_lock(), instead of
      set_page_dirty().  This is probably more accurate.
      
      As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
      dealing with a file backed page where we have reference on the inode it
      hangs off." [1]
      
      Another side effect is that the release code is simplified because the
      page[] loop is now in gup.c instead of here, so just delete the local
      release_user_pages() entirely, and call put_user_pages_dirty_lock()
      directly, instead.
      
      [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-13-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      57459435
    • John Hubbard's avatar
      mm/gup: introduce pin_user_pages*() and FOLL_PIN · eddb1c22
      John Hubbard authored
      
      
      Introduce pin_user_pages*() variations of get_user_pages*() calls, and
      also pin_longterm_pages*() variations.
      
      For now, these are placeholder calls, until the various call sites are
      converted to use the correct get_user_pages*() or pin_user_pages*() API.
      
      These variants will eventually all set FOLL_PIN, which is also
      introduced, and thoroughly documented.
      
          pin_user_pages()
          pin_user_pages_remote()
          pin_user_pages_fast()
      
      All pages that are pinned via the above calls, must be unpinned via
      put_user_page().
      
      The underlying rules are:
      
      * FOLL_PIN is a gup-internal flag, so the call sites should not directly
        set it.  That behavior is enforced with assertions.
      
      * Call sites that want to indicate that they are going to do DirectIO
        ("DIO") or something with similar characteristics, should call a
        get_user_pages()-like wrapper call that sets FOLL_PIN.  These wrappers
        will:
      
          * Start with "pin_user_pages" instead of "get_user_pages".  That
            makes it easy to find and audit the call sites.
      
          * Set FOLL_PIN
      
      * For pages that are received via FOLL_PIN, those pages must be returned
        via put_user_page().
      
      Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases in
      this documentation.  (I've reworded it and expanded upon it.)
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-12-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>		[Documentation]
      Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eddb1c22
    • John Hubbard's avatar
      media/v4l2-core: set pages dirty upon releasing DMA buffers · 3c7470b6
      John Hubbard authored
      
      
      After DMA is complete, and the device and CPU caches are synchronized,
      it's still required to mark the CPU pages as dirty, if the data was
      coming from the device.  However, this driver was just issuing a bare
      put_page() call, without any set_page_dirty*() call.
      
      Fix the problem, by calling set_page_dirty_lock() if the CPU pages were
      potentially receiving data from the device.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-11-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3c7470b6
    • John Hubbard's avatar
      IB/umem: use get_user_pages_fast() to pin DMA pages · 4789fcdd
      John Hubbard authored
      
      
      And get rid of the mmap_sem calls, as part of that.  Note that
      get_user_pages_fast() will, if necessary, fall back to
      __gup_longterm_unlocked(), which takes the mmap_sem as needed.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-10-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4789fcdd
    • John Hubbard's avatar
      mm/gup: allow FOLL_FORCE for get_user_pages_fast() · f4000fdf
      John Hubbard authored
      Commit 817be129 ("mm: validate get_user_pages_fast flags") allowed
      only FOLL_WRITE and FOLL_LONGTERM to be passed to get_user_pages_fast().
      This, combined with the fact that get_user_pages_fast() falls back to
      "slow gup", which *does* accept FOLL_FORCE, leads to an odd situation:
      if you need FOLL_FORCE, you cannot call get_user_pages_fast().
      
      There does not appear to be any reason for filtering out FOLL_FORCE.
      There is nothing in the _fast() implementation that requires that we
      avoid writing to the pages.  So it appears to have been an oversight.
      
      Fix by allowing FOLL_FORCE to be set for get_user_pages_fast().
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-9-jhubbard@nvidia.com
      Fixes: 817be129
      
       ("mm: validate get_user_pages_fast flags")
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f4000fdf
    • John Hubbard's avatar
      vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call · 3567813e
      John Hubbard authored
      
      
      Update VFIO to take advantage of the recently loosened restriction on
      FOLL_LONGTERM with get_user_pages_remote().  Also, now it is possible to
      fix a bug: the VFIO caller is logically a FOLL_LONGTERM user, but it
      wasn't setting FOLL_LONGTERM.
      
      Also, remove an unnessary pair of calls that were releasing and
      reacquiring the mmap_sem.  There is no need to avoid holding mmap_sem
      just in order to call page_to_pfn().
      
      Also, now that the the DAX check ("if a VMA is DAX, don't allow long
      term pinning") is in the internals of get_user_pages_remote() and
      __gup_longterm_locked(), there's no need for it at the VFIO call site.  So
      remove it.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-8-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Tested-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Acked-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Suggested-by: default avatarJason Gunthorpe <jgg@ziepe.ca>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3567813e
    • John Hubbard's avatar
      mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM · c4237f8b
      John Hubbard authored
      
      
      As it says in the updated comment in gup.c: current FOLL_LONGTERM
      behavior is incompatible with FAULT_FLAG_ALLOW_RETRY because of the FS
      DAX check requirement on vmas.
      
      However, the corresponding restriction in get_user_pages_remote() was
      slightly stricter than is actually required: it forbade all
      FOLL_LONGTERM callers, but we can actually allow FOLL_LONGTERM callers
      that do not set the "locked" arg.
      
      Update the code and comments to loosen the restriction, allowing
      FOLL_LONGTERM in some cases.
      
      Also, copy the DAX check ("if a VMA is DAX, don't allow long term
      pinning") from the VFIO call site, all the way into the internals of
      get_user_pages_remote() and __gup_longterm_locked().  That is:
      get_user_pages_remote() calls __gup_longterm_locked(), which in turn
      calls check_dax_vmas().  This check will then be removed from the VFIO
      call site in a subsequent patch.
      
      Thanks to Jason Gunthorpe for pointing out a clean way to fix this, and
      to Dan Williams for helping clarify the DAX refactoring.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-7-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Tested-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Acked-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Suggested-by: default avatarJason Gunthorpe <jgg@ziepe.ca>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c4237f8b
    • John Hubbard's avatar
      goldish_pipe: rename local pin_user_pages() routine · 1023369c
      John Hubbard authored
      
      
      Avoid naming conflicts: rename local static function from
      "pin_user_pages()" to "goldfish_pin_pages()".
      
      An upcoming patch will introduce a global pin_user_pages() function.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-6-jhubbard@nvidia.com
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1023369c
    • John Hubbard's avatar
      mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages · 07d80269
      John Hubbard authored
      
      
      An upcoming patch changes and complicates the refcounting and especially
      the "put page" aspects of it.  In order to keep everything clean,
      refactor the devmap page release routines:
      
      * Rename put_devmap_managed_page() to page_is_devmap_managed(), and
        limit the functionality to "read only": return a bool, with no side
        effects.
      
      * Add a new routine, put_devmap_managed_page(), to handle decrementing
        the refcount for ZONE_DEVICE pages.
      
      * Change callers (just release_pages() and put_page()) to check
        page_is_devmap_managed() before calling the new
        put_devmap_managed_page() routine.  This is a performance point:
        put_page() is a hot path, so we need to avoid non- inline function calls
        where possible.
      
      * Rename __put_devmap_managed_page() to free_devmap_managed_page(), and
        limit the functionality to unconditionally freeing a devmap page.
      
      This is originally based on a separate patch by Ira Weiny, which applied
      to an early version of the put_user_page() experiments.  Since then,
      Jérôme Glisse suggested the refactoring described above.
      
      Link: http://lkml.kernel.org/r/20200107224558.2362728-5-jhubbard@nvidia.com
      Signed-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Suggested-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Leon Romanovsky <leonro@mellanox.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      07d80269