Skip to content
  1. Jul 13, 2019
    • Huang Ying's avatar
      mm, swap: fix race between swapoff and some swap operations · eb085574
      Huang Ying authored
      When swapin is performed, after getting the swap entry information from
      the page table, system will swap in the swap entry, without any lock held
      to prevent the swap device from being swapoff.  This may cause the race
      like below,
      
      CPU 1				CPU 2
      -----				-----
      				do_swap_page
      				  swapin_readahead
      				    __read_swap_cache_async
      swapoff				      swapcache_prepare
        p->swap_map = NULL		        __swap_duplicate
      					  p->swap_map[?] /* !!! NULL pointer access */
      
      Because swapoff is usually done when system shutdown only, the race may
      not hit many people in practice.  But it is still a race need to be fixed.
      
      To fix the race, get_swap_device() is added to check whether the specified
      swap entry is valid in its swap device.  If so, it will keep the swap
      entry valid via preventing the swap device from being swapoff, until
      put_swap_device() is called.
      
      Because swapoff() is very rare code path, to make the normal path runs as
      fast as possible, rcu_read_lock/unlock() and synchronize_rcu() instead of
      reference count is used to implement get/put_swap_device().  >From
      get_swap_device() to put_swap_device(), RCU reader side is locked, so
      synchronize_rcu() in swapoff() will wait until put_swap_device() is
      called.
      
      In addition to swap_map, cluster_info, etc.  data structure in the struct
      swap_info_struct, the swap cache radix tree will be freed after swapoff,
      so this patch fixes the race between swap cache looking up and swapoff
      too.
      
      Races between some other swap cache usages and swapoff are fixed too via
      calling synchronize_rcu() between clearing PageSwapCache() and freeing
      swap cache data structure.
      
      Another possible method to fix this is to use preempt_off() +
      stop_machine() to prevent the swap device from being swapoff when its data
      structure is being accessed.  The overhead in hot-path of both methods is
      similar.  The advantages of RCU based method are,
      
      1. stop_machine() may disturb the normal execution code path on other
         CPUs.
      
      2. File cache uses RCU to protect its radix tree.  If the similar
         mechanism is used for swap cache too, it is easier to share code
         between them.
      
      3. RCU is used to protect swap cache in total_swapcache_pages() and
         exit_swap_address_space() already.  The two mechanisms can be
         merged to simplify the logic.
      
      Link: http://lkml.kernel.org/r/20190522015423.14418-1-ying.huang@intel.com
      Fixes: 235b6217
      
       ("mm/swap: add cluster lock")
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Not-nacked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Yang Shi <yang.shi@linux.alibaba.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eb085574
    • Yang Shi's avatar
      mm/filemap.c: correct the comment about VM_FAULT_RETRY · a4985833
      Yang Shi authored
      Commit 6b4c9f44
      
       ("filemap: drop the mmap_sem for all blocking
      operations") changed when mmap_sem is dropped during filemap page fault
      and when returning VM_FAULT_RETRY.
      
      Correct the comment to reflect the change.
      
      Link: http://lkml.kernel.org/r/1556234531-108228-1-git-send-email-yang.shi@linux.alibaba.com
      Signed-off-by: default avatarYang Shi <yang.shi@linux.alibaba.com>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a4985833
    • Christoph Hellwig's avatar
      9p: pass the correct prototype to read_cache_page · f053cbd4
      Christoph Hellwig authored
      
      
      Fix the callback 9p passes to read_cache_page to actually have the
      proper type expected.  Casting around function pointers can easily
      hide typing bugs, and defeats control flow protection.
      
      Link: http://lkml.kernel.org/r/20190520055731.24538-5-hch@lst.de
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f053cbd4
    • Christoph Hellwig's avatar
      jffs2: pass the correct prototype to read_cache_page · 265de8ce
      Christoph Hellwig authored
      
      
      Fix the callback jffs2 passes to read_cache_page to actually have the
      proper type expected.  Casting around function pointers can easily hide
      typing bugs, and defeats control flow protection.
      
      Link: http://lkml.kernel.org/r/20190520055731.24538-4-hch@lst.de
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      265de8ce
    • Christoph Hellwig's avatar
      mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page · 6c45b454
      Christoph Hellwig authored
      
      
      We can just pass a NULL filler and do the right thing inside of
      do_read_cache_page based on the NULL parameter.
      
      Link: http://lkml.kernel.org/r/20190520055731.24538-3-hch@lst.de
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6c45b454
    • Christoph Hellwig's avatar
      mm/filemap.c: fix an overly long line in read_cache_page · d322a8e5
      Christoph Hellwig authored
      
      
      Patch series "fix filler_t callback type mismatches", v2.
      
      Casting mapping->a_ops->readpage to filler_t causes an indirect call
      type mismatch with Control-Flow Integrity checking.  This change fixes
      the mismatch in read_cache_page_gfp and read_mapping_page by adding
      using a NULL filler argument as an indication to call ->readpage
      directly, and by passing the right parameter callbacks in nfs and jffs2.
      
      This patch (of 4):
      
      Code cleanup.
      
      Link: http://lkml.kernel.org/r/20190520055731.24538-2-hch@lst.de
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d322a8e5
    • Vlastimil Babka's avatar
      mm, debug_pagealloc: use a page type instead of page_ext flag · 3972f6bb
      Vlastimil Babka authored
      
      
      When debug_pagealloc is enabled, we currently allocate the page_ext
      array to mark guard pages with the PAGE_EXT_DEBUG_GUARD flag.  Now that
      we have the page_type field in struct page, we can use that instead, as
      guard pages are neither PageSlab nor mapped to userspace.  This reduces
      memory overhead when debug_pagealloc is enabled and there are no other
      features requiring the page_ext array.
      
      Link: http://lkml.kernel.org/r/20190603143451.27353-4-vbabka@suse.cz
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3972f6bb
    • Vlastimil Babka's avatar
      mm, page_alloc: more extensive free page checking with debug_pagealloc · 4462b32c
      Vlastimil Babka authored
      The page allocator checks struct pages for expected state (mapcount,
      flags etc) as pages are being allocated (check_new_page()) and freed
      (free_pages_check()) to provide some defense against errors in page
      allocator users.
      
      Prior commits 479f854a ("mm, page_alloc: defer debugging checks of
      pages allocated from the PCP") and 4db7548c
      
       ("mm, page_alloc: defer
      debugging checks of freed pages until a PCP drain") this has happened
      for order-0 pages as they were allocated from or freed to the per-cpu
      caches (pcplists).  Since those are fast paths, the checks are now
      performed only when pages are moved between pcplists and global free
      lists.  This however lowers the chances of catching errors soon enough.
      
      In order to increase the chances of the checks to catch errors, the
      kernel has to be rebuilt with CONFIG_DEBUG_VM, which also enables
      multiple other internal debug checks (VM_BUG_ON() etc), which is
      suboptimal when the goal is to catch errors in mm users, not in mm code
      itself.
      
      To catch some wrong users of the page allocator we have
      CONFIG_DEBUG_PAGEALLOC, which is designed to have virtually no overhead
      unless enabled at boot time.  Memory corruptions when writing to freed
      pages have often the same underlying errors (use-after-free, double free)
      as corrupting the corresponding struct pages, so this existing debugging
      functionality is a good fit to extend by also perform struct page checks
      at least as often as if CONFIG_DEBUG_VM was enabled.
      
      Specifically, after this patch, when debug_pagealloc is enabled on boot,
      and CONFIG_DEBUG_VM disabled, pages are checked when allocated from or
      freed to the pcplists *in addition* to being moved between pcplists and
      free lists.  When both debug_pagealloc and CONFIG_DEBUG_VM are enabled,
      pages are checked when being moved between pcplists and free lists *in
      addition* to when allocated from or freed to the pcplists.
      
      When debug_pagealloc is not enabled on boot, the overhead in fast paths
      should be virtually none thanks to the use of static key.
      
      Link: http://lkml.kernel.org/r/20190603143451.27353-3-vbabka@suse.cz
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4462b32c
    • Vlastimil Babka's avatar
      mm, debug_pagelloc: use static keys to enable debugging · 96a2b03f
      Vlastimil Babka authored
      Patch series "debug_pagealloc improvements".
      
      I have been recently debugging some pcplist corruptions, where it would be
      useful to perform struct page checks immediately as pages are allocated
      from and freed to pcplists, which is now only possible by rebuilding the
      kernel with CONFIG_DEBUG_VM (details in Patch 2 changelog).
      
      To make this kind of debugging simpler in future on a distro kernel, I
      have improved CONFIG_DEBUG_PAGEALLOC so that it has even smaller overhead
      when not enabled at boot time (Patch 1) and also when enabled (Patch 3),
      and extended it to perform the struct page checks more often when enabled
      (Patch 2).  Now it can be configured in when building a distro kernel
      without extra overhead, and debugging page use after free or double free
      can be enabled simply by rebooting with debug_pagealloc=on.
      
      This patch (of 3):
      
      CONFIG_DEBUG_PAGEALLOC has been redesigned by 031bc574
      
      
      ("mm/debug-pagealloc: make debug-pagealloc boottime configurable") to
      allow being always enabled in a distro kernel, but only perform its
      expensive functionality when booted with debug_pagelloc=on.  We can
      further reduce the overhead when not boot-enabled (including page
      allocator fast paths) using static keys.  This patch introduces one for
      debug_pagealloc core functionality, and another for the optional guard
      page functionality (enabled by booting with debug_guardpage_minorder=X).
      
      Link: http://lkml.kernel.org/r/20190603143451.27353-2-vbabka@suse.cz
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96a2b03f
    • Nicolas Boichat's avatar
      mm/failslab.c: by default, do not fail allocations with direct reclaim only · a9659476
      Nicolas Boichat authored
      When failslab was originally written, the intention of the
      "ignore-gfp-wait" flag default value ("N") was to fail GFP_ATOMIC
      allocations.  Those were defined as (__GFP_HIGH), and the code would test
      for __GFP_WAIT (0x10u).
      
      However, since then, __GFP_WAIT was replaced by __GFP_RECLAIM
      (___GFP_DIRECT_RECLAIM|___GFP_KSWAPD_RECLAIM), and GFP_ATOMIC is now
      defined as (__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM).
      
      This means that when the flag is false, almost no allocation ever fails
      (as even GFP_ATOMIC allocations contain ___GFP_KSWAPD_RECLAIM).
      
      Restore the original intent of the code, by ignoring calls that directly
      reclaim only (__GFP_DIRECT_RECLAIM), and thus, failing GFP_ATOMIC calls
      again by default.
      
      Link: http://lkml.kernel.org/r/20190520214514.81360-1-drinkcat@chromium.org
      Fixes: 71baba4b
      
       ("mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM")
      Signed-off-by: default avatarNicolas Boichat <drinkcat@chromium.org>
      Reviewed-by: Akino...
      a9659476
    • Andrew Morton's avatar
      include/linux/pagemap.h: document trylock_page() return value · f4458845
      Andrew Morton authored
      
      
      Cc: Henry Burns <henryburns@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Xidong Wang <wangxidong_97@163.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f4458845
    • Denis Efremov's avatar
      mm: remove the exporting of totalram_pages · 98ef2046
      Denis Efremov authored
      Previously totalram_pages was the global variable.  Currently,
      totalram_pages is the static inline function from the include/linux/mm.h
      However, the function is also marked as EXPORT_SYMBOL, which is at best an
      odd combination.  Because there is no point for the static inline function
      from a public header to be exported, this commit removes the
      EXPORT_SYMBOL() marking.  It will be still possible to use the function in
      modules because all the symbols it depends on are exported.
      
      Link: http://lkml.kernel.org/r/20190710141031.15642-1-efremov@linux.com
      Fixes: ca79b0c2
      
       ("mm: convert totalram_pages and totalhigh_pages variables to atomic")
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      98ef2046
    • Sebastian Andrzej Siewior's avatar
      include/linux/vmpressure.h: use spinlock_t instead of struct spinlock · 51b17629
      Sebastian Andrzej Siewior authored
      
      
      For spinlocks the type spinlock_t should be used instead of "struct
      spinlock".
      
      Use spinlock_t for spinlock's definition.
      
      Link: http://lkml.kernel.org/r/20190704153803.12739-3-bigeasy@linutronix.de
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      51b17629
    • Pingfan Liu's avatar
      mm/page_isolation.c: change the prototype of undo_isolate_page_range() · 1fcf0a56
      Pingfan Liu authored
      
      
      undo_isolate_page_range() never fails, so no need to return value.
      
      Link: http://lkml.kernel.org/r/1562075604-8979-1-git-send-email-kernelfans@gmail.com
      Signed-off-by: default avatarPingfan Liu <kernelfans@gmail.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Qian Cai <cai@lca.pw>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1fcf0a56
    • Christoph Hellwig's avatar
      mm: remove the account_page_dirtied export · ac1c3e49
      Christoph Hellwig authored
      
      
      account_page_dirtied() is only used by our set_page_dirty() helpers and
      should not be used anywhere else.
      
      Link: http://lkml.kernel.org/r/20190605183702.30572-1-hch@lst.de
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ac1c3e49
    • Alexey Dobriyan's avatar
      include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info · 219f8a2e
      Alexey Dobriyan authored
      
      
      The field is only used in swap code.
      
      Link: http://lkml.kernel.org/r/20190503190500.GA30589@avx2
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      219f8a2e
    • Jason Gunthorpe's avatar
      mm: make !CONFIG_HUGE_PAGE wrappers into static inlines · 442a5a9a
      Jason Gunthorpe authored
      
      
      Instead of using defines, which loses type safety and provokes unused
      variable warnings from gcc, put the constants into static inlines.
      
      Link: http://lkml.kernel.org/r/20190522235102.GA15370@mellanox.com
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Suggested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      442a5a9a
    • Miklos Szeredi's avatar
      mm/memory.c: trivial clean up in insert_page() · 465fc3a9
      Miklos Szeredi authored
      
      
      Make the success case use the same cleanup path as the failure case.
      
      Link: http://lkml.kernel.org/r/20190523134024.GC24093@localhost.localdomain
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      465fc3a9
    • Bharath Vedartham's avatar
      mm/gup.c: make follow_page_mask() static · a7030aea
      Bharath Vedartham authored
      
      
      follow_page_mask() is only used in gup.c, make it static.
      
      Link: http://lkml.kernel.org/r/20190510190831.GA4061@bharath12345-Inspiron-5559
      Signed-off-by: default avatarBharath Vedartham <linux.bhar@gmail.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a7030aea
    • Mike Rapoport's avatar
      sparc: remove ARCH_SELECT_MEMORY_MODEL · 44567607
      Mike Rapoport authored
      
      
      The ARCH_SELECT_MEMORY_MODEL option is enabled only for 64-bit.  However,
      64-bit configuration also enables ARCH_SPARSEMEM_DEFAULT and there is no
      ARCH_FLATMEM_ENABLE in arch/sparc/Kconfig.
      
      With such settings, the dependencies in mm/Kconfig are always evaluated to
      SPARSEMEM=y for 64-bit and to FLATMEM=y for 32-bit.
      
      The ARCH_SELECT_MEMORY_MODEL option in arch/sparc/Kconfig does not affect
      anything and can be removed.
      
      Link: http://lkml.kernel.org/r/1556740577-4140-4-git-send-email-rppt@linux.ibm.com
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      44567607
    • Mike Rapoport's avatar
      s390: remove ARCH_SELECT_MEMORY_MODEL · a9d8777e
      Mike Rapoport authored
      The only reason s390 has ARCH_SELECT_MEMORY_MODEL option in
      arch/s390/Kconfig is an ancient compile error with allnoconfig which was
      fixed by commit 97195d6b
      
       ("[S390] fix sparsemem related compile error
      with allnoconfig on s390") by adding the ARCH_SELECT_MEMORY_MODEL option.
      
      Since then a lot have changed and now allnoconfig builds just fine without
      ARCH_SELECT_MEMORY_MODEL, so it can be removed.
      
      Link: http://lkml.kernel.org/r/1556740577-4140-3-git-send-email-rppt@linux.ibm.com
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Russell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9d8777e
    • Mike Rapoport's avatar
      arm: remove ARCH_SELECT_MEMORY_MODEL · 03069bb0
      Mike Rapoport authored
      
      
      Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect".
      
      For several architectures the ARCH_SELECT_MEMORY_MODEL has no real effect
      because the dependencies for the memory model are always evaluated to a
      single value.
      
      Remove the ARCH_SELECT_MEMORY_MODEL from the Kconfigs for these
      architectures.
      
      This patch (of 3):
      
      The ARCH_SELECT_MEMORY_MODEL in arch/arm/Kconfig is enabled only when
      ARCH_SPARSEMEM_ENABLE=y.  But in this case, ARCH_SPARSEMEM_DEFAULT is also
      enabled and this in turn enables SPARSEMEM_MANUAL.
      
      Since there is no definition of ARCH_FLATMEM_ENABLE in arch/arm/Kconfig,
      SPARSEMEM_MANUAL is the only enabled memory model, hence the final
      selection will evaluate to SPARSEMEM=y.
      
      Since ARCH_SPARSEMEM_ENABLE is set to 'y' only by several sub-arch
      configurations, the default for must sub-arches would be the falback to
      FLATMEM regardless of ARCH_SELECT_MEMORY_MODEL.
      
      Link: http://lkml.kernel.org/r/1556740577-4140-2-git-send-email-rppt@linux.ibm.com
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03069bb0
    • Andrew Morton's avatar
      include/linux/pfn_t.h: remove pfn_t_to_virt() · 2236b99d
      Andrew Morton authored
      
      
      It has no callers and there is no virt_to_pfn_t().
      
      Reported-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2236b99d
    • Marco Elver's avatar
      mm/kasan: add object validation in ksize() · 0d4ca4c9
      Marco Elver authored
      
      
      ksize() has been unconditionally unpoisoning the whole shadow memory
      region associated with an allocation.  This can lead to various undetected
      bugs, for example, double-kzfree().
      
      Specifically, kzfree() uses ksize() to determine the actual allocation
      size, and subsequently zeroes the memory.  Since ksize() used to just
      unpoison the whole shadow memory region, no invalid free was detected.
      
      This patch addresses this as follows:
      
      1. Add a check in ksize(), and only then unpoison the memory region.
      
      2. Preserve kasan_unpoison_slab() semantics by explicitly unpoisoning
         the shadow memory region using the size obtained from __ksize().
      
      Tested:
      1. With SLAB allocator: a) normal boot without warnings; b) verified the
         added double-kzfree() is detected.
      2. With SLUB allocator: a) normal boot without warnings; b) verified the
         added double-kzfree() is detected.
      
      [elver@google.com: s/BUG_ON/WARN_ON_ONCE/, per Kees]
        Link: http://lkml.kernel.org/r/20190627094445.216365-6-elver@google.com
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199359
      Link: http://lkml.kernel.org/r/20190626142014.141844-6-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0d4ca4c9
    • Marco Elver's avatar
      mm/slab: refactor common ksize KASAN logic into slab_common.c · 10d1f8cb
      Marco Elver authored
      
      
      This refactors common code of ksize() between the various allocators into
      slab_common.c: __ksize() is the allocator-specific implementation without
      instrumentation, whereas ksize() includes the required KASAN logic.
      
      Link: http://lkml.kernel.org/r/20190626142014.141844-5-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10d1f8cb
    • Marco Elver's avatar
      lib/test_kasan: Add test for double-kzfree detection · bb104ed7
      Marco Elver authored
      
      
      Add a simple test that checks if double-kzfree is being detected
      correctly.
      
      Link: http://lkml.kernel.org/r/20190626142014.141844-4-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb104ed7
    • Marco Elver's avatar
      mm/kasan: change kasan_check_{read,write} to return boolean · b5f6e0fc
      Marco Elver authored
      
      
      This changes {,__}kasan_check_{read,write} functions to return a boolean
      denoting if the access was valid or not.
      
      [sfr@canb.auug.org.au: include types.h for "bool"]
        Link: http://lkml.kernel.org/r/20190705184949.13cdd021@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20190626142014.141844-3-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5f6e0fc
    • Marco Elver's avatar
      mm/kasan: introduce __kasan_check_{read,write} · 7d8ad890
      Marco Elver authored
      
      
      Patch series "mm/kasan: Add object validation in ksize()", v3.
      
      This patch (of 5):
      
      This introduces __kasan_check_{read,write}.  __kasan_check functions may
      be used from anywhere, even compilation units that disable instrumentation
      selectively.
      
      This change eliminates the need for the __KASAN_INTERNAL definition.
      
      [elver@google.com: v5]
        Link: http://lkml.kernel.org/r/20190708170706.174189-2-elver@google.com
      Link: http://lkml.kernel.org/r/20190626142014.141844-2-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7d8ad890
    • Marco Elver's avatar
      asm-generic, x86: add bitops instrumentation for KASAN · 751ad98d
      Marco Elver authored
      
      
      This adds a new header to asm-generic to allow optionally instrumenting
      architecture-specific asm implementations of bitops.
      
      This change includes the required change for x86 as reference and
      changes the kernel API doc to point to bitops-instrumented.h instead.
      Rationale: the functions in x86's bitops.h are no longer the kernel API
      functions, but instead the arch_ prefixed functions, which are then
      instrumented via bitops-instrumented.h.
      
      Other architectures can similarly add support for asm implementations of
      bitops.
      
      The documentation text was derived from x86 and existing bitops
      asm-generic versions: 1) references to x86 have been removed; 2) as a
      result, some of the text had to be reworded for clarity and consistency.
      
      Tested using lib/test_kasan with bitops tests (pre-requisite patch).
      Bugzilla ref: https://bugzilla.kernel.org/show_bug.cgi?id=198439
      
      Link: http://lkml.kernel.org/r/20190613125950.197667-4-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      751ad98d
    • Marco Elver's avatar
      x86: use static_cpu_has in uaccess region to avoid instrumentation · ff661350
      Marco Elver authored
      
      
      This patch is a pre-requisite for enabling KASAN bitops instrumentation;
      using static_cpu_has instead of boot_cpu_has avoids instrumentation of
      test_bit inside the uaccess region.  With instrumentation, the KASAN
      check would otherwise be flagged by objtool.
      
      For consistency, kernel/signal.c was changed to mirror this change,
      however, is never instrumented with KASAN (currently unsupported under
      x86 32bit).
      
      Link: http://lkml.kernel.org/r/20190613125950.197667-3-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Suggested-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ff661350
    • Marco Elver's avatar
      lib/test_kasan: add bitops tests · 19a33ca6
      Marco Elver authored
      
      
      Patch series "Bitops instrumentation for KASAN", v5.
      
      This patch (of 3):
      
      This adds bitops tests to the test_kasan module.  In a follow-up patch,
      support for bitops instrumentation will be added.
      
      Link: http://lkml.kernel.org/r/20190613125950.197667-2-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19a33ca6
    • Marco Elver's avatar
      mm/kasan: print frame description for stack bugs · e8969219
      Marco Elver authored
      
      
      This adds support for printing stack frame description on invalid stack
      accesses.  The frame description is embedded by the compiler, which is
      parsed and then pretty-printed.
      
      Currently, we can only print the stack frame info for accesses to the
      task's own stack, but not accesses to other tasks' stacks.
      
      Example of what it looks like:
      
        page dumped because: kasan: bad access detected
      
        addr ffff8880673ef98a is located in stack of task insmod/2008 at offset 106 in frame:
         kasan_stack_oob+0x0/0xf5 [test_kasan]
      
        this frame has 2 objects:
         [32, 36) 'i'
         [96, 106) 'stack_array'
      
        Memory state around the buggy address:
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198435
      Link: http://lkml.kernel.org/r/20190522100048.146841-1-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e8969219
    • André Almeida's avatar
      docs: kmemleak: add more documentation details · b7c3613e
      André Almeida authored
      
      
      Wikipedia now has a main article to "tracing garbage collector" topic.
      Change the URL and use the reStructuredText syntax for hyperlinks and add
      more details about the use of the tool.  Add a section about how to use
      the kmemleak-test module to test the memory leak scanning.
      
      Link: http://lkml.kernel.org/r/20190612155231.19448-2-andrealmeid@collabora.com
      Signed-off-by: default avatarAndré Almeida <andrealmeid@collabora.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b7c3613e
    • André Almeida's avatar
      mm/kmemleak.c: change error at _write when kmemleak is disabled · 4e4dfce2
      André Almeida authored
      
      
      According to POSIX, EBUSY means that the "device or resource is busy", and
      this can lead to people thinking that the file
      `/sys/kernel/debug/kmemleak/` is somehow locked or being used by other
      process.  Change this error code to a more appropriate one.
      
      Link: http://lkml.kernel.org/r/20190612155231.19448-1-andrealmeid@collabora.com
      Signed-off-by: default avatarAndré Almeida <andrealmeid@collabora.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e4dfce2
    • Dmitry Vyukov's avatar
      mm/kmemleak.c: fix check for softirq context · 6ef90569
      Dmitry Vyukov authored
      
      
      in_softirq() is a wrong predicate to check if we are in a softirq
      context.  It also returns true if we have BH disabled, so objects are
      falsely stamped with "softirq" comm.  The correct predicate is
      in_serving_softirq().
      
      If user does cat from /sys/kernel/debug/kmemleak previously they would
      see this, which is clearly wrong, this is system call context (see the
      comm):
      
      unreferenced object 0xffff88805bd661c0 (size 64):
        comm "softirq", pid 0, jiffies 4294942959 (age 12.400s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00  ................
          00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
        backtrace:
          [<0000000007dcb30c>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
          [<0000000007dcb30c>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<0000000007dcb30c>] slab_alloc mm/slab.c:3326 [inline]
          [<0000000007dcb30c>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
          [<00000000969722b7>] kmalloc include/linux/slab.h:547 [inline]
          [<00000000969722b7>] kzalloc include/linux/slab.h:742 [inline]
          [<00000000969722b7>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
          [<00000000969722b7>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
          [<00000000a4134b5f>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
          [<00000000d20248ad>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
          [<000000003d367be7>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
          [<000000003c7c76af>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
          [<000000000c1aeb23>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
          [<000000000157b92b>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
          [<00000000a9f3d058>] __do_sys_setsockopt net/socket.c:2089 [inline]
          [<00000000a9f3d058>] __se_sys_setsockopt net/socket.c:2086 [inline]
          [<00000000a9f3d058>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
          [<000000001b8da885>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
          [<00000000ba770c62>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      now they will see this:
      
      unreferenced object 0xffff88805413c800 (size 64):
        comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s)
        hex dump (first 32 bytes):
          00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00  .z.W............
          00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000c5d3be64>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
          [<00000000c5d3be64>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000c5d3be64>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000c5d3be64>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
          [<0000000023865be2>] kmalloc include/linux/slab.h:547 [inline]
          [<0000000023865be2>] kzalloc include/linux/slab.h:742 [inline]
          [<0000000023865be2>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
          [<0000000023865be2>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
          [<000000003029a9d4>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
          [<00000000ccd0a87c>] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
          [<00000000a85a3785>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
          [<00000000ec13c18d>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
          [<0000000052d748e3>] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
          [<00000000512f1014>] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
          [<00000000181758bc>] __do_sys_setsockopt net/socket.c:2089 [inline]
          [<00000000181758bc>] __se_sys_setsockopt net/socket.c:2086 [inline]
          [<00000000181758bc>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
          [<00000000d4b73623>] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
          [<00000000c1098bec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ef90569
    • Shakeel Butt's avatar
      slub: don't panic for memcg kmem cache creation failure · cb097cd4
      Shakeel Butt authored
      
      
      Currently for CONFIG_SLUB, if a memcg kmem cache creation is failed and
      the corresponding root kmem cache has SLAB_PANIC flag, the kernel will
      be crashed.  This is unnecessary as the kernel can handle the creation
      failures of memcg kmem caches.  Additionally CONFIG_SLAB does not
      implement this behavior.  So, to keep the behavior consistent between
      SLAB and SLUB, removing the panic for memcg kmem cache creation
      failures.  The root kmem cache creation failure for SLAB_PANIC correctly
      panics for both SLAB and SLUB.
      
      Link: http://lkml.kernel.org/r/20190619232514.58994-1-shakeelb@google.com
      Reported-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb097cd4
    • Yury Norov's avatar
      mm/slub.c: avoid double string traverse in kmem_cache_flags() · 9cf3a8d8
      Yury Norov authored
      
      
      If ',' is not found, kmem_cache_flags() calls strlen() to find the end of
      line.  We can do it in a single pass using strchrnul().
      
      Link: http://lkml.kernel.org/r/20190501053111.7950-1-ynorov@marvell.com
      Signed-off-by: default avatarYury Norov <ynorov@marvell.com>
      Acked-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9cf3a8d8
    • Kees Cook's avatar
      lkdtm/heap: add tests for freelist hardening · 966fede8
      Kees Cook authored
      
      
      This adds tests for double free and cross-cache freeing, which should both
      be caught by CONFIG_SLAB_FREELIST_HARDENED.
      
      Link: http://lkml.kernel.org/r/20190530045017.15252-4-keescook@chromium.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      966fede8
    • Kees Cook's avatar
      mm/slab: sanity-check page type when looking up cache · a64b5378
      Kees Cook authored
      
      
      This avoids any possible type confusion when looking up an object.  For
      example, if a non-slab were to be passed to kfree(), the invalid
      slab_cache pointer (i.e.  overlapped with some other value from the
      struct page union) would be used for subsequent slab manipulations that
      could lead to further memory corruption.
      
      Since the page is already in cache, adding the PageSlab() check will
      have nearly zero cost, so add a check and WARN() to virt_to_cache().
      Additionally replaces an open-coded virt_to_cache().  To support the
      failure mode this also updates all callers of virt_to_cache() and
      cache_from_obj() to handle a NULL cache pointer return value (though
      note that several already handle this case gracefully).
      
      [dan.carpenter@oracle.com: restore IRQs in kfree()]
        Link: http://lkml.kernel.org/r/20190613065637.GE16334@mwanda
      Link: http://lkml.kernel.org/r/20190530045017.15252-3-keescook@chromium.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a64b5378
    • Kees Cook's avatar
      mm/slab: validate cache membership under freelist hardening · 598a0717
      Kees Cook authored
      
      
      Patch series "mm/slab: Improved sanity checking".
      
      This adds defenses against slab cache confusion (as seen in real-world
      exploits[1]) and gracefully handles type confusions when trying to look
      up slab caches from an arbitrary page.  (Also is patch 3: new LKDTM
      tests for these defenses as well as for the existing double-free
      detection.
      
      This patch (of 3):
      
      When building under CONFIG_SLAB_FREELIST_HARDENING, it makes sense to
      perform sanity-checking on the assumed slab cache during
      kmem_cache_free() to make sure the kernel doesn't mix freelists across
      slab caches and corrupt memory (as seen in the exploitation of flaws
      like CVE-2018-9568[1]).  Note that the prior code might WARN() but still
      corrupt memory (i.e.  return the assumed cache instead of the owned
      cache).
      
      There is no noticeable performance impact (changes are within noise).
      Measuring parallel kernel builds, I saw the following with
      CONFIG_SLAB_FREELIST_HARDENED, before and after this patch:
      
      before:
      
      	Run times: 288.85 286.53 287.09 287.07 287.21
      	Min: 286.53 Max: 288.85 Mean: 287.35 Std Dev: 0.79
      
      after:
      
      	Run times: 289.58 287.40 286.97 287.20 287.01
      	Min: 286.97 Max: 289.58 Mean: 287.63 Std Dev: 0.99
      
      Delta: 0.1% which is well below the standard deviation
      
      [1] https://github.com/ThomasKing2014/slides/raw/master/Building%20universal%20Android%20rooting%20with%20a%20type%20confusion%20vulnerability.pdf
      
      Link: http://lkml.kernel.org/r/20190530045017.15252-2-keescook@chromium.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      598a0717