Skip to content
  1. Jan 17, 2019
    • David Howells's avatar
      afs: Fix race in async call refcounting · 34fa4761
      David Howells authored
      There's a race between afs_make_call() and afs_wake_up_async_call() in the
      case that an error is returned from rxrpc_kernel_send_data() after it has
      queued the final packet.
      
      afs_make_call() will try and clean up the mess, but the call state may have
      been moved on thereby causing afs_process_async_call() to also try and to
      delete the call.
      
      Fix this by:
      
       (1) Getting an extra ref for an asynchronous call for the call itself to
           hold.  This makes sure the call doesn't evaporate on us accidentally
           and will allow the call to be retained by the caller in a future
           patch.  The ref is released on leaving afs_make_call() or
           afs_wait_for_call_to_complete().
      
       (2) In the event of an error from rxrpc_kernel_send_data():
      
           (a) Don't set the call state to AFS_CALL_COMPLETE until *after* the
           	 call has been aborted and ended.  This prevents
           	 afs_deliver_to_call() from doing anything with any notifications
           	 it gets.
      
           (b) Explicitly end the call immediately to prevent further callbacks.
      
           (c) Cancel any queued async_work and wait for the work if it's
           	 executing.  This allows us to be sure the race won't recur when we
           	 change the state.  We put the work queue's ref on the call if we
           	 managed to cancel it.
      
           (d) Put the call's ref that we got in (1).  This belongs to us as long
           	 as the call is in state AFS_CALL_CL_REQUESTING.
      
      Fixes: 341f741f
      
       ("afs: Refcount the afs_call struct")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      34fa4761
    • David Howells's avatar
      afs: Provide a function to get a ref on a call · 7a75b007
      David Howells authored
      
      
      Provide a function to get a reference on an afs_call struct.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      7a75b007
    • David Howells's avatar
      afs: Fix key refcounting in file locking code · 59d49076
      David Howells authored
      Fix the refcounting of the authentication keys in the file locking code.
      The vnode->lock_key member points to a key on which it expects to be
      holding a ref, but it isn't always given an extra ref, however.
      
      Fixes: 0fafdc9f
      
       ("afs: Fix file locking")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      59d49076
    • Marc Dionne's avatar
      afs: Don't set vnode->cb_s_break in afs_validate() · 4882a27c
      Marc Dionne authored
      A cb_interest record is not necessarily attached to the vnode on entry to
      afs_validate(), which can cause an oops when we try to bring the vnode's
      cb_s_break up to date in the default case (ie. no current callback promise
      and the vnode has not been deleted).
      
      Fix this by simply removing the line, as vnode->cb_s_break will be set when
      needed by afs_register_server_cb_interest() when we next get a callback
      promise from RPC call.
      
      The oops looks something like:
      
          BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
          ...
          RIP: 0010:afs_validate+0x66/0x250 [kafs]
          ...
          Call Trace:
           afs_d_revalidate+0x8d/0x340 [kafs]
           ? __d_lookup+0x61/0x150
           lookup_dcache+0x44/0x70
           ? lookup_dcache+0x44/0x70
           __lookup_hash+0x24/0xa0
           do_unlinkat+0x11d/0x2c0
           __x64_sys_unlink+0x23/0x30
           do_syscall_64+0x4d/0xf0
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: ae3b7361
      
       ("afs: Fix validation/callback interaction")
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4882a27c
  2. Jan 11, 2019
    • Marc Dionne's avatar
      afs: Set correct lock type for the yfs CreateFile · 5edc22cc
      Marc Dionne authored
      A lock type of 0 is "LockRead", which makes the fileserver record an
      unintentional read lock on the new file.  This will cause problems
      later on if the file is the subject of locking operations.
      
      The correct default value should be -1 ("LockNone").
      
      Fix the operation marshalling code to set the value and provide an enum to
      symbolise the values whilst we're at it.
      
      Fixes: 30062bd1
      
       ("afs: Implement YFS support in the fs client")
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5edc22cc
    • Gustavo A. R. Silva's avatar
      afs: Use struct_size() in kzalloc() · c2b8bd49
      Gustavo A. R. Silva authored
      
      
      One of the more common cases of allocation size calculations is finding the
      size of a structure that has a zero-sized array at the end, along with
      memory for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          void *entry[];
      };
      
      instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can now
      use the new struct_size() helper:
      
      instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This code was detected with the help of Coccinelle.
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c2b8bd49
  3. Jan 09, 2019
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · a88cc8da
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "14 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm, page_alloc: do not wake kswapd with zone lock held
        hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization"
        hugetlbfs: revert "Use i_mmap_rwsem to fix page fault/truncate race"
        mm: page_mapped: don't assume compound page is huge or THP
        mm/memory.c: initialise mmu_notifier_range correctly
        tools/vm/page_owner: use page_owner_sort in the use example
        kasan: fix krealloc handling for tag-based mode
        kasan: make tag based mode work with CONFIG_HARDENED_USERCOPY
        kasan, arm64: use ARCH_SLAB_MINALIGN instead of manual aligning
        mm, memcg: fix reclaim deadlock with writeback
        mm/usercopy.c: no check page span for stack objects
        slab: alien caches must not be initialized if the allocation of the alien cache failed
        fork, memcg: fix cached_stacks case
        zram: idle writeback fixes and cleanup
      a88cc8da
    • Stafford Horne's avatar
      arch/openrisc: Fix issues with access_ok() · 9cb2feb4
      Stafford Horne authored
      The commit 594cc251
      
       ("make 'user_access_begin()' do 'access_ok()'")
      exposed incorrect implementations of access_ok() macro in several
      architectures.  This change fixes 2 issues found in OpenRISC.
      
      OpenRISC was not properly using parenthesis for arguments and also using
      arguments twice.  This patch fixes those 2 issues.
      
      I test booted this patch with v5.0-rc1 on qemu and it's working fine.
      
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarStafford Horne <shorne@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9cb2feb4
    • Mel Gorman's avatar
      mm, page_alloc: do not wake kswapd with zone lock held · 73444bc4
      Mel Gorman authored
      syzbot reported the following regression in the latest merge window and
      it was confirmed by Qian Cai that a similar bug was visible from a
      different context.
      
        ======================================================
        WARNING: possible circular locking dependency detected
        4.20.0+ #297 Not tainted
        ------------------------------------------------------
        syz-executor0/8529 is trying to acquire lock:
        000000005e7fb829 (&pgdat->kswapd_wait){....}, at:
        __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
      
        but task is already holding lock:
        000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock
        include/linux/spinlock.h:329 [inline]
        000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk
        mm/page_alloc.c:2548 [inline]
        000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist
        mm/page_alloc.c:3021 [inline]
        000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist
        mm/page_alloc.c:3050 [inline]
        000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue
        mm/page_alloc.c:3072 [inline]
        000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at:
        get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491
      
      It appears to be a false positive in that the only way the lock ordering
      should be inverted is if kswapd is waking itself and the wakeup
      allocates debugging objects which should already be allocated if it's
      kswapd doing the waking.  Nevertheless, the possibility exists and so
      it's best to avoid the problem.
      
      This patch flags a zone as needing a kswapd using the, surprisingly,
      unused zone flag field.  The flag is read without the lock held to do
      the wakeup.  It's possible that the flag setting context is not the same
      as the flag clearing context or for small races to occur.  However, each
      race possibility is harmless and there is no visible degredation in
      fragmentation treatment.
      
      While zone->flag could have continued to be unused, there is potential
      for moving some existing fields into the flags field instead.
      Particularly read-mostly ones like zone->initialized and
      zone->contiguous.
      
      Link: http://lkml.kernel.org/r/20190103225712.GJ31517@techsingularity.net
      Fixes: 1c30844d
      
       ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
      Reported-by: default avatar <syzbot+93d94a001cfbce9e60e1@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Tested-by: default avatarQian Cai <cai@lca.pw>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      73444bc4
    • Mike Kravetz's avatar
      hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization" · ddeaab32
      Mike Kravetz authored
      This reverts b43a9990
      
      
      
      The reverted commit caused issues with migration and poisoning of anon
      huge pages.  The LTP move_pages12 test will cause an "unable to handle
      kernel NULL pointer" BUG would occur with stack similar to:
      
        RIP: 0010:down_write+0x1b/0x40
        Call Trace:
          migrate_pages+0x81f/0xb90
          __ia32_compat_sys_migrate_pages+0x190/0x190
          do_move_pages_to_node.isra.53.part.54+0x2a/0x50
          kernel_move_pages+0x566/0x7b0
          __x64_sys_move_pages+0x24/0x30
          do_syscall_64+0x5b/0x180
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The purpose of the reverted patch was to fix some long existing races
      with huge pmd sharing.  It used i_mmap_rwsem for this purpose with the
      idea that this could also be used to address truncate/page fault races
      with another patch.  Further analysis has determined that i_mmap_rwsem
      can not be used to address all these hugetlbfs synchronization issues.
      Therefore, revert this patch while working an another approach to the
      underlying issues.
      
      Link: http://lkml.kernel.org/r/20190103235452.29335-2-mike.kravetz@oracle.com
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ddeaab32
    • Mike Kravetz's avatar
      hugetlbfs: revert "Use i_mmap_rwsem to fix page fault/truncate race" · e7c58097
      Mike Kravetz authored
      This reverts c86aa7bb
      
      
      
      The reverted commit caused ABBA deadlocks when file migration raced with
      file eviction for specific hugetlbfs files.  This was discovered with a
      modified version of the LTP move_pages12 test.
      
      The purpose of the reverted patch was to close a long existing race
      between hugetlbfs file truncation and page faults.  After more analysis
      of the patch and impacted code, it was determined that i_mmap_rwsem can
      not be used for all required synchronization.  Therefore, revert this
      patch while working an another approach to the underlying issue.
      
      Link: http://lkml.kernel.org/r/20190103235452.29335-1-mike.kravetz@oracle.com
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7c58097
    • Jan Stancek's avatar
      mm: page_mapped: don't assume compound page is huge or THP · 8ab88c71
      Jan Stancek authored
      LTP proc01 testcase has been observed to rarely trigger crashes
      on arm64:
          page_mapped+0x78/0xb4
          stable_page_flags+0x27c/0x338
          kpageflags_read+0xfc/0x164
          proc_reg_read+0x7c/0xb8
          __vfs_read+0x58/0x178
          vfs_read+0x90/0x14c
          SyS_read+0x60/0xc0
      
      The issue is that page_mapped() assumes that if compound page is not
      huge, then it must be THP.  But if this is 'normal' compound page
      (COMPOUND_PAGE_DTOR), then following loop can keep running (for
      HPAGE_PMD_NR iterations) until it tries to read from memory that isn't
      mapped and triggers a panic:
      
              for (i = 0; i < hpage_nr_pages(page); i++) {
                      if (atomic_read(&page[i]._mapcount) >= 0)
                              return true;
      	}
      
      I could replicate this on x86 (v4.20-rc4-98-g60b548237fed) only
      with a custom kernel module [1] which:
       - allocates compound page (PAGEC) of order 1
       - allocates 2 normal pages (COPY), which are initialized to 0xff (to
         satisfy _mapcount >= 0)
       - 2 PAGEC page structs are copied to address of first COPY page
       - second page of COPY is marked as not present
       - call to page_mapped(COPY) now triggers fault on access to 2nd COPY
         page at offset 0x30 (_mapcount)
      
      [1] https://github.com/jstancek/reproducers/blob/master/kernel/page_mapped_crash/repro.c
      
      Fix the loop to iterate for "1 << compound_order" pages.
      
      Kirrill said "IIRC, sound subsystem can producuce custom mapped compound
      pages".
      
      Link: http://lkml.kernel.org/r/c440d69879e34209feba21e12d236d06bc0a25db.1543577156.git.jstancek@redhat.com
      Fixes: e1534ae9
      
       ("mm: differentiate page_mapped() from page_mapcount() for compound pages")
      Signed-off-by: default avatarJan Stancek <jstancek@redhat.com>
      Debugged-by: default avatarLaszlo Ersek <lersek@redhat.com>
      Suggested-by: default avatar"Kirill A. Shutemov" <kirill@shutemov.name>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8ab88c71
    • Matthew Wilcox's avatar
      mm/memory.c: initialise mmu_notifier_range correctly · 1ed7293a
      Matthew Wilcox authored
      One of the paths in follow_pte_pmd() initialised the mmu_notifier_range
      incorrectly.
      
      Link: http://lkml.kernel.org/r/20190103002126.GM6310@bombadil.infradead.org
      Fixes: ac46d4f3
      
       ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2")
      Signed-off-by: default avatarMatthew Wilcox <willy@infradead.org>
      Tested-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ed7293a
    • Miles Chen's avatar
      tools/vm/page_owner: use page_owner_sort in the use example · aff876dc
      Miles Chen authored
      
      
      The example in comment does not useable because the output binary is
      named "page_owner_sort", not "sort".
      
      Also add a reference to Documentation/vm/page_owner.rst
      
      Link: http://lkml.kernel.org/r/1546515361-8317-1-git-send-email-miles.chen@mediatek.com
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aff876dc
    • Andrey Konovalov's avatar
      kasan: fix krealloc handling for tag-based mode · a3fe7cdf
      Andrey Konovalov authored
      
      
      Right now tag-based KASAN can retag the memory that is reallocated via
      krealloc and return a differently tagged pointer even if the same slab
      object gets used and no reallocated technically happens.
      
      There are a few issues with this approach.  One is that krealloc callers
      can't rely on comparing the return value with the passed argument to
      check whether reallocation happened.  Another is that if a caller knows
      that no reallocation happened, that it can access object memory through
      the old pointer, which leads to false positives.  Look at
      nf_ct_ext_add() to see an example.
      
      Fix this by keeping the same tag if the memory don't actually gets
      reallocated during krealloc.
      
      Link: http://lkml.kernel.org/r/bb2a71d17ed072bcc528cbee46fcbd71a6da3be4.1546540962.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3fe7cdf
    • Andrey Konovalov's avatar
      kasan: make tag based mode work with CONFIG_HARDENED_USERCOPY · 96fedce2
      Andrey Konovalov authored
      
      
      With CONFIG_HARDENED_USERCOPY enabled __check_heap_object() compares and
      then subtracts a potentially tagged pointer with a non-tagged address of
      the page that this pointer belongs to, which leads to unexpected
      behavior.
      
      Untag the pointer in __check_heap_object() before doing any of these
      operations.
      
      Link: http://lkml.kernel.org/r/7e756a298d514c4482f52aea6151db34818d395d.1546540962.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96fedce2
    • Andrey Konovalov's avatar
      kasan, arm64: use ARCH_SLAB_MINALIGN instead of manual aligning · eb214f2d
      Andrey Konovalov authored
      
      
      Instead of changing cache->align to be aligned to KASAN_SHADOW_SCALE_SIZE
      in kasan_cache_create() we can reuse the ARCH_SLAB_MINALIGN macro.
      
      Link: http://lkml.kernel.org/r/52ddd881916bcc153a9924c154daacde78522227.1546540962.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Suggested-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eb214f2d
    • Michal Hocko's avatar
      mm, memcg: fix reclaim deadlock with writeback · 63f3655f
      Michal Hocko authored
      Liu Bo has experienced a deadlock between memcg (legacy) reclaim and the
      ext4 writeback
      
        task1:
          wait_on_page_bit+0x82/0xa0
          shrink_page_list+0x907/0x960
          shrink_inactive_list+0x2c7/0x680
          shrink_node_memcg+0x404/0x830
          shrink_node+0xd8/0x300
          do_try_to_free_pages+0x10d/0x330
          try_to_free_mem_cgroup_pages+0xd5/0x1b0
          try_charge+0x14d/0x720
          memcg_kmem_charge_memcg+0x3c/0xa0
          memcg_kmem_charge+0x7e/0xd0
          __alloc_pages_nodemask+0x178/0x260
          alloc_pages_current+0x95/0x140
          pte_alloc_one+0x17/0x40
          __pte_alloc+0x1e/0x110
          alloc_set_pte+0x5fe/0xc20
          do_fault+0x103/0x970
          handle_mm_fault+0x61e/0xd10
          __do_page_fault+0x252/0x4d0
          do_page_fault+0x30/0x80
          page_fault+0x28/0x30
      
        task2:
          __lock_page+0x86/0xa0
          mpage_prepare_extent_to_map+0x2e7/0x310 [ext4]
          ext4_writepages+0x479/0xd60
          do_writepages+0x1e/0x30
          __writeback_single_inode+0x45/0x320
          writeback_sb_inodes+0x272/0x600
          __writeback_inodes_wb+0x92/0xc0
          wb_writeback+0x268/0x300
          wb_workfn+0xb4/0x390
          process_one_work+0x189/0x420
          worker_thread+0x4e/0x4b0
          kthread+0xe6/0x100
          ret_from_fork+0x41/0x50
      
      He adds
       "task1 is waiting for the PageWriteback bit of the page that task2 has
        collected in mpd->io_submit->io_bio, and tasks2 is waiting for the
        LOCKED bit the page which tasks1 has locked"
      
      More precisely task1 is handling a page fault and it has a page locked
      while it charges a new page table to a memcg.  That in turn hits a
      memory limit reclaim and the memcg reclaim for legacy controller is
      waiting on the writeback but that is never going to finish because the
      writeback itself is waiting for the page locked in the #PF path.  So
      this is essentially ABBA deadlock:
      
                                              lock_page(A)
                                              SetPageWriteback(A)
                                              unlock_page(A)
        lock_page(B)
                                              lock_page(B)
        pte_alloc_pne
          shrink_page_list
            wait_on_page_writeback(A)
                                              SetPageWriteback(B)
                                              unlock_page(B)
      
                                              # flush A, B to clear the writeback
      
      This accumulating of more pages to flush is used by several filesystems
      to generate a more optimal IO patterns.
      
      Waiting for the writeback in legacy memcg controller is a workaround for
      pre-mature OOM killer invocations because there is no dirty IO
      throttling available for the controller.  There is no easy way around
      that unfortunately.  Therefore fix this specific issue by pre-allocating
      the page table outside of the page lock.  We have that handy
      infrastructure for that already so simply reuse the fault-around pattern
      which already does this.
      
      There are probably other hidden __GFP_ACCOUNT | GFP_KERNEL allocations
      from under a fs page locked but they should be really rare.  I am not
      aware of a better solution unfortunately.
      
      [akpm@linux-foundation.org: fix mm/memory.c:__do_fault()]
      [akpm@linux-foundation.org: coding-style fixes]
      [mhocko@kernel.org: enhance comment, per Johannes]
        Link: http://lkml.kernel.org/r/20181214084948.GA5624@dhcp22.suse.cz
      Link: http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org
      Fixes: c3b94f44
      
       ("memcg: further prevent OOM with too many dirty pages")
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Debugged-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      63f3655f
    • Qian Cai's avatar
      mm/usercopy.c: no check page span for stack objects · 7bff3c06
      Qian Cai authored
      
      
      It is easy to trigger this with CONFIG_HARDENED_USERCOPY_PAGESPAN=y,
      
        usercopy: Kernel memory overwrite attempt detected to spans multiple pages (offset 0, size 23)!
        kernel BUG at mm/usercopy.c:102!
      
      For example,
      
      print_worker_info
      char name[WQ_NAME_LEN] = { };
      char desc[WORKER_DESC_LEN] = { };
        probe_kernel_read(name, wq->name, sizeof(name) - 1);
        probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
          __copy_from_user_inatomic
            check_object_size
              check_heap_object
                check_page_span
      
      This is because on-stack variables could cross PAGE_SIZE boundary, and
      failed this check,
      
      if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) ==
      	   ((unsigned long)end & (unsigned long)PAGE_MASK)))
      
      ptr = FFFF889007D7EFF8
      end = FFFF889007D7F00E
      
      Hence, fix it by checking if it is a stack object first.
      
      [keescook@chromium.org: improve comments after reorder]
        Link: http://lkml.kernel.org/r/20190103165151.GA32845@beast
      Link: http://lkml.kernel.org/r/20181231030254.99441-1-cai@lca.pw
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7bff3c06
    • Christoph Lameter's avatar
      slab: alien caches must not be initialized if the allocation of the alien cache failed · 09c2e76e
      Christoph Lameter authored
      Callers of __alloc_alien() check for NULL.  We must do the same check in
      __alloc_alien_cache to avoid NULL pointer dereferences on allocation
      failures.
      
      Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e98974906aa-000000@email.amazonses.com
      Fixes: 49dfc304 ("slab: use the lock on alien_cache, instead of the lock on array_cache")
      Fixes: c8522a3a
      
       ("Slab: introduce alloc_alien")
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Reported-by: default avatar <syzbot+d6ed4ec679652b4fd4e4@syzkaller.appspotmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09c2e76e
    • Shakeel Butt's avatar
      fork, memcg: fix cached_stacks case · ba4a4574
      Shakeel Butt authored
      Commit 5eed6f1d ("fork,memcg: fix crash in free_thread_stack on
      memcg charge fail") fixes a crash caused due to failed memcg charge of
      the kernel stack.  However the fix misses the cached_stacks case which
      this patch fixes.  So, the same crash can happen if the memcg charge of
      a cached stack is failed.
      
      Link: http://lkml.kernel.org/r/20190102180145.57406-1-shakeelb@google.com
      Fixes: 5eed6f1d
      
       ("fork,memcg: fix crash in free_thread_stack on memcg charge fail")
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ba4a4574
    • Minchan Kim's avatar
      zram: idle writeback fixes and cleanup · 1d69a3f8
      Minchan Kim authored
      This patch includes some fixes and cleanup for idle-page writeback.
      
      1. writeback_limit interface
      
      Now writeback_limit interface is rather conusing.  For example, once
      writeback limit budget is exausted, admin can see 0 from
      /sys/block/zramX/writeback_limit which is same semantic with disable
      writeback_limit at this moment.  IOW, admin cannot tell that zero came
      from disable writeback limit or exausted writeback limit.
      
      To make the interface clear, let's sepatate enable of writeback limit to
      another knob - /sys/block/zram0/writeback_limit_enable
      
      * before:
        while true :
          # to re-enable writeback limit once previous one is used up
          echo 0 > /sys/block/zram0/writeback_limit
          echo $((200<<20)) > /sys/block/zram0/writeback_limit
          ..
          .. # used up the writeback limit budget
      
      * new
        # To enable writeback limit, from the beginning, admin should
        # enable it.
        echo $((200<<20)) > /sys/block/zram0/writeback_limit
        echo 1 > /sys/block/zram/0/writeback_limit...
      1d69a3f8
    • David Herrmann's avatar
      fork: record start_time late · 7b558513
      David Herrmann authored
      
      
      This changes the fork(2) syscall to record the process start_time after
      initializing the basic task structure but still before making the new
      process visible to user-space.
      
      Technically, we could record the start_time anytime during fork(2).  But
      this might lead to scenarios where a start_time is recorded long before
      a process becomes visible to user-space.  For instance, with
      userfaultfd(2) and TLS, user-space can delay the execution of fork(2)
      for an indefinite amount of time (and will, if this causes network
      access, or similar).
      
      By recording the start_time late, it much closer reflects the point in
      time where the process becomes live and can be observed by other
      processes.
      
      Lastly, this makes it much harder for user-space to predict and control
      the start_time they get assigned.  Previously, user-space could fork a
      process and stall it in copy_thread_tls() before its pid is allocated,
      but after its start_time is recorded.  This can be misused to later-on
      cycle through PIDs and resume the stalled fork(2) yielding a process
      that has the same pid and start_time as a process that existed before.
      This can be used to circumvent security systems that identify processes
      by their pid+start_time combination.
      
      Even though user-space was always aware that start_time recording is
      flaky (but several projects are known to still rely on start_time-based
      identification), changing the start_time to be recorded late will help
      mitigate existing attacks and make it much harder for user-space to
      control the start_time a process gets assigned.
      
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarTom Gundersen <teg@jklm.no>
      Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b558513
  4. Jan 07, 2019
    • Masahiro Yamada's avatar
      arch: restore generic-y += shmparam.h for some architectures · 3bd6e94b
      Masahiro Yamada authored
      For some reasons, I accidentally got rid of "generic-y += shmparam.h"
      from some architectures.
      
      Restore them to fix building c6x, h8300, hexagon, m68k, microblaze,
      openrisc, and unicore32.
      
      Fixes: d6e4b3e3
      
       ("arch: remove redundant UAPI generic-y defines")
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3bd6e94b
    • Linus Torvalds's avatar
      Linux 5.0-rc1 · bfeffd15
      Linus Torvalds authored
      bfeffd15
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 85e1ffbd
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - improve boolinit.cocci and use_after_iter.cocci semantic patches
      
       - fix alignment for kallsyms
      
       - move 'asm goto' compiler test to Kconfig and clean up jump_label
         CONFIG option
      
       - generate asm-generic wrappers automatically if arch does not
         implement mandatory UAPI headers
      
       - remove redundant generic-y defines
      
       - misc cleanups
      
      * tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kconfig: rename generated .*conf-cfg to *conf-cfg
        kbuild: remove unnecessary stubs for archheader and archscripts
        kbuild: use assignment instead of define ... endef for filechk_* rules
        arch: remove redundant UAPI generic-y defines
        kbuild: generate asm-generic wrappers if mandatory headers are missing
        arch: remove stale comments "UAPI Header export list"
        riscv: remove redundant kernel-space generic-y
        kbuild: change filechk to surround the given command with { }
        kbuild: remove redundant target cleaning on failure
        kbuild: clean up rule_dtc_dt_yaml
        kbuild: remove UIMAGE_IN and UIMAGE_OUT
        jump_label: move 'asm goto' support test to Kconfig
        kallsyms: lower alignment on ARM
        scripts: coccinelle: boolinit: drop warnings on named constants
        scripts: coccinelle: check for redeclaration
        kconfig: remove unused "file" field of yylval union
        nds32: remove redundant kernel-space generic-y
        nios2: remove unneeded HAS_DMA define
      85e1ffbd
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ac5eed2b
      Linus Torvalds authored
      Pull perf tooling updates form Ingo Molnar:
       "A final batch of perf tooling changes: mostly fixes and small
        improvements"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
        perf session: Add comment for perf_session__register_idle_thread()
        perf thread-stack: Fix thread stack processing for the idle task
        perf thread-stack: Allocate an array of thread stacks
        perf thread-stack: Factor out thread_stack__init()
        perf thread-stack: Allow for a thread stack array
        perf thread-stack: Avoid direct reference to the thread's stack
        perf thread-stack: Tidy thread_stack__bottom() usage
        perf thread-stack: Simplify some code in thread_stack__process()
        tools gpio: Allow overriding CFLAGS
        tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command
        tools thermal tmon: Allow overriding CFLAGS assignments
        tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command
        perf c2c: Increase the HITM ratio limit for displayed cachelines
        perf c2c: Change the default coalesce setup
        perf trace beauty ioctl: Beautify USBDEVFS_ commands
        perf trace beauty: Export function to get the files for a thread
        perf trace: Wire up ioctl's USBDEBFS_ cmd table generator
        perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands
        tools headers uapi: Grab a copy of usbdevice_fs.h
        perf trace: Store the major number for a file when storing its pathname
        ...
      ac5eed2b
    • Linus Torvalds's avatar
      Change mincore() to count "mapped" pages rather than "cached" pages · 574823bf
      Linus Torvalds authored
      
      
      The semantics of what "in core" means for the mincore() system call are
      somewhat unclear, but Linux has always (since 2.3.52, which is when
      mincore() was initially done) treated it as "page is available in page
      cache" rather than "page is mapped in the mapping".
      
      The problem with that traditional semantic is that it exposes a lot of
      system cache state that it really probably shouldn't, and that users
      shouldn't really even care about.
      
      So let's try to avoid that information leak by simply changing the
      semantics to be that mincore() counts actual mapped pages, not pages
      that might be cheaply mapped if they were faulted (note the "might be"
      part of the old semantics: being in the cache doesn't actually guarantee
      that you can access them without IO anyway, since things like network
      filesystems may have to revalidate the cache before use).
      
      In many ways the old semantics were somewhat insane even aside from the
      information leak issue.  From the very beginning (and that beginning is
      a long time ago: 2.3.52 was released in March 2000, I think), the code
      had a comment saying
      
        Later we can get more picky about what "in core" means precisely.
      
      and this is that "later".  Admittedly it is much later than is really
      comfortable.
      
      NOTE! This is a real semantic change, and it is for example known to
      change the output of "fincore", since that program literally does a
      mmmap without populating it, and then doing "mincore()" on that mapping
      that doesn't actually have any pages in it.
      
      I'm hoping that nobody actually has any workflow that cares, and the
      info leak is real.
      
      We may have to do something different if it turns out that people have
      valid reasons to want the old semantics, and if we can limit the
      information leak sanely.
      
      Cc: Kevin Easton <kevin@guarana.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Masatake YAMATO <yamato@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      574823bf
    • Linus Torvalds's avatar
      Fix 'acccess_ok()' on alpha and SH · 94bd8a05
      Linus Torvalds authored
      Commit 594cc251
      
       ("make 'user_access_begin()' do 'access_ok()'")
      broke both alpha and SH booting in qemu, as noticed by Guenter Roeck.
      
      It turns out that the bug wasn't actually in that commit itself (which
      would have been surprising: it was mostly a no-op), but in how the
      addition of access_ok() to the strncpy_from_user() and strnlen_user()
      functions now triggered the case where those functions would test the
      access of the very last byte of the user address space.
      
      The string functions actually did that user range test before too, but
      they did it manually by just comparing against user_addr_max().  But
      with user_access_begin() doing the check (using "access_ok()"), it now
      exposed problems in the architecture implementations of that function.
      
      For example, on alpha, the access_ok() helper macro looked like this:
      
        #define __access_ok(addr, size) \
              ((get_fs().seg & (addr | size | (addr+size))) == 0)
      
      and what it basically tests is of any of the high bits get set (the
      USER_DS masking value is 0xfffffc0000000000).
      
      And that's completely wrong for the "addr+size" check.  Because it's
      off-by-one for the case where we check to the very end of the user
      address space, which is exactly what the strn*_user() functions do.
      
      Why? Because "addr+size" will be exactly the size of the address space,
      so trying to access the last byte of the user address space will fail
      the __access_ok() check, even though it shouldn't.  As a result, the
      user string accessor functions failed consistently - because they
      literally don't know how long the string is going to be, and the max
      access is going to be that last byte of the user address space.
      
      Side note: that alpha macro is buggy for another reason too - it re-uses
      the arguments twice.
      
      And SH has another version of almost the exact same bug:
      
        #define __addr_ok(addr) \
              ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
      
      so far so good: yes, a user address must be below the limit.  But then:
      
        #define __access_ok(addr, size)         \
              (__addr_ok((addr) + (size)))
      
      is wrong with the exact same off-by-one case: the case when "addr+size"
      is exactly _equal_ to the limit is actually perfectly fine (think "one
      byte access at the last address of the user address space")
      
      The SH version is actually seriously buggy in another way: it doesn't
      actually check for overflow, even though it did copy the _comment_ that
      talks about overflow.
      
      So it turns out that both SH and alpha actually have completely buggy
      implementations of access_ok(), but they happened to work in practice
      (although the SH overflow one is a serious serious security bug, not
      that anybody likely cares about SH security).
      
      This fixes the problems by using a similar macro on both alpha and SH.
      It isn't trying to be clever, the end address is based on this logic:
      
              unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;
      
      which basically says "add start and length, and then subtract one unless
      the length was zero".  We can't subtract one for a zero length, or we'd
      just hit an underflow instead.
      
      For a lot of access_ok() users the length is a constant, so this isn't
      actually as expensive as it initially looks.
      
      Reported-and-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94bd8a05
    • Linus Torvalds's avatar
      Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt · baa67073
      Linus Torvalds authored
      Pull fscrypt updates from Ted Ts'o:
       "Add Adiantum support for fscrypt"
      
      * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
        fscrypt: add Adiantum support
      baa67073
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 21524046
      Linus Torvalds authored
      Pull ext4 bug fixes from Ted Ts'o:
       "Fix a number of ext4 bugs"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix special inode number checks in __ext4_iget()
        ext4: track writeback errors using the generic tracking infrastructure
        ext4: use ext4_write_inode() when fsyncing w/o a journal
        ext4: avoid kernel warning when writing the superblock to a dead device
        ext4: fix a potential fiemap/page fault deadlock w/ inline_data
        ext4: make sure enough credits are reserved for dioread_nolock writes
      21524046
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping · e2b745f4
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
       "Fix various regressions introduced in this cycles:
      
         - fix dma-debug tracking for the map_page / map_single
           consolidatation
      
         - properly stub out DMA mapping symbols for !HAS_DMA builds to avoid
           link failures
      
         - fix AMD Gart direct mappings
      
         - setup the dma address for no kernel mappings using the remap
           allocator"
      
      * tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping:
        dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations
        x86/amd_gart: fix unmapping of non-GART mappings
        dma-mapping: remove a few unused exports
        dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA
        dma-mapping: remove dmam_{declare,release}_coherent_memory
        dma-mapping: implement dmam_alloc_coherent using dmam_alloc_attrs
        dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs
      e2b745f4
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-for-v4.21' of... · 12133258
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
      
      Pull chrome platform updates from Benson Leung:
      
       - Changes for EC_MKBP_EVENT_SENSOR_FIFO handling.
      
       - Also, maintainership changes. Olofj out, Enric balletbo in.
      
      * tag 'tag-chrome-platform-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
        MAINTAINERS: add maintainers for ChromeOS EC sub-drivers
        MAINTAINERS: platform/chrome: Add Enric as a maintainer
        MAINTAINERS: platform/chrome: remove myself as maintainer
        platform/chrome: don't report EC_MKBP_EVENT_SENSOR_FIFO as wakeup
        platform/chrome: straighten out cros_ec_get_{next,host}_event() error codes
      12133258
    • Linus Torvalds's avatar
      Merge tag 'hwlock-v4.21' of git://github.com/andersson/remoteproc · 66e012f6
      Linus Torvalds authored
      Pull hwspinlock updates from Bjorn Andersson:
       "This adds support for the hardware semaphores found in STM32MP1"
      
      * tag 'hwlock-v4.21' of git://github.com/andersson/remoteproc:
        hwspinlock: fix return value check in stm32_hwspinlock_probe()
        hwspinlock: add STM32 hwspinlock device
        dt-bindings: hwlock: Document STM32 hwspinlock bindings
      66e012f6
  5. Jan 06, 2019
    • Eric Biggers's avatar
      fscrypt: add Adiantum support · 8094c3ce
      Eric Biggers authored
      Add support for the Adiantum encryption mode to fscrypt.  Adiantum is a
      tweakable, length-preserving encryption mode with security provably
      reducible to that of XChaCha12 and AES-256, subject to a security bound.
      It's also a true wide-block mode, unlike XTS.  See the paper
      "Adiantum: length-preserving encryption for entry-level processors"
      (https://eprint.iacr.org/2018/720.pdf) for more details.  Also see
      commit 059c2a4d
      
       ("crypto: adiantum - add Adiantum support").
      
      On sufficiently long messages, Adiantum's bottlenecks are XChaCha12 and
      the NH hash function.  These algorithms are fast even on processors
      without dedicated crypto instructions.  Adiantum makes it feasible to
      enable storage encryption on low-end mobile devices that lack AES
      instructions; currently such devices are unencrypted.  On ARM Cortex-A7,
      on 4096-byte messages Adiantum encryption is about 4 times faster than
      AES-256-XTS encryption; decryption is about 5 times faster.
      
      In fscrypt, Adiantum is suitable for encrypting both file contents and
      names.  With filenames, it fixes a known weakness: when two filenames in
      a directory share a common prefix of >= 16 bytes, with CTS-CBC their
      encrypted filenames share a common prefix too, leaking information.
      Adiantum does not have this problem.
      
      Since Adiantum also accepts long tweaks (IVs), it's also safe to use the
      master key directly for Adiantum encryption rather than deriving
      per-file keys, provided that the per-file nonce is included in the IVs
      and the master key isn't used for any other encryption mode.  This
      configuration saves memory and improves performance.  A new fscrypt
      policy flag is added to allow users to opt-in to this configuration.
      
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      8094c3ce
    • Linus Torvalds's avatar
      Merge tag 'docs-5.0-fixes' of git://git.lwn.net/linux · b5aef86e
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A handful of late-arriving documentation fixes"
      
      * tag 'docs-5.0-fixes' of git://git.lwn.net/linux:
        doc: filesystems: fix bad references to nonexistent ext4.rst file
        Documentation/admin-guide: update URL of LKML information link
        Docs/kernel-api.rst: Remove blk-tag.c reference
      b5aef86e
    • Linus Torvalds's avatar
      Merge tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 · 15b215e5
      Linus Torvalds authored
      Pull firewire fixlet from Stefan Richter:
       "Remove an explicit dependency in Kconfig which is implied by another
        dependency"
      
      * tag 'firewire-update' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
        firewire: Remove depends on HAS_DMA in case of platform dependency
      15b215e5
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190104' of git://git.kernel.dk/linux-block · d7252d0d
      Linus Torvalds authored
      Pull block updates and fixes from Jens Axboe:
      
       - Pulled in MD changes that Shaohua had queued up for 4.21.
      
         Unfortunately we lost Shaohua late 2018, I'm sending these in on his
         behalf.
      
       - In conjunction with the above, I added a CREDITS entry for Shaoua.
      
       - sunvdc queue restart fix (Ming)
      
      * tag 'for-linus-20190104' of git://git.kernel.dk/linux-block:
        Add CREDITS entry for Shaohua Li
        block: sunvdc: don't run hw queue synchronously from irq context
        md: fix raid10 hang issue caused by barrier
        raid10: refactor common wait code from regular read/write request
        md: remvoe redundant condition check
        lib/raid6: add option to skip algo benchmarking
        lib/raid6: sort algos in rough performance order
        lib/raid6: check for assembler SSSE3 support
        lib/raid6: avoid __attribute_const__ redefinition
        lib/raid6: add missing include for raid6test
        md: remove set but not used variable 'bi_rdev'
      d7252d0d
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm · 0fe4e2d5
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Happy New Year, just decloaking from leave to get some stuff from the
        last week in before rc1:
      
        core:
         - two regression fixes for damage blob and atomic
      
        i915 gvt:
         - Some missed GVT fixes from the original pull
      
        amdgpu:
         - new PCI IDs
         - SR-IOV fixes
         - DC fixes
         - Vega20 fixes"
      
      * tag 'drm-next-2019-01-05' of git://anongit.freedesktop.org/drm/drm: (53 commits)
        drm: Put damage blob when destroy plane state
        drm: fix null pointer dereference on null state pointer
        drm/amdgpu: Add new VegaM pci id
        drm/ttm: Use drm_debug_printer for all ttm_bo_mem_space_debug output
        drm/amdgpu: add Vega20 PSP ASD firmware loading
        drm/amd/display: Fix MST dp_blank REG_WAIT timeout
        drm/amd/display: validate extended dongle caps
        drm/amd/display: Use div_u64 for flip timestamp ns to ms
        drm/amdgpu/uvd:Change uvd ring name convention
        drm/amd/powerplay: add Vega20 LCLK DPM level setting support
        drm/amdgpu: print process info when job timeout
        drm/amdgpu/nbio7.4: add hw bug workaround for vega20
        drm/amdgpu/nbio6.1: add hw bug workaround for vega10/12
        drm/amd/display: Optimize passive update planes.
        drm/amd/display: verify lane status before exiting verify link cap
        drm/amd/display: Fix bug with not updating VSP infoframe
        drm/amd/display: Add retry to read ddc_clock pin
        drm/amd/display: Don't skip link training for empty dongle
        drm/amd/display: Wait edp HPD to high in detect_sink
        drm/amd/display: fix surface update sequence
        ...
      0fe4e2d5
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 3954e1d0
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Over the break a few defects were found, so this is a -rc style pull
        request of various small things that have been posted.
      
         - An attempt to shorten RCU grace period driven delays showed crashes
           during heavier testing, and has been entirely reverted
      
         - A missed merge/rebase error between the advise_mr and ib_device_ops
           series
      
         - Some small static analysis driven fixes from Julia and Aditya
      
         - Missed ability to create a XRC_INI in the devx verbs interop
           series"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        infiniband/qedr: Potential null ptr dereference of qp
        infiniband: bnxt_re: qplib: Check the return value of send_message
        IB/ipoib: drop useless LIST_HEAD
        IB/core: Add advise_mr to the list of known ops
        Revert "IB/mlx5: Fix long EEH recover time with NVMe offloads"
        IB/mlx5: Allow XRC INI usage via verbs in DEVX context
      3954e1d0