Skip to content
  1. Nov 04, 2018
    • Linus Torvalds's avatar
      Merge tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 16944728
      Linus Torvalds authored
      Pull cifs fixes and updates from Steve French:
       "Three small fixes (one Kerberos related, one for stable, and another
        fixes an oops in xfstest 377), two helpful debugging improvements,
        three patches for cifs directio and some minor cleanup"
      
      * tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix signed/unsigned mismatch on aio_read patch
        cifs: don't dereference smb_file_target before null check
        CIFS: Add direct I/O functions to file_operations
        CIFS: Add support for direct I/O write
        CIFS: Add support for direct I/O read
        smb3: missing defines and structs for reparse point handling
        smb3: allow more detailed protocol info on open files for debugging
        smb3: on kerberos mount if server doesn't specify auth type use krb5
        smb3: add trace point for tree connection
        cifs: fix spelling mistake, EACCESS -> EACCES
        cifs: fix return value for cifs_listxattr
      16944728
    • Linus Torvalds's avatar
      Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · ed61a132
      Linus Torvalds authored
      Pull 9p fix from Al Viro:
       "Regression fix for net/9p handling of iov_iter; broken by braino when
        switching to iov_iter_is_kvec() et.al., spotted and fixed by Marc"
      
      * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        iov_iter: Fix 9p virtio breakage
      ed61a132
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · af102b33
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "This is a set of minor small (and safe changes) that didn't make the
        initial pull request plus some bug fixes"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: mvsas: Remove set but not used variable 'id'
        scsi: qla2xxx: Remove two arguments from qlafx00_error_entry()
        scsi: qla2xxx: Make sure that qlafx00_ioctl_iosb_entry() initializes 'res'
        scsi: qla2xxx: Remove a set-but-not-used variable
        scsi: qla2xxx: Make qla2x00_sysfs_write_nvram() easier to analyze
        scsi: qla2xxx: Declare local functions 'static'
        scsi: qla2xxx: Improve several kernel-doc headers
        scsi: qla2xxx: Modify fall-through annotations
        scsi: 3w-sas: 3w-9xxx: Use unsigned char for cdb
        scsi: mvsas: Use dma_pool_zalloc
        scsi: target: Don't request modules that aren't even built
        scsi: target: Set response length for REPORT TARGET PORT GROUPS
      af102b33
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · cddfa11a
      Linus Torvalds authored
      Merge more updates from Andrew Morton:
      
       - more ocfs2 work
      
       - various leftovers
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        memory_hotplug: cond_resched in __remove_pages
        bfs: add sanity check at bfs_fill_super()
        kernel/sysctl.c: remove duplicated include
        kernel/kexec_file.c: remove some duplicated includes
        mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask
        ocfs2: fix clusters leak in ocfs2_defrag_extent()
        ocfs2: dlmglue: clean up timestamp handling
        ocfs2: don't put and assigning null to bh allocated outside
        ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
        ocfs2: don't use iocb when EIOCBQUEUED returns
        ocfs2: without quota support, avoid calling quota recovery
        ocfs2: remove ocfs2_is_o2cb_active()
        mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings
        include/linux/notifier.h: SRCU: fix ctags
        mm: handle no memcg case in memcg_kmem_charge() properly
      cddfa11a
    • Michal Hocko's avatar
      memory_hotplug: cond_resched in __remove_pages · dd33ad7b
      Michal Hocko authored
      We have received a bug report that unbinding a large pmem (>1TB) can
      result in a soft lockup:
      
        NMI watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [ndctl:4365]
        [...]
        Supported: Yes
        CPU: 9 PID: 4365 Comm: ndctl Not tainted 4.12.14-94.40-default #1 SLE12-SP4
        Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0833.051120182255 05/11/2018
        task: ffff9cce7d4410c0 task.stack: ffffbe9eb1bc4000
        RIP: 0010:__put_page+0x62/0x80
        Call Trace:
         devm_memremap_pages_release+0x152/0x260
         release_nodes+0x18d/0x1d0
         device_release_driver_internal+0x160/0x210
         unbind_store+0xb3/0xe0
         kernfs_fop_write+0x102/0x180
         __vfs_write+0x26/0x150
         vfs_write+0xad/0x1a0
         SyS_write+0x42/0x90
         do_syscall_64+0x74/0x150
         entry_SYSCALL_64_after_hwframe+0x3d/0xa2
        RIP: 0033:0x7fd13166b3d0
      
      It has been reported on an older (4.12) kernel but the current upstream
      code doesn't cond_resched in the hot remove code at all and the given
      range to remove might be really large.  Fix the issue by calling
      cond_resched once per memory section.
      
      Link: http://lkml.kernel.org/r/20181031125840.23982-1-mhocko@kernel.org
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Cc: Dan Williams <dan.j.williams@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd33ad7b
    • Tetsuo Handa's avatar
      bfs: add sanity check at bfs_fill_super() · 9f2df09a
      Tetsuo Handa authored
      syzbot is reporting too large memory allocation at bfs_fill_super() [1].
      Since file system image is corrupted such that bfs_sb->s_start == 0,
      bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix
      this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and
      printf().
      
      [1] https://syzkaller.appspot.com/bug?id=16a87c236b951351374a84c8a32f40edbc034e96
      
      Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
      
      
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarsyzbot <syzbot+71c6b5d68e91149fc8a4@syzkaller.appspotmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Tigran Aivazian <aivazian.tigran@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f2df09a
    • Michael Schupikov's avatar
    • zhong jiang's avatar
      kernel/kexec_file.c: remove some duplicated includes · 3383b360
      zhong jiang authored
      We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary.
      hence just remove them.
      
      Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com
      
      
      Signed-off-by: default avatarzhong jiang <zhongjiang@huawei.com>
      Reviewed-by: default avatarBhupesh Sharma <bhsharma@redhat.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3383b360
    • Michal Hocko's avatar
      mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask · 89c83fb5
      Michal Hocko authored
      THP allocation mode is quite complex and it depends on the defrag mode.
      This complexity is hidden in alloc_hugepage_direct_gfpmask from a large
      part currently. The NUMA special casing (namely __GFP_THISNODE) is
      however independent and placed in alloc_pages_vma currently. This both
      adds an unnecessary branch to all vma based page allocation requests and
      it makes the code more complex unnecessarily as well. Not to mention
      that e.g. shmem THP used to do the node reclaiming unconditionally
      regardless of the defrag mode until recently. This was not only
      unexpected behavior but it was also hardly a good default behavior and I
      strongly suspect it was just a side effect of the code sharing more than
      a deliberate decision which suggests that such a layering is wrong.
      
      Get rid of the thp special casing from alloc_pages_vma and move the
      logic to alloc_hugepage_direct_gfpmask. __GFP_THISNODE is applied to the
      resulting gfp mask only when the direct reclaim is not requested and
      when there is no explicit numa binding to preserve the current logic.
      
      Please note that there's also a slight difference wrt MPOL_BIND now. The
      previous code would avoid using __GFP_THISNODE if the local node was
      outside of policy_nodemask(). After this patch __GFP_THISNODE is avoided
      for all MPOL_BIND policies. So there's a difference that if local node
      is actually allowed by the bind policy's nodemask, previously
      __GFP_THISNODE would be added, but now it won't be. From the behavior
      POV this is still correct because the policy nodemask is used.
      
      Link: http://lkml.kernel.org/r/20180925120326.24392-3-mhocko@kernel.org
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      89c83fb5
    • Larry Chen's avatar
      ocfs2: fix clusters leak in ocfs2_defrag_extent() · 6194ae42
      Larry Chen authored
      ocfs2_defrag_extent() might leak allocated clusters.  When the file
      system has insufficient space, the number of claimed clusters might be
      less than the caller wants.  If that happens, the original code might
      directly commit the transaction without returning clusters.
      
      This patch is based on code in ocfs2_add_clusters_in_btree().
      
      [akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac]
      Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com
      
      
      Signed-off-by: default avatarLarry Chen <lchen@suse.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6194ae42
    • Arnd Bergmann's avatar
      ocfs2: dlmglue: clean up timestamp handling · 3a3d1e51
      Arnd Bergmann authored
      The handling of timestamps outside of the 1970..2038 range in the dlm
      glue is rather inconsistent: on 32-bit architectures, this has always
      wrapped around to negative timestamps in the 1902..1969 range, while on
      64-bit kernels all timestamps are interpreted as positive 34 bit numbers
      in the 1970..2514 year range.
      
      Now that the VFS code handles 64-bit timestamps on all architectures, we
      can make the behavior more consistent here, and return the same result
      that we had on 64-bit already, making the file system y2038 safe in the
      process.  Outside of dlmglue, it already uses 64-bit on-disk timestamps
      anway, so that part is fine.
      
      For consistency, I'm changing ocfs2_pack_timespec() to clamp anything
      outside of the supported range to the minimum and maximum values.  This
      avoids a possible ambiguity of values before 1970 in particular, which
      used to be interpreted as times at the end of the 2514 range previously.
      
      Link: http://lkml.kernel.org/r/20180619155826.4106487-1-arnd@arndb.de
      
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a3d1e51
    • Changwei Ge's avatar
      ocfs2: don't put and assigning null to bh allocated outside · cf76c785
      Changwei Ge authored
      ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read
      several blocks from disk.  Currently, the input argument *bhs* can be
      NULL or NOT.  It depends on the caller's behavior.  If the function
      fails in reading blocks from disk, the corresponding bh will be assigned
      to NULL and put.
      
      Obviously, above process for non-NULL input bh is not appropriate.
      Because the caller doesn't even know its bhs are put and re-assigned.
      
      If buffer head is managed by caller, ocfs2_read_blocks and
      ocfs2_read_blocks_sync() should not evaluate it to NULL.  It will cause
      caller accessing illegal memory, thus crash.
      
      Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com
      
      
      Signed-off-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Reviewed-by: default avatarGuozhonghua <guozhonghua@h3c.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf76c785
    • Changwei Ge's avatar
      ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry · 29aa3016
      Changwei Ge authored
      Somehow, file system metadata was corrupted, which causes
      ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
      
      According to the original design intention, if above happens we should
      skip the problematic block and continue to retrieve dir entry.  But
      there is obviouse misuse of brelse around related code.
      
      After failure of ocfs2_check_dir_entry(), current code just moves to
      next position and uses the problematic buffer head again and again
      during which the problematic buffer head is released for multiple times.
      I suppose, this a serious issue which is long-lived in ocfs2.  This may
      cause other file systems which is also used in a the same host insane.
      
      So we should also consider about bakcporting this patch into linux
      -stable.
      
      Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
      
      
      Signed-off-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Suggested-by: default avatarChangkuo Shi <shi.changkuo@h3c.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29aa3016
    • Changwei Ge's avatar
      ocfs2: don't use iocb when EIOCBQUEUED returns · 9e985787
      Changwei Ge authored
      When -EIOCBQUEUED returns, it means that aio_complete() will be called
      from dio_complete(), which is an asynchronous progress against
      write_iter.  Generally, IO is a very slow progress than executing
      instruction, but we still can't take the risk to access a freed iocb.
      
      And we do face a BUG crash issue.  Using the crash tool, iocb is
      obviously freed already.
      
        crash> struct -x kiocb ffff881a350f5900
        struct kiocb {
          ki_filp = 0xffff881a350f5a80,
          ki_pos = 0x0,
          ki_complete = 0x0,
          private = 0x0,
          ki_flags = 0x0
        }
      
      And the backtrace shows:
        ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2]
        aio_run_iocb+0x229/0x2f0
        do_io_submit+0x291/0x540
        SyS_io_submit+0x10/0x20
        system_call_fastpath+0x16/0x75
      
      Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com
      
      
      Signed-off-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9e985787
    • Guozhonghua's avatar
      ocfs2: without quota support, avoid calling quota recovery · 21158ca8
      Guozhonghua authored
      During one dead node's recovery by other node, quota recovery work will
      be queued.  We should avoid calling quota when it is not supported, so
      check the quota flags.
      
      Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com
      
      
      Signed-off-by: default avatarguozhonghua <guozhonghua@h3c.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      21158ca8
    • Gang He's avatar
      ocfs2: remove ocfs2_is_o2cb_active() · a6346447
      Gang He authored
      Remove ocfs2_is_o2cb_active().  We have similar functions to identify
      which cluster stack is being used via osb->osb_cluster_stack.
      
      Secondly, the current implementation of ocfs2_is_o2cb_active() is not
      totally safe.  Based on the design of stackglue, we need to get
      ocfs2_stack_lock before using ocfs2_stack related data structures, and
      that active_stack pointer can be NULL in the case of mount failure.
      
      Link: http://lkml.kernel.org/r/1495441079-11708-1-git-send-email-ghe@suse.com
      
      
      Signed-off-by: default avatarGang He <ghe@suse.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Reviewed-by: default avatarEric Ren <zren@suse.com>
      Acked-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a6346447
    • Andrea Arcangeli's avatar
      mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings · ac5b2c18
      Andrea Arcangeli authored
      THP allocation might be really disruptive when allocated on NUMA system
      with the local node full or hard to reclaim.  Stefan has posted an
      allocation stall report on 4.12 based SLES kernel which suggests the
      same issue:
      
        kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
        kvm cpuset=/ mems_allowed=0-1
        CPU: 10 PID: 84752 Comm: kvm Tainted: G        W 4.12.0+98-ph <a href="/view.php?id=1" title="[geschlossen] Integration Ramdisk" class="resolved">0000001</a> SLE15 (unreleased)
        Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017
        Call Trace:
         dump_stack+0x5c/0x84
         warn_alloc+0xe0/0x180
         __alloc_pages_slowpath+0x820/0xc90
         __alloc_pages_nodemask+0x1cc/0x210
         alloc_pages_vma+0x1e5/0x280
         do_huge_pmd_wp_page+0x83f/0xf00
         __handle_mm_fault+0x93d/0x1060
         handle_mm_fault+0xc6/0x1b0
         __do_page_fault+0x230/0x430
         do_page_fault+0x2a/0x70
         page_fault+0x7b/0x80
         [...]
        Mem-Info:
        active_anon:126315487 inactive_anon:1612476 isolated_anon:5
         active_file:60183 inactive_file:245285 isolated_file:0
         unevictable:15657 dirty:286 writeback:1 unstable:0
         slab_reclaimable:75543 slab_unreclaimable:2509111
         mapped:81814 shmem:31764 pagetables:370616 bounce:0
         free:32294031 free_pcp:6233 free_cma:0
        Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
        Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
      
      The defrag mode is "madvise" and from the above report it is clear that
      the THP has been allocated for MADV_HUGEPAGA vma.
      
      Andrea has identified that the main source of the problem is
      __GFP_THISNODE usage:
      
      : The problem is that direct compaction combined with the NUMA
      : __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very
      : hard the local node, instead of failing the allocation if there's no
      : THP available in the local node.
      :
      : Such logic was ok until __GFP_THISNODE was added to the THP allocation
      : path even with MPOL_DEFAULT.
      :
      : The idea behind the __GFP_THISNODE addition, is that it is better to
      : provide local memory in PAGE_SIZE units than to use remote NUMA THP
      : backed memory. That largely depends on the remote latency though, on
      : threadrippers for example the overhead is relatively low in my
      : experience.
      :
      : The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in
      : extremely slow qemu startup with vfio, if the VM is larger than the
      : size of one host NUMA node. This is because it will try very hard to
      : unsuccessfully swapout get_user_pages pinned pages as result of the
      : __GFP_THISNODE being set, instead of falling back to PAGE_SIZE
      : allocations and instead of trying to allocate THP on other nodes (it
      : would be even worse without vfio type1 GUP pins of course, except it'd
      : be swapping heavily instead).
      
      Fix this by removing __GFP_THISNODE for THP requests which are
      requesting the direct reclaim.  This effectivelly reverts 5265047a
      on the grounds that the zone/node reclaim was known to be disruptive due
      to premature reclaim when there was memory free.  While it made sense at
      the time for HPC workloads without NUMA awareness on rare machines, it
      was ultimately harmful in the majority of cases.  The existing behaviour
      is similar, if not as widespare as it applies to a corner case but
      crucially, it cannot be tuned around like zone_reclaim_mode can.  The
      default behaviour should always be to cause the least harm for the
      common case.
      
      If there are specialised use cases out there that want zone_reclaim_mode
      in specific cases, then it can be built on top.  Longterm we should
      consider a memory policy which allows for the node reclaim like behavior
      for the specific memory ranges which would allow a
      
      [1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com
      
      Mel said:
      
      : Both patches look correct to me but I'm responding to this one because
      : it's the fix.  The change makes sense and moves further away from the
      : severe stalling behaviour we used to see with both THP and zone reclaim
      : mode.
      :
      : I put together a basic experiment with usemem configured to reference a
      : buffer multiple times that is 80% the size of main memory on a 2-socket
      : box with symmetric node sizes and defrag set to "always".  The defrag
      : setting is not the default but it would be functionally similar to
      : accessing a buffer with madvise(MADV_HUGEPAGE).  Usemem is configured to
      : reference the buffer multiple times and while it's not an interesting
      : workload, it would be expected to complete reasonably quickly as it fits
      : within memory.  The results were;
      :
      : usemem
      :                                   vanilla           noreclaim-v1
      : Amean     Elapsd-1       42.78 (   0.00%)       26.87 (  37.18%)
      : Amean     Elapsd-3       27.55 (   0.00%)        7.44 (  73.00%)
      : Amean     Elapsd-4        5.72 (   0.00%)        5.69 (   0.45%)
      :
      : This shows the elapsed time in seconds for 1 thread, 3 threads and 4
      : threads referencing buffers 80% the size of memory.  With the patches
      : applied, it's 37.18% faster for the single thread and 73% faster with two
      : threads.  Note that 4 threads showing little difference does not indicate
      : the problem is related to thread counts.  It's simply the case that 4
      : threads gets spread so their workload mostly fits in one node.
      :
      : The overall view from /proc/vmstats is more startling
      :
      :                          4.19.0-rc1  4.19.0-rc1
      :                             vanillanoreclaim-v1r1
      : Minor Faults               35593425      708164
      : Major Faults                 484088          36
      : Swap Ins                    3772837           0
      : Swap Outs                   3932295           0
      :
      : Massive amounts of swap in/out without the patch
      :
      : Direct pages scanned        6013214           0
      : Kswapd pages scanned              0           0
      : Kswapd pages reclaimed            0           0
      : Direct pages reclaimed      4033009           0
      :
      : Lots of reclaim activity without the patch
      :
      : Kswapd efficiency              100%        100%
      : Kswapd velocity               0.000       0.000
      : Direct efficiency               67%        100%
      : Direct velocity           11191.956       0.000
      :
      : Mostly from direct reclaim context as you'd expect without the patch.
      :
      : Page writes by reclaim  3932314.000       0.000
      : Page writes file                 19           0
      : Page writes anon            3932295           0
      : Page reclaim immediate        42336           0
      :
      : Writes from reclaim context is never good but the patch eliminates it.
      :
      : We should never have default behaviour to thrash the system for such a
      : basic workload.  If zone reclaim mode behaviour is ever desired but on a
      : single task instead of a global basis then the sensible option is to build
      : a mempolicy that enforces that behaviour.
      
      This was a severe regression compared to previous kernels that made
      important workloads unusable and it starts when __GFP_THISNODE was
      added to THP allocations under MADV_HUGEPAGE.  It is not a significant
      risk to go to the previous behavior before __GFP_THISNODE was added, it
      worked like that for years.
      
      This was simply an optimization to some lucky workloads that can fit in
      a single node, but it ended up breaking the VM for others that can't
      possibly fit in a single node, so going back is safe.
      
      [mhocko@suse.com: rewrote the changelog based on the one from Andrea]
      Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org
      Fixes: 5265047a
      
       ("mm, thp: really limit transparent hugepage allocation to local node")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarStefan Priebe <s.priebe@profihost.ag>
      Debugged-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Tested-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: <stable@vger.kernel.org>	[4.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ac5b2c18
    • Sam Protsenko's avatar
      include/linux/notifier.h: SRCU: fix ctags · 94e297c5
      Sam Protsenko authored
      ctags indexing ("make tags" command) throws this warning:
      
          ctags: Warning: include/linux/notifier.h:125:
          null expansion of name pattern "\1"
      
      This is the result of DEFINE_PER_CPU() macro expansion.  Fix that by
      getting rid of line break.
      
      Similar fix was already done in commit 25528213 ("tags: Fix
      DEFINE_PER_CPU expansions"), but this one probably wasn't noticed.
      
      Link: http://lkml.kernel.org/r/20181030202808.28027-1-semen.protsenko@linaro.org
      Fixes: 9c80172b
      
       ("kernel/SRCU: provide a static initializer")
      Signed-off-by: default avatarSam Protsenko <semen.protsenko@linaro.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94e297c5
    • Roman Gushchin's avatar
      mm: handle no memcg case in memcg_kmem_charge() properly · e68599a3
      Roman Gushchin authored
      Mike Galbraith reported a regression caused by the commit 9b6f7e16
      ("mm: rework memcg kernel stack accounting") on a system with
      "cgroup_disable=memory" boot option: the system panics with the following
      stack trace:
      
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8
        PGD 0 P4D 0
        Oops: 0002 [#1] PREEMPT SMP PTI
        CPU: 0 PID: 1 Comm: systemd Not tainted 4.19.0-preempt+ #410
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fed4
        RIP: 0010:page_counter_try_charge+0x22/0xc0
        Code: 41 5d c3 c3 0f 1f 40 00 0f 1f 44 00 00 48 85 ff 0f 84 a7 00 00 00 41 56 48 89 f8 49 89 fe 49
        Call Trace:
         try_charge+0xcb/0x780
         memcg_kmem_charge_memcg+0x28/0x80
         memcg_kmem_charge+0x8b/0x1d0
         copy_process.part.41+0x1ca/0x2070
         _do_fork+0xd7/0x3d0
         do_syscall_64+0x5a/0x180
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The problem occurs because get_mem_cgroup_from_current() returns the NULL
      pointer if memory controller is disabled.  Let's check if this is a case
      at the beginning of memcg_kmem_charge() and just return 0 if
      mem_cgroup_disabled() returns true.  This is how we handle this case in
      many other places in the memory controller code.
      
      Link: http://lkml.kernel.org/r/20181029215123.17830-1-guro@fb.com
      Fixes: 9b6f7e16
      
       ("mm: rework memcg kernel stack accounting")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e68599a3
  2. Nov 03, 2018
    • Marc Zyngier's avatar
      iov_iter: Fix 9p virtio breakage · 2cbfdf4d
      Marc Zyngier authored
      When switching to the new iovec accessors, a negation got subtly
      dropped, leading to 9p being remarkably broken (here with kvmtool):
      
      [    7.430941] VFS: Mounted root (9p filesystem) on device 0:15.
      [    7.432080] devtmpfs: mounted
      [    7.432717] Freeing unused kernel memory: 1344K
      [    7.433658] Run /virt/init as init process
        Warning: unable to translate guest address 0x7e00902ff000 to host
        Warning: unable to translate guest address 0x7e00902fefc0 to host
        Warning: unable to translate guest address 0x7e00902ff000 to host
        Warning: unable to translate guest address 0x7e008febef80 to host
        Warning: unable to translate guest address 0x7e008febf000 to host
        Warning: unable to translate guest address 0x7e008febef00 to host
        Warning: unable to translate guest address 0x7e008febf000 to host
      [    7.436376] Kernel panic - not syncing: Requested init /virt/init failed (error -8).
      [    7.437554] CPU: 29 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc8-02267-g00e23707442a #291
      [    7.439006] Hardware name: linux,dummy-virt (DT)
      [    7.439902] Call trace:
      [    7.440387]  dump_backtrace+0x0/0x148
      [    7.441104]  show_stack+0x14/0x20
      [    7.441768]  dump_stack+0x90/0xb4
      [    7.442425]  panic+0x120/0x27c
      [    7.443036]  kernel_init+0xa4/0x100
      [    7.443725]  ret_from_fork+0x10/0x18
      [    7.444444] SMP: stopping secondary CPUs
      [    7.445391] Kernel Offset: disabled
      [    7.446169] CPU features: 0x0,23000438
      [    7.446974] Memory Limit: none
      [    7.447645] ---[ end Kernel panic - not syncing: Requested init /virt/init failed (error -8). ]---
      
      Restoring the missing "!" brings the guest back to life.
      
      Fixes: 00e23707
      
       ("iov_iter: Use accessor function")
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2cbfdf4d
    • Steve French's avatar
      cifs: fix signed/unsigned mismatch on aio_read patch · b98e26df
      Steve French authored
      
      
      The patch "CIFS: Add support for direct I/O read" had
      a signed/unsigned mismatch (ssize_t vs. size_t) in the
      return from one function.  Similar trivial change
      in aio_write
      
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reported-by: default avatarJulia Lawall <julia.lawall@lip6.fr>
      b98e26df
    • Colin Ian King's avatar
      cifs: don't dereference smb_file_target before null check · 8c6c9bed
      Colin Ian King authored
      There is a null check on dst_file->private data which suggests
      it can be potentially null. However, before this check, pointer
      smb_file_target is derived from dst_file->private and dereferenced
      in the call to tlink_tcon, hence there is a potential null pointer
      deference.
      
      Fix this by assigning smb_file_target and target_tcon after the
      null pointer sanity checks.
      
      Detected by CoverityScan, CID#1475302 ("Dereference before null check")
      
      Fixes: 04b38d60
      
       ("vfs: pull btrfs clone API to vfs layer")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      8c6c9bed
    • Long Li's avatar
      CIFS: Add direct I/O functions to file_operations · be4eb688
      Long Li authored
      
      
      With direct read/write functions implemented, add them to file_operations.
      
      Dircet I/O is used under two conditions:
      1. When mounting with "cache=none", CIFS uses direct I/O for all user file
      data transfer.
      2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data
      transfer on this file.
      
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      be4eb688
    • Long Li's avatar
      CIFS: Add support for direct I/O write · 8c5f9c1a
      Long Li authored
      
      
      With direct I/O write, user supplied buffers are pinned to the memory and data
      are transferred directly from user buffers to the transport layer.
      
      Change in v3: add support for kernel AIO
      
      Change in v4:
      Refactor common write code to __cifs_writev for direct and non-direct I/O.
      Retry on direct I/O failure.
      
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      8c5f9c1a
    • Long Li's avatar
      CIFS: Add support for direct I/O read · 6e6e2b86
      Long Li authored
      
      
      With direct I/O read, we transfer the data directly from transport layer to
      the user data buffer.
      
      Change in v3: add support for kernel AIO
      
      Change in v4:
      Refactor common read code to __cifs_readv for direct and non-direct I/O.
      Retry on direct I/O failure.
      
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      6e6e2b86
    • Steve French's avatar
      smb3: missing defines and structs for reparse point handling · 0df444a0
      Steve French authored
      
      
      We were missing some structs from MS-FSCC relating to
      reparse point handling.  Add them to protocol defines
      in smb2pdu.h
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      0df444a0
    • Steve French's avatar
      smb3: allow more detailed protocol info on open files for debugging · dfe33f9a
      Steve French authored
      
      
      In order to debug complex problems it is often helpful to
      have detailed information on the client and server view
      of the open file information.  Add the ability for root to
      view the list of smb3 open files and dump the persistent
      handle and other info so that it can be more easily
      correlated with server logs.
      
      Sample output from "cat /proc/fs/cifs/open_files"
      
       # Version:1
       # Format:
       # <tree id> <persistent fid> <flags> <count> <pid> <uid> <filename> <mid>
       0x5 0x800000378 0x8000 1 7704 0 some-file 0x14
       0xcb903c0c 0x84412e67 0x8000 1 7754 1001 rofile 0x1a6d
       0xcb903c0c 0x9526b767 0x8000 1 7720 1000 file 0x1a5b
       0xcb903c0c 0x9ce41a21 0x8000 1 7715 0 smallfile 0xd67
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      dfe33f9a
    • Steve French's avatar
      smb3: on kerberos mount if server doesn't specify auth type use krb5 · 926674de
      Steve French authored
      
      
      Some servers (e.g. Azure) do not include a spnego blob in the SMB3
      negotiate protocol response, so on kerberos mounts ("sec=krb5")
      we can fail, as we expected the server to list its supported
      auth types (OIDs in the spnego blob in the negprot response).
      Change this so that on krb5 mounts we default to trying krb5 if the
      server doesn't list its supported protocol mechanisms.
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      CC: Stable <stable@vger.kernel.org>
      926674de
    • Steve French's avatar
      smb3: add trace point for tree connection · f8af49dd
      Steve French authored
      
      
      In debugging certain scenarios, especially reconnect cases,
      it can be helpful to have a dynamic trace point for the
      result of tree connect.  See sample output below
      from a reconnect event. The new event is 'smb3_tcon'
      
                  TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
                     | |       |   ||||       |         |
                 cifsd-6071  [001] ....  2659.897923: smb3_reconnect: server=localhost current_mid=0xa
           kworker/1:1-71    [001] ....  2666.026342: smb3_cmd_done: 	sid=0x0 tid=0x0 cmd=0 mid=0
           kworker/1:1-71    [001] ....  2666.026576: smb3_cmd_err: 	sid=0xc49e1787 tid=0x0 cmd=1 mid=1 status=0xc0000016 rc=-5
           kworker/1:1-71    [001] ....  2666.031677: smb3_cmd_done: 	sid=0xc49e1787 tid=0x0 cmd=1 mid=2
           kworker/1:1-71    [001] ....  2666.031921: smb3_cmd_done: 	sid=0xc49e1787 tid=0x6e78f05f cmd=3 mid=3
           kworker/1:1-71    [001] ....  2666.031923: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\test rc=0
           kworker/1:1-71    [001] ....  2666.032097: smb3_cmd_done: 	sid=0xc49e1787 tid=0x6e78f05f cmd=11 mid=4
           kworker/1:1-71    [001] ....  2666.032265: smb3_cmd_done: 	sid=0xc49e1787 tid=0x7912332f cmd=3 mid=5
           kworker/1:1-71    [001] ....  2666.032266: smb3_tcon: xid=0 sid=0xc49e1787 tid=0x0 unc_name=\\localhost\IPC$ rc=0
           kworker/1:1-71    [001] ....  2666.032386: smb3_cmd_done: 	sid=0xc49e1787 tid=0x7912332f cmd=11 mid=6
      
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      f8af49dd
    • Colin Ian King's avatar
      cifs: fix spelling mistake, EACCESS -> EACCES · 413d6100
      Colin Ian King authored
      
      
      Trivial fix to a spelling mistake of the error access name EACCESS,
      rename to EACCES
      
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      413d6100
    • Ronnie Sahlberg's avatar
      cifs: fix return value for cifs_listxattr · 0c5d6cb6
      Ronnie Sahlberg authored
      
      
      If the application buffer was too small to fit all the names
      we would still count the number of bytes and return this for
      listxattr. This would then trigger a BUG in usercopy.c
      
      Fix the computation of the size so that we return -ERANGE
      correctly when the buffer is too small.
      
      This fixes the kernel BUG for xfstest generic/377
      
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarAurelien Aptel <aaptel@suse.com>
      0c5d6cb6
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20181102' of git://git.kernel.dk/linux-block · 5f215853
      Linus Torvalds authored
      Pull block layer fixes from Jens Axboe:
       "The biggest part of this pull request is the revert of the blkcg
        cleanup series. It had one fix earlier for a stacked device issue, but
        another one was reported. Rather than play whack-a-mole with this,
        revert the entire series and try again for the next kernel release.
      
        Apart from that, only small fixes/changes.
      
        Summary:
      
         - Indentation fixup for mtip32xx (Colin Ian King)
      
         - The blkcg cleanup series revert (Dennis Zhou)
      
         - Two NVMe fixes. One fixing a regression in the nvme request
           initialization in this merge window, causing nvme-fc to not work.
           The other is a suspend/resume p2p resource issue (James, Keith)
      
         - Fix sg discard merge, allowing us to merge in cases where we didn't
           before (Jianchao Wang)
      
         - Call rq_qos_exit() after the queue is frozen, preventing a hang
           (Ming)
      
         - Fix brd queue setup, fixing an oops if we fail setting up all
           devices (Ming)"
      
      * tag 'for-linus-20181102' of git://git.kernel.dk/linux-block:
        nvme-pci: fix conflicting p2p resource adds
        nvme-fc: fix request private initialization
        blkcg: revert blkcg cleanups series
        block: brd: associate with queue until adding disk
        block: call rq_qos_exit() after queue is frozen
        mtip32xx: clean an indentation issue, remove extraneous tabs
        block: fix the DISCARD request merge
      5f215853
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-4.20-rc1' of... · fcc37f76
      Linus Torvalds authored
      Merge tag 'pwm/for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "This series contains a number of improvements to existing drivers,
        such as LPSS. Some drivers, such as renesas-tpu and rcar get support
        for more SoC generations. To round things off this fixes an issue with
        the sysfs interface"
      
      * tag 'pwm/for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: lpss: Only set update bit if we are actually changing the settings
        pwm: lpss: Force runtime-resume on suspend on Cherry Trail
        pwm: Enable TI ECAP driver for ARCH_K3
        dt-bindings: pwm: tiecap: Add TI AM654 SoC specific compatible
        dt-bindings: pwm: rcar: Add r8a774a1 support
        pwm: Send a uevent on the pwmchip device upon channel sysfs (un)export
        Revert "pwm: Set class for exported channels in sysfs"
        dt-bindings: pwm: renesas-tpu: Document r8a7744 support
        dt-bindings: pwm: rcar: Add r8a7744 support
        dt-bindings: pwm: renesas: tpu: Document R8A779{7|8}0 bindings
        dt-bindings: pwm: renesas: pwm-rcar: Document R8A779{7|8}0 bindings
        dt-bindings: pwm: renesas: tpu: Fix "compatible" prop description
        pwm: Use SPDX identifier for Renesas drivers
        pwm: lpss: Add get_state callback
        pwm: lpss: Release runtime-pm reference from the driver's remove callback
        pwm: lpss: Check PWM powerstate after resume on Cherry Trail devices
        pwm: lpss: Move struct pwm_lpss_chip definition to the header file
        pwm: lpss: Add ACPI HID for second PWM controller on Cherry Trail devices
        ACPI / PM: Export acpi_device_get_power() for use by modular build drivers
        pwm: tegra: Remove gratuituous blank line
      fcc37f76
    • Linus Torvalds's avatar
      Merge tag 'edac_for_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 0b21f21a
      Linus Torvalds authored
      Pull more EDAC updates from Borislav Petkov:
       "The second part of the EDAC pile which contains the ADXL user and a
        build fix which addresses a not-so-sensical .config but fixes
        randconfig builds people do:
      
         - skx_edac: Address translation for NVDIMMs (Tony Luck and Qiuxu Zhuo)
      
         - ACPI_ADXL build fix"
      
      [ I don't think "sensical" is a word, particularly when used in the
        context of actually meaning "nonsensical", but I like it   - Linus ]
      
      * tag 'edac_for_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
        EDAC, skx: Fix randconfig builds
        EDAC, skx_edac: Add address translation for non-volatile DIMMs
      0b21f21a
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 54480aa7
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A few device-specific fixes: a fix for SPDIF on old Creative PCI
        board, and two additional fixes for the recent changes in FireWire
        audio stack"
      
      * tag 'sound-fix-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: firewire-lib: fix insufficient PCM rule for period/buffer size
        ALSA: ca0106: Disable IZD on SB0570 DAC to fix audio pops
        ALSA: dice: fix to wait for releases of all ALSA character devices
      54480aa7
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm · bc6080ae
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Pretty much a normal fixes pull pre-rc1, mostly amdgpu fixes, one i915
        link training regression fix, and a couple of minor panel/bridge fixes
        and a panel quirk"
      
      * tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm: (37 commits)
        drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default"
        drm/amd/pp: Print warning if od_sclk/mclk out of range
        drm/amd/pp: Fix pp_sclk/mclk_od not work on Vega10
        drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7
        drm/amd/powerplay: no MGPU fan boost enablement on DPM disabled
        drm/amdgpu: Fix skipping hangged job reset during gpu recover.
        drm/amd/powerplay: revise Vega20 pptable version check
        drm/amd/display: set backlight level limit to 1
        drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1
        dt-bindings: drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1
        drm/bridge: ti-sn65dsi86: Remove the mystery delay
        drm/panel: simple: Add "no-hpd" delay for Innolux TV123WAM
        drm/panel: simple: Support panels with HPD where HPD isn't connected
        dt-bindings: drm/panel: simple: Add no-hpd property
        drm/edid: Add 6 bpc quirk for BOE panel.
        drm/amdgpu: fix reporting of failed msg sent to SMU (v2)
        drm/amdgpu: Fix compute ring 1.0.0 failure after reset
        drm/amdgpu: fix VM leaf walking
        drm/amdgpu: fix amdgpu_vm_fini
        drm/amd/powerplay: commonize the API for retrieving current clocks
        ...
      bc6080ae
    • Linus Torvalds's avatar
      Merge tag 'apparmor-pr-2018-11-01' of... · d81f50bd
      Linus Torvalds authored
      Merge tag 'apparmor-pr-2018-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
      
      Pull apparmor updates from John Johansen:
       "Features/Improvements:
         - replace spin_is_locked() with lockdep
         - add base support for secmark labeling and matching
      
        Cleanups:
         - clean an indentation issue, remove extraneous space
         - remove no-op permission check in policy_unpack
         - fix checkpatch missing spaces error in Parse secmark policy
         - fix network performance issue in aa_label_sk_perm
      
        Bug fixes:
         - add #ifdef checks for secmark filtering
         - fix an error code in __aa_create_ns()
         - don't try to replace stale label in ptrace checks
         - fix failure to audit context info in build_change_hat
         - check buffer bounds when mapping permissions mask
         - fully initialize aa_perms struct when answering userspace query
         - fix uninitialized value in aa_split_fqname"
      
      * tag 'apparmor-pr-2018-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
        apparmor: clean an indentation issue, remove extraneous space
        apparmor: fix checkpatch error in Parse secmark policy
        apparmor: add #ifdef checks for secmark filtering
        apparmor: Fix uninitialized value in aa_split_fqname
        apparmor: don't try to replace stale label in ptraceme check
        apparmor: Replace spin_is_locked() with lockdep
        apparmor: Allow filtering based on secmark policy
        apparmor: Parse secmark policy
        apparmor: Add a wildcard secid
        apparmor: don't try to replace stale label in ptrace access check
        apparmor: Fix network performance issue in aa_label_sk_perm
      d81f50bd
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · c2aa1a44
      Linus Torvalds authored
      Pull vfs dedup fixes from Dave Chinner:
       "This reworks the vfs data cloning infrastructure.
      
        We discovered many issues with these interfaces late in the 4.19 cycle
        - the worst of them (data corruption, setuid stripping) were fixed for
        XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all
        the problems was needed. That rework is the contents of this pull
        request.
      
        Rework the vfs_clone_file_range and vfs_dedupe_file_range
        infrastructure to use a common .remap_file_range method and supply
        generic bounds and sanity checking functions that are shared with the
        data write path. The current VFS infrastructure has problems with
        rlimit, LFS file sizes, file time stamps, maximum filesystem file
        sizes, stripping setuid bits, etc and so they are addressed in these
        commits.
      
        We also introduce the ability for the ->remap_file_range methods to
        return short clones so that clones for vfs_copy_file_range() don't get
        rejected if the entire range can't be cloned. It also allows
        filesystems to sliently skip deduplication of partial EOF blocks if
        they are not capable of doing so without requiring errors to be thrown
        to userspace.
      
        Existing filesystems are converted to user the new remap_file_range
        method, and both XFS and ocfs2 are modified to make use of the new
        generic checking infrastructure"
      
      * tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
        xfs: remove [cm]time update from reflink calls
        xfs: remove xfs_reflink_remap_range
        xfs: remove redundant remap partial EOF block checks
        xfs: support returning partial reflink results
        xfs: clean up xfs_reflink_remap_blocks call site
        xfs: fix pagecache truncation prior to reflink
        ocfs2: remove ocfs2_reflink_remap_range
        ocfs2: support partial clone range and dedupe range
        ocfs2: fix pagecache truncation prior to reflink
        ocfs2: truncate page cache for clone destination file before remapping
        vfs: clean up generic_remap_file_range_prep return value
        vfs: hide file range comparison function
        vfs: enable remap callers that can handle short operations
        vfs: plumb remap flags through the vfs dedupe functions
        vfs: plumb remap flags through the vfs clone functions
        vfs: make remap_file_range functions take and return bytes completed
        vfs: remap helper should update destination inode metadata
        vfs: pass remap flags to generic_remap_checks
        vfs: pass remap flags to generic_remap_file_range_prep
        vfs: combine the clone and dedupe into a single remap_file_range
        ...
      c2aa1a44
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · b69f9e17
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Some things that I missed due to travel, or that came in late.
      
        Two fixes also going to stable:
      
         - A revert of a buggy change to the 8xx TLB miss handlers.
      
         - Our flushing of SPE (Signal Processing Engine) registers on fork
           was broken.
      
        Other changes:
      
         - A change to the KVM decrementer emulation to use proper APIs.
      
         - Some cleanups to the way we do code patching in the 8xx code.
      
         - Expose the maximum possible memory for the system in
           /proc/powerpc/lparcfg.
      
         - Merge some updates from Scott: "a couple device tree updates, and a
           fix for a missing prototype warning"
      
        A few other minor fixes and a handful of fixes for our selftests.
      
        Thanks to: Aravinda Prasad, Breno Leitao, Camelia Groza, Christophe
        Leroy, Felipe Rechia, Joel Stanley, Naveen N. Rao, Paul Mackerras,
        Scott Wood, Tyrel Datwyler"
      
      * tag 'powerpc-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits)
        selftests/powerpc: Fix compilation issue due to asm label
        selftests/powerpc/cache_shape: Fix out-of-tree build
        selftests/powerpc/switch_endian: Fix out-of-tree build
        selftests/powerpc/pmu: Link ebb tests with -no-pie
        selftests/powerpc/signal: Fix out-of-tree build
        selftests/powerpc/ptrace: Fix out-of-tree build
        powerpc/xmon: Relax frame size for clang
        selftests: powerpc: Fix warning for security subdir
        selftests/powerpc: Relax L1d miss targets for rfi_flush test
        powerpc/process: Fix flush_all_to_thread for SPE
        powerpc/pseries: add missing cpumask.h include file
        selftests/powerpc: Fix ptrace tm failure
        KVM: PPC: Use exported tb_to_ns() function in decrementer emulation
        powerpc/pseries: Export maximum memory value
        powerpc/8xx: Use patch_site for perf counters setup
        powerpc/8xx: Use patch_site for memory setup patching
        powerpc/code-patching: Add a helper to get the address of a patch_site
        Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP"
        powerpc/8xx: add missing header in 8xx_mmu.c
        powerpc/8xx: Add DT node for using the SEC engine of the MPC885
        ...
      b69f9e17
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.20-mw3' of... · 63c6e188
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.20-mw3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V defconfig update from Palmer Dabbelt:
       "Sorry for the last minute patches, but it was suggested we try to push
        this in before rc1 to make it easier for people to keep their branch
        rebases sane"
      
      * tag 'riscv-for-linus-4.20-mw3' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        RISC-V: refresh defconfig
      63c6e188