Skip to content
  1. Nov 16, 2017
    • Johannes Thumshirn's avatar
      include/linux/slab.h: add kmalloc_array_node() and kcalloc_node() · 5799b255
      Johannes Thumshirn authored
      Patch series "Add kmalloc_array_node() and kcalloc_node()".
      
      Our current memeory allocation routines suffer form an API imbalance,
      for one we have kmalloc_array() and kcalloc() which check for overflows
      in size multiplication and we have kmalloc_node() and kzalloc_node()
      which allow for memory allocation on a certain NUMA node but don't check
      for eventual overflows.
      
      This patch (of 6):
      
      We have kmalloc_array() and kcalloc() wrappers on top of kmalloc() which
      ensure us overflow free multiplication for the size of a memory
      allocation but these implementations are not NUMA-aware.
      
      Likewise we have kmalloc_node() which is a NUMA-aware version of
      kmalloc() but the implementation is not aware of any possible overflows
      in eventual size calculations.
      
      Introduce a combination of the two above cases to have a NUMA-node aware
      version of kmalloc_array() and kcalloc().
      
      Link: http://lkml.kernel.org/r/20170927082038.3782-2-jthumshirn@suse.de
      
      
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Damien Le Moal <damien.lemoal@wdc.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mike Marciniszyn <infinipath@intel.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: Sean Hefty <sean.hefty@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5799b255
    • Miles Chen's avatar
      slub: fix sysfs duplicate filename creation when slub_debug=O · 11066386
      Miles Chen authored
      When slub_debug=O is set.  It is possible to clear debug flags for an
      "unmergeable" slab cache in kmem_cache_open().  It makes the "unmergeable"
      cache became "mergeable" in sysfs_slab_add().
      
      These caches will generate their "unique IDs" by create_unique_id(), but
      it is possible to create identical unique IDs.  In my experiment,
      sgpool-128, names_cache, biovec-256 generate the same ID ":Ft-0004096" and
      the kernel reports "sysfs: cannot create duplicate filename
      '/kernel/slab/:Ft-0004096'".
      
      To repeat my experiment, set disable_higher_order_debug=1,
      CONFIG_SLUB_DEBUG_ON=y in kernel-4.14.
      
      Fix this issue by setting unmergeable=1 if slub_debug=O and the the
      default slub_debug contains any no-merge flags.
      
      call path:
      kmem_cache_create()
        __kmem_cache_alias()	-> we set SLAB_NEVER_MERGE flags here
        create_cache()
          __kmem_cache_create()
            kmem_cache_open()	-> clear DEBUG_METADATA_FLAGS
            sysfs_slab_add()	-> the slab cache is mergeable now
      
        sysfs: cannot create duplicate filename '/kernel/slab/:Ft-0004096'
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.0-rc7ajb-00131-gd4c2e9f-dirty #123
        Hardware name: linux,dummy-virt (DT)
        task: ffffffc07d4e0080 task.stack: ffffff8008008000
        PC is at sysfs_warn_dup+0x60/0x7c
        LR is at sysfs_warn_dup+0x60/0x7c
        pc :  lr :  pstate: 60000145
        Call trace:
         sysfs_warn_dup+0x60/0x7c
         sysfs_create_dir_ns+0x98/0xa0
         kobject_add_internal+0xa0/0x294
         kobject_init_and_add+0x90/0xb4
         sysfs_slab_add+0x90/0x200
         __kmem_cache_create+0x26c/0x438
         kmem_cache_create+0x164/0x1f4
         sg_pool_init+0x60/0x100
         do_one_initcall+0x38/0x12c
         kernel_init_freeable+0x138/0x1d4
         kernel_init+0x10/0xfc
         ret_from_fork+0x10/0x18
      
      Link: http://lkml.kernel.org/r/1510365805-5155-1-git-send-email-miles.chen@mediatek.com
      
      
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      11066386
    • Alexey Dobriyan's avatar
      slab, slub, slob: convert slab_flags_t to 32-bit · 4fd0b46e
      Alexey Dobriyan authored
      struct kmem_cache::flags is "unsigned long" which is unnecessary on
      64-bit as no flags are defined in the higher bits.
      
      Switch the field to 32-bit and save some space on x86_64 until such
      flags appear:
      
      	add/remove: 0/0 grow/shrink: 0/107 up/down: 0/-657 (-657)
      	function                                     old     new   delta
      	sysfs_slab_add                               720     719      -1
      				...
      	check_object                                 699     676     -23
      
      [akpm@linux-foundation.org: fix printk warning]
      Link: http://lkml.kernel.org/r/20171021100635.GA8287@avx2
      
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarPekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4fd0b46e
    • Alexey Dobriyan's avatar
      slab, slub, slob: add slab_flags_t · d50112ed
      Alexey Dobriyan authored
      Add sparse-checked slab_flags_t for struct kmem_cache::flags (SLAB_POISON,
      etc).
      
      SLAB is bloated temporarily by switching to "unsigned long", but only
      temporarily.
      
      Link: http://lkml.kernel.org/r/20171021100225.GA22428@avx2
      
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarPekka Enberg <penberg@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d50112ed
    • David Rientjes's avatar
      mm/slab.c: only set __GFP_RECLAIMABLE once · a3ba0744
      David Rientjes authored
      SLAB_RECLAIM_ACCOUNT is a permanent attribute of a slab cache.  Set
      __GFP_RECLAIMABLE as part of its ->allocflags rather than check the
      cachep flag on every page allocation.
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1710171527560.140898@chino.kir.corp.google.com
      
      
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3ba0744
    • Miles Chen's avatar
      mm/slob.c: remove an unnecessary check for __GFP_ZERO · 9f88faee
      Miles Chen authored
      Current flow guarantees a valid pointer when handling the __GFP_ZERO
      case.  So remove the unnecessary NULL pointer check.
      
      Link: http://lkml.kernel.org/r/1507203141-11959-1-git-send-email-miles.chen@mediatek.com
      
      
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f88faee
    • Yang Shi's avatar
      mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory · 852d8be0
      Yang Shi authored
      The kernel may panic when an oom happens without killable process
      sometimes it is caused by huge unreclaimable slabs used by kernel.
      
      Although kdump could help debug such problem, however, kdump is not
      available on all architectures and it might be malfunction sometime.
      And, since kernel already panic it is worthy capturing such information
      in dmesg to aid touble shooting.
      
      Print out unreclaimable slab info (used size and total size) which
      actual memory usage is not zero (num_objs * size != 0) when
      unreclaimable slabs amount is greater than total user memory (LRU
      pages).
      
      The output looks like:
      
        Unreclaimable slab info:
        Name                      Used          Total
        rpc_buffers               31KB         31KB
        rpc_tasks                  7KB          7KB
        ebitmap_node            1964KB       1964KB
        avtab_node              5024KB       5024KB
        xfs_buf                 1402KB       1402KB
        xfs_ili                  134KB        134KB
        xfs_efi_item             115KB        115KB
        xfs_efd_item             115KB        115KB
        xfs_buf_item             134KB        134KB
        xfs_log_item_desc        342KB        342KB
        xfs_trans               1412KB       1412KB
        xfs_ifork                212KB        212KB
      
      [yang.s@alibaba-inc.com: v11]
        Link: http://lkml.kernel.org/r/1507656303-103845-4-git-send-email-yang.s@alibaba-inc.com
      Link: http://lkml.kernel.org/r/1507152550-46205-4-git-send-email-yang.s@alibaba-inc.com
      
      
      Signed-off-by: default avatarYang Shi <yang.s@alibaba-inc.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      852d8be0
    • Yang Shi's avatar
      mm: slabinfo: remove CONFIG_SLABINFO · 5b365771
      Yang Shi authored
      According to discussion with Christoph
      (https://marc.info/?l=linux-kernel&m=150695909709711&w=2), it sounds like
      it is pointless to keep CONFIG_SLABINFO around.
      
      This patch removes the CONFIG_SLABINFO config option, but /proc/slabinfo
      is still available.
      
      [yang.s@alibaba-inc.com: v11]
        Link: http://lkml.kernel.org/r/1507656303-103845-3-git-send-email-yang.s@alibaba-inc.com
      Link: http://lkml.kernel.org/r/1507152550-46205-3-git-send-email-yang.s@alibaba-inc.com
      
      
      Signed-off-by: default avatarYang Shi <yang.s@alibaba-inc.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5b365771
    • Yang Shi's avatar
      tools: slabinfo: add "-U" option to show unreclaimable slabs only · 7ad3f188
      Yang Shi authored
      Patch series "oom: capture unreclaimable slab info in oom message", v10.
      
      Recently we ran into a oom issue, kernel panic due to no killable
      process.  The dmesg shows huge unreclaimable slabs used almost 100%
      memory, but kdump doesn't capture vmcore due to some reason.
      
      So, it may sound better to capture unreclaimable slab info in oom
      message when kernel panic to aid trouble shooting and cover the corner
      case.  Since kernel already panic, so capturing more information sounds
      worthy and doesn't bother normal oom killer.
      
      With the patchset, tools/vm/slabinfo has a new option, "-U", to show
      unreclaimable slab only.
      
      And, oom will print all non zero (num_objs * size != 0) unreclaimable
      slabs in oom killer message.
      
      This patch (of 3):
      
      Add "-U" option to show unreclaimable slabs only.
      
      "-U" and "-S" together can tell us what unreclaimable slabs use the most
      memory to help debug huge unreclaimable slabs issue.
      
      Link: http://lkml.kernel.org/r/1507152550-46205-2-git-send-email-yang.s@alibaba-inc.com
      
      
      Signed-off-by: default avatarYang Shi <yang.s@alibaba-inc.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ad3f188
    • Guozhonghua's avatar
      ocfs2: remove unneeded goto in ocfs2_reserve_cluster_bitmap_bits() · 47ee9d89
      Guozhonghua authored
      Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4F3CDE3A9@H3CMLB14-EX.srv.huawei-3com.com
      
      
      Signed-off-by: default avatarguozhonghua <guozhonghua@h3c.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      47ee9d89
    • Changwei Ge's avatar
      ocfs2/dlm: get mle inuse only when it is initialized · 3db409fa
      Changwei Ge authored
      When dlm_add_migration_mle returns -EEXIST, previously input mle will
      not be initialized.  So we can't use its associated dlm object.  And we
      truly don't need this mle for already launched migration progress, since
      oldmle has taken this role.
      
      Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373CED7AA61@H3CMLB14-EX.srv.huawei-3com.com
      
      
      Signed-off-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3db409fa
    • alex chen's avatar
      ocfs2: subsystem.su_mutex is required while accessing the item->ci_parent · 853bc26a
      alex chen authored
      The subsystem.su_mutex is required while accessing the item->ci_parent,
      otherwise, NULL pointer dereference to the item->ci_parent will be
      triggered in the following situation:
      
      add node                     delete node
      sys_write
       vfs_write
        configfs_write_file
         o2nm_node_store
          o2nm_node_local_write
                                   do_rmdir
                                    vfs_rmdir
                                     configfs_rmdir
                                      mutex_lock(&subsys->su_mutex);
                                      unlink_obj
                                       item->ci_group = NULL;
                                       item->ci_parent = NULL;
      	 to_o2nm_cluster_from_node
      	  node->nd_item.ci_parent->ci_parent
      	  BUG since of NULL pointer dereference to nd_item.ci_parent
      
      Moreover, the o2nm_cluster also should be protected by the
      subsystem.su_mutex.
      
      [alex.chen@huawei.com: v2]
        Link: http://lkml.kernel.org/r/59EEAA69.9080703@huawei.com
      Link: http://lkml.kernel.org/r/59E9B36A.10700@huawei.com
      
      
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Reviewed-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      853bc26a
    • alex chen's avatar
      ocfs2: ip_alloc_sem should be taken in ocfs2_get_block() · 3e4c56d4
      alex chen authored
      ip_alloc_sem should be taken in ocfs2_get_block() when reading file in
      DIRECT mode to prevent concurrent access to extent tree with
      ocfs2_dio_end_io_write(), which may cause BUGON in the following
      situation:
      
      read file 'A'                                  end_io of writing file 'A'
      vfs_read
       __vfs_read
        ocfs2_file_read_iter
         generic_file_read_iter
          ocfs2_direct_IO
           __blockdev_direct_IO
            do_blockdev_direct_IO
             do_direct_IO
              get_more_blocks
               ocfs2_get_block
                ocfs2_extent_map_get_blocks
                 ocfs2_get_clusters
                  ocfs2_get_clusters_nocache()
                   ocfs2_search_extent_list
                    return the index of record which
                    contains the v_cluster, that is
                    v_cluster > rec[i]->e_cpos.
                                                      ocfs2_dio_end_io
                                                       ocfs2_dio_end_io_write
                                                        down_write(&oi->ip_alloc_sem);
                                                        ocfs2_mark_extent_written
                                                         ocfs2_change_extent_flag
                                                          ocfs2_split_extent
                                                           ...
                                                       --> modify the rec[i]->e_cpos, resulting
                                                           in v_cluster < rec[i]->e_cpos.
                   BUG_ON(v_cluster < le32_to_cpu(rec->e_cpos))
      
      [alex.chen@huawei.com: v3]
        Link: http://lkml.kernel.org/r/59EF3614.6050008@huawei.com
      Link: http://lkml.kernel.org/r/59EF3614.6050008@huawei.com
      Fixes: c15471f7
      
       ("ocfs2: fix sparse file & data ordering issue in direct io")
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Reviewed-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Reviewed-by: default avatarGang He <ghe@suse.com>
      Acked-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3e4c56d4
    • alex chen's avatar
      ocfs2: should wait dio before inode lock in ocfs2_setattr() · 28f5a8a7
      alex chen authored
      we should wait dio requests to finish before inode lock in
      ocfs2_setattr(), otherwise the following deadlock will happen:
      
      process 1                  process 2                    process 3
      truncate file 'A'          end_io of writing file 'A'   receiving the bast messages
      ocfs2_setattr
       ocfs2_inode_lock_tracker
        ocfs2_inode_lock_full
       inode_dio_wait
        __inode_dio_wait
        -->waiting for all dio
        requests finish
                                                              dlm_proxy_ast_handler
                                                               dlm_do_local_bast
                                                                ocfs2_blocking_ast
                                                                 ocfs2_generic_handle_bast
                                                                  set OCFS2_LOCK_BLOCKED flag
                              dio_end_io
                               dio_bio_end_aio
                                dio_complete
                                 ocfs2_dio_end_io
                                  ocfs2_dio_end_io_write
                                   ocfs2_inode_lock
                                    __ocfs2_cluster_lock
                                     ocfs2_wait_for_mask
                                     -->waiting for OCFS2_LOCK_BLOCKED
                                     flag to be cleared, that is waiting
                                     for 'process 1' unlocking the inode lock
                                 inode_dio_end
                                 -->here dec the i_dio_count, but will never
                                 be called, so a deadlock happened.
      
      Link: http://lkml.kernel.org/r/59F81636.70508@huawei.com
      
      
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Reviewed-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Acked-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      28f5a8a7
    • piaojun's avatar
      ocfs2: clean up some unused function declarations · 67b1b8d1
      piaojun authored
      Link: http://lkml.kernel.org/r/59C5D7D6.9050106@huawei.com
      
      
      Signed-off-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarAlex Chen <alex.chen@huawei.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      67b1b8d1
    • Changwei Ge's avatar
      ocfs2: fix cluster hang after a node dies · 1c019671
      Changwei Ge authored
      When a node dies, other live nodes have to choose a new master for an
      existed lock resource mastered by the dead node.
      
      As for ocfs2/dlm implementation, this is done by function -
      dlm_move_lockres_to_recovery_list which marks those lock rsources as
      DLM_LOCK_RES_RECOVERING and manages them via a list from which DLM
      changes lock resource's master later.
      
      So without invoking dlm_move_lockres_to_recovery_list, no master will be
      choosed after dlm recovery accomplishment since no lock resource can be
      found through ::resource list.
      
      What's worse is that if DLM_LOCK_RES_RECOVERING is not marked for lock
      resources mastered a dead node, it will break up synchronization among
      nodes.
      
      So invoke dlm_move_lockres_to_recovery_list again.
      
      Fixs: 'commit ee8f7fcb ("ocfs2/dlm: continue to purge recovery lockres when recovery master goes down")'
      Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373CED6E0F9@H3CMLB14-EX.srv.huawei-3com.com
      
      
      Signed-off-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Reported-by: default avatarVitaly Mayatskih <v.mayatskih@gmail.com>
      Tested-by: default avatarVitaly Mayatskikh <v.mayatskih@gmail.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c019671
    • piaojun's avatar
      ocfs2: cleanup unused func declaration and assignment · 98d6c09e
      piaojun authored
      Link: http://lkml.kernel.org/r/59E064BB.8000005@huawei.com
      
      
      Signed-off-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      98d6c09e
    • piaojun's avatar
      ocfs2: no need flush workqueue before destroying it · 23e0813a
      piaojun authored
      destroy_workqueue() will do flushing work for us.
      
      Link: http://lkml.kernel.org/r/59E06476.3090502@huawei.com
      
      
      Signed-off-by: default avatarJun Piao <piaojun@huawei.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      23e0813a
    • Guozhonghua's avatar
      ocfs2: remove unused declaration ocfs2_publish_get_mount_state() · a60874f8
      Guozhonghua authored
      Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4D0743232@H3CMLB12-EX.srv.huawei-3com.com
      
      
      Signed-off-by: default avatarguozhonghua <guozhonghua@h3c.com>
      Acked-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a60874f8
    • Geert Uytterhoeven's avatar
      m32r: fix endianness constraints · c95f1211
      Geert Uytterhoeven authored
      The m32r Kconfig provides both CPU_BIG_ENDIAN and CPU_LITTLE_ENDIAN
      configuration options.  As they are user-selectable and independent,
      this allows invalid configurations:
      
        - All m32r defconfigs build a big endian kernel, but CPU_BIG_ENDIAN is
          not set, causing compiler warnings like:
      
      	include/linux/byteorder/big_endian.h:7:2: warning: #warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN [-Wcpp]
      	 #warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN
      	  ^
      
        - Since commit 5bdfca64 ("m32r: define CPU_BIG_ENDIAN"),
          building an allmodconfig or allyesconfig enables both
          CONFIG_CPU_BIG_ENDIAN and CONFIG_CPU_LITTLE_ENDIAN.
          While this did get rid of the warning above, both options are
          obviously mutually exclusive.
      
      Fix this by making only CPU_LITTLE_ENDIAN configurable by the user, as
      before, and by making sure exactly one of CPU_BIG_ENDIAN and
      CPU_LITTLE_ENDIAN is always enabled.
      
      Link: http://lkml.kernel.org/r/1509361505-18150-1-git-send-email-geert@linux-m68k.org
      Fixes: 5bdfca64
      
       ("m32r: define CPU_BIG_ENDIAN")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c95f1211
    • Maninder Singh's avatar
      bloat-o-meter: provide 3 different arguments for data, function and All · 192efb7a
      Maninder Singh authored
      This patch provides 3 new arguments for bloat-o-meter
       1) -c -> for all (showing function and data differently)
       2) -d -> data
       3) -t -> function
      
      output:
      
        ./scripts/bloat-o-meter  -c "file1" "file2"
        add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-152 (-152)
        Function                                     old     new   delta
        main                                         412     260    -152
        Total: Before=548, After=396, chg -27.74%
        ##########################################################
        add/remove: 1/0 grow/shrink: 1/0 up/down: 84/0 (84)
        Data                                         old     new   delta
        arr                                            -      64     +64
        backtrace                                     60      80     +20
        Total: Before=109, After=193, chg +77.06%
        ##########################################################
        add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-64 (-64)
        RO Data                                      old     new   delta
        arr                                           64       -     -64
        Total: Before=68, After=4, chg -94.12%
      
      [maninder1.s@samsung.com: v1 -> v2]
        Link: http://lkml.kernel.org/r/1506569402-24787-1-git-send-email-maninder1.s@samsung.com
      Link: http://lkml.kernel.org/r/1506336313-27187-1-git-send-email-maninder1.s@samsung.com
      
      
      Signed-off-by: default avatarVaneet Narang <v.narang@samsung.com>
      Signed-off-by: default avatarManinder Singh <maninder1.s@samsung.com>
      Cc: Amit Sahrawat <a.sahrawat@samsung.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: <pankaj.m@samsung.com>
      Cc: <a.sahrawat@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      192efb7a
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · c9b012e5
      Linus Torvalds authored
      Pull arm64 updates from Will Deacon:
       "The big highlight is support for the Scalable Vector Extension (SVE)
        which required extensive ABI work to ensure we don't break existing
        applications by blowing away their signal stack with the rather large
        new vector context (<= 2 kbit per vector register). There's further
        work to be done optimising things like exception return, but the ABI
        is solid now.
      
        Much of the line count comes from some new PMU drivers we have, but
        they're pretty self-contained and I suspect we'll have more of them in
        future.
      
        Plenty of acronym soup here:
      
         - initial support for the Scalable Vector Extension (SVE)
      
         - improved handling for SError interrupts (required to handle RAS
           events)
      
         - enable GCC support for 128-bit integer types
      
         - remove kernel text addresses from backtraces and register dumps
      
         - use of WFE to implement long delay()s
      
         - ACPI IORT updates from Lorenzo Pieralisi
      
         - perf PMU driver for the Statistical Profiling Extension (SPE)
      
         - perf PMU driver for Hisilicon's system PMUs
      
         - misc cleanups and non-critical fixes"
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
        arm64: Make ARMV8_DEPRECATED depend on SYSCTL
        arm64: Implement __lshrti3 library function
        arm64: support __int128 on gcc 5+
        arm64/sve: Add documentation
        arm64/sve: Detect SVE and activate runtime support
        arm64/sve: KVM: Hide SVE from CPU features exposed to guests
        arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
        arm64/sve: KVM: Prevent guests from using SVE
        arm64/sve: Add sysctl to set the default vector length for new processes
        arm64/sve: Add prctl controls for userspace vector length management
        arm64/sve: ptrace and ELF coredump support
        arm64/sve: Preserve SVE registers around EFI runtime service calls
        arm64/sve: Preserve SVE registers around kernel-mode NEON use
        arm64/sve: Probe SVE capabilities and usable vector lengths
        arm64: cpufeature: Move sys_caps_initialised declarations
        arm64/sve: Backend logic for setting the vector length
        arm64/sve: Signal handling support
        arm64/sve: Support vector length resetting for new processes
        arm64/sve: Core task context handling
        arm64/sve: Low-level CPU setup
        ...
      c9b012e5
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.15-arch-v9-premerge' of... · b293fca4
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.15-arch-v9-premerge' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
      
      
      
      Pull RISC-V architecture support from Palmer Dabbelt:
       "This contains the core RISC-V Linux port, which has been through nine
        rounds of review on various mailing lists. The port is not complete:
        there's some cleanup patches moving through the review process, a
        whole bunch of drivers that need some work, and a lot of feature
        additions that will be needed.
      
        The patches contained in this tag have been through nine rounds of
        review on the various mailing lists. I have some outstanding cleanup
        patches, but since there's been so much review on these patches I
        thought it would be best to submit them as-is and then submit explicit
        cleanup patches so everyone can review them. This first patch set is
        big enough that it's a bit of a pain to constantly rewrite, and it's
        caused a few headaches with various contributors.
      
        The port is definately a work in progress. While what's there builds
        and boots with 4.14, it's a bit hard to actually see anything happen
        because there are no device drivers yet. I maintain a staging branch
        that contains all the device drivers and cleanup that actually works,
        but those patches won't all be ready for a while. I'd like to get what
        we currently have into your tree so everyone can start working from a
        single base -- of particular importance is allowing the glibc
        upstreaming process to proceed so we can sort out any possibly
        lingering user-visible ABI problems we might have.
      
        Copied below is the ChangeLog that contains the history of this patch
        set:
      
         (v9) As per suggestions on our v8 patch set, I've split the core
              architecture code out from our drivers and would like to submit
              this patch set to be included into linux-next, with the goal
              being to be merged in during the next merge window. This patch
              set is based on 4.14-rc2, but if it's better to have it based on
              something else then I can change it around.
      
              This patch set contains just the core arch code for RISC-V, so
              while it builds an nominally boots, you can't print or take an
              interrupt so it's not that useful. If you're looking to actually
              boot a system it would probably be better to use the full patch
              set listed below.
      
              We've collected a handful of tags from reviewers, and the
              remainder of the patch set only got minimal feedback last time.
              Here's what changed:
      
               - We now use the device tree to initialize the timer driver so
                 it's less tighly coupled with the arch port.
      
               - I cleaned up the defconfigs -- there's actually now just one,
                 and it's empty. For now I think we're OK with what the kernel
                 sets as defaults, but I anticipate we'll begin to expand this
                 as people start to use the port more.
      
               - The VDSO symbols version is sane.
      
               - We WFI while spinning in the boot loop.
      
               - A handful of comments have been added.
      
              While there are still a handful of FIXMEs in this patch set,
              we've started to get enough interest from various users and
              contributors that maintaining an out of tree patch set is
              starting to become a big burden. Hopefully the patches are good
              enough to merge now, which will at least get everyone working in
              a more reasonable manner as we clean up the remaining issues.
      
         (v8) I know it may not be the ideal time to submit a patch set right
              now, as it's the middle of the merge window, but things have
              calmed down quite a bit in the last month so I thought it would
              be good to get everyone on the same page. There's been a handful
              of changes since the last patch set, but most of them are fairly
              minor:
      
               - We changed PAGE_OFFSET to allowing mapping more physical
                 memory on 64-bit systems. This is user configurable, as it
                 triggers a different code model that generates slightly less
                 efficient code.
      
               - The device tree binding documentation is back, I'd managed to
                 lose it at some point.
      
               - We now pass the atomic64 test suite
      
               - The SBI timer driver has been refactored.
      
         (v7) It's been a while since my last patch set, but the changes han
              been fairly minimal:
      
               - The PCI cleanup patches have been dropped, we'll do them as a
                 separate patch set later.
      
               - We've the Kconfig entries from CONFIG_ISA_* to
                 CONFIG_RISCV_ISA_*, to make grep easier.
      
               - There have been a handful of memory model related tweaks in
                 I/O land, particularly relating the PCI and the upcoming
                 platform specification. There are significant comments in the
                 relevant files. This is still a WIP, but I think we're close
                 to getting as good as we're going to get until we end up with
                 some more specifications.
      
         (v6) As it's been only a day since the v5 patch set, the changes are
              pretty minimal:
      
               - The patch set is now based on linux-next/master, which I
                 believe is a better base now that we're getting closer to
                 upstream.
      
               - EARLY_PRINTK is no longer an option. Since the SBI console is
                 reasonable, there's no penalty to enabling it (and thus no
                 benefit to disabling it).
      
               - The mmap syscalls were refactored a bit.
      
         (v5) Things have really started to calm down, so this is fairly
              similar to the v4 patch set. The most interesting changes
              include:
      
               - We've moved back to a single patch set.
      
               - SMP support has been fixed, I was accidentally running on a
                 non-SMP configuration. There were various mistakes all over
                 the tree as a result of this.
      
               - The cmpxchg syscalls have been removed, as they were deemed a
                 bad idea. As a result, RISC-V Linux systems mandate the A
                 extension. The corresponding Kconfig entry to enable builds
                 on non-A systems has been removed.
      
               - A few more atomic fixes: mostly fence changes, but those
                 resulted in a handful of additional macros that were no
                 longer necessary.
      
               - riscv_early_sie has been removed.
      
         (v4) There have only been a few changes since the v3 patch set:
      
               - The cmpxchg64 syscall is no longer enabled on 32-bit systems.
                 It's not possible to provide this on SMP systems, and it's
                 not necessary as glibc knows not to call it.
      
               - We provide a ELF_HWCAP so users can determine the ISA of the
                 machine the kernel is running on.
      
               - The multi-line comments are in a better form.
      
               - There were a handful of headers that could be replaced with
                 the asm-generic versions, and a few unnecessary definitions.
      
               - We no longer use printk, but instead use pr_*.
      
               - A few Kconfig and defconfig entries have been cleaned up.
      
         (v3) A highlight of the changes since the v2 patch set includes:
      
               - We've split out all our drivers into separate patch sets,
                 which I've already sent out to the relevant maintainers. I
                 haven't included those patches in this patch set, but some of
                 them are necessary to build our port.
      
               - The patch set is now split up differently: rather than being
                 split per directory it is split per topic. Hopefully this
                 will make it easier to review the port on the mailing list.
                 The split is a bit rough, so you probably still want to look
                 at the patch set as a whole.
      
               - atomic.h has been completely rewritten and is hopefully now
                 correct. I've attempted to sanitize the various other memory
                 model related code as well, and I think it should all be sane
                 now aside from a handful of FIXMEs commented in the code.
      
               - We've changed the cmpexchg syscall to always exist and to not
                 be multiplexed. There is also a VDSO entry for compare and
                 exchange, which allows kernels with the A extension to
                 execute user code without the A extension reasonably fast.
      
               - Our user-visible register state now contains enough space for
                 the Q extension for 128-bit floating point, as well as a few
                 words to allow extensibility to future ISA extensions like
                 the eventual V extension for vectors.
      
               - A handful of driver cleanups, but these have been split into
                 separate patch sets now so I won't duplicate them here.
      
         (v2) A highlight of the changes since the v1 patch set includes:
      
               - We've split out our drivers into the right places, which
                 means now there's a lot more patches. I'll be submitting
                 these patches to various subsystem maintainers and including
                 them in any future RISC-V patch sets until they've been
                 merged.
      
               - The SBI console driver has been completely rewritten to use
                 the HVC helpers and is now significantly smaller.
      
               - We've begun to use weaker barriers as opposed to just the big
                 "fence". There's still some work to do here, specifically:
                  - We need fences in the relaxed MMIO functions.
                  - The non-relaxed MMIO functions are missing R/W bits on their fences.
                  - Many AMOs need the aq and rl bits set.
      
               - We now have thread_info in task_struct. As a result, sscratch
                 now contains TP instead of SP. This was necessary because
                 thread_info is no longer on the stack.
      
               - A few shared routines have been added that we use instead of
                 creating another arch copy"
      
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      
      * tag 'riscv-for-linus-4.15-arch-v9-premerge' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux:
        RISC-V: Build Infrastructure
        RISC-V: User-facing API
        RISC-V: Paging and MMU
        RISC-V: Device, timer, IRQs, and the SBI
        RISC-V: Task implementation
        RISC-V: ELF and module implementation
        RISC-V: Generic library routines and assembly
        RISC-V: Atomic and Locking Code
        RISC-V: Init and Halt Code
        dt-bindings: RISC-V CPU Bindings
        lib: Add shared copies of some GCC library routines
        MAINTAINERS: Add RISC-V
      b293fca4
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching · 0ef76878
      Linus Torvalds authored
      Pull livepatching updates from Jiri Kosina:
      
       - shadow variables support, allowing livepatches to associate new
         "shadow" fields to existing data structures, from Joe Lawrence
      
       - pre/post patch callbacks API, allowing livepatch writers to register
         callbacks to be called before and after patch application, from Joe
         Lawrence
      
      * 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
        livepatch: __klp_disable_patch() should never be called for disabled patches
        livepatch: Correctly call klp_post_unpatch_callback() in error paths
        livepatch: add transition notices
        livepatch: move transition "complete" notice into klp_complete_transition()
        livepatch: add (un)patch callbacks
        livepatch: Small shadow variable documentation fixes
        livepatch: __klp_shadow_get_or_alloc() is local to shadow.c
        livepatch: introduce shadow variable API
      0ef76878
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/trivial · 9682b3de
      Linus Torvalds authored
      Pull trivial tree updates from Jiri Kosina:
       "The usual rocket-science from trivial tree for 4.15"
      
      * 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
        MAINTAINERS: relinquish kconfig
        MAINTAINERS: Update my email address
        treewide: Fix typos in Kconfig
        kfifo: Fix comments
        init/Kconfig: Fix module signing document location
        misc: ibmasm: Return error on error path
        HID: logitech-hidpp: fix mistake in printk, "feeback" -> "feedback"
        MAINTAINERS: Correct path to uDraw PS3 driver
        tracing: Fix doc mistakes in trace sample
        tracing: Kconfig text fixes for CONFIG_HWLAT_TRACER
        MIPS: Alchemy: Remove reverted CONFIG_NETLINK_MMAP from db1xxx_defconfig
        mm/huge_memory.c: fixup grammar in comment
        lib/xz: Add fall-through comments to a switch statement
      9682b3de
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 20df1578
      Linus Torvalds authored
      Pull HID updates from Jiri Kosina:
      
       - high resolution mode for Dell canvas support, from Benjamin Tissoires
      
       - pen handling fixes for the Wacom driver, from Jason Gerecke
      
       - i2c-hid: Apollo-Lake based laptops improvements, from Hans de Goede
      
       - Input/Core: eraser tool support, from Ping Cheng
      
       - new ALPS touchpad (T4, found currently on HP EliteBook 1000, Zbook
         Stduio and HP Elite book x360) supportm from Masaki Ota
      
       - other smaller assorted fixes
      
      * 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (33 commits)
        HID: cp2112: fix broken gpio_direction_input callback
        HID: cp2112: fix interface specification URL
        HID: Wacom: switch Dell canvas into highres mode
        HID: wacom: generic: Send BTN_STYLUS3 when both barrel switches are set
        HID: sony: Fix SHANWAN pad rumbling on USB
        HID: i2c-hid: Add no-irq-after-reset quirk for 0911:5288 device
        HID: add backlight level quirk for Asus ROG laptops
        HID: cp2112: add HIDRAW dependency
        HID: Add ID 044f:b605 ThrustMaster, Inc. force feedback Racing Wheel
        HID: hid-logitech: remove redundant assignment to pointer value
        HID: wacom: generic: Recognize WACOM_HID_WD_PEN as a type of pen collection
        HID: rmi: Check that a device is a RMI device before calling RMI functions
        HID: add multi-input quirk for GamepadBlock
        HID: alps: add new U1 device ID
        HID: alps: add support for Alps T4 Touchpad device
        HID: alps: remove variables local to u1_init() from the device struct
        HID: alps: properly handle max_fingers and minimum on X and Y axis
        HID: alps: Separate U1 device code
        HID: alps: delete unnecessary struct u1_dev devInfo
        HID: usbhid: Convert timers to use timer_setup()
        ...
      20df1578
  2. Nov 15, 2017