Skip to content
  1. Jun 29, 2018
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.18-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · cd993fc4
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - Fix crash caused by endpoint library initialization order change
         (Alan Douglas)
      
       - Fix shpchp NULL pointer dereference regression on non-ACPI platforms
         (Bjorn Helgaas)
      
       - Move PCI_DOMAINS selection to fix build regression (Lorenzo
         Pieralisi)
      
      * tag 'pci-v4.18-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: controller: Move PCI_DOMAINS selection to arch Kconfig
        PCI: Initialize endpoint library before controllers
        PCI: shpchp: Manage SHPC unconditionally on non-ACPI systems
      cd993fc4
    • Linus Torvalds's avatar
      Merge tag 'pm-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 5e4e8c55
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix up recently added features (the Kryo cpufreq driver and
        performance states coverage in the generic power domains framework),
        add missing documentation for a recently added sysfs knob in the
        intel_pstate driver and fix an error in its documentation.
      
        Specifics:
      
         - Fix the initialization time error handling in the recently added
           Kryo cpufreq driver (Dan Carpenter).
      
         - Fix up the recently added coverage of performance states in the
           generic power domains (genpd) framework (Viresh Kumar).
      
         - Add missing documentation of the new hwp_dynamic_boost sysfs knob
           in the intel_pstate driver (Rafael Wysocki).
      
         - Fix incorrect sysfs path in the intel_pstate driver documentation
           (Rafael Wysocki)"
      
      * tag 'pm-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Documentation: intel_pstate: Describe hwp_dynamic_boost sysfs knob
        Documentation: admin-guide: intel_pstate: Fix sysfs path
        PM / Domains: Rename opp_node to np
        PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
        cpufreq: qcom-kryo: Fix error handling in probe()
      5e4e8c55
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm · 48a3c64b
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Nothing too major this round:
      
         - small set of mali-dp fixes
      
         - single meson fix
      
         - a bunch of amdgpu fixes (one makes non-4k page sizes not be a bad
           experience)"
      
      * tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm:
        drm/amd/display: release spinlock before committing updates to stream
        drm/amdgpu:Support new VCN FW version naming convention
        drm/amdgpu: fix UBSAN: Undefined behaviour for amdgpu_fence.c
        drm/meson: Fix an un-handled error path in 'meson_drv_bind_master()'
        drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping
        drm/amdgpu: Count disabled CRTCs in commit tail earlier
        drm/mali-dp: Rectify the width and height passed to rotmem_required()
        drm/arm/malidp: Preserve LAYER_FORMAT contents when setting format
        drm: mali-dp: Enable Global SE interrupts mask for DP500
        drm/arm/malidp: Ensure that the crtcs are shutdown before removing any encoder/connector
      48a3c64b
    • Linus Torvalds's avatar
      Merge tag 'for-4.18/dm-fixes' of... · ff23908b
      Linus Torvalds authored
      Merge tag 'for-4.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix dm core to use more efficient bio_split() instead of
         bio_clone_bioset(). Also fixes splitting bio that has integrity
         payload.
      
       - Three fixes related to properly validating DAX capabilities of a
         stacked DM device that will advertise DAX support.
      
       - Update DM writecache target to use 2-factor allocator arguments. Kees
         says this is the last related change for 4.18.
      
       - Fix DM zoned target to use GFP_NOIO to avoid triggering reclaim
         during IO submission (caught by lockdep).
      
       - Fix DM thinp to gracefully recover from running out of data space
         while a previous async discard completes (whereby freeing space).
      
       - Fix DM thinp's metadata transaction commit to avoid needless work.
      
      * tag 'for-4.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: prevent DAX mounts if not supported
        dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
        pmem: only set QUEUE_FLAG_DAX for fsdax mode
        dm thin: handle running out of data space vs concurrent discard
        dm raid: don't use 'const' in function return
        dm zoned: avoid triggering reclaim from inside dmz_map()
        dm writecache: use 2-factor allocator arguments
        dm thin metadata: remove needless work from __commit_transaction
        dm: use bio_split() when splitting out the already processed bio
      ff23908b
    • Avi Kivity's avatar
      aio: mark __aio_sigset::sigmask const · 2cd3ae21
      Avi Kivity authored
      
      
      io_pgetevents() will not change the signal mask.  Mark it const to make
      it clear and to reduce the need for casts in user code.
      
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAvi Kivity <avi@scylladb.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [hch: reapply the patch that got incorrectly reverted]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2cd3ae21
    • Christoph Hellwig's avatar
      net: handle NULL ->poll gracefully · e88958e6
      Christoph Hellwig authored
      
      
      The big aio poll revert broke various network protocols that don't
      implement ->poll as a patch in the aio poll serie removed sock_no_poll
      and made the common code handle this case.
      
      Reported-by: default avatar <syzbot+57727883dbad76db2ef0@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+cdb0d3176b53d35ad454@syzkaller.appspotmail.com>
      Reported-by: default avatar <syzbot+2c7e8f74f8b2571c87e8@syzkaller.appspotmail.com>
      Reported-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Fixes: a11e1d43
      
       ("Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e88958e6
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-domains' · e27b4d4a
      Rafael J. Wysocki authored
      Merge fixups for the recent extenstion of the generic power domains
      (genpd) framework covering performance states.
      
      * pm-domains:
        PM / Domains: Rename opp_node to np
        PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
      e27b4d4a
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2018-06-28' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 2d8aa4ef
      Dave Airlie authored
      
      
      drm-misc-fixes for v4.18-rc3:
      - A single fix in meson for an unhandled error path in meson_drv_bind_master().
      
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      Link: https://patchwork.freedesktop.org/patch/msgid/fa740f31-5a8d-ed45-5e8a-aecd3f6f11b7@linux.intel.com
      2d8aa4ef
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.18' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · d12bce95
      Dave Airlie authored
      
      
      A few fixes for 4.18:
      - fix a read past the end of an array due to vega20 changes
      - fix driver on systems with non-4K pages
      - fix locking with pageflipping in DC that could lead to a sleep while atomic
      - fix VCN firmware version reporting for upcoming firmware
      
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180628032641.2765-1-alexander.deucher@amd.com
      d12bce95
    • Ross Zwisler's avatar
      dm: prevent DAX mounts if not supported · dbc62659
      Ross Zwisler authored
      
      
      Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
      flag is set on the device's request queue to decide whether or not the
      device supports filesystem DAX.  Really we should be using
      bdev_dax_supported() like filesystems do at mount time.  This performs
      other tests like checking to make sure the dax_direct_access() path works.
      
      We also explicitly clear QUEUE_FLAG_DAX on the DM device's request queue if
      any of the underlying devices do not support DAX.  This makes the handling
      of QUEUE_FLAG_DAX consistent with the setting/clearing of most other flags
      in dm_table_set_restrictions().
      
      Now that bdev_dax_supported() explicitly checks for QUEUE_FLAG_DAX, this
      will ensure that filesystems built upon DM devices will only be able to
      mount with DAX if all underlying devices also support DAX.
      
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Fixes: commit 545ed20e
      
       ("dm: add infrastructure for DAX support")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      dbc62659
    • Ross Zwisler's avatar
      dax: check for QUEUE_FLAG_DAX in bdev_dax_supported() · 15256f6c
      Ross Zwisler authored
      
      
      Add an explicit check for QUEUE_FLAG_DAX to __bdev_dax_supported().  This
      is needed for DM configurations where the first element in the dm-linear or
      dm-stripe target supports DAX, but other elements do not.  Without this
      check __bdev_dax_supported() will pass for such devices, letting a
      filesystem on that device mount with the DAX option.
      
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: commit 545ed20e
      
       ("dm: add infrastructure for DAX support")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      15256f6c
    • Ross Zwisler's avatar
      pmem: only set QUEUE_FLAG_DAX for fsdax mode · 4557641b
      Ross Zwisler authored
      QUEUE_FLAG_DAX is an indication that a given block device supports
      filesystem DAX and should not be set for PMEM namespaces which are in "raw"
      mode.  These namespaces lack struct page and are prevented from
      participating in filesystem DAX as of commit 569d0365
      
       ("dax: require
      'struct page' by default for filesystem dax").
      
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: 569d0365
      
       ("dax: require 'struct page' by default for filesystem dax")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      4557641b
    • Linus Torvalds's avatar
      Merge tag 'printk-for-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk · 90368a37
      Linus Torvalds authored
      Pull printk fix from Petr Mladek:
       "Revert a commit that went in by mistake. I already have a better fix
        in the queue for 4.19"
      
      * tag 'printk-for-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
        Revert "lib/test_printf.c: call wait_for_random_bytes() before plain %p tests"
      90368a37
    • Linus Torvalds's avatar
      Merge tag 'sound-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · e26aac3c
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Over a dozen changes, but all small and clear fixes.
      
        Half of them are the regression fixes for CA0132 HD-audio codec, and
        the rest are, again, a few more fixups for HD-audio, two UBSAN fixes
        in the core ioctls, and a trivial fix in the error path handling in
        lx6464es driver"
      
      * tag 'sound-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: seq: Fix UBSAN warning at SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT ioctl
        ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl
        ALSA: hda/realtek - Fix the problem of two front mics on more machines
        ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
        ALSA: hda/ca0132: make array ca0132_alt_chmaps static
        ALSA: hda - Force to link down at runtime suspend on ATI/AMD HDMI
        ALSA: lx6464es: Missing error code in snd_lx6464es_create()
        ALSA: hda/ca0132: Fix DMic data rate for Alienware M17x R4
        ALSA: hda/ca0132: Restore PCM Analog Mic-In2
        ALSA: hda/ca0132: Don't test for QUIRK_NONE
        ALSA: hda/ca0132: Restore behavior of QUIRK_ALIENWARE
        ALSA: hda/ca0132: Delete redundant UNSOL event requests
        ALSA: hda/ca0132: Delete pointless assignments to struct auto_pin_cfg fields
        ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co
      e26aac3c
    • Linus Torvalds's avatar
      Merge tag 'mtd/fixes-for-4.18-rc3' of git://git.infradead.org/linux-mtd · c7e1d692
      Linus Torvalds authored
      Pull mtd fixes from Boris Brezillon:
       "NAND fixes:
      
         - add a quirk for a bunch of broken Macronix chips
      
         - fix nand_block_bad() when chip->ecc.read_oob() returns a positive
           value encoding the number of bitflips
      
         - fix OOB handling in the MXC driver fo V2.1 controllers
      
         - flag the ONFI_FEATURE_ON_DIE_ECC as supported in the Micron driver
      
         - hardcode clk rate in the denali_dt driver to address a bad DT
           representation (the proper fix will be queued for 4.19)
      
        SPI NOR fixes:
      
         - add an ULL constant to some ID definitions so that the ID is not
           truncated on 32-bit platforms
      
        MTD fixes:
      
         - fix the sector unlocking logic in the CFI driver"
      
      * tag 'mtd/fixes-for-4.18-rc3' of git://git.infradead.org/linux-mtd:
        mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz unconditionally
        mtd: dataflash: Use ULL suffix for 64-bit constants
        mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
        mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
        mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
        mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
        mtd: rawnand: All AC chips have a broken GET_FEATURES(TIMINGS).
        mtd: rawnand: fix return value check for bad block status
        mtd: rawnand: mxc: set spare area size register explicitly
        mtd: rawnand: micron: add ONFI_FEATURE_ON_DIE_ECC to supported features
      c7e1d692
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · ea5f39f2
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "7 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        proc: add Alexey to MAINTAINERS
        kasan: depend on CONFIG_SLUB_DEBUG
        include/linux/dax.h: dax_iomap_fault() returns vm_fault_t
        x86/e820: put !E820_TYPE_RAM regions into memblock.reserved
        slub: fix failure when we delete and create a slab cache
        Revert mm/vmstat.c: fix vmstat_update() preemption BUG
        lib/percpu_ida.c: don't do alloc from per-CPU list if there is none
      ea5f39f2
    • Alexey Dobriyan's avatar
    • Jason A. Donenfeld's avatar
      kasan: depend on CONFIG_SLUB_DEBUG · dd275caf
      Jason A. Donenfeld authored
      KASAN depends on having access to some of the accounting that SLUB_DEBUG
      does; without it, there are immediate crashes [1].  So, the natural
      thing to do is to make KASAN select SLUB_DEBUG.
      
      [1] http://lkml.kernel.org/r/CAHmME9rtoPwxUSnktxzKso14iuVCWT7BE_-_8PAC=pGw1iJnQg@mail.gmail.com
      
      Link: http://lkml.kernel.org/r/20180622154623.25388-1-Jason@zx2c4.com
      Fixes: f9e13c0a
      
       ("slab, slub: skip unnecessary kasan_cache_shutdown()")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd275caf
    • Souptick Joarder's avatar
      include/linux/dax.h: dax_iomap_fault() returns vm_fault_t · f77bc3a8
      Souptick Joarder authored
      Commit 1c8f4220 ("mm: change return type to vm_fault_t") missed a
      conversion.  It's not a big problem at present because mainline is still
      using
      
      	typedef int vm_fault_t;
      
      Fixes: 1c8f4220 ("mm: change return type to vm_fault_t")
      Link: http://lkml.kernel.org/r/20180620172046.GA27894@jordon-HP-15-Notebook-PC
      
      
      Signed-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f77bc3a8
    • Naoya Horiguchi's avatar
      x86/e820: put !E820_TYPE_RAM regions into memblock.reserved · 124049de
      Naoya Horiguchi authored
      There is a kernel panic that is triggered when reading /proc/kpageflags
      on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
      
        BUG: unable to handle kernel paging request at fffffffffffffffe
        PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0
        Oops: 0000 [#1] SMP PTI
        CPU: 2 PID: 1728 Comm: page-types Not tainted 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014
        RIP: 0010:stable_page_flags+0x27/0x3c0
        Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7
        RSP: 0018:ffffbbd44111fde0 EFLAGS: 00010202
        RAX: fffffffffffffffe RBX: 00007fffffffeff9 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffffed1182fff5c0
        RBP: ffffffffffffffff R08: 0000000000000001 R09: 0000000000000001
        R10: ffffbbd44111fed8 R11: 0000000000000000 R12: ffffed1182fff5c0
        R13: 00000000000bffd7 R14: 0000000002fff5c0 R15: ffffbbd44111ff10
        FS:  00007efc4335a500(0000) GS:ffff93a5bfc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: fffffffffffffffe CR3: 00000000b2a58000 CR4: 00000000001406e0
        Call Trace:
         kpageflags_read+0xc7/0x120
         proc_reg_read+0x3c/0x60
         __vfs_read+0x36/0x170
         vfs_read+0x89/0x130
         ksys_pread64+0x71/0x90
         do_syscall_64+0x5b/0x160
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7efc42e75e23
        Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24
      
      According to kernel bisection, this problem became visible due to commit
      f7f99100 ("mm: stop zeroing memory during allocation in vmemmap")
      which changes how struct pages are initialized.
      
      Memblock layout affects the pfn ranges covered by node/zone.  Consider
      that we have a VM with 2 NUMA nodes and each node has 4GB memory, and
      the default (no memmap= given) memblock layout is like below:
      
        MEMBLOCK configuration:
         memory size = 0x00000001fff75c00 reserved size = 0x000000000300c000
         memory.cnt  = 0x4
         memory[0x0]     [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
         memory[0x1]     [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
         memory[0x2]     [0x0000000100000000-0x000000013fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0
         memory[0x3]     [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
         ...
      
      If you give memmap=1G!4G (so it just covers memory[0x2]),
      the range [0x100000000-0x13fffffff] is gone:
      
        MEMBLOCK configuration:
         memory size = 0x00000001bff75c00 reserved size = 0x000000000300c000
         memory.cnt  = 0x3
         memory[0x0]     [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
         memory[0x1]     [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
         memory[0x2]     [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
         ...
      
      This causes shrinking node 0's pfn range because it is calculated by the
      address range of memblock.memory.  So some of struct pages in the gap
      range are left uninitialized.
      
      We have a function zero_resv_unavail() which does zeroing the struct pages
      within the reserved unavailable range (i.e.  memblock.memory &&
      !memblock.reserved).  This patch utilizes it to cover all unavailable
      ranges by putting them into memblock.reserved.
      
      Link: http://lkml.kernel.org/r/20180615072947.GB23273@hori1.linux.bs1.fc.nec.co.jp
      Fixes: f7f99100
      
       ("mm: stop zeroing memory during allocation in vmemmap")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Tested-by: default avatarOscar Salvador <osalvador@suse.de>
      Tested-by: default avatar"Herton R. Krzesinski" <herton@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      124049de
    • Mikulas Patocka's avatar
      slub: fix failure when we delete and create a slab cache · d50d82fa
      Mikulas Patocka authored
      In kernel 4.17 I removed some code from dm-bufio that did slab cache
      merging (commit 21bb1327: "dm bufio: remove code that merges slab
      caches") - both slab and slub support merging caches with identical
      attributes, so dm-bufio now just calls kmem_cache_create and relies on
      implicit merging.
      
      This uncovered a bug in the slub subsystem - if we delete a cache and
      immediatelly create another cache with the same attributes, it fails
      because of duplicate filename in /sys/kernel/slab/.  The slub subsystem
      offloads freeing the cache to a workqueue - and if we create the new
      cache before the workqueue runs, it complains because of duplicate
      filename in sysfs.
      
      This patch fixes the bug by moving the call of kobject_del from
      sysfs_slab_remove_workfn to shutdown_cache.  kobject_del must be called
      while we hold slab_mutex - so that the sysfs entry is deleted before a
      cache with the same attributes could be created.
      
      Running device-mapper-test-suite with:
      
        dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/
      
      triggered:
      
        Buffer I/O error on dev dm-0, logical block 1572848, async page read
        device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5
        device-mapper: thin: 253:1: aborting current metadata transaction
        sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144'
        CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25
        Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
        Workqueue: dm-thin do_worker [dm_thin_pool]
        Call Trace:
         dump_stack+0x5a/0x73
         sysfs_warn_dup+0x58/0x70
         sysfs_create_dir_ns+0x77/0x80
         kobject_add_internal+0xba/0x2e0
         kobject_init_and_add+0x70/0xb0
         sysfs_slab_add+0xb1/0x250
         __kmem_cache_create+0x116/0x150
         create_cache+0xd9/0x1f0
         kmem_cache_create_usercopy+0x1c1/0x250
         kmem_cache_create+0x18/0x20
         dm_bufio_client_create+0x1ae/0x410 [dm_bufio]
         dm_block_manager_create+0x5e/0x90 [dm_persistent_data]
         __create_persistent_data_objects+0x38/0x940 [dm_thin_pool]
         dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool]
         metadata_operation_failed+0x59/0x100 [dm_thin_pool]
         alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool]
         process_cell+0x2a3/0x550 [dm_thin_pool]
         do_worker+0x28d/0x8f0 [dm_thin_pool]
         process_one_work+0x171/0x370
         worker_thread+0x49/0x3f0
         kthread+0xf8/0x130
         ret_from_fork+0x35/0x40
        kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory.
        kmem_cache_create(dm_bufio_buffer-16) failed with error -17
      
      Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com
      
      
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Reported-by: default avatarMike Snitzer <snitzer@redhat.com>
      Tested-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d50d82fa
    • Sebastian Andrzej Siewior's avatar
      Revert mm/vmstat.c: fix vmstat_update() preemption BUG · 28557cc1
      Sebastian Andrzej Siewior authored
      Revert commit c7f26ccf ("mm/vmstat.c: fix vmstat_update() preemption
      BUG").  Steven saw a "using smp_processor_id() in preemptible" message
      and added a preempt_disable() section around it to keep it quiet.  This
      is not the right thing to do it does not fix the real problem.
      
      vmstat_update() is invoked by a kworker on a specific CPU.  This worker
      it bound to this CPU.  The name of the worker was "kworker/1:1" so it
      should have been a worker which was bound to CPU1.  A worker which can
      run on any CPU would have a `u' before the first digit.
      
      smp_processor_id() can be used in a preempt-enabled region as long as
      the task is bound to a single CPU which is the case here.  If it could
      run on an arbitrary CPU then this is the problem we have an should seek
      to resolve.
      
      Not only this smp_processor_id() must not be migrated to another CPU but
      also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
      Not to mention that other code relies on the fact that such a worker
      runs on one specific CPU only.
      
      Therefore revert that commit and we should look instead what broke the
      affinity mask of the kworker.
      
      Link: http://lkml.kernel.org/r/20180504104451.20278-1-bigeasy@linutronix.de
      
      
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Steven J. Hill <steven.hill@cavium.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      28557cc1
    • Sebastian Andrzej Siewior's avatar
      lib/percpu_ida.c: don't do alloc from per-CPU list if there is none · 4bb6e96a
      Sebastian Andrzej Siewior authored
      In commit 804209d8 ("lib/percpu_ida.c: use _irqsave() instead of
      local_irq_save() + spin_lock") I inlined alloc_local_tag() and mixed up
      the >= check from percpu_ida_alloc() with the one in alloc_local_tag().
      
      Don't alloc from per-CPU freelist if ->nr_free is zero.
      
      Link: http://lkml.kernel.org/r/20180613075830.c3zeva52fuj6fxxv@linutronix.de
      Fixes: 804209d8
      
       ("lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock")
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reported-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Tested-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Nicholas Bellinger <nab@linux-iscsi.org>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4bb6e96a
    • Linus Torvalds's avatar
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds authored
      
      
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  2. Jun 28, 2018
    • Dave Airlie's avatar
    • Dave Airlie's avatar
    • Shirish S's avatar
      drm/amd/display: release spinlock before committing updates to stream · 4de9f38b
      Shirish S authored
      
      
      Currently, amdgpu_do_flip() spinlocks crtc->dev->event_lock and
      releases it only after committing updates to the stream.
      
      dc_commit_updates_for_stream() should be moved out of
      spinlock for the below reasons:
      
      1. event_lock is supposed to protect access to acrct->pflip_status _only_
      2. dc_commit_updates_for_stream() has potential sleep's
         and also its not appropriate to be  in an atomic state
         for such long sequences of code.
      
      Signed-off-by: default avatarShirish S <shirish.s@amd.com>
      Suggested-by: default avatarAndrey Grodzovsky <andrey.grodzovsky@amd.com>
      Reviewed-by: default avatarMichel Dänzer <michel.daenzer@amd.com>
      Reviewed-by: default avatarHarry Wentland <harry.wentland@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      4de9f38b
    • James Zhu's avatar
      drm/amdgpu:Support new VCN FW version naming convention · 62d5b8e3
      James Zhu authored
      
      
      Support new VCN FW version naming convention:
        [31, 28] for VEP interface major version if applicable
        [27, 24] for decode interface major version
        [23, 20] for encode interface major version
        [19, 12] for encode interface minor version
        [11, 0]  for firmware revision
      Bit 20-23, it is encode major and non-zero for new naming convention.
      This field is part of version minor and DRM_DISABLED_FLAG in old naming
      convention. Since the latest version minor is 0x5B and DRM_DISABLED_FLAG
      is zero in old naming convention, this field is always zero so far.
      These four bits are used to tell which naming convention is present.
      
      Signed-off-by: default avatarJames Zhu <James.Zhu@amd.com>
      Reviewed-by: default avatarFang, Peter <Peter.Fang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      62d5b8e3
    • Leo Liu's avatar
      drm/amdgpu: fix UBSAN: Undefined behaviour for amdgpu_fence.c · d9e98ee2
      Leo Liu authored
      
      
      Here is the UBSAN dump:
      
      [    3.866656] index 2 is out of range for type 'amdgpu_uvd_inst [2]'
      [    3.866693] Workqueue: events work_for_cpu_fn
      [    3.866702] Call Trace:
      [    3.866710]  dump_stack+0x85/0xc5
      [    3.866719]  ubsan_epilogue+0x9/0x40
      [    3.866727]  __ubsan_handle_out_of_bounds+0x89/0x90
      [    3.866737]  ? rcu_read_lock_sched_held+0x58/0x60
      [    3.866746]  ? __kmalloc+0x26c/0x2d0
      [    3.866846]  amdgpu_fence_driver_start_ring+0x259/0x280 [amdgpu]
      [    3.866896]  amdgpu_ring_init+0x12c/0x710 [amdgpu]
      [    3.866906]  ? sprintf+0x42/0x50
      [    3.866956]  amdgpu_gfx_kiq_init_ring+0x1bc/0x3a0 [amdgpu]
      [    3.867009]  gfx_v8_0_sw_init+0x1ad3/0x2360 [amdgpu]
      [    3.867062]  ? smu7_init+0xec/0x160 [amdgpu]
      [    3.867109]  amdgpu_device_init+0x112c/0x1dc0 [amdgpu]
      
      'ring->me' might be set as 2 with 'amdgpu_gfx_kiq_init_ring', that would
      cause out of range for 'amdgpu_uvd_inst[2]'.
      
      v2: simplified with ring type
      
      Signed-off-by: default avatarLeo Liu <leo.liu@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      d9e98ee2
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.18-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · f5749432
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Here are some patches for 4.18 to fix regressions, accounting
        problems, overflow problems, and to strengthen metadata validation to
        prevent corruption.
      
        This series has been run through a full xfstests run over the weekend
        and through a quick xfstests run against this morning's master, with
        no major failures reported.
      
        Changes since last update:
      
         - more metadata validation strengthening to prevent crashes.
      
         - fix extent offset overflow problem when insert_range on a 512b
           block fs
      
         - fix some off-by-one errors in the realtime fsmap code
      
         - fix some math errors in the default resblks calculation when free
           space is low
      
         - fix a problem where stale page contents are exposed via mmap read
           after a zero_range at eof
      
         - fix accounting problems with per-ag reservations causing statfs
           reports to vary incorrectly"
      
      * tag 'xfs-4.18-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: fix fdblocks accounting w/ RMAPBT per-AG reservation
        xfs: ensure post-EOF zeroing happens after zeroing part of a file
        xfs: fix off-by-one error in xfs_rtalloc_query_range
        xfs: fix uninitialized field in rtbitmap fsmap backend
        xfs: recheck reflink state after grabbing ILOCK_SHARED for a write
        xfs: don't allow insert-range to shift extents past the maximum offset
        xfs: don't trip over negative free space in xfs_reserve_blocks
        xfs: allow empty transactions while frozen
        xfs: xfs_iflush_abort() can be called twice on cluster writeback failure
        xfs: More robust inode extent count validation
        xfs: simplify xfs_bmap_punch_delalloc_range
      f5749432
    • Timur Tabi's avatar
      MAINTAINERS: Timur has a kernel.org address · 0e49740c
      Timur Tabi authored
      
      
      Timur Tabi no longer works for Qualcomm, and he now has a kernel.org
      email address, so update MAINTAINERS accordingly.
      
      Signed-off-by: default avatarTimur Tabi <timur@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0e49740c
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_4.18_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 59ec39fe
      Linus Torvalds authored
      Pull MIPS build fix from Paul Burton:
       "A single build fix for 4.18:
      
        Adjust rseq_signal_deliver() & rseq_handle_notify_resume() calls to
        add the ksig argument introduced in v4.18-rc2, around the same time as
        the unadjusted MIPS rseq support"
      
      * tag 'mips_fixes_4.18_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Add ksig argument to rseq_{signal_deliver,handle_notify_resume}
      59ec39fe
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · f8a78bdb
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A handful of fixes, nothing really concerning and most touching
        devicetree files for various platforms.
      
        I also regenerated the shared multiplatform defconfigs; they have
        drifted quite a bit due to Kconfig changes and reordering, and several
        platform maintainers tried doing the same which resulted in a lot of
        conflict pain -- this way we get everybody onto the same base for next
        merge window"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
        arm64: dts: uniphier: fix widget name of headphone for LD11/LD20 boards
        ARM: dts: Fix SPI node for Arria10
        arm64: dts: stratix10: Fix SPI nodes for Stratix10
        qcom: cmd-db: enforce CONFIG_OF_RESERVED_MEM dependency
        ARM: Always build secure_cntvoff.S on ARM V7 to fix shmobile !SMP build
        ARM: multi_v7_defconfig: renormalize based on recent additions
        arm64: defconfig: renormalize based on recent additions
        arm64: dts: msm8916: fix Coresight ETF graph connections
        arm64: dts: apq8096-db820c: disable uart0 by default
        ARM: dts: imx6sx: fix irq for pcie bridge
        arm64: dts: Stingray: Fix I2C controller interrupt type
        arm64: dts: ns2: Fix PCIe controller interrupt type
        arm64: dts: ns2: Fix I2C controller interrupt type
        arm64: dts: specify 1.8V EMMC capabilities for bcm958742t
        arm64: dts: specify 1.8V EMMC capabilities for bcm958742k
        ARM: dts: Cygnus: Fix PCIe controller interrupt type
        ARM: dts: Cygnus: Fix I2C controller interrupt type
        ARM: dts: BCM5301x: Fix i2c controller interrupt type
        ARM: dts: HR2: Fix interrupt types for i2c and PCIe
        ARM: dts: NSP: Fix PCIe controllers interrupt types
        ...
      f8a78bdb
    • Linus Torvalds's avatar
      Merge tag 'microblaze-v4.18-rc3' of git://git.monstr.eu/linux-2.6-microblaze · 22c3b152
      Linus Torvalds authored
      Pull microblaze fixes from Michal Simek:
      
       - fix architecture gpio heart beat code
      
       - add new syscalls
      
       - remove unused xlnx,compound handling
      
       - remove platform.c
      
      * tag 'microblaze-v4.18-rc3' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: consolidate GPIO reset handling
        microblaze: remove unecessary of_platform_bus_probe call
        microblaze: Add new syscalls io_pgetevents and rseq
        microblaze: Remove architecture heart beat code
        microblaze: heartbeat: fix missing prom.h include
      22c3b152
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · debd52a0
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small bug fixes (barrier elimination, memory leak on unload,
        spinlock recursion) and a technical enhancement left over from the
        merge window: the TCMU read length support is required for tape
        devices read when the length of the read is greater than the tape
        block size"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: scsi_debug: Fix memory leak on module unload
        scsi: qla2xxx: Spinlock recursion in qla_target
        scsi: ipr: Eliminate duplicate barriers
        scsi: target: tcmu: add read length support
      debd52a0
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · c92067ae
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
      
       - the main change is a fix for my brain-dead patch to PS/2 button
         reporting for some protocols that made it in 4.17
      
       - there is a new driver for Spreadtum vibrator that I intended to send
         during merge window but ended up not sending the 2nd pull request.
         Given that this is a brand new driver we should not see regressions
         here
      
       - a fixup to Elantech PS/2 driver to avoid decoding errors on Thinkpad
         P52
      
       - addition of few more ACPI IDs for Silead and Elan drivers
      
       - RMI4 is switched to using IRQ domain code instead of rolling its own
         implementation
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: psmouse - fix button reporting for basic protocols
        Input: xpad - fix GPD Win 2 controller name
        Input: elan_i2c_smbus - fix more potential stack buffer overflows
        Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
        Input: elantech - fix V4 report decoding for module with middle key
        Input: elantech - enable middle button of touchpads on ThinkPad P52
        Input: do not assign new tracking ID when changing tool type
        Input: make input_report_slot_state() return boolean
        Input: synaptics-rmi4 - fix axis-swap behavior
        Input: synaptics-rmi4 - fix the error return code in rmi_probe_interrupts()
        Input: synaptics-rmi4 - convert irq distribution to irq_domain
        Input: silead - add MSSL0002 ACPI HID
        Input: goldfish_events - fix checkpatch warnings
        Input: add Spreadtrum vibrator driver
      c92067ae
    • Linus Torvalds's avatar
      Merge branch 'fixes-v4.18-rc3' of... · 896a3492
      Linus Torvalds authored
      Merge branch 'fixes-v4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull more security subsystem fixes from James Morris:
       "Two further fixes for the keys subsystem"
      
      * 'fixes-v4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        dh key: fix rounding up KDF output length
        certs/blacklist: fix const confusion
      896a3492
  3. Jun 27, 2018
    • Linus Torvalds's avatar
      checkpatch: remove warning for 'old' stable@kernel.org address · 3b41c3e2
      Linus Torvalds authored
      
      
      It may not be the actual real stable mailing list address, but the
      stable scripts to actually pick up on the traditional way to mark stable
      patches.
      
      There are also reasons to explicitly avoid using the actual mailing list
      address, since security patches with embargo dates generally do want the
      stable marking, but don't want tools etc to mistakenly send the patch
      out to the mailing list early.
      
      So don't warn for things that are still actively used and explicitly
      supported.
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3b41c3e2
    • Katsuhiro Suzuki's avatar
      arm64: dts: uniphier: fix widget name of headphone for LD11/LD20 boards · 86676c46
      Katsuhiro Suzuki authored
      This patch fixes wrong name of headphone widget for receiving events
      of insert/remove headphone plug from simple-card or audio-graph-card.
      
      If we use wrong widget name then we get warning messages such as
      "asoc-audio-graph-card sound: ASoC: DAPM unknown pin Headphones"
      when the plug is inserted or removed from headphone jack.
      
      Fixes: fb21a0ac
      
       ("arm64: dts: uniphier: add sound node")
      Signed-off-by: default avatarKatsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      86676c46
    • Mike Snitzer's avatar
      dm thin: handle running out of data space vs concurrent discard · a685557f
      Mike Snitzer authored
      Discards issued to a DM thin device can complete to userspace (via
      fstrim) _before_ the metadata changes associated with the discards is
      reflected in the thinp superblock (e.g. free blocks).  As such, if a
      user constructs a test that loops repeatedly over these steps, block
      allocation can fail due to discards not having completed yet:
      1) fill thin device via filesystem file
      2) remove file
      3) fstrim
      
      From initial report, here:
      https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html
      
      
      
      "The root cause of this issue is that dm-thin will first remove
      mapping and increase corresponding blocks' reference count to prevent
      them from being reused before DISCARD bios get processed by the
      underlying layers. However. increasing blocks' reference count could
      also increase the nr_allocated_this_transaction in struct sm_disk
      which makes smd->old_ll.nr_allocated +
      smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks.
      In this case, alloc_data_block() will never commit metadata to reset
      the begin pointer of struct sm_disk, because sm_disk_get_nr_free()
      always return an underflow value."
      
      While there is room for improvement to the space-map accounting that
      thinp is making use of: the reality is this test is inherently racey and
      will result in the previous iteration's fstrim's discard(s) completing
      vs concurrent block allocation, via dd, in the next iteration of the
      loop.
      
      No amount of space map accounting improvements will be able to allow
      user's to use a block before a discard of that block has completed.
      
      So the best we can really do is allow DM thinp to gracefully handle such
      aggressive use of all the pool's data by degrading the pool into
      out-of-data-space (OODS) mode.  We _should_ get that behaviour already
      (if space map accounting didn't falsely cause alloc_data_block() to
      believe free space was available).. but short of that we handle the
      current reality that dm_pool_alloc_data_block() can return -ENOSPC.
      
      Reported-by: default avatarDennis Yang <dennisyang@qnap.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      a685557f