Skip to content
  1. Jan 31, 2017
  2. Jan 27, 2017
    • Chris Wilson's avatar
      drm/sti: Fix compilation failure for drm_framebuffer.pixel_format · a5b2b6eb
      Chris Wilson authored
      drivers/gpu/drm/sti/sti_plane.c:76:33: error: ‘struct drm_framebuffer’
      has no member named ‘pixel_format’; did you mean ‘format’?
      
      I didn't look to hard at the casting to a char * and just did a
      mechanical transformation of s/pixel_format/format->format/ as given in
      commit 438b74a5 ("drm: Nuke fb->pixel_format").
      
      Fixes: 438b74a5
      
       ("drm: Nuke fb->pixel_format")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
      Cc: Vincent Abriou <vincent.abriou@st.com>
      Acked-by: default avatarVincent Abriou <vincent.abriou@st.com>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      a5b2b6eb
    • Dave Airlie's avatar
      Merge tag 'drm-misc-next-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc into drm-next · 3875623c
      Dave Airlie authored
      - cleanups&fixes for dw-hdmi bride driver (Laurent)
      - updates for adv bridge driver (John Stultz) for nexus
      - drm_crtc_from_index helper rollout (Shawn Guo)
      - removing drm_framebuffer_unregister_private from drivers&core
      - target_vblank (Andrey Grodzovsky)
      - misc tiny stuff
      
      * tag 'drm-misc-next-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc: (49 commits)
        drm: qxl: Open code teardown function for qxl
        drm: qxl: Open code probing sequence for qxl
        drm/bridge: adv7511: Re-write the i2c address before EDID probing
        drm/bridge: adv7511: Reuse __adv7511_power_on/off() when probing EDID
        drm/bridge: adv7511: Rework adv7511_power_on/off() so they can be reused internally
        drm/bridge: adv7511: Enable HPD interrupts to support hotplug and improve monitor detection
        drm/bridge: adv7511: Switch to using drm_kms_helper_hotplug_event()
        drm/bridge: adv7511: Use work_struct to defer hotplug handing to out of irq context
        drm: vc4: use crtc helper drm_crtc_from_index()
        drm: tegra: use crtc helper drm_crtc_from_index()
        drm: nouveau: use crtc helper drm_crtc_from_index()
        drm: mediatek: use crtc helper drm_crtc_from_index()
        drm: kirin: use crtc helper drm_crtc_from_index()
        drm: exynos: use crtc helper drm_crtc_from_index()
        dt-bindings: display: dw-hdmi: Clean up DT bindings documentation
        drm: bridge: dw-hdmi: Assert SVSRET before resetting the PHY
        drm: bridge: dw-hdmi: Fix the name of the PHY reset macros
        drm: bridge: dw-hdmi: Define and use macros for PHY register addresses
        drm: bridge: dw-hdmi: Detect PHY type at runtime
        drm: bridge: dw-hdmi: Handle overflow workaround based on device version
        ...
      3875623c
    • Dave Airlie's avatar
      Merge tag 'drm-intel-next-2017-01-23' of git://anongit.freedesktop.org/git/drm-intel into drm-next · a7e2641a
      Dave Airlie authored
      Final block of feature work for 4.11:
      
      - gen8 pd cleanup from Matthew Auld
      - more cleanups for view/vma (Chris)
      - dmc support on glk (Anusha Srivatsa)
      - use core crc api (Tomue)
      - track wedged requests using fence.error (Chris)
      - lots of psr fixes (Nagaraju, Vathsala)
      - dp mst support, acked for merging through drm-intel by Takashi
        (Libin)
      - huc loading support, including uapi for libva to use it (Anusha
        Srivatsa)
      
      * tag 'drm-intel-next-2017-01-23' of git://anongit.freedesktop.org/git/drm-intel: (111 commits)
        drm/i915: Update DRIVER_DATE to 20170123
        drm/i915: reinstate call to trace_i915_vma_bind
        drm/i915: Assert that created vma has a whole number of pages
        drm/i915: Assert the drm_mm_node is allocated when on the VM lists
        drm/i915: Treat an error from i915_vma_instance() as unlikely
        drm/i915: Reject vma creation larger than address space
        drm/i915: Use common LRU inactive vma bumping for unpin_from_display
        drm/i915: Do an unlocked wait before set-cache-level ioctl
        drm/i915/huc: Assert that HuC vma is placed in GuC accessible range
        drm/i915/huc: Avoid attempting to authenticate non-existent fw
        drm/i915: Set adjustment to zero on Up/Down interrupts if freq is already max/min
        drm/i915: Remove the double handling of 'flags from intel_mode_from_pipe_config()
        drm/i915: Remove crtc->config usage from intel_modeset_readout_hw_state()
        drm/i915: Release temporary load-detect state upon switching
        drm/i915: Remove i915_gem_object_to_ggtt()
        drm/i915: Remove i915_vma_create from VMA API
        drm/i915: Add a check that the VMA instance we lookup matches the request
        drm/i915: Rename some warts in the VMA API
        drm/i915: Track pinned vma in intel_plane_state
        drm/i915/get_params: Add HuC status to getparams
        ...
      a7e2641a
    • Dave Airlie's avatar
      Reinstate "drm/probe-helpers: Drop locking from poll_enable"" · c4d79c22
      Dave Airlie authored
      This reverts commit 54a07c7b,
      and reinstates the original.
      
      [airlied: this might be a bad plan for git].
      
      commit 3846fd9b
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jan 11 10:01:17 2017 +0100
      
          drm/probe-helpers: Drop locking from poll_enable
      
          It was only needed to protect the connector_list walking, see
      
          commit 8c4ccc4a
          Author: Daniel Vetter <daniel.vetter@ffwll.ch>
          Date:   Thu Jul 9 23:44:26 2015 +0200
      
              drm/probe-helper: Grab mode_config.mutex in poll_init/enable
      
          Unfortunately the commit message of that patch fails to mention that
          the new locking check was for the connector_list.
      
          But that requirement disappeared in
      
          commit c36a3254
          Author: Daniel Vetter <daniel.vetter@ffwll.ch>
          Date:   Thu Dec 15 16:58:43 2016 +0100
      
              drm: Convert all helpers to drm_connector_list_iter
      
          and so we can drop this again.
      
          This fixes a locking inversion on nouveau, where the rpm code needs to
          re-enable. But in other places the rpm_get() calls are nested within
          the big modeset locks.
      
          While at it, also improve the kerneldoc for these two functions a
          notch.
      
          v2: Update the kerneldoc even more to explain that these functions
          can't be called concurrently, or bad things happen (Chris).
      c4d79c22
    • Dave Airlie's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next · b0df0b25
      Dave Airlie authored
      Backmerge Linus master to get the connector locking revert.
      
      * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux: (645 commits)
        sysctl: fix proc_doulongvec_ms_jiffies_minmax()
        Revert "drm/probe-helpers: Drop locking from poll_enable"
        MAINTAINERS: add Dan Streetman to zbud maintainers
        MAINTAINERS: add Dan Streetman to zswap maintainers
        mm: do not export ioremap_page_range symbol for external module
        mn10300: fix build error of missing fpu_save()
        romfs: use different way to generate fsid for BLOCK or MTD
        frv: add missing atomic64 operations
        mm, page_alloc: fix premature OOM when racing with cpuset mems update
        mm, page_alloc: move cpuset seqcount checking to slowpath
        mm, page_alloc: fix fast-path race with cpuset update or removal
        mm, page_alloc: fix check for NULL preferred_zone
        kernel/panic.c: add missing \n
        fbdev: color map copying bounds checking
        frv: add atomic64_add_unless()
        mm/mempolicy.c: do not put mempolicy before using its nodemask
        radix-tree: fix private list warnings
        Documentation/filesystems/proc.txt: add VmPin
        mm, memcg: do not retry precharge charges
        proc: add a schedule point in proc_pid_readdir()
        ...
      b0df0b25
    • Eric Dumazet's avatar
      sysctl: fix proc_doulongvec_ms_jiffies_minmax() · ff9f8a7c
      Eric Dumazet authored
      
      
      We perform the conversion between kernel jiffies and ms only when
      exporting kernel value to user space.
      
      We need to do the opposite operation when value is written by user.
      
      Only matters when HZ != 1000
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ff9f8a7c
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 928d336a
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "A bunch of pin control fixes for v4.10 that didn't get sent off until
        now, sorry for the delay.
      
        It's only driver fixes:
      
         - A bunch of fixes to the Intel drivers: broxton, baytrail. Bugs
           related to register offsets, IRQ, debounce functionality.
      
         - Fix a conflict amongst UART settings on the meson.
      
         - Fix the ethernet setting on the Uniphier.
      
         - A compilation warning squelched"
      
      * tag 'pinctrl-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: uniphier: fix Ethernet (RMII) pin-mux setting for LD20
        pinctrl: meson: fix uart_ao_b for GXBB and GXL/GXM
        pinctrl: amd: avoid maybe-uninitalized warning
        pinctrl: baytrail: Do not add all GPIOs to IRQ domain
        pinctrl: baytrail: Rectify debounce support
        pinctrl: intel: Set pin direction properly
        pinctrl: broxton: Use correct PADCFGLOCK offset
      928d336a
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux · bed7b016
      Linus Torvalds authored
      Pull drm revert from Dave Airlie:
       "Revert one patch missing some prereqs.
      
        One of the connector fixes was missing some prereqs, we have an
        alternate driver fix that should work that I'll send tomorrow.
      
        Today is a holiday here so quickly smashing this out"
      
      Daniel Vetter explains:
       "I pushed a locking change to fix a nouveau rpm issue to -fixes that
        needed the connector_list rework. And that's only in -next, but I
        missed that. Dave has the revert in a pull, and he'll follow-up with
        the hack nouveau patch for 4.10, and then we'll reapply the proper fix
        again for -next and revert the hacks. A bit a mess, but should be
        sorted soon"
      
      * tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux:
        Revert "drm/probe-helpers: Drop locking from poll_enable"
      bed7b016
  3. Jan 26, 2017
  4. Jan 25, 2017
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 883af14e
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "26 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits)
        MAINTAINERS: add Dan Streetman to zbud maintainers
        MAINTAINERS: add Dan Streetman to zswap maintainers
        mm: do not export ioremap_page_range symbol for external module
        mn10300: fix build error of missing fpu_save()
        romfs: use different way to generate fsid for BLOCK or MTD
        frv: add missing atomic64 operations
        mm, page_alloc: fix premature OOM when racing with cpuset mems update
        mm, page_alloc: move cpuset seqcount checking to slowpath
        mm, page_alloc: fix fast-path race with cpuset update or removal
        mm, page_alloc: fix check for NULL preferred_zone
        kernel/panic.c: add missing \n
        fbdev: color map copying bounds checking
        frv: add atomic64_add_unless()
        mm/mempolicy.c: do not put mempolicy before using its nodemask
        radix-tree: fix private list warnings
        Documentation/filesystems/proc.txt: add VmPin
        mm, memcg: do not retry precharge charges
        proc: add a schedule point in proc_pid_readdir()
        mm: alloc_contig: re-allow CMA to compact FS pages
        mm/slub.c: trace free objects at KERN_INFO
        ...
      883af14e
    • Dan Streetman's avatar
      MAINTAINERS: add Dan Streetman to zbud maintainers · aab45453
      Dan Streetman authored
      
      
      Add myself as zbud maintainer.
      
      Link: http://lkml.kernel.org/r/20170124221705.26523-1-ddstreet@ieee.org
      Signed-off-by: default avatarDan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aab45453
    • Dan Streetman's avatar
      MAINTAINERS: add Dan Streetman to zswap maintainers · 534c9dc9
      Dan Streetman authored
      
      
      Add myself as zswap maintainer.
      
      Link: http://lkml.kernel.org/r/20170124212200.19052-1-ddstreet@ieee.org
      Signed-off-by: default avatarDan Streetman <ddstreet@ieee.org>
      Acked-by: default avatarSeth Jennings <sjenning@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      534c9dc9
    • zhong jiang's avatar
      mm: do not export ioremap_page_range symbol for external module · 3277953d
      zhong jiang authored
      
      
      Recently, I've found cases in which ioremap_page_range was used
      incorrectly, in external modules, leading to crashes.  This can be
      partly attributed to the fact that ioremap_page_range is lower-level,
      with fewer protections, as compared to the other functions that an
      external module would typically call.  Those include:
      
           ioremap_cache
           ioremap_nocache
           ioremap_prot
           ioremap_uc
           ioremap_wc
           ioremap_wt
      
      ...each of which wraps __ioremap_caller, which in turn provides a safer
      way to achieve the mapping.
      
      Therefore, stop EXPORT-ing ioremap_page_range.
      
      Link: http://lkml.kernel.org/r/1485173220-29010-1-git-send-email-zhongjiang@huawei.com
      Signed-off-by: default avatarzhong jiang <zhongjiang@huawei.com>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Suggested-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3277953d
    • Randy Dunlap's avatar
      mn10300: fix build error of missing fpu_save() · 3705ccfd
      Randy Dunlap authored
      
      
      When CONFIG_FPU is not enabled on arch/mn10300, <asm/switch_to.h> causes
      a build error with a call to fpu_save():
      
        kernel/built-in.o: In function `.L410':
        core.c:(.sched.text+0x28a): undefined reference to `fpu_save'
      
      Fix this by including <asm/fpu.h> in <asm/switch_to.h> so that an empty
      static inline fpu_save() is defined.
      
      Link: http://lkml.kernel.org/r/dc421c4f-4842-4429-1b99-92865c2f24b6@infradead.org
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Reviewed-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3705ccfd
    • Coly Li's avatar
      romfs: use different way to generate fsid for BLOCK or MTD · f598f82e
      Coly Li authored
      Commit 8a59f5d2
      
       ("fs/romfs: return f_fsid for statfs(2)") generates
      a 64bit id from sb->s_bdev->bd_dev.  This is only correct when romfs is
      defined with CONFIG_ROMFS_ON_BLOCK.  If romfs is only defined with
      CONFIG_ROMFS_ON_MTD, sb->s_bdev is NULL, referencing sb->s_bdev->bd_dev
      will triger an oops.
      
      Richard Weinberger points out that when CONFIG_ROMFS_BACKED_BY_BOTH=y,
      both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD are defined.
      Therefore when calling huge_encode_dev() to generate a 64bit id, I use
      the follow order to choose parameter,
      
      - CONFIG_ROMFS_ON_BLOCK defined
        use sb->s_bdev->bd_dev
      - CONFIG_ROMFS_ON_BLOCK undefined and CONFIG_ROMFS_ON_MTD defined
        use sb->s_dev when,
      - both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD undefined
        leave id as 0
      
      When CONFIG_ROMFS_ON_MTD is defined and sb->s_mtd is not NULL, sb->s_dev
      is set to a device ID generated by MTD_BLOCK_MAJOR and mtd index,
      otherwise sb->s_dev is 0.
      
      This is a try-best effort to generate a uniq file system ID, if all the
      above conditions are not meet, f_fsid of this romfs instance will be 0.
      Generally only one romfs can be built on single MTD block device, this
      method is enough to identify multiple romfs instances in a computer.
      
      Link: http://lkml.kernel.org/r/1482928596-115155-1-git-send-email-colyli@suse.de
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reported-by: default avatarNong Li <nongli1031@gmail.com>
      Tested-by: default avatarNong Li <nongli1031@gmail.com>
      Cc: Richard Weinberger <richard.weinberger@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f598f82e
    • Sudip Mukherjee's avatar
      frv: add missing atomic64 operations · 4180c4c1
      Sudip Mukherjee authored
      
      
      Some more atomic64 operations were missing and as a result frv
      allmodconfig was failing.  Add the missing operations.
      
      Link: http://lkml.kernel.org/r/1485193844-12850-1-git-send-email-sudip.mukherjee@codethink.co.uk
      Signed-off-by: default avatarSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4180c4c1
    • Vlastimil Babka's avatar
      mm, page_alloc: fix premature OOM when racing with cpuset mems update · e47483bc
      Vlastimil Babka authored
      Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
      triggers OOM killer in few seconds, despite lots of free memory.  The
      test attempts to repeatedly fault in memory in one process in a cpuset,
      while changing allowed nodes of the cpuset between 0 and 1 in another
      process.
      
      The problem comes from insufficient protection against cpuset changes,
      which can cause get_page_from_freelist() to consider all zones as
      non-eligible due to nodemask and/or current->mems_allowed.  This was
      masked in the past by sufficient retries, but since commit 682a3385
      ("mm, page_alloc: inline the fast path of the zonelist iterator") we fix
      the preferred_zoneref once, and don't iterate over the whole zonelist in
      further attempts, thus the only eligible zones might be placed in the
      zonelist before our starting point and we always miss them.
      
      A previous patch fixed this problem for current->mems_allowed.  However,
      cpuset changes also update the task's mempolicy nodemask.  The fix has
      two parts.  We have to repeat the preferred_zoneref search when we
      detect cpuset update by way of seqcount, and we have to check the
      seqcount before considering OOM.
      
      [akpm@linux-foundation.org: fix typo in comment]
      Link: http://lkml.kernel.org/r/20170120103843.24587-5-vbabka@suse.cz
      Fixes: c33d6c06
      
       ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarGanapatrao Kulkarni <gpkulkarni@gmail.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e47483bc
    • Vlastimil Babka's avatar
      mm, page_alloc: move cpuset seqcount checking to slowpath · 5ce9bfef
      Vlastimil Babka authored
      
      
      This is a preparation for the following patch to make review simpler.
      While the primary motivation is a bug fix, this also simplifies the fast
      path, although the moved code is only enabled when cpusets are in use.
      
      Link: http://lkml.kernel.org/r/20170120103843.24587-4-vbabka@suse.cz
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5ce9bfef
    • Vlastimil Babka's avatar
      mm, page_alloc: fix fast-path race with cpuset update or removal · 16096c25
      Vlastimil Babka authored
      Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
      triggers OOM killer in few seconds, despite lots of free memory.  The
      test attempts to repeatedly fault in memory in one process in a cpuset,
      while changing allowed nodes of the cpuset between 0 and 1 in another
      process.
      
      One possible cause is that in the fast path we find the preferred
      zoneref according to current mems_allowed, so that it points to the
      middle of the zonelist, skipping e.g.  zones of node 1 completely.  If
      the mems_allowed is updated to contain only node 1, we never reach it in
      the zonelist, and trigger OOM before checking the cpuset_mems_cookie.
      
      This patch fixes the particular case by redoing the preferred zoneref
      search if we switch back to the original nodemask.  The condition is
      also slightly changed so that when the last non-root cpuset is removed,
      we don't miss it.
      
      Note that this is not a full fix, and more patches will follow.
      
      Link: http://lkml.kernel.org/r/20170120103843.24587-3-vbabka@suse.cz
      Fixes: 682a3385
      
       ("mm, page_alloc: inline the fast path of the zonelist iterator")
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reported-by: default avatarGanapatrao Kulkarni <gpkulkarni@gmail.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      16096c25
    • Vlastimil Babka's avatar
      mm, page_alloc: fix check for NULL preferred_zone · ea57485a
      Vlastimil Babka authored
      Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
      
      This is v2 of my attempt to fix the recent report based on LTP cpuset
      stress test [1].  The intention is to go to stable 4.9 LTSS with this,
      as triggering repeated OOMs is not nice.  That's why the patches try to
      be not too intrusive.
      
      Unfortunately why investigating I found that modifying the testcase to
      use per-VMA policies instead of per-task policies will bring the OOM's
      back, but that seems to be much older and harder to fix problem.  I have
      posted a RFC [2] but I believe that fixing the recent regressions has a
      higher priority.
      
      Longer-term we might try to think how to fix the cpuset mess in a better
      and less error prone way.  I was for example very surprised to learn,
      that cpuset updates change not only task->mems_allowed, but also
      nodemask of mempolicies.  Until now I expected the parameter to
      alloc_pages_nodemask() to be stable.  I wonder why do we then treat
      cpusets specially in get_page_from_freelist() and distinguish HARDWALL
      etc, when there's unconditional intersection between mempolicy and
      cpuset.  I would expect the nodemask adjustment for saving overhead in
      g_p_f(), but that clearly doesn't happen in the current form.  So we
      have both crazy complexity and overhead, AFAICS.
      
      [1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
      [2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
      
      This patch (of 4):
      
      Since commit c33d6c06 ("mm, page_alloc: avoid looking up the first
      zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
      which can theoretically happen due to concurrent cpuset modification.  We
      check the zoneref pointer which is never NULL and we should check the zone
      pointer.  Also document this in first_zones_zonelist() comment per Michal
      Hocko.
      
      Fixes: c33d6c06
      
       ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
      Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea57485a
    • Jiri Slaby's avatar
      kernel/panic.c: add missing \n · ff7a28a0
      Jiri Slaby authored
      
      
      When a system panics, the "Rebooting in X seconds.." message is never
      printed because it lacks a new line.  Fix it.
      
      Link: http://lkml.kernel.org/r/20170119114751.2724-1-jslaby@suse.cz
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ff7a28a0
    • Kees Cook's avatar
      fbdev: color map copying bounds checking · 2dc705a9
      Kees Cook authored
      Copying color maps to userspace doesn't check the value of to->start,
      which will cause kernel heap buffer OOB read due to signedness wraps.
      
      CVE-2016-8405
      
      Link: http://lkml.kernel.org/r/20170105224249.GA50925@beast
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: Peter Pi (@heisecode) of Trend Micro
      Cc: Min Chong <mchong@google.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2dc705a9
    • Sudip Mukherjee's avatar
      frv: add atomic64_add_unless() · 545d58f6
      Sudip Mukherjee authored
      
      
      The build of frv allmodconfig was failing with the error:
      lib/atomic64_test.c:209:9: error:
      
      	implicit declaration of function 'atomic64_add_unless'
      
      All the atomic64 operations were defined in frv, but
      atomic64_add_unless() was not done.
      
      Implement atomic64_add_unless() as done in other arches.
      
      Link: http://lkml.kernel.org/r/1484781236-6698-1-git-send-email-sudipm.mukherjee@gmail.com
      Signed-off-by: default avatarSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      545d58f6
    • Vlastimil Babka's avatar
      mm/mempolicy.c: do not put mempolicy before using its nodemask · d51e9894
      Vlastimil Babka authored
      Since commit be97a41b ("mm/mempolicy.c: merge alloc_hugepage_vma to
      alloc_pages_vma") alloc_pages_vma() can potentially free a mempolicy by
      mpol_cond_put() before accessing the embedded nodemask by
      __alloc_pages_nodemask().  The commit log says it's so "we can use a
      single exit path within the function" but that's clearly wrong.  We can
      still do that when doing mpol_cond_put() after the allocation attempt.
      
      Make sure the mempolicy is not freed prematurely, otherwise
      __alloc_pages_nodemask() can end up using a bogus nodemask, which could
      lead e.g.  to premature OOM.
      
      Fixes: be97a41b
      
       ("mm/mempolicy.c: merge alloc_hugepage_vma to alloc_pages_vma")
      Link: http://lkml.kernel.org/r/20170118141124.8345-1-vbabka@suse.cz
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>	[4.0+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d51e9894
    • Matthew Wilcox's avatar
      radix-tree: fix private list warnings · dd040b6f
      Matthew Wilcox authored
      The newly introduced warning in radix_tree_free_nodes() was testing the
      wrong variable; it should have been 'old' instead of 'node'.
      
      Fixes: ea07b862
      
       ("mm: workingset: fix use-after-free in shadow node shrinker")
      Link: http://lkml.kernel.org/r/20170118163746.GA32495@cmpxchg.org
      Signed-off-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd040b6f
    • Fabian Frederick's avatar
      Documentation/filesystems/proc.txt: add VmPin · bbd88e1d
      Fabian Frederick authored
      Commit bc3e53f6
      
       ("mm: distinguish between mlocked and pinned pages")
      added VmPin in /proc/<pid>/status.  Report that in
      Documentation/filesystems/proc.txt
      
      Also move Umask after Name to keep correct order.
      
      Link: http://lkml.kernel.org/r/20170114201219.30387-1-fabf@skynet.be
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bbd88e1d
    • David Rientjes's avatar
      mm, memcg: do not retry precharge charges · 3674534b
      David Rientjes authored
      When memory.move_charge_at_immigrate is enabled and precharges are
      depleted during move, mem_cgroup_move_charge_pte_range() will attempt to
      increase the size of the precharge.
      
      Prevent precharges from ever looping by setting __GFP_NORETRY.  This was
      probably the intention of the GFP_KERNEL & ~__GFP_NORETRY, which is
      pointless as written.
      
      Fixes: 0029e19e
      
       ("mm: memcontrol: remove explicit OOM parameter in charge path")
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701130208510.69402@chino.kir.corp.google.com
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3674534b
    • Eric Dumazet's avatar
      proc: add a schedule point in proc_pid_readdir() · 3ba4bcee
      Eric Dumazet authored
      
      
      We have seen proc_pid_readdir() invocations holding cpu for more than 50
      ms.  Add a cond_resched() to be gentle with other tasks.
      
      [akpm@linux-foundation.org: coding style fix]
      Link: http://lkml.kernel.org/r/1484238380.15816.42.camel@edumazet-glaptop3.roam.corp.google.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ba4bcee
    • Lucas Stach's avatar
      mm: alloc_contig: re-allow CMA to compact FS pages · 424f6c48
      Lucas Stach authored
      Commit 73e64c51 ("mm, compaction: allow compaction for GFP_NOFS
      requests") changed compation to skip FS pages if not explicitly allowed
      to touch them, but missed to update the CMA compact_control.
      
      This leads to a very high isolation failure rate, crippling performance
      of CMA even on a lightly loaded system.  Re-allow CMA to compact FS
      pages by setting the correct GFP flags, restoring CMA behavior and
      performance to the kernel 4.9 level.
      
      Fixes: 73e64c51
      
       (mm, compaction: allow compaction for GFP_NOFS requests)
      Link: http://lkml.kernel.org/r/20170113115155.24335-1-l.stach@pengutronix.de
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      424f6c48
    • Daniel Thompson's avatar
      mm/slub.c: trace free objects at KERN_INFO · aa2efd5e
      Daniel Thompson authored
      
      
      Currently when trace is enabled (e.g.  slub_debug=T,kmalloc-128 ) the
      trace messages are mostly output at KERN_INFO.  However the trace code
      also calls print_section() to hexdump the head of a free object.  This
      is hard coded to use KERN_ERR, meaning the console is deluged with trace
      messages even if we've asked for quiet.
      
      Fix this the obvious way but adding a level parameter to
      print_section(), allowing calls from the trace code to use the same
      trace level as other trace messages.
      
      Link: http://lkml.kernel.org/r/20170113154850.518-1-daniel.thompson@linaro.org
      Signed-off-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa2efd5e
    • Andrea Arcangeli's avatar
      userfaultfd: fix SIGBUS resulting from false rwsem wakeups · 15a77c6f
      Andrea Arcangeli authored
      
      
      With >=32 CPUs the userfaultfd selftest triggered a graceful but
      unexpected SIGBUS because VM_FAULT_RETRY was returned by
      handle_userfault() despite the UFFDIO_COPY wasn't completed.
      
      This seems caused by rwsem waking the thread blocked in
      handle_userfault() and we can't run up_read() before the wait_event
      sequence is complete.
      
      Keeping the wait_even sequence identical to the first one, would require
      running userfaultfd_must_wait() again to know if the loop should be
      repeated, and it would also require retaking the rwsem and revalidating
      the whole vma status.
      
      It seems simpler to wait the targeted wakeup so that if false wakeups
      materialize we still wait for our specific wakeup event, unless of
      course there are signals or the uffd was released.
      
      Debug code collecting the stack trace of the wakeup showed this:
      
        $ ./userfaultfd 100 99999
        nr_pages: 25600, nr_pages_per_cpu: 800
        bounces: 99998, mode: racing ver poll, userfaults: 32 35 90 232 30 138 69 82 34 30 139 40 40 31 20 19 43 13 15 28 27 38 21 43 56 22 1 17 31 8 4 2
        bounces: 99997, mode: rnd ver poll, Bus error (core dumped)
      
          save_stack_trace+0x2b/0x50
          try_to_wake_up+0x2a6/0x580
          wake_up_q+0x32/0x70
          rwsem_wake+0xe0/0x120
          call_rwsem_wake+0x1b/0x30
          up_write+0x3b/0x40
          vm_mmap_pgoff+0x9c/0xc0
          SyS_mmap_pgoff+0x1a9/0x240
          SyS_mmap+0x22/0x30
          entry_SYSCALL_64_fastpath+0x1f/0xbd
          0xffffffffffffffff
          FAULT_FLAG_ALLOW_RETRY missing 70
        CPU: 24 PID: 1054 Comm: userfaultfd Tainted: G        W       4.8.0+ #30
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
        Call Trace:
          dump_stack+0xb8/0x112
          handle_userfault+0x572/0x650
          handle_mm_fault+0x12cb/0x1520
          __do_page_fault+0x175/0x500
          trace_do_page_fault+0x61/0x270
          do_async_page_fault+0x19/0x90
          async_page_fault+0x25/0x30
      
      This always happens when the main userfault selftest thread is running
      clone() while glibc runs either mprotect or mmap (both taking mmap_sem
      down_write()) to allocate the thread stack of the background threads,
      while locking/userfault threads already run at full throttle and are
      susceptible to false wakeups that may cause handle_userfault() to return
      before than expected (which results in graceful SIGBUS at the next
      attempt).
      
      This was reproduced only with >=32 CPUs because the loop to start the
      thread where clone() is too quick with fewer CPUs, while with 32 CPUs
      there's already significant activity on ~32 locking and userfault
      threads when the last background threads are started with clone().
      
      This >=32 CPUs SMP race condition is likely reproducible only with the
      selftest because of the much heavier userfault load it generates if
      compared to real apps.
      
      We'll have to allow "one more" VM_FAULT_RETRY for the WP support and a
      patch floating around that provides it also hidden this problem but in
      reality only is successfully at hiding the problem.
      
      False wakeups could still happen again the second time
      handle_userfault() is invoked, even if it's a so rare race condition
      that getting false wakeups twice in a row is impossible to reproduce.
      This full fix is needed for correctness, the only alternative would be
      to allow VM_FAULT_RETRY to be returned infinitely.  With this fix the WP
      support can stick to a strict "one more" VM_FAULT_RETRY logic (no need
      of returning it infinite times to avoid the SIGBUS).
      
      Link: http://lkml.kernel.org/r/20170111005535.13832-2-aarcange@redhat.com
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarShubham Kumar Sharma <shubham.kumar.sharma@oracle.com>
      Tested-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Michael Rapoport <RAPOPORT@il.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      15a77c6f
    • Arnd Bergmann's avatar
      drivers/memstick/core/memstick.c: avoid -Wnonnull warning · de182cc8
      Arnd Bergmann authored
      
      
      gcc-7 produces a harmless false-postive warning about a possible NULL
      pointer access:
      
        drivers/memstick/core/memstick.c: In function 'h_memstick_read_dev_id':
        drivers/memstick/core/memstick.c:309:3: error: argument 2 null where non-null expected [-Werror=nonnull]
           memcpy(mrq->data, buf, mrq->data_len);
      
      This can't happen because the caller sets the command to 'MS_TPC_READ_REG',
      which causes the data direction to be 'READ' and the NULL pointer not
      accessed.
      
      As a simple workaround for the warning, we can pass a pointer to the
      data that we actually want to read into.  This is not needed here, but
      also harmless, and lets the compiler know that the access is ok.
      
      Link: http://lkml.kernel.org/r/20170111144143.548867-1-arnd@arndb.de
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Alex Dubov <oakad@yahoo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de182cc8
    • Don Zickus's avatar
      kernel/watchdog: prevent false hardlockup on overloaded system · b94f5118
      Don Zickus authored
      
      
      On an overloaded system, it is possible that a change in the watchdog
      threshold can be delayed long enough to trigger a false positive.
      
      This can easily be achieved by having a cpu spinning indefinitely on a
      task, while another cpu updates watchdog threshold.
      
      What happens is while trying to park the watchdog threads, the hrtimers
      on the other cpus trigger and reprogram themselves with the new slower
      watchdog threshold.  Meanwhile, the nmi watchdog is still programmed
      with the old faster threshold.
      
      Because the one cpu is blocked, it prevents the thread parking on the
      other cpus from completing, which is needed to shutdown the nmi watchdog
      and reprogram it correctly.  As a result, a false positive from the nmi
      watchdog is reported.
      
      Fix this by setting a park_in_progress flag to block all lockups until
      the parking is complete.
      
      Fix provided by Ulrich Obergfell.
      
      [akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
      Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.com
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Reviewed-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b94f5118
    • Ross Zwisler's avatar
      dax: fix build warnings with FS_DAX and !FS_IOMAP · 6affb9d7
      Ross Zwisler authored
      As reported by Arnd:
      
        https://lkml.org/lkml/2017/1/10/756
      
      Compiling with the following configuration:
      
        # CONFIG_EXT2_FS is not set
        # CONFIG_EXT4_FS is not set
        # CONFIG_XFS_FS is not set
        # CONFIG_FS_IOMAP depends on the above filesystems, as is not set
        CONFIG_FS_DAX=y
      
      generates build warnings about unused functions in fs/dax.c:
      
        fs/dax.c:878:12: warning: `dax_insert_mapping' defined but not used [-Wunused-function]
         static int dax_insert_mapping(struct address_space *mapping,
                    ^~~~~~~~~~~~~~~~~~
        fs/dax.c:572:12: warning: `copy_user_dax' defined but not used [-Wunused-function]
         static int copy_user_dax(struct block_device *bdev, sector_t sector, size_t size,
                    ^~~~~~~~~~~~~
        fs/dax.c:542:12: warning: `dax_load_hole' defined but not used [-Wunused-function]
         static int dax_load_hole(struct address_space *mapping, void **entry,
                    ^~~~~~~~~~~~~
        fs/dax.c:312:14: warning: `grab_mapping_entry' define...
      6affb9d7
    • Keno Fischer's avatar
      mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thp · 8310d48b
      Keno Fischer authored
      In commit 19be0eaf
      
       ("mm: remove gup_flags FOLL_WRITE games from
      __get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE
      after a COW was resolved to setting the (newly introduced) FOLL_COW
      instead.  Simultaneously, the check in gup.c was updated to still allow
      writes with FOLL_FORCE set if FOLL_COW had also been set.
      
      However, a similar check in huge_memory.c was forgotten.  As a result,
      remote memory writes to ro regions of memory backed by transparent huge
      pages cause an infinite loop in the kernel (handle_mm_fault sets
      FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails
      out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is
      true.
      
      While in this state the process is stil SIGKILLable, but little else
      works (e.g.  no ptrace attach, no other signals).  This is easily
      reproduced with the following code (assuming thp are set to always):
      
          #include <assert.h>
          #include <fcntl.h>
          #include <stdint.h>
          #include <stdio.h>
          #include <string.h>
          #include <sys/mman.h>
          #include <sys/stat.h>
          #include <sys/types.h>
          #include <sys/wait.h>
          #include <unistd.h>
      
          #define TEST_SIZE 5 * 1024 * 1024
      
          int main(void) {
            int status;
            pid_t child;
            int fd = open("/proc/self/mem", O_RDWR);
            void *addr = mmap(NULL, TEST_SIZE, PROT_READ,
                              MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
            assert(addr != MAP_FAILED);
            pid_t parent_pid = getpid();
            if ((child = fork()) == 0) {
              void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE,
                                 MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
              assert(addr2 != MAP_FAILED);
              memset(addr2, 'a', TEST_SIZE);
              pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr);
              return 0;
            }
            assert(child == waitpid(child, &status, 0));
            assert(WIFEXITED(status) && WEXITSTATUS(status) == 0);
            return 0;
          }
      
      Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously
      to the update in gup.c in the original commit.  The same pattern exists
      in follow_devmap_pmd.  However, we should not be able to reach that
      check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we
      ever do.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.com
      Signed-off-by: default avatarKeno Fischer <keno@juliacomputing.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8310d48b
    • Yasuaki Ishimatsu's avatar
      memory_hotplug: make zone_can_shift() return a boolean value · 8a1f780e
      Yasuaki Ishimatsu authored
      online_{kernel|movable} is used to change the memory zone to
      ZONE_{NORMAL|MOVABLE} and online the memory.
      
      To check that memory zone can be changed, zone_can_shift() is used.
      Currently the function returns minus integer value, plus integer
      value and 0. When the function returns minus or plus integer value,
      it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
      
      But when the function returns 0, there are two meanings.
      
      One of the meanings is that the memory zone does not need to be changed.
      For example, when memory is in ZONE_NORMAL and onlined by online_kernel
      the memory zone does not need to be changed.
      
      Another meaning is that the memory zone cannot be changed. When memory
      is in ZONE_NORMAL and onlined by online_movable, the memory zone may
      not be changed to ZONE_MOVALBE due to memory online limitation(see
      Documentation/memory-hotplug.txt). In this case, memory must not be
      onlined.
      
      The patch changes the return type of zone_can_shift() so that memory
      online operation fails when memory zone cannot be changed as follows:
      
      Before applying patch:
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
         # echo online_movable > memory4097/state
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  8388608
                 managed  8388608
      
         online_movable operation succeeded. But memory is onlined as
         ZONE_NORMAL, not ZONE_MOVABLE.
      
      After applying patch:
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
         # echo online_movable > memory4097/state
         bash: echo: write error: Invalid argument
         # grep -A 35 "Node 2" /proc/zoneinfo
         Node 2, zone   Normal
         <snip>
            node_scanned  0
                 spanned  8388608
                 present  7864320
                 managed  7864320
      
         online_movable operation failed because of failure of changing
         the memory zone from ZONE_NORMAL to ZONE_MOVABLE
      
      Fixes: df429ac0
      
       ("memory-hotplug: more general validation of zone during online")
      Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: default avatarReza Arbab <arbab@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a1f780e
    • Will Deacon's avatar
      vring: Force use of DMA API for ARM-based systems with legacy devices · c7070619
      Will Deacon authored
      Booting Linux on an ARM fastmodel containing an SMMU emulation results
      in an unexpected I/O page fault from the legacy virtio-blk PCI device:
      
      [    1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received:
      [    1.211800] arm-smmu-v3 2b400000.smmu:	0x00000000fffff010
      [    1.211880] arm-smmu-v3 2b400000.smmu:	0x0000020800000000
      [    1.211959] arm-smmu-v3 2b400000.smmu:	0x00000008fa081002
      [    1.212075] arm-smmu-v3 2b400000.smmu:	0x0000000000000000
      [    1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received:
      [    1.212234] arm-smmu-v3 2b400000.smmu:	0x00000000fffff010
      [    1.212314] arm-smmu-v3 2b400000.smmu:	0x0000020800000000
      [    1.212394] arm-smmu-v3 2b400000.smmu:	0x00000008fa081000
      [    1.212471] arm-smmu-v3 2b400000.smmu:	0x0000000000000000
      
      <system hangs failing to read partition table>
      
      This is because the legacy virtio-blk device is behind an SMMU, so we
      have consequently swizzled its DMA ops and configured the SMMU to
      translate accesses. This then requires the vring code to use the DMA API
      to establish translations, otherwise all transactions will result in
      fatal faults and termination.
      
      Given that ARM-based systems only see an SMMU if one is really present
      (the topology is all described by firmware tables such as device-tree or
      IORT), then we can safely use the DMA API for all legacy virtio devices.
      Modern devices can advertise the prescense of an IOMMU using the
      VIRTIO_F_IOMMU_PLATFORM feature flag.
      
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: <stable@vger.kernel.org>
      Fixes: 876945db
      
       ("arm64: Hook up IOMMU dma_ops")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      c7070619