Skip to content
  1. Feb 11, 2021
  2. Feb 10, 2021
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · b8776f14
      David S. Miller authored
      
      
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-02-10
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 5 non-merge commits during the last 8 day(s) which contain
      a total of 3 files changed, 22 insertions(+), 21 deletions(-).
      
      The main changes are:
      
      1) Fix missed execution of kprobes BPF progs when kprobe is firing via
         int3, from Alexei Starovoitov.
      
      2) Fix potential integer overflow in map max_entries for stackmap on
         32 bit archs, from Bui Quang Minh.
      
      3) Fix a verifier pruning and a insn rewrite issue related to 32 bit ops,
         from Daniel Borkmann.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c# Please enter a commit message to explain why this merge is necessary,
      b8776f14
    • Johannes Weiner's avatar
      Revert "mm: memcontrol: avoid workload stalls when lowering memory.high" · e82553c1
      Johannes Weiner authored
      This reverts commit 536d3bf2, as it can
      cause writers to memory.high to get stuck in the kernel forever,
      performing page reclaim and consuming excessive amounts of CPU cycles.
      
      Before the patch, a write to memory.high would first put the new limit
      in place for the workload, and then reclaim the requested delta.  After
      the patch, the kernel tries to reclaim the delta before putting the new
      limit into place, in order to not overwhelm the workload with a sudden,
      large excess over the limit.  However, if reclaim is actively racing
      with new allocations from the uncurbed workload, it can keep the write()
      working inside the kernel indefinitely.
      
      This is causing problems in Facebook production.  A privileged
      system-level daemon that adjusts memory.high for various workloads
      running on a host can get unexpectedly stuck in the kernel and
      essentially turn into a sort of involuntary kswapd for one of the
      workloads.  We've observed that daemon busy-spin in a write() for
      minutes at a time, neglecting its other duties on the system, and
      expending privileged system resources on behalf of a workload.
      
      To remedy this, we have first considered changing the reclaim logic to
      break out after a couple of loops - whether the workload has converged
      to the new limit or not - and bound the write() call this way.  However,
      the root cause that inspired the sequence change in the first place has
      been fixed through other means, and so a revert back to the proven
      limit-setting sequence, also used by memory.max, is preferable.
      
      The sequence was changed to avoid extreme latencies in the workload when
      the limit was lowered: the sudden, large excess created by the limit
      lowering would erroneously trigger the penalty sleeping code that is
      meant to throttle excessive growth from below.  Allocating threads could
      end up sleeping long after the write() had already reclaimed the delta
      for which they were being punished.
      
      However, erroneous throttling also caused problems in other scenarios at
      around the same time.  This resulted in commit b3ff9291 ("mm, memcg:
      reclaim more aggressively before high allocator throttling"), included
      in the same release as the offending commit.  When allocating threads
      now encounter large excess caused by a racing write() to memory.high,
      instead of entering punitive sleeps, they will simply be tasked with
      helping reclaim down the excess, and will be held no longer than it
      takes to accomplish that.  This is in line with regular limit
      enforcement - i.e.  if the workload allocates up against or over an
      otherwise unchanged limit from below.
      
      With the patch breaking userspace, and the root cause addressed by other
      means already, revert it again.
      
      Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org
      Fixes: 536d3bf2
      
       ("mm: memcontrol: avoid workload stalls when lowering memory.high")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChris Down <chris@chrisdown.name>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Michal Koutný <mkoutny@suse.com>
      Cc: <stable@vger.kernel.org>	[5.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e82553c1
    • Andrey Ryabinin's avatar
      MAINTAINERS: update Andrey Ryabinin's email address · a0c2eb0a
      Andrey Ryabinin authored
      
      
      Update my email, @virtuozzo.com will stop working shortly.
      
      Link: https://lkml.kernel.org/r/20210204223904.3824-1-ryabinin.a.a@gmail.com
      Signed-off-by: default avatarAndrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0c2eb0a
    • Rong Chen's avatar
      selftests/vm: rename file run_vmtests to run_vmtests.sh · d52db800
      Rong Chen authored
      Commit c2aa8afc has renamed run_vmtests in Makefile, but the file
      still uses the old name.
      
      The kernel test robot reported the following issue:
      
        # selftests: vm: run_vmtests.sh
        # Warning: file run_vmtests.sh is missing!
        not ok 1 selftests: vm: run_vmtests.sh
      
      Link: https://lkml.kernel.org/r/20210205085507.1479894-1-rong.a.chen@intel.com
      Fixes: c2aa8afc
      
       (selftests/vm: rename run_vmtests --> run_vmtests.sh)
      Signed-off-by: default avatarRong Chen <rong.a.chen@intel.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d52db800
    • Seth Forshee's avatar
      tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha · ad69c389
      Seth Forshee authored
      As with s390, alpha is a 64-bit architecture with a 32-bit ino_t.  With
      CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
      display "inode64" in the mount options, whereas passing "inode64" in the
      mount options will fail.  This leads to erroneous behaviours such as
      this:
      
        # mkdir mnt
        # mount -t tmpfs nodev mnt
        # mount -o remount,rw mnt
        mount: /home/ubuntu/mnt: mount point not mounted or bad option.
      
      Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.
      
      Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com
      Fixes: ea3271f7
      
       ("tmpfs: support 64-bit inums per-sb")
      Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Chris Down <chris@chrisdown.name>
      Cc: Amir Goldstein <amir73il@gmail.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: <stable@vger.kernel.org>	[5.9+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ad69c389
    • Seth Forshee's avatar
      tmpfs: disallow CONFIG_TMPFS_INODE64 on s390 · b85a7a8b
      Seth Forshee authored
      Currently there is an assumption in tmpfs that 64-bit architectures also
      have a 64-bit ino_t.  This is not true on s390 which has a 32-bit ino_t.
      With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers
      and display "inode64" in the mount options, but passing the "inode64"
      mount option will fail.  This leads to the following behavior:
      
        # mkdir mnt
        # mount -t tmpfs nodev mnt
        # mount -o remount,rw mnt
        mount: /home/ubuntu/mnt: mount point not mounted or bad option.
      
      As mount sees "inode64" in the mount options and thus passes it in the
      options for the remount.
      
      So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
      
      Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
      Fixes: ea3271f7
      
       ("tmpfs: support 64-bit inums per-sb")
      Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Chris Down <chris@chrisdown.name>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Amir Goldstein <amir73il@gmail.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: <stable@vger.kernel.org>	[5.9+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b85a7a8b
    • Arnd Bergmann's avatar
      mm/mremap: fix BUILD_BUG_ON() error in get_extent · a30a2909
      Arnd Bergmann authored
      clang can't evaluate this function argument at compile time when the
      function is not inlined, which leads to a link time failure:
      
        ld.lld: error: undefined symbol: __compiletime_assert_414
        >>> referenced by mremap.c
        >>>               mremap.o:(get_extent) in archive mm/built-in.a
      
      Mark the function as __always_inline to avoid it.
      
      Link: https://lkml.kernel.org/r/20201230154104.522605-1-arnd@kernel.org
      Fixes: 9ad9718b
      
       ("mm/mremap: calculate extent in one place")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Tested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Cc: Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Brian Geffon <bgeffon@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a30a2909