Skip to content
  1. Feb 22, 2018
    • Mike Rapoport's avatar
      mm/swap.c: make functions and their kernel-doc agree (again) · cb6f0f34
      Mike Rapoport authored
      There was a conflict between the commit e02a9f04 ("mm/swap.c: make
      functions and their kernel-doc agree") and the commit f144c390 ("mm:
      docs: fix parameter names mismatch") that both tried to fix mismatch
      betweeen pagevec_lookup_entries() parameter names and their description.
      
      Since nr_entries is a better name for the parameter, fix the description
      again.
      
      Link: http://lkml.kernel.org/r/1518116946-20947-1-git-send-email-rppt@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb6f0f34
    • Mike Rapoport's avatar
      14fec9eb
    • Rasmus Villemoes's avatar
      ida: do zeroing in ida_pre_get() · b1a8a7a7
      Rasmus Villemoes authored
      As far as I can tell, the only place the per-cpu ida_bitmap is populated
      is in ida_pre_get.  The pre-allocated element is stolen in two places in
      ida_get_new_above, in both cases immediately followed by a memset(0).
      
      Since ida_get_new_above is called with locks held, do the zeroing in
      ida_pre_get, or rather let kmalloc() do it.  Also, apparently gcc
      generates ~44 bytes of code to do a memset(, 0, 128):
      
        $ scripts/bloat-o-meter vmlinux.{0,1}
        add/remove: 0/0 grow/shrink: 2/1 up/down: 5/-88 (-83)
        Function                                     old     new   delta
        ida_pre_get                                  115     119      +4
        vermagic                                      27      28      +1
        ida_get_new_above                            715     627     -88
      
      Link: http://lkml.kernel.org/r/20180108225634.15340-1-linux@rasmusvillemoes.dk
      
      
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1a8a7a7
    • Huang Ying's avatar
      mm, swap, frontswap: fix THP swap if frontswap enabled · 7ba71669
      Huang Ying authored
      It was reported by Sergey Senozhatsky that if THP (Transparent Huge
      Page) and frontswap (via zswap) are both enabled, when memory goes low
      so that swap is triggered, segfault and memory corruption will occur in
      random user space applications as follow,
      
      kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
       #0  0x00007fc08889ae0d _int_malloc (libc.so.6)
       #1  0x00007fc08889c2f3 malloc (libc.so.6)
       #2  0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
       #3  0x0000560e6005e75c n/a (urxvt)
       #4  0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
       #5  0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
       #6  0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
       #7  0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
       #8  0x0000560e6005cb55 ev_run (urxvt)
       #9  0x0000560e6003b9b9 main (urxvt)
       #10 0x00007fc08883af4a __libc_start_main (libc.so.6)
       #11 0x0000560e6003f9da _start (urxvt)
      
      After bisection, it was found the first bad commit is bd4c82c2 ("mm,
      THP, swap: delay splitting THP after swapped out").
      
      The root cause is as follows:
      
      When the pages are written to swap device during swapping out in
      swap_writepage(), zswap (fontswap) is tried to compress the pages to
      improve performance.  But zswap (frontswap) will treat THP as a normal
      page, so only the head page is saved.  After swapping in, tail pages
      will not be restored to their original contents, causing memory
      corruption in the applications.
      
      This is fixed by refusing to save page in the frontswap store functions
      if the page is a THP.  So that the THP will be swapped out to swap
      device.
      
      Another choice is to split THP if frontswap is enabled.  But it is found
      that the frontswap enabling isn't flexible.  For example, if
      CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
      zswap itself isn't enabled.
      
      Frontswap has multiple backends, to make it easy for one backend to
      enable THP support, the THP checking is put in backend frontswap store
      functions instead of the general interfaces.
      
      Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com
      Fixes: bd4c82c2
      
       ("mm, THP, swap: delay splitting THP after swapped out")
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Suggested-by: Minchan Kim <minchan@kernel.org>	[put THP checking in backend]
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: <stable@vger.kernel.org>	[4.14]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ba71669
    • Andi Kleen's avatar
    • David Rientjes's avatar
      kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE · 88913bd8
      David Rientjes authored
      chan->n_subbufs is set by the user and relay_create_buf() does a kmalloc()
      of chan->n_subbufs * sizeof(size_t *).
      
      kmalloc_slab() will generate a warning when this fails if
      chan->subbufs * sizeof(size_t *) > KMALLOC_MAX_SIZE.
      
      Limit chan->n_subbufs to the maximum allowed kmalloc() size.
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802061216100.122576@chino.kir.corp.google.com
      Fixes: f6302f1b
      
       ("relay: prevent integer overflow in relay_open()")
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      88913bd8
    • Shakeel Butt's avatar
      mm, mlock, vmscan: no more skipping pagevecs · 9c4e6b1a
      Shakeel Butt authored
      When a thread mlocks an address space backed either by file pages which
      are currently not present in memory or swapped out anon pages (not in
      swapcache), a new page is allocated and added to the local pagevec
      (lru_add_pvec), I/O is triggered and the thread then sleeps on the page.
      On I/O completion, the thread can wake on a different CPU, the mlock
      syscall will then sets the PageMlocked() bit of the page but will not be
      able to put that page in unevictable LRU as the page is on the pagevec
      of a different CPU.  Even on drain, that page will go to evictable LRU
      because the PageMlocked() bit is not checked on pagevec drain.
      
      The page will eventually go to right LRU on reclaim but the LRU stats
      will remain skewed for a long time.
      
      This patch puts all the pages, even unevictable, to the pagevecs and on
      the drain, the pages will be added on their LRUs correctly by checking
      their evictability.  This resolves the mlocked pages on pagevec of other
      CPUs issue because when those pagevecs will be drained, the mlocked file
      pages will go to unevictable LRU.  Also this makes the race with munlock
      easier to resolve because the pagevec drains happen in LRU lock.
      
      However there is still one place which makes a page evictable and does
      PageLRU check on that page without LRU lock and needs special attention.
      TestClearPageMlocked() and isolate_lru_page() in clear_page_mlock().
      
      	#0: __pagevec_lru_add_fn	#1: clear_page_mlock
      
      	SetPageLRU()			if (!TestClearPageMlocked())
      					  return
      	smp_mb() // <--required
      					// inside does PageLRU
      	if (!PageMlocked())		if (isolate_lru_page())
      	  move to evictable LRU		  putback_lru_page()
      	else
      	  move to unevictable LRU
      
      In '#1', TestClearPageMlocked() provides full memory barrier semantics
      and thus the PageLRU check (inside isolate_lru_page) can not be
      reordered before it.
      
      In '#0', without explicit memory barrier, the PageMlocked() check can be
      reordered before SetPageLRU().  If that happens, '#0' can put a page in
      unevictable LRU and '#1' might have just cleared the Mlocked bit of that
      page but fails to isolate as PageLRU fails as '#0' still hasn't set
      PageLRU bit of that page.  That page will be stranded on the unevictable
      LRU.
      
      There is one (good) side effect though.  Without this patch, the pages
      allocated for System V shared memory segment are added to evictable LRUs
      even after shmctl(SHM_LOCK) on that segment.  This patch will correctly
      put such pages to unevictable LRU.
      
      Link: http://lkml.kernel.org/r/20171121211241.18877-1-shakeelb@google.com
      
      
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9c4e6b1a
    • Johannes Weiner's avatar
      mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats · c3cc3911
      Johannes Weiner authored
      After commit a983b5eb ("mm: memcontrol: fix excessive complexity in
      memory.stat reporting"), we observed slowly upward creeping NR_WRITEBACK
      counts over the course of several days, both the per-memcg stats as well
      as the system counter in e.g.  /proc/meminfo.
      
      The conversion from full per-cpu stat counts to per-cpu cached atomic
      stat counts introduced an irq-unsafe RMW operation into the updates.
      
      Most stat updates come from process context, but one notable exception
      is the NR_WRITEBACK counter.  While writebacks are issued from process
      context, they are retired from (soft)irq context.
      
      When writeback completions interrupt the RMW counter updates of new
      writebacks being issued, the decs from the completions are lost.
      
      Since the global updates are routed through the joint lruvec API, both
      the memcg counters as well as the system counters are affected.
      
      This patch makes the joint stat and event API irq safe.
      
      Link: http://lkml.kernel.org/r/20180203082353.17284-1-hannes@cmpxchg.org
      Fixes: a983b5eb
      
       ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Debugged-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarRik van Riel <riel@surriel.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c3cc3911
    • Arnd Bergmann's avatar
      Kbuild: always define endianess in kconfig.h · 101110f6
      Arnd Bergmann authored
      Build testing with LTO found a couple of files that get compiled
      differently depending on whether asm/byteorder.h gets included early
      enough or not.  In particular, include/asm-generic/qrwlock_types.h is
      affected by this, but there are probably others as well.
      
      The symptom is a series of LTO link time warnings, including these:
      
          net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch]
           int netlbl_unlhsh_add(struct net *net,
          net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here
      
          include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch]
           ipv6_renew_options_kern(struct sock *sk,
          net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here
      
          net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here
           struct net_device *dev_get_by_name_rcu(struct net *net, const char *name)
          net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used
      
          drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch]
           i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
          drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here
      
          include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch]
           ssize_t debugfs_attr_read(struct file *file, char __user *buf,
          fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here
      
          include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch]
           void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock);
          kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here
      
          include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch]
           int simple_attr_open(struct inode *inode, struct file *file,
          fs/libfs.c:795: note: 'simple_attr_open' was previously declared here
      
      All of the above are caused by include/asm-generic/qrwlock_types.h
      failing to include asm/byteorder.h after commit e0d02285
      ("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
      in linux-4.15.
      
      Similar bugs may or may not exist in older kernels as well, but there is
      no easy way to test those with link-time optimizations, and kernels
      before 4.14 are harder to fix because they don't have Babu's patch
      series
      
      We had similar issues with CONFIG_ symbols in the past and ended up
      always including the configuration headers though linux/kconfig.h.  This
      works around the issue through that same file, defining either
      __BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN,
      which is now always set on all architectures since commit 4c97a0c8
      ("arch: define CPU_BIG_ENDIAN for all fixed big endian archs").
      
      Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.de
      
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Babu Moger <babu.moger@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      101110f6
    • Andrew Morton's avatar
      include/linux/sched/mm.h: re-inline mmdrop() · d34bc48f
      Andrew Morton authored
      As Peter points out, Doing a CALL+RET for just the decrement is a bit silly.
      
      Fixes: d70f2a14
      
       ("include/linux/sched/mm.h: uninline mmdrop_async(), etc")
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infraded.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d34bc48f
    • Martin Kelly's avatar
      tools: fix cross-compile var clobbering · 7ed1c190
      Martin Kelly authored
      Currently a number of Makefiles break when used with toolchains that
      pass extra flags in CC and other cross-compile related variables (such
      as --sysroot).
      
      Thus we get this error when we use a toolchain that puts --sysroot in
      the CC var:
      
        ~/src/linux/tools$ make iio
        [snip]
        iio_event_monitor.c:18:10: fatal error: unistd.h: No such file or directory
          #include <unistd.h>
                   ^~~~~~~~~~
      
      This occurs because we clobber several env vars related to
      cross-compiling with lines like this:
      
        CC = $(CROSS_COMPILE)gcc
      
      Although this will point to a valid cross-compiler, we lose any extra
      flags that might exist in the CC variable, which can break toolchains
      that rely on them (for example, those that use --sysroot).
      
      This easily shows up using a Yocto SDK:
      
        $ . [snip]/sdk/environment-setup-cortexa8hf-neon-poky-linux-gnueabi
      
        $ echo $CC
        arm-poky-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
        -mcpu=cortex-a8
        --sysroot=[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi
      
        $ echo $CROSS_COMPILE
        arm-poky-linux-gnueabi-
      
        $ echo ${CROSS_COMPILE}gcc
        krm-poky-linux-gnueabi-gcc
      
      Although arm-poky-linux-gnueabi-gcc is a cross-compiler, we've lost the
      --sysroot and other flags that enable us to find the right libraries to
      link against, so we can't find unistd.h and other libraries and headers.
      Normally with the --sysroot flag we would find unistd.h in the sdk
      directory in the sysroot:
      
        $ find [snip]/sdk/sysroots -path '*/usr/include/unistd.h'
        [snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/include/unistd.h
      
      The perf Makefile adds CC = $(CROSS_COMPILE)gcc if and only if CC is not
      already set, and it compiles correctly with the above toolchain.
      
      So, generalize the logic that perf uses in the common Makefile and
      remove the manual CC = $(CROSS_COMPILE)gcc lines from each Makefile.
      
      Note that this patch does not fix cross-compile for all the tools (some
      have other bugs), but it does fix it for all except usb and acpi, which
      still have other unrelated issues.
      
      I tested both with and without the patch on native and cross-build and
      there appear to be no regressions.
      
      Link: http://lkml.kernel.org/r/20180107214028.23771-1-martin@martingkelly.com
      
      
      Signed-off-by: default avatarMartin Kelly <martin@martingkelly.com>
      Acked-by: default avatarMark Brown <broonie@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Jonathan Cameron <jic23@kernel.org>
      Cc: Pali Rohar <pali.rohar@gmail.com>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Robert Moore <robert.moore@intel.com>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Valentina Manea <valentina.manea.m@gmail.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Mario Limonciello <mario.limonciello@dell.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ed1c190
  2. Feb 21, 2018
  3. Feb 20, 2018
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 79c0ef3e
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Prevent index integer overflow in ptr_ring, from Jason Wang.
      
       2) Program mvpp2 multicast filter properly, from Mikulas Patocka.
      
       3) The bridge brport attribute file is write only and doesn't have a
          ->show() method, don't blindly invoke it. From Xin Long.
      
       4) Inverted mask used in genphy_setup_forced(), from Ingo van Lil.
      
       5) Fix multiple definition issue with if_ether.h UAPI header, from
          Hauke Mehrtens.
      
       6) Fix GFP_KERNEL usage in atomic in RDS protocol code, from Sowmini
          Varadhan.
      
       7) Revert XDP redirect support from thunderx driver, it is not
          implemented properly. From Jesper Dangaard Brouer.
      
       8) Fix missing RTNL protection across some tipc operations, from Ying
          Xue.
      
       9) Return the correct IV bytes in the TLS getsockopt code, from Boris
          Pismenny.
      
      10) Take tclassid into consideration properly when doing FIB rule
          matching. From Stefano Brivio.
      
      11) cxgb4 device needs more PCI VPD quirks, from Casey Leedom.
      
      12) TUN driver doesn't align frags properly, and we can end up doing
          unaligned atomics on misaligned metadata. From Eric Dumazet.
      
      13) Fix various crashes found using DEBUG_PREEMPT in rmnet driver, from
          Subash Abhinov Kasiviswanathan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
        tg3: APE heartbeat changes
        mlxsw: spectrum_router: Do not unconditionally clear route offload indication
        net: qualcomm: rmnet: Fix possible null dereference in command processing
        net: qualcomm: rmnet: Fix warning seen with 64 bit stats
        net: qualcomm: rmnet: Fix crash on real dev unregistration
        sctp: remove the left unnecessary check for chunk in sctp_renege_events
        rxrpc: Work around usercopy check
        tun: fix tun_napi_alloc_frags() frag allocator
        udplite: fix partial checksum initialization
        skbuff: Fix comment mis-spelling.
        dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock
        PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
        cxgb4: fix trailing zero in CIM LA dump
        cxgb4: free up resources of pf 0-3
        fib_semantics: Don't match route with mismatching tclassid
        NFC: llcp: Limit size of SDP URI
        tls: getsockopt return record sequence number
        tls: reset the crypto info if copy_from_user fails
        tls: retrun the correct IV in getsockopt
        docs: segmentation-offloads.txt: add SCTP info
        ...
      79c0ef3e
    • Jacek Anaszewski's avatar
      MAINTAINERS: Remove Richard Purdie from LED maintainers · a988681d
      Jacek Anaszewski authored
      
      
      Richard has been inactive on the linux-leds list for a long time.
      After email discussion we agreed on removing him from
      the LED maintainers, which will better reflect the actual status.
      
      Acked-by: default avatarRichard Purdie <rpurdie@rpsys.net>
      Signed-off-by: default avatarJacek Anaszewski <jacek.anaszewski@gmail.com>
      a988681d
    • Prashant Sreedharan's avatar
      tg3: APE heartbeat changes · 506b0a39
      Prashant Sreedharan authored
      
      
      In ungraceful host shutdown or driver crash case BMC connectivity is
      lost. APE firmware is missing the driver state in this
      case to keep the BMC connectivity alive.
      This patch has below change to address this issue.
      
      Heartbeat mechanism with APE firmware. This heartbeat mechanism
      is needed to notify the APE firmware about driver state.
      
      This patch also has the change in wait time for APE event from
      1ms to 20ms as there can be some delay in getting response.
      
      v2: Drop inline keyword as per David suggestion.
      
      Signed-off-by: default avatarPrashant Sreedharan <prashant.sreedharan@broadcom.com>
      Signed-off-by: default avatarSatish Baddipadige <satish.baddipadige@broadcom.com>
      Signed-off-by: default avatarSiva Reddy Kallam <siva.kallam@broadcom.com>
      Acked-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      506b0a39
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Do not unconditionally clear route offload indication · d1c95af3
      Ido Schimmel authored
      When mlxsw replaces (or deletes) a route it removes the offload
      indication from the replaced route. This is problematic for IPv4 routes,
      as the offload indication is stored in the fib_info which is usually
      shared between multiple routes.
      
      Instead of unconditionally clearing the offload indication, only clear
      it if no other route is using the fib_info.
      
      Fixes: 3984d1a8
      
       ("mlxsw: spectrum_router: Provide offload indication using nexthop flags")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarAlexander Petrovskiy <alexpe@mellanox.com>
      Tested-by: default avatarAlexander Petrovskiy <alexpe@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1c95af3
    • David S. Miller's avatar
      Merge branch 'qualcomm-rmnet-Fix-issues-with-CONFIG_DEBUG_PREEMPT-enabled' · cae69256
      David S. Miller authored
      
      
      Subash Abhinov Kasiviswanathan says:
      
      ====================
      net: qualcomm: rmnet: Fix issues with CONFIG_DEBUG_PREEMPT enabled
      
      Patch 1 and 2 fixes issues identified when CONFIG_DEBUG_PREEMPT was
      enabled. These involve APIs which were called in invalid contexts.
      
      Patch 3 is a null derefence fix identified by code inspection.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cae69256
    • Subash Abhinov Kasiviswanathan's avatar
      net: qualcomm: rmnet: Fix possible null dereference in command processing · f57bbaae
      Subash Abhinov Kasiviswanathan authored
      If a command packet with invalid mux id is received, the packet would
      not have a valid endpoint. This invalid endpoint maybe dereferenced
      leading to a crash. Identified by manual code inspection.
      
      Fixes: 3352e6c4
      
       ("net: qualcomm: rmnet: Convert the muxed endpoint to hlist")
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f57bbaae
    • Subash Abhinov Kasiviswanathan's avatar
      net: qualcomm: rmnet: Fix warning seen with 64 bit stats · 4dba8bbc
      Subash Abhinov Kasiviswanathan authored
      With CONFIG_DEBUG_PREEMPT enabled, a warning was seen on device
      creation. This occurs due to the incorrect cpu API usage in
      ndo_get_stats64 handler.
      
      BUG: using smp_processor_id() in preemptible [00000000] code: rmnetcli/5743
      caller is debug_smp_processor_id+0x1c/0x24
      Call trace:
      [<ffffff9d48c8967c>] dump_backtrace+0x0/0x2a8
      [<ffffff9d48c89bbc>] show_stack+0x20/0x28
      [<ffffff9d4901fff8>] dump_stack+0xa8/0xe0
      [<ffffff9d490421e0>] check_preemption_disabled+0x104/0x108
      [<ffffff9d49042200>] debug_smp_processor_id+0x1c/0x24
      [<ffffff9d494a36b0>] rmnet_get_stats64+0x64/0x13c
      [<ffffff9d49b014e0>] dev_get_stats+0x68/0xd8
      [<ffffff9d49d58df8>] rtnl_fill_stats+0x54/0x140
      [<ffffff9d49b1f0b8>] rtnl_fill_ifinfo+0x428/0x9cc
      [<ffffff9d49b23834>] rtmsg_ifinfo_build_skb+0x80/0xf4
      [<ffffff9d49b23930>] rtnetlink_event+0x88/0xb4
      [<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
      [<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
      [<ffffff9d49b08bf8>] __netdev_upper_dev_link+0x290/0x5e8
      [<ffffff9d49b08fcc>] netdev_master_upper_dev_link+0x3c/0x48
      [<ffffff9d494a2e74>] rmnet_newlink+0xf0/0x1c8
      [<ffffff9d49b23360>] rtnl_newlink+0x57c/0x6c8
      [<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
      [<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
      [<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
      [<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
      [<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
      [<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
      [<ffffff9d49ae91bc>] SyS_sendto+0x1a0/0x1e4
      [<ffffff9d48c83770>] el0_svc_naked+0x24/0x28
      
      Fixes: 192c4b5d
      
       ("net: qualcomm: rmnet: Add support for 64 bit stats")
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4dba8bbc
    • Subash Abhinov Kasiviswanathan's avatar
      net: qualcomm: rmnet: Fix crash on real dev unregistration · b37f78f2
      Subash Abhinov Kasiviswanathan authored
      With CONFIG_DEBUG_PREEMPT enabled, a crash with the following call
      stack was observed when removing a real dev which had rmnet devices
      attached to it.
      To fix this, remove the netdev_upper link APIs and instead use the
      existing information in rmnet_port and rmnet_priv to get the
      association between real and rmnet devs.
      
      BUG: sleeping function called from invalid context
      in_atomic(): 0, irqs_disabled(): 0, pid: 5762, name: ip
      Preemption disabled at:
      [<ffffff9d49043564>] debug_object_active_state+0xa4/0x16c
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      PC is at ___might_sleep+0x13c/0x180
      LR is at ___might_sleep+0x17c/0x180
      [<ffffff9d48ce0924>] ___might_sleep+0x13c/0x180
      [<ffffff9d48ce09c0>] __might_sleep+0x58/0x8c
      [<ffffff9d49d6253c>] mutex_lock+0x2c/0x48
      [<ffffff9d48ed4840>] kernfs_remove_by_name_ns+0x48/0xa8
      [<ffffff9d48ed6ec8>] sysfs_remove_link+0x30/0x58
      [<ffffff9d49b05840>] __netdev_adjacent_dev_remove+0x14c/0x1e0
      [<ffffff9d49b05914>] __netdev_adjacent_dev_unlink_lists+0x40/0x68
      [<ffffff9d49b08820>] netdev_upper_dev_unlink+0xb4/0x1fc
      [<ffffff9d494a29f0>] rmnet_dev_walk_unreg+0x6c/0xc8
      [<ffffff9d49b00b40>] netdev_walk_all_lower_dev_rcu+0x58/0xb4
      [<ffffff9d494a30fc>] rmnet_config_notify_cb+0xf4/0x134
      [<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
      [<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
      [<ffffff9d49b0b568>] rollback_registered_many+0x230/0x3c8
      [<ffffff9d49b0b738>] unregister_netdevice_many+0x38/0x94
      [<ffffff9d49b1e110>] rtnl_delete_link+0x58/0x88
      [<ffffff9d49b201dc>] rtnl_dellink+0xbc/0x1cc
      [<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
      [<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
      [<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
      [<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
      [<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
      [<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
      [<ffffff9d49ae6f94>] ___sys_sendmsg+0x298/0x2b0
      [<ffffff9d49ae98f8>] SyS_sendmsg+0xb4/0xf0
      [<ffffff9d48c83770>] el0_svc_naked+0x24/0x28
      
      Fixes: ceed73a2 ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
      Fixes: 60d58f97
      
       ("net: qualcomm: rmnet: Implement bridge mode")
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b37f78f2
  4. Feb 19, 2018
    • Linus Torvalds's avatar
      Linux 4.16-rc2 · 91ab883e
      Linus Torvalds authored
      91ab883e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0e06fb5b
      Linus Torvalds authored
      Pull x86 Kconfig fixes from Thomas Gleixner:
       "Three patchlets to correct HIGHMEM64G and CMPXCHG64 dependencies in
        Kconfig when CPU selections are explicitely set to M586 or M686"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig
        x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group
        x86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group
      0e06fb5b
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9ca2c16f
      Linus Torvalds authored
      Pull perf updates from Thomas Gleixner:
       "Perf tool updates and kprobe fixes:
      
         - perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf
           top' using it, making it bearable to use it in large core count
           systems such as Knights Landing/Mill Intel systems (Kan Liang)
      
         - s/390 now uses syscall.tbl, just like x86-64 to generate the
           syscall table id -> string tables used by 'perf trace' (Hendrik
           Brueckner)
      
         - Use strtoull() instead of home grown function (Andy Shevchenko)
      
         - Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)
      
         - Document missing 'perf data --force' option (Sangwon Hong)
      
         - Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William
           Cohen)
      
         - Improve error handling and error propagation of ftrace based
           kprobes so failures when installing kprobes are not silently
           ignored and create disfunctional tracepoints"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
        kprobes: Propagate error from disarm_kprobe_ftrace()
        kprobes: Propagate error from arm_kprobe_ftrace()
        Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
        perf s390: Rework system call table creation by using syscall.tbl
        perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
        tools/headers: Synchronize kernel ABI headers, v4.16-rc1
        perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
        perf data: Document missing --force option
        perf tools: Substitute yet another strtoull()
        perf top: Check the latency of perf_top__mmap_read()
        perf top: Switch default mode to overwrite mode
        perf top: Remove lost events checking
        perf hists browser: Add parameter to disable lost event warning
        perf top: Add overwrite fall back
        perf evsel: Expose the perf_missing_features struct
        perf top: Check per-event overwrite term
        perf mmap: Discard legacy interface for mmap read
        perf test: Update mmap read functions for backward-ring-buffer test
        perf mmap: Introduce perf_mmap__read_event()
        perf mmap: Introduce perf_mmap__read_done()
        ...
      9ca2c16f
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2d6c4e40
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "A small set of updates mostly for irq chip drivers:
      
         - MIPS GIC fix for spurious, masked interrupts
      
         - fix for a subtle IPI bug in GICv3
      
         - do not probe GICv3 ITSs that are marked as disabled
      
         - multi-MSI support for GICv2m
      
         - various small cleanups"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
        irqchip/bcm: Remove hashed address printing
        irqchip/gic-v2m: Add PCI Multi-MSI support
        irqchip/gic-v3: Ignore disabled ITS nodes
        irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
        irqchip/gic-v3: Change pr_debug message to pr_devel
        irqchip/mips-gic: Avoid spuriously handling masked interrupts
      2d6c4e40
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 59e47215
      Linus Torvalds authored
      Pull core fix from Thomas Gleixner:
       "A small fix which adds the missing for_each_cpu_wrap() stub for the UP
        case to avoid build failures"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpumask: Make for_each_cpu_wrap() available on UP as well
      59e47215
  5. Feb 18, 2018
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180217' of git://git.kernel.dk/linux-block · c786427f
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request from Keith, with fixes all over the map for nvme.
         From various folks.
      
       - Classic polling fix, that avoids a latency issue where we still end
         up waiting for an interrupt in some cases. From Nitesh Shetty.
      
       - Comment typo fix from Minwoo Im.
      
      * tag 'for-linus-20180217' of git://git.kernel.dk/linux-block:
        block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS
        nvme-rdma: fix sysfs invoked reset_ctrl error flow
        nvmet: Change return code of discard command if not supported
        nvme-pci: Fix timeouts in connecting state
        nvme-pci: Remap CMB SQ entries on every controller reset
        nvme: fix the deadlock in nvme_update_formats
        blk: optimization for classic polling
        nvme: Don't use a stack buffer for keep-alive command
        nvme_fc: cleanup io completion
        nvme_fc: correct abort race condition on resets
        nvme: Fix discard buffer overrun
        nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition
        nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
        nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING
      c786427f
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · fa2139ef
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
      
       - meson-gx: Revert to earlier tuning process
      
       - bcm2835: Don't overwrite max frequency unconditionally
      
      * tag 'mmc-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: bcm2835: Don't overwrite max frequency unconditionally
        Revert "mmc: meson-gx: include tx phase in the tuning process"
      fa2139ef
    • Linus Torvalds's avatar
      Merge tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd · 4b6415f9
      Linus Torvalds authored
      Pull mtd fixes from Boris Brezillon:
      
       - add missing dependency to NAND_MARVELL Kconfig entry
      
       - use the appropriate OOB layout in the VF610 driver
      
      * tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd:
        mtd: nand: MTD_NAND_MARVELL should depend on HAS_DMA
        mtd: nand: vf610: set correct ooblayout
      4b6415f9
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ee78ad78
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "The main attraction is a fix for a bug in the new drmem code, which
        was causing an oops on boot on some versions of Qemu.
      
        There's also a fix for XIVE (Power9 interrupt controller) on KVM, as
        well as a few other minor fixes.
      
        Thanks to: Corentin Labbe, Cyril Bur, Cédric Le Goater, Daniel Black,
        Nathan Fontenot, Nicholas Piggin"
      
      * tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/pseries: Check for zero filled ibm,dynamic-memory property
        powerpc/pseries: Add empty update_numa_cpu_lookup_table() for NUMA=n
        powerpc/powernv: IMC fix out of bounds memory access at shutdown
        powerpc/xive: Use hw CPU ids when configuring the CPU queues
        powerpc: Expose TSCR via sysfs only on powernv
      ee78ad78
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 74688a02
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
       "The bulk of this is the pte accessors annotation to READ/WRITE_ONCE
        (we tried to avoid pushing this during the merge window to avoid
        conflicts)
      
         - Updated the page table accessors to use READ/WRITE_ONCE and prevent
           compiler transformation that could lead to an apparent loss of
           coherency
      
         - Enabled branch predictor hardening for the Falkor CPU
      
         - Fix interaction between kpti enabling and KASan causing the
           recursive page table walking to take a significant time
      
         - Fix some sparse warnings"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: cputype: Silence Sparse warnings
        arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
        arm64: proc: Set PTE_NG for table entries to avoid traversing them twice
        arm64: Add missing Falkor part number for branch predictor hardening
      74688a02
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.16a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · f73f047d
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - fixes for the Xen pvcalls frontend driver
      
       - fix for booting Xen pv domains
      
       - fix for the xenbus driver user interface
      
      * tag 'for-linus-4.16a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        pvcalls-front: wait for other operations to return when release passive sockets
        pvcalls-front: introduce a per sock_mapping refcount
        x86/xen: Calculate __max_logical_packages on PV domains
        xenbus: track caller request id
      f73f047d
  6. Feb 17, 2018
    • Stefano Stabellini's avatar
      pvcalls-front: wait for other operations to return when release passive sockets · d1a75e08
      Stefano Stabellini authored
      
      
      Passive sockets can have ongoing operations on them, specifically, we
      have two wait_event_interruptable calls in pvcalls_front_accept.
      
      Add two wake_up calls in pvcalls_front_release, then wait for the
      potential waiters to return and release the sock_mapping refcount.
      
      Signed-off-by: default avatarStefano Stabellini <stefano@aporeto.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      d1a75e08
    • Stefano Stabellini's avatar
      pvcalls-front: introduce a per sock_mapping refcount · 64d68718
      Stefano Stabellini authored
      
      
      Introduce a per sock_mapping refcount, in addition to the existing
      global refcount. Thanks to the sock_mapping refcount, we can safely wait
      for it to be 1 in pvcalls_front_release before freeing an active socket,
      instead of waiting for the global refcount to be 1.
      
      Signed-off-by: default avatarStefano Stabellini <stefano@aporeto.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      64d68718
    • Prarit Bhargava's avatar
      x86/xen: Calculate __max_logical_packages on PV domains · 63e708f8
      Prarit Bhargava authored
      The kernel panics on PV domains because native_smp_cpus_done() is
      only called for HVM domains.
      
      Calculate __max_logical_packages for PV domains.
      
      Fixes: b4c0a732
      
       ("x86/smpboot: Fix __max_logical_packages estimate")
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Tested-and-reported-by: default avatarSimon Gaiser <simon@invisiblethingslab.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: xen-devel@lists.xenproject.org
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      63e708f8
    • Joao Martins's avatar
      xenbus: track caller request id · 29fee6ee
      Joao Martins authored
      Commit fd8aa909 ("xen: optimize xenbus driver for multiple concurrent
      xenstore accesses") optimized xenbus concurrent accesses but in doing so
      broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in
      charge of xenbus message exchange with the correct header and body. Now,
      after the mentioned commit the replies received by application will no
      longer have the header req_id echoed back as it was on request (see
      specification below for reference), because that particular field is being
      overwritten by kernel.
      
      struct xsd_sockmsg
      {
        uint32_t type;  /* XS_??? */
        uint32_t req_id;/* Request identifier, echoed in daemon's response.  */
        uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */
        uint32_t len;   /* Length of data following this. */
      
        /* Generally followed by nul-terminated string(s). */
      };
      
      Before there was only one request at a time so req_id could simply be
      forwarded back and forth. To allow simultaneous requests we need a
      different req_id for each message thus kernel keeps a monotonic increasing
      counter for this field and is written on every request irrespective of
      userspace value.
      
      Forwarding again the req_id on userspace requests is not a solution because
      we would open the possibility of userspace-generated req_id colliding with
      kernel ones. So this patch instead takes another route which is to
      artificially keep user req_id while keeping the xenbus logic as is. We do
      that by saving the original req_id before xs_send(), use the private kernel
      counter as req_id and then once reply comes and was validated, we restore
      back the original req_id.
      
      Cc: <stable@vger.kernel.org> # 4.11
      Fixes: fd8aa909
      
       ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
      Reported-by: default avatarBhavesh Davda <bhavesh.davda@oracle.com>
      Signed-off-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      29fee6ee
    • Robin Murphy's avatar
      arm64: cputype: Silence Sparse warnings · e1a50de3
      Robin Murphy authored
      
      
      Sparse makes a fair bit of noise about our MPIDR mask being implicitly
      long - let's explicitly describe it as such rather than just relying on
      the value forcing automatic promotion.
      
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      e1a50de3
    • Xin Long's avatar
      sctp: remove the left unnecessary check for chunk in sctp_renege_events · 9ab2323c
      Xin Long authored
      Commit fb234035 ("sctp: remove the useless check in
      sctp_renege_events") forgot to remove another check for
      chunk in sctp_renege_events.
      
      Dan found this when doing a static check.
      
      This patch is to remove that check, and also to merge
      two checks into one 'if statement'.
      
      Fixes: fb234035
      
       ("sctp: remove the useless check in sctp_renege_events")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ab2323c
    • David Howells's avatar
      rxrpc: Work around usercopy check · a16b8d0c
      David Howells authored
      
      
      Due to a check recently added to copy_to_user(), it's now not permitted to
      copy from slab-held data to userspace unless the slab is whitelisted.  This
      affects rxrpc_recvmsg() when it attempts to place an RXRPC_USER_CALL_ID
      control message in the userspace control message buffer.  A warning is
      generated by usercopy_warn() because the source is the copy of the
      user_call_ID retained in the rxrpc_call struct.
      
      Work around the issue by copying the user_call_ID to a variable on the
      stack and passing that to put_cmsg().
      
      The warning generated looks like:
      
      	Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'dmaengine-unmap-128' (offset 680, size 8)!
      	WARNING: CPU: 0 PID: 1401 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0
      	...
      	RIP: 0010:usercopy_warn+0x7e/0xa0
      	...
      	Call Trace:
      	 __check_object_size+0x9c/0x1a0
      	 put_cmsg+0x98/0x120
      	 rxrpc_recvmsg+0x6fc/0x1010 [rxrpc]
      	 ? finish_wait+0x80/0x80
      	 ___sys_recvmsg+0xf8/0x240
      	 ? __clear_rsb+0x25/0x3d
      	 ? __clear_rsb+0x15/0x3d
      	 ? __clear_rsb+0x25/0x3d
      	 ? __clear_rsb+0x15/0x3d
      	 ? __clear_rsb+0x25/0x3d
      	 ? __clear_rsb+0x15/0x3d
      	 ? __clear_rsb+0x25/0x3d
      	 ? __clear_rsb+0x15/0x3d
      	 ? finish_task_switch+0xa6/0x2b0
      	 ? trace_hardirqs_on_caller+0xed/0x180
      	 ? _raw_spin_unlock_irq+0x29/0x40
      	 ? __sys_recvmsg+0x4e/0x90
      	 __sys_recvmsg+0x4e/0x90
      	 do_syscall_64+0x7a/0x220
      	 entry_SYSCALL_64_after_hwframe+0x26/0x9b
      
      Reported-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Tested-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a16b8d0c
    • Eric Dumazet's avatar
      tun: fix tun_napi_alloc_frags() frag allocator · 43a08e0f
      Eric Dumazet authored
      <Mark Rutland reported>
          While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a
          misaligned atomic in __skb_clone:
      
              atomic_inc(&(skb_shinfo(skb)->dataref));
      
         where dataref doesn't have the required natural alignment, and the
         atomic operation faults. e.g. i often see it aligned to a single
         byte boundary rather than a four byte boundary.
      
         AFAICT, the skb_shared_info is misaligned at the instant it's
         allocated in __napi_alloc_skb()  __napi_alloc_skb()
      </end of report>
      
      Problem is caused by tun_napi_alloc_frags() using
      napi_alloc_frag() with user provided seg sizes,
      leading to other users of this API getting unaligned
      page fragments.
      
      Since we would like to not necessarily add paddings or alignments to
      the frags that tun_napi_alloc_frags() attaches to the skb, switch to
      another page frag allocator.
      
      As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations,
      meaning that we can not deplete memory reserves as easily.
      
      Fixes: 90e33d45
      
       ("tun: enable napi_gro_frags() for TUN/TAP driver")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43a08e0f
    • Alexey Kodanev's avatar
      udplite: fix partial checksum initialization · 15f35d49
      Alexey Kodanev authored
      Since UDP-Lite is always using checksum, the following path is
      triggered when calculating pseudo header for it:
      
        udp4_csum_init() or udp6_csum_init()
          skb_checksum_init_zero_check()
            __skb_checksum_validate_complete()
      
      The problem can appear if skb->len is less than CHECKSUM_BREAK. In
      this particular case __skb_checksum_validate_complete() also invokes
      __skb_checksum_complete(skb). If UDP-Lite is using partial checksum
      that covers only part of a packet, the function will return bad
      checksum and the packet will be dropped.
      
      It can be fixed if we skip skb_checksum_init_zero_check() and only
      set the required pseudo header checksum for UDP-Lite with partial
      checksum before udp4_csum_init()/udp6_csum_init() functions return.
      
      Fixes: ed70fcfc ("net: Call skb_checksum_init in IPv4")
      Fixes: e4f45b7f
      
       ("net: Call skb_checksum_init in IPv6")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15f35d49