Skip to content
  1. Sep 01, 2019
  2. Aug 31, 2019
    • Jakub Kicinski's avatar
      tracing: Correct kdoc formats · c68c9ec1
      Jakub Kicinski authored
      
      
      Fix the following kdoc warnings:
      
      kernel/trace/trace.c:1579: warning: Function parameter or member 'tr' not described in 'update_max_tr_single'
      kernel/trace/trace.c:1579: warning: Function parameter or member 'tsk' not described in 'update_max_tr_single'
      kernel/trace/trace.c:1579: warning: Function parameter or member 'cpu' not described in 'update_max_tr_single'
      kernel/trace/trace.c:1776: warning: Function parameter or member 'type' not described in 'register_tracer'
      kernel/trace/trace.c:2239: warning: Function parameter or member 'task' not described in 'tracing_record_taskinfo'
      kernel/trace/trace.c:2239: warning: Function parameter or member 'flags' not described in 'tracing_record_taskinfo'
      kernel/trace/trace.c:2269: warning: Function parameter or member 'prev' not described in 'tracing_record_taskinfo_sched_switch'
      kernel/trace/trace.c:2269: warning: Function parameter or member 'next' not described in 'tracing_record_taskinfo_sched_switch'
      kernel/trace/trace.c:2269: warning: Function parameter or member 'flags' not described in 'tracing_record_taskinfo_sched_switch'
      kernel/trace/trace.c:3078: warning: Function parameter or member 'ip' not described in 'trace_vbprintk'
      kernel/trace/trace.c:3078: warning: Function parameter or member 'fmt' not described in 'trace_vbprintk'
      kernel/trace/trace.c:3078: warning: Function parameter or member 'args' not described in 'trace_vbprintk'
      
      Link: http://lkml.kernel.org/r/20190828052549.2472-2-jakub.kicinski@netronome.com
      
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      c68c9ec1
    • Jisheng Zhang's avatar
      ftrace/x86: Remove mcount() declaration · 2e815627
      Jisheng Zhang authored
      Commit 562e14f7
      
       ("ftrace/x86: Remove mcount support") removed the
      support for using mcount, so we could remove the mcount() declaration
      to clean up.
      
      Link: http://lkml.kernel.org/r/20190826170150.10f101ba@xhacker.debian
      
      Signed-off-by: default avatarJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      2e815627
    • Xinpeng Liu's avatar
      tracing/probe: Fix null pointer dereference · 19a58ce1
      Xinpeng Liu authored
      BUG: KASAN: null-ptr-deref in trace_probe_cleanup+0x8d/0xd0
      Read of size 8 at addr 0000000000000000 by task syz-executor.0/9746
      trace_probe_cleanup+0x8d/0xd0
      free_trace_kprobe.part.14+0x15/0x50
      alloc_trace_kprobe+0x23e/0x250
      
      Link: http://lkml.kernel.org/r/1565220563-980-1-git-send-email-danielliu861@gmail.com
      
      Fixes: e3dc9f89
      
       ("tracing/probe: Add trace_event_call accesses APIs")
      Signed-off-by: default avatarXinpeng Liu <danielliu861@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      19a58ce1
    • Denis Efremov's avatar
      tracing: Make exported ftrace_set_clr_event non-static · 595a438c
      Denis Efremov authored
      The function ftrace_set_clr_event is declared static and marked
      EXPORT_SYMBOL_GPL(), which is at best an odd combination. Because the
      function was decided to be a part of API, this commit removes the static
      attribute and adds the declaration to the header.
      
      Link: http://lkml.kernel.org/r/20190704172110.27041-1-efremov@linux.com
      
      Fixes: f45d1225
      
       ("tracing: Kernel access to Ftrace instances")
      Reviewed-by: default avatarJoe Jin <joe.jin@oracle.com>
      Signed-off-by: default avatarDenis Efremov <efremov@linux.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      595a438c
    • Naveen N. Rao's avatar
      ftrace: Check for successful allocation of hash · 5b0022dd
      Naveen N. Rao authored
      In register_ftrace_function_probe(), we are not checking the return
      value of alloc_and_copy_ftrace_hash(). The subsequent call to
      ftrace_match_records() may end up dereferencing the same. Add a check to
      ensure this doesn't happen.
      
      Link: http://lkml.kernel.org/r/26e92574f25ad23e7cafa3cf5f7a819de1832cbe.1562249521.git.naveen.n.rao@linux.vnet.ibm.com
      
      Cc: stable@vger.kernel.org
      Fixes: 1ec3a81a
      
       ("ftrace: Have each function probe use its own ftrace_ops")
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      5b0022dd
    • Steven Rostedt (VMware)'s avatar
      ftrace: Check for empty hash and comment the race with registering probes · 372e0d01
      Steven Rostedt (VMware) authored
      The race between adding a function probe and reading the probes that exist
      is very subtle. It needs a comment. Also, the issue can also happen if the
      probe has has the EMPTY_HASH as its func_hash.
      
      Cc: stable@vger.kernel.org
      Fixes: 7b60f3d8
      
       ("ftrace: Dynamically create the probe ftrace_ops for the trace_array")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      372e0d01
    • Naveen N. Rao's avatar
      ftrace: Fix NULL pointer dereference in t_probe_next() · 7bd46644
      Naveen N. Rao authored
      LTP testsuite on powerpc results in the below crash:
      
        Unable to handle kernel paging request for data at address 0x00000000
        Faulting instruction address: 0xc00000000029d800
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 NUMA PowerNV
        ...
        CPU: 68 PID: 96584 Comm: cat Kdump: loaded Tainted: G        W
        NIP:  c00000000029d800 LR: c00000000029dac4 CTR: c0000000001e6ad0
        REGS: c0002017fae8ba10 TRAP: 0300   Tainted: G        W
        MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28022422  XER: 20040000
        CFAR: c00000000029d90c DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0
        ...
        NIP [c00000000029d800] t_probe_next+0x60/0x180
        LR [c00000000029dac4] t_mod_start+0x1a4/0x1f0
        Call Trace:
        [c0002017fae8bc90] [c000000000cdbc40] _cond_resched+0x10/0xb0 (unreliable)
        [c0002017fae8bce0] [c0000000002a15b0] t_start+0xf0/0x1c0
        [c0002017fae8bd30] [c0000000004ec2b4] seq_read+0x184/0x640
        [c0002017fae8bdd0] [c0000000004a57bc] sys_read+0x10c/0x300
        [c0002017fae8be30] [c00000000000b388] system_call+0x5c/0x70
      
      The test (ftrace_set_ftrace_filter.sh) is part of ftrace stress tests
      and the crash happens when the test does 'cat
      $TRACING_PATH/set_ftrace_filter'.
      
      The address points to the second line below, in t_probe_next(), where
      filter_hash is dereferenced:
        hash = iter->probe->ops.func_hash->filter_hash;
        size = 1 << hash->size_bits;
      
      This happens due to a race with register_ftrace_function_probe(). A new
      ftrace_func_probe is created and added into the func_probes list in
      trace_array under ftrace_lock. However, before initializing the filter,
      we drop ftrace_lock, and re-acquire it after acquiring regex_lock. If
      another process is trying to read set_ftrace_filter, it will be able to
      acquire ftrace_lock during this window and it will end up seeing a NULL
      filter_hash.
      
      Fix this by just checking for a NULL filter_hash in t_probe_next(). If
      the filter_hash is NULL, then this probe is just being added and we can
      simply return from here.
      
      Link: http://lkml.kernel.org/r/05e021f757625cbbb006fad41380323dbe4e3b43.1562249521.git.naveen.n.rao@linux.vnet.ibm.com
      
      Cc: stable@vger.kernel.org
      Fixes: 7b60f3d8
      
       ("ftrace: Dynamically create the probe ftrace_ops for the trace_array")
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      7bd46644
  3. Aug 26, 2019
    • Linus Torvalds's avatar
      Linux 5.3-rc6 · a55aa89a
      Linus Torvalds authored
      v5.3-rc6
      a55aa89a
    • Linus Torvalds's avatar
      Merge tag 'auxdisplay-for-linus-v5.3-rc7' of git://github.com/ojeda/linux · c749088f
      Linus Torvalds authored
      Pull auxdisplay cleanup from Miguel Ojeda:
       "Make ht16k33_fb_fix and ht16k33_fb_var constant (Nishka Dasgupta)"
      
      * tag 'auxdisplay-for-linus-v5.3-rc7' of git://github.com/ojeda/linux:
        auxdisplay: ht16k33: Make ht16k33_fb_fix and ht16k33_fb_var constant
      c749088f
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 32ae83ff
      Linus Torvalds authored
      Pull UML fix from Richard Weinberger:
       "Fix time travel mode"
      
      * tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
        um: fix time travel mode
      32ae83ff
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs · 94a76d9b
      Linus Torvalds authored
      Pull UBIFS and JFFS2 fixes from Richard Weinberger:
       "UBIFS:
         - Don't block too long in writeback_inodes_sb()
         - Fix for a possible overrun of the log head
         - Fix double unlock in orphan_delete()
      
        JFFS2:
         - Remove C++ style from UAPI header and unbreak picky toolchains"
      
      * tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
        ubifs: Limit the number of pages in shrink_liability
        ubifs: Correctly initialize c->min_log_bytes
        ubifs: Fix double unlock around orphan_delete()
        jffs2: Remove C++ style comments from uapi header
      94a76d9b
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 146c3d32
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A few fixes for x86:
      
         - Fix a boot regression caused by the recent bootparam sanitizing
           change, which escaped the attention of all people who reviewed that
           code.
      
         - Address a boot problem on machines with broken E820 tables caused
           by an underflow which ended up placing the trampoline start at
           physical address 0.
      
         - Handle machines which do not advertise a legacy timer of any form,
           but need calibration of the local APIC timer gracefully by making
           the calibration routine independent from the tick interrupt. Marked
           for stable as well as there seems to be quite some new laptops
           rolled out which expose this.
      
         - Clear the RDRAND CPUID bit on AMD family 15h and 16h CPUs which are
           affected by broken firmware which does not initialize RDRAND
           correctly after resume. Add a command line parameter to override
           this for machine which either do not use suspend/resume or have a
           fixed BIOS. Unfortunately there is no way to detect this on boot,
           so the only safe decision is to turn it off by default.
      
         - Prevent RFLAGS from being clobbers in CALL_NOSPEC on 32bit which
           caused fast KVM instruction emulation to break.
      
         - Explain the Intel CPU model naming convention so that the repeating
           discussions come to an end"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
        x86/boot: Fix boot regression caused by bootparam sanitizing
        x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
        x86/boot/compressed/64: Fix boot on machines with broken E820 table
        x86/apic: Handle missing global clockevent gracefully
        x86/cpu: Explain Intel model naming convention
      146c3d32
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5a13fc3d
      Linus Torvalds authored
      Pull timekeeping fix from Thomas Gleixner:
       "A single fix for a regression caused by the generic VDSO
        implementation where a math overflow causes CLOCK_BOOTTIME to become a
        random number generator"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
      5a13fc3d
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8a04c2ee
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "Handle the worker management in situations where a task is scheduled
        out on a PI lock contention correctly and schedule a new worker if
        possible"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/core: Schedule new worker even if PI-blocked
      8a04c2ee
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 05bbb936
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "Two small fixes for kprobes and perf:
      
         - Prevent a deadlock in kprobe_optimizer() causes by reverse lock
           ordering
      
         - Fix a comment typo"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        kprobes: Fix potential deadlock in kprobe_optimizer()
        perf/x86: Fix typo in comment
      05bbb936
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 44c471e4
      Linus Torvalds authored
      Pull irq fix from Thomas Gleixner:
       "A single fix for a imbalanced kobject operation in the irq decriptor
        code which was unearthed by the new warnings in the kobject code"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Properly pair kobject_del() with kobject_add()
      44c471e4
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · f47edb59
      Linus Torvalds authored
      Mergr misc fixes from Andrew Morton:
       "11 fixes"
      
      Mostly VM fixes, one psi polling fix, and one parisc build fix.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y
        mm/zsmalloc.c: fix race condition in zs_destroy_pool
        mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
        mm, page_owner: handle THP splits correctly
        userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
        psi: get poll_work to run when calling poll syscall next time
        mm: memcontrol: flush percpu vmevents before releasing memcg
        mm: memcontrol: flush percpu vmstats before releasing memcg
        parisc: fix compilation errrors
        mm, page_alloc: move_freepages should not examine struct page of reserved memory
        mm/z3fold.c: fix race between migration and destruction
      f47edb59
  4. Aug 25, 2019
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping · e67095fd
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
       "Two fixes for regressions in this merge window:
      
         - select the Kconfig symbols for the noncoherent dma arch helpers on
           arm if swiotlb is selected, not just for LPAE to not break then Xen
           build, that uses swiotlb indirectly through swiotlb-xen
      
         - fix the page allocator fallback in dma_alloc_contiguous if the CMA
           allocation fails"
      
      * tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping:
        dma-direct: fix zone selection after an unaddressable CMA allocation
        arm: select the dma-noncoherent symbols for all swiotlb builds
      e67095fd
    • Andrey Ryabinin's avatar
      mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y · 00fb24a4
      Andrey Ryabinin authored
      The code like this:
      
      	ptr = kmalloc(size, GFP_KERNEL);
      	page = virt_to_page(ptr);
      	offset = offset_in_page(ptr);
      	kfree(page_address(page) + offset);
      
      may produce false-positive invalid-free reports on the kernel with
      CONFIG_KASAN_SW_TAGS=y.
      
      In the example above we lose the original tag assigned to 'ptr', so
      kfree() gets the pointer with 0xFF tag.  In kfree() we check that 0xFF
      tag is different from the tag in shadow hence print false report.
      
      Instead of just comparing tags, do the following:
      
      1) Check that shadow doesn't contain KASAN_TAG_INVALID.  Otherwise it's
         double-free and it doesn't matter what tag the pointer have.
      
      2) If pointer tag is different from 0xFF, make sure that tag in the
         shadow is the same as in the pointer.
      
      Link: http://lkml.kernel.org/r/20190819172540.19581-1-aryabinin@virtuozzo.com
      Fixes: 7f94ffbc
      
       ("kasan: add hooks implementation for tag-based mode")
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarWalter Wu <walter-zh.wu@mediatek.com>
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00fb24a4
    • Henry Burns's avatar
      mm/zsmalloc.c: fix race condition in zs_destroy_pool · 701d6785
      Henry Burns authored
      In zs_destroy_pool() we call flush_work(&pool->free_work).  However, we
      have no guarantee that migration isn't happening in the background at
      that time.
      
      Since migration can't directly free pages, it relies on free_work being
      scheduled to free the pages.  But there's nothing preventing an
      in-progress migrate from queuing the work *after*
      zs_unregister_migration() has called flush_work().  Which would mean
      pages still pointing at the inode when we free it.
      
      Since we know at destroy time all objects should be free, no new
      migrations can come in (since zs_page_isolate() fails for fully-free
      zspages).  This means it is sufficient to track a "# isolated zspages"
      count by class, and have the destroy logic ensure all such pages have
      drained before proceeding.  Keeping that state under the class spinlock
      keeps the logic straightforward.
      
      In this case a memory leak could lead to an eventual crash if compaction
      hits the leaked page.  This crash would only occur if people are
      changing their zswap backend at runtime (which eventually starts
      destruction).
      
      Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
      Fixes: 48b4800a
      
       ("zsmalloc: page migration support")
      Signed-off-by: default avatarHenry Burns <henryburns@google.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Henry Burns <henrywolfeburns@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      701d6785
    • Henry Burns's avatar
      mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely · 1a87aa03
      Henry Burns authored
      In zs_page_migrate() we call putback_zspage() after we have finished
      migrating all pages in this zspage.  However, the return value is
      ignored.  If a zs_free() races in between zs_page_isolate() and
      zs_page_migrate(), freeing the last object in the zspage,
      putback_zspage() will leave the page in ZS_EMPTY for potentially an
      unbounded amount of time.
      
      To fix this, we need to do the same thing as zs_page_putback() does:
      schedule free_work to occur.
      
      To avoid duplicated code, move the sequence to a new
      putback_zspage_deferred() function which both zs_page_migrate() and
      zs_page_putback() call.
      
      Link: http://lkml.kernel.org/r/20190809181751.219326-1-henryburns@google.com
      Fixes: 48b4800a
      
       ("zsmalloc: page migration support")
      Signed-off-by: default avatarHenry Burns <henryburns@google.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Henry Burns <henrywolfeburns@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1a87aa03
    • Vlastimil Babka's avatar
      mm, page_owner: handle THP splits correctly · f7da677b
      Vlastimil Babka authored
      THP splitting path is missing the split_page_owner() call that
      split_page() has.
      
      As a result, split THP pages are wrongly reported in the page_owner file
      as order-9 pages.  Furthermore when the former head page is freed, the
      remaining former tail pages are not listed in the page_owner file at
      all.  This patch fixes that by adding the split_page_owner() call into
      __split_huge_page().
      
      Link: http://lkml.kernel.org/r/20190820131828.22684-2-vbabka@suse.cz
      Fixes: a9627bc5
      
       ("mm/page_owner: introduce split_page_owner and replace manual handling")
      Reported-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f7da677b
    • Oleg Nesterov's avatar
      userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx · 46d0b24c
      Oleg Nesterov authored
      userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if
      mm->core_state != NULL.
      
      Otherwise a page fault can see userfaultfd_missing() == T and use an
      already freed userfaultfd_ctx.
      
      Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com
      Fixes: 04f5866e
      
       ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping")
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      46d0b24c
    • Jason Xing's avatar
      psi: get poll_work to run when calling poll syscall next time · 7b2b55da
      Jason Xing authored
      
      
      Only when calling the poll syscall the first time can user receive
      POLLPRI correctly.  After that, user always fails to acquire the event
      signal.
      
      Reproduce case:
       1. Get the monitor code in Documentation/accounting/psi.txt
       2. Run it, and wait for the event triggered.
       3. Kill and restart the process.
      
      The question is why we can end up with poll_scheduled = 1 but the work
      not running (which would reset it to 0).  And the answer is because the
      scheduling side sees group->poll_kworker under RCU protection and then
      schedules it, but here we cancel the work and destroy the worker.  The
      cancel needs to pair with resetting the poll_scheduled flag.
      
      Link: http://lkml.kernel.org/r/1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com
      Signed-off-by: default avatarJason Xing <kerneljasonxing@linux.alibaba.com>
      Signed-off-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: default avatarCaspar Zhang <caspar@linux.alibaba.com>
      Reviewed-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b2b55da
    • Roman Gushchin's avatar
      mm: memcontrol: flush percpu vmevents before releasing memcg · bb65f89b
      Roman Gushchin authored
      Similar to vmstats, percpu caching of local vmevents leads to an
      accumulation of errors on non-leaf levels.  This happens because some
      leftovers may remain in percpu caches, so that they are never propagated
      up by the cgroup tree and just disappear into nonexistence with on
      releasing of the memory cgroup.
      
      To fix this issue let's accumulate and propagate percpu vmevents values
      before releasing the memory cgroup similar to what we're doing with
      vmstats.
      
      Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
      only over online cpus.
      
      Link: http://lkml.kernel.org/r/20190819202338.363363-4-guro@fb.com
      Fixes: 42a30035
      
       ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb65f89b
    • Roman Gushchin's avatar
      mm: memcontrol: flush percpu vmstats before releasing memcg · c350a99e
      Roman Gushchin authored
      Percpu caching of local vmstats with the conditional propagation by the
      cgroup tree leads to an accumulation of errors on non-leaf levels.
      
      Let's imagine two nested memory cgroups A and A/B.  Say, a process
      belonging to A/B allocates 100 pagecache pages on the CPU 0.  The percpu
      cache will spill 3 times, so that 32*3=96 pages will be accounted to A/B
      and A atomic vmstat counters, 4 pages will remain in the percpu cache.
      
      Imagine A/B is nearby memory.max, so that every following allocation
      triggers a direct reclaim on the local CPU.  Say, each such attempt will
      free 16 pages on a new cpu.  That means every percpu cache will have -16
      pages, except the first one, which will have 4 - 16 = -12.  A/B and A
      atomic counters will not be touched at all.
      
      Now a user removes A/B.  All percpu caches are freed and corresponding
      vmstat numbers are forgotten.  A has 96 pages more than expected.
      
      As memory cgroups are created and destroyed, errors do accumulate.  Even
      1-2 pages differences can accumulate into large numbers.
      
      To fix this issue let's accumulate and propagate percpu vmstat values
      before releasing the memory cgroup.  At this point these numbers are
      stable and cannot be changed.
      
      Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
      only over online cpus.
      
      Link: http://lkml.kernel.org/r/20190819202338.363363-2-guro@fb.com
      Fixes: 42a30035
      
       ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c350a99e
    • Qian Cai's avatar
      parisc: fix compilation errrors · bbcb03a9
      Qian Cai authored
      Commit 0cfaee2a ("include/asm-generic/5level-fixup.h: fix variable
      'p4d' set but not used") converted a few functions from macros to static
      inline, which causes parisc to complain,
      
        In file included from include/asm-generic/4level-fixup.h:38:0,
                         from arch/parisc/include/asm/pgtable.h:5,
                         from arch/parisc/include/asm/io.h:6,
                         from include/linux/io.h:13,
                         from sound/core/memory.c:9:
        include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'?
         #define p4d_t    pgd_t
                          ^
        include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t'
         static inline int p4d_none(p4d_t p4d)
                                    ^~~~~
      
      It is because "4level-fixup.h" is included before "asm/page.h" where
      "pgd_t" is defined.
      
      Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw
      Fixes: 0cfaee2a
      
       ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bbcb03a9
    • David Rientjes's avatar
      mm, page_alloc: move_freepages should not examine struct page of reserved memory · cd961038
      David Rientjes authored
      After commit 907ec5fc ("mm: zero remaining unavailable struct
      pages"), struct page of reserved memory is zeroed.  This causes
      page->flags to be 0 and fixes issues related to reading
      /proc/kpageflags, for example, of reserved memory.
      
      The VM_BUG_ON() in move_freepages_block(), however, assumes that
      page_zone() is meaningful even for reserved memory.  That assumption is
      no longer true after the aforementioned commit.
      
      There's no reason why move_freepages_block() should be testing the
      legitimacy of page_zone() for reserved memory; its scope is limited only
      to pages on the zone's freelist.
      
      Note that pfn_valid() can be true for reserved memory: there is a
      backing struct page.  The check for page_to_nid(page) is also buggy but
      reserved memory normally only appears on node 0 so the zeroing doesn't
      affect this.
      
      Move the debug checks to after verifying PageBuddy is true.  This
      isolates the scope of the checks to only be for buddy pages which are on
      the zone's freelist which move_freepages_block() is operating on.  In
      this case, an incorrect node or zone is a bug worthy of being warned
      about (and the examination of struct page is acceptable bcause this
      memory is not reserved).
      
      Why does move_freepages_block() gets called on reserved memory? It's
      simply math after finding a valid free page from the per-zone free area
      to use as fallback.  We find the beginning and end of the pageblock of
      the valid page and that can bring us into memory that was reserved per
      the e820.  pfn_valid() is still true (it's backed by a struct page), but
      since it's zero'd we shouldn't make any inferences here about comparing
      its node or zone.  The current node check just happens to succeed most
      of the time by luck because reserved memory typically appears on node 0.
      
      The fix here is to validate that we actually have buddy pages before
      testing if there's any type of zone or node strangeness going on.
      
      We noticed it almost immediately after bringing 907ec5fc
      
       in on
      CONFIG_DEBUG_VM builds.  It depends on finding specific free pages in
      the per-zone free area where the math in move_freepages() will bring the
      start or end pfn into reserved memory and wanting to claim that entire
      pageblock as a new migratetype.  So the path will be rare, require
      CONFIG_DEBUG_VM, and require fallback to a different migratetype.
      
      Some struct pages were already zeroed from reserve pages before
      907ec5fca3c so it theoretically could trigger before this commit.  I
      think it's rare enough under a config option that most people don't run
      that others may not have noticed.  I wouldn't argue against a stable tag
      and the backport should be easy enough, but probably wouldn't single out
      a commit that this is fixing.
      
      Mel said:
      
      : The overhead of the debugging check is higher with this patch although
      : it'll only affect debug builds and the path is not particularly hot.
      : If this was a concern, I think it would be reasonable to simply remove
      : the debugging check as the zone boundaries are checked in
      : move_freepages_block and we never expect a zone/node to be smaller than
      : a pageblock and stuck in the middle of another zone.
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp.google.com
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd961038
    • Henry Burns's avatar
      mm/z3fold.c: fix race between migration and destruction · d776aaa9
      Henry Burns authored
      
      
      In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq).
      However, we have no guarantee that migration isn't happening in the
      background at that time.
      
      Migration directly calls queue_work_on(pool->compact_wq), if destruction
      wins that race we are using a destroyed workqueue.
      
      Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com
      Signed-off-by: default avatarHenry Burns <henryburns@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Cc: Henry Burns <henrywolfeburns@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d776aaa9
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 083f0f2c
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "Here is a (hopefully last) set of GPIO fixes for the v5.3 kernel
        cycle. Two are pretty core:
      
         - Fix not reporting open drain/source lines to userspace as "input"
      
         - Fix a minor build error found in randconfigs
      
         - Fix a chip select quirk on the Freescale SPI
      
         - Fix the irqchip initialization semantic order to reflect what it
           was using the old API"
      
      * tag 'gpio-v5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: Fix irqchip initialization order
        gpio: of: fix Freescale SPI CS quirk handling
        gpio: Fix build error of function redefinition
        gpiolib: never report open-drain/source lines as 'input' to user-space
      083f0f2c
    • Linus Torvalds's avatar
      Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux · 36146921
      Linus Torvalds authored
      Pull Hyper-V fixes from Sasha Levin:
      
       - Fix for panics and network failures on PAE guests by Dexuan Cui.
      
       - Fix of a memory leak (and related cleanups) in the hyper-v keyboard
         driver by Dexuan Cui.
      
       - Code cleanups for hyper-v clocksource driver during the merge window
         by Dexuan Cui.
      
       - Fix for a false positive warning in the userspace hyper-v KVP store
         by Vitaly Kuznetsov.
      
      * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
        Tools: hv: kvp: eliminate 'may be used uninitialized' warning
        Input: hyperv-keyboard: Use in-place iterator API in the channel callback
        Drivers: hv: vmbus: Remove the unused "tsc_page" from struct hv_context
      36146921
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 0a022ecc
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Two KVM/arm fixes for MMIO emulation and UBSAN.
      
        Unusually, we're routing them via the arm64 tree as per Paolo's
        request on the list:
      
          https://lore.kernel.org/kvm/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com/
      
        We don't actually have any other arm64 fixes pending at the moment
        (touch wood), so I've pulled from Marc, written a merge commit, tagged
        the result and run it through my build/boot/bisect scripts"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
        KVM: arm/arm64: Only skip MMIO insn once
      0a022ecc
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 17d0fbf4
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Four fixes, three for edge conditions which don't occur very often.
        The lpfc fix mitigates memory exhaustion for some high CPU systems"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
        scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()
        scsi: target: tcmu: avoid use-after-free after command timeout
        scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure
      17d0fbf4
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 8942230a
      Linus Torvalds authored
      Pull xfs fix from Darrick Wong:
       "A single patch that fixes a xfs lockup problem when a chown/chgrp
        operation fails due to running out of quota. It has survived the usual
        xfstests runs and merges cleanly with this morning's master:
      
         - Fix a forgotten inode unlock when chown/chgrp fail due to quota"
      
      * tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
      8942230a
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm · bc67b17e
      Linus Torvalds authored
      Pull more drm fixes from Dave Airlie:
       "Although the tree built for me fine on arm here, it appears either
        header cleanups in next or some kconfig combo it breaks, so this
        contains a fix to mediatek to include dma-mapping.h explicitly.
      
        There was also one nouveau fix that came in late that I was going to
        leave until next week, but since I was sending this I thought it may
        as well be in here:
      
        mediatek:
         - fix build in some cases
      
        nouveau:
         - fix hang with i2c and mst docks"
      
      * tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm:
        drm/mediatek: include dma-mapping header
        drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
      bc67b17e
  5. Aug 24, 2019
    • Will Deacon's avatar
      Merge tag 'kvmarm-fixes-for-5.3-3' of... · 087eeea9
      Will Deacon authored
      Merge tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm/fixes
      
      Pull KVM/arm fixes from Marc Zyngier as per Paulo's request at:
      
        https://lkml.kernel.org/r/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com
      
        "One (hopefully last) set of fixes for KVM/arm for 5.3: an embarassing
         MMIO emulation regression, and a UBSAN splat. Oh well...
      
         - Don't overskip instructions on MMIO emulation
      
         - Fix UBSAN splat when initializing PPI priorities"
      
      * tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
        KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
        KVM: arm/arm64: Only skip MMIO insn once
      087eeea9