Skip to content
  1. Jan 20, 2022
    • David Hildenbrand's avatar
      proc/vmcore: don't fake reading zeroes on surprise vmcore_cb unregistration · 25bc5b0d
      David Hildenbrand authored
      In commit cc5f2704 ("proc/vmcore: convert oldmem_pfn_is_ram callback
      to more generic vmcore callbacks"), we added detection of surprise
      vmcore_cb unregistration after the vmcore was already opened.  Once
      detected, we warn the user and simulate reading zeroes from that point
      on when accessing the vmcore.
      
      The basic reason was that unexpected unregistration, for example, by
      manually unbinding a driver from a device after opening the vmcore, is
      not supported and could result in reading oldmem the vmcore_cb would
      have actually prohibited while registered.  However, something like that
      can similarly be trigger by a user that's really looking for trouble
      simply by unbinding the relevant driver before opening the vmcore -- or
      by disallowing loading the driver in the first place.  So it's actually
      of limited help.
      
      Currently, unregistration can only be triggered via virtio-mem when
      manually unbinding the driver from the device inside the VM; there is no
      way to trigger it from the hypervisor, as hypervisors don't allow for
      unplugging virtio-mem devices -- ripping out system RAM from a VM
      without coordination with the guest is usually not a good idea.
      
      The important part is that unbinding the driver and unregistering the
      vmcore_cb while concurrently reading the vmcore won't crash the system,
      and that is handled by the rwsem.
      
      To make the mechanism more future proof, let's remove the "read zero"
      part, but leave the warning in place.  For example, we could have a
      future driver (like virtio-balloon) that will contact the hypervisor to
      figure out if we already populated a page for a given PFN.
      Hotunplugging such a device and consequently unregistering the vmcore_cb
      could be triggered from the hypervisor without harming the system even
      while kdump is running.  In that case, we don't want to silently end up
      with a vmcore that contains wrong data, because the user inside the VM
      might be unaware of the hypervisor action and might easily miss the
      warning in the log.
      
      Link: https://lkml.kernel.org/r/20211111192243.22002-1-david@redhat.com
      
      
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Philipp Rudo <prudo@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25bc5b0d
    • Kefeng Wang's avatar
      mm: percpu: add generic pcpu_populate_pte() function · 20c03576
      Kefeng Wang authored
      With NEED_PER_CPU_PAGE_FIRST_CHUNK enabled, we need a function to
      populate pte, this patch adds a generic pcpu populate pte function,
      pcpu_populate_pte(), which is marked __weak and used on most
      architectures, but it is overridden on x86, which has its own
      implementation.
      
      Link: https://lkml.kernel.org/r/20211216112359.103822-5-wangkefeng.wang@huawei.com
      
      
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      20c03576
    • Kefeng Wang's avatar
      mm: percpu: add generic pcpu_fc_alloc/free funciton · 23f91716
      Kefeng Wang authored
      With the previous patch, we could add a generic pcpu first chunk
      allocate and free function to cleanup the duplicated definations on each
      architecture.
      
      Link: https://lkml.kernel.org/r/20211216112359.103822-4-wangkefeng.wang@huawei.com
      
      
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      23f91716
    • Kefeng Wang's avatar
      mm: percpu: add pcpu_fc_cpu_to_node_fn_t typedef · 1ca3fb3a
      Kefeng Wang authored
      Add pcpu_fc_cpu_to_node_fn_t and pass it into pcpu_fc_alloc_fn_t, pcpu
      first chunk allocation will call it to alloc memblock on the
      corresponding node by it, this is prepare for the next patch.
      
      Link: https://lkml.kernel.org/r/20211216112359.103822-3-wangkefeng.wang@huawei.com
      
      
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ca3fb3a
    • Kefeng Wang's avatar
      mm: percpu: generalize percpu related config · 7ecd19cf
      Kefeng Wang authored
      Patch series "mm: percpu: Cleanup percpu first chunk function".
      
      When supporting page mapping percpu first chunk allocator on arm64, we
      found there are lots of duplicated codes in percpu embed/page first chunk
      allocator.  This patchset is aimed to cleanup them and should no function
      change.
      
      The currently supported status about 'embed' and 'page' in Archs shows
      below,
      
      	embed: NEED_PER_CPU_PAGE_FIRST_CHUNK
      	page:  NEED_PER_CPU_EMBED_FIRST_CHUNK
      
      		embed	page
      	------------------------
      	arm64	  Y	 Y
      	mips	  Y	 N
      	powerpc	  Y	 Y
      	riscv	  Y	 N
      	sparc	  Y	 Y
      	x86	  Y	 Y
      	------------------------
      
      There are two interfaces about percpu first chunk allocator,
      
       extern int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
                                      size_t atom_size,
                                      pcpu_fc_cpu_distance_fn_t cpu_distance_fn,
      -                               pcpu_fc_alloc_fn_t alloc_fn,
      -                               pcpu_fc_free_fn_t free_fn);
      +                               pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
      
       extern int __init pcpu_page_first_chunk(size_t reserved_size,
      -                               pcpu_fc_alloc_fn_t alloc_fn,
      -                               pcpu_fc_free_fn_t free_fn,
      -                               pcpu_fc_populate_pte_fn_t populate_pte_fn);
      +                               pcpu_fc_cpu_to_node_fn_t cpu_to_nd_fn);
      
      The pcpu_fc_alloc_fn_t/pcpu_fc_free_fn_t is killed, we provide generic
      pcpu_fc_alloc() and pcpu_fc_free() function, which are called in the
      pcpu_embed/page_first_chunk().
      
      1) For pcpu_embed_first_chunk(), pcpu_fc_cpu_to_node_fn_t is needed to be
         provided when archs supported NUMA.
      
      2) For pcpu_page_first_chunk(), the pcpu_fc_populate_pte_fn_t is killed too,
         a generic pcpu_populate_pte() which marked '__weak' is provided, if you
         need a different function to populate pte on the arch(like x86), please
         provide its own implementation.
      
      [1] https://github.com/kevin78/linux.git percpu-cleanup
      
      This patch (of 4):
      
      The HAVE_SETUP_PER_CPU_AREA/NEED_PER_CPU_EMBED_FIRST_CHUNK/
      NEED_PER_CPU_PAGE_FIRST_CHUNK/USE_PERCPU_NUMA_NODE_ID configs, which have
      duplicate definitions on platforms that subscribe it.
      
      Move them into mm, drop these redundant definitions and instead just
      select it on applicable platforms.
      
      Link: https://lkml.kernel.org/r/20211216112359.103822-1-wangkefeng.wang@huawei.com
      Link: https://lkml.kernel.org/r/20211216112359.103822-2-wangkefeng.wang@huawei.com
      
      
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
      Cc: Will Deacon <will@kernel.org>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ecd19cf
  2. Jan 10, 2022
  3. Jan 09, 2022
  4. Jan 08, 2022
  5. Jan 07, 2022
  6. Jan 06, 2022
    • Chris Packham's avatar
      i2c: mpc: Avoid out of bounds memory access · 72a4a87d
      Chris Packham authored
      When performing an I2C transfer where the last message was a write KASAN
      would complain:
      
        BUG: KASAN: slab-out-of-bounds in mpc_i2c_do_action+0x154/0x630
        Read of size 2 at addr c814e310 by task swapper/2/0
      
        CPU: 2 PID: 0 Comm: swapper/2 Tainted: G    B             5.16.0-rc8 #1
        Call Trace:
        [e5ee9d50] [c08418e8] dump_stack_lvl+0x4c/0x6c (unreliable)
        [e5ee9d70] [c02f8a14] print_address_description.constprop.13+0x64/0x3b0
        [e5ee9da0] [c02f9030] kasan_report+0x1f0/0x204
        [e5ee9de0] [c0c76ee4] mpc_i2c_do_action+0x154/0x630
        [e5ee9e30] [c0c782c4] mpc_i2c_isr+0x164/0x240
        [e5ee9e60] [c00f3a04] __handle_irq_event_percpu+0xf4/0x3b0
        [e5ee9ec0] [c00f3d40] handle_irq_event_percpu+0x80/0x110
        [e5ee9f40] [c00f3e48] handle_irq_event+0x78/0xd0
        [e5ee9f60] [c00fcfec] handle_fasteoi_irq+0x19c/0x370
        [e5ee9fa0] [c00f1d84] generic_handle_irq+0x54/0x80
        [e5ee9fc0] [c0006b54] __do_irq+0x64/0x200
        [e5ee9ff0] [c0007958] __do_IRQ+0xe8/0x1c0
        [c812dd50] [e3eaab20] 0xe3eaab20
        [c812dd90] [c0007a4c] do_IRQ+0x1c/0x30
        [c812dda0] [c0000c04] ExternalInput+0x144/0x160
        --- interrupt: 500 at arch_cpu_idle+0x34/0x60
        NIP:  c000b684 LR: c000b684 CTR: c0019688
        REGS: c812ddb0 TRAP: 0500   Tainted: G    B              (5.16.0-rc8)
        MSR:  00029002 <CE,EE,ME>  CR: 22000488  XER: 20000000
      
        GPR00: c10ef7fc c812de90 c80ff200 c2394718 00000001 00000001 c10e3f90 00000003
        GPR08: 00000000 c0019688 c2394718 fc7d625b 22000484 00000000 21e17000 c208228c
        GPR16: e3e99284 00000000 ffffffff c2390000 c001bac0 c2082288 c812df60 c001ba60
        GPR24: c23949c0 00000018 00080000 00000004 c80ff200 00000002 c2348ee4 c2394718
        NIP [c000b684] arch_cpu_idle+0x34/0x60
        LR [c000b684] arch_cpu_idle+0x34/0x60
        --- interrupt: 500
        [c812de90] [c10e3f90] rcu_eqs_enter.isra.60+0xc0/0x110 (unreliable)
        [c812deb0] [c10ef7fc] default_idle_call+0xbc/0x230
        [c812dee0] [c00af0e8] do_idle+0x1c8/0x200
        [c812df10] [c00af3c0] cpu_startup_entry+0x20/0x30
        [c812df20] [c001e010] start_secondary+0x5d0/0xba0
        [c812dff0] [c00028a0] __secondary_start+0x90/0xdc
      
      This happened because we would overrun the i2c->msgs array on the final
      interrupt for the I2C STOP. This didn't happen if the last message was a
      read because there is no interrupt in that case. Ensure that we only
      access the current message if we are not processing a I2C STOP
      condition.
      
      Fixes: 1538d82f
      
       ("i2c: mpc: Interrupt driven transfer")
      Reported-by: default avatarMaxime Bizon <mbizon@freebox.fr>
      Signed-off-by: default avatarChris Packham <chris.packham@alliedtelesis.co.nz>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      72a4a87d
    • Olof Johansson's avatar
      Merge tag 'socfpga_fix_for_v5.16_part_3' of... · 8922bb65
      Olof Johansson authored
      Merge tag 'socfpga_fix_for_v5.16_part_3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes
      
      SoCFPGA dts updates for v5.16, part 3
      - Change the SoCFPGA compatible to "intel,socfpga-qspi"
      - Update dt-bindings document to include "intel,socfpga-qspi"
      
      * tag 'socfpga_fix_for_v5.16_part_3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: (361 commits)
        ARM: dts: socfpga: change qspi to "intel,socfpga-qspi"
        dt-bindings: spi: cadence-quadspi: document "intel,socfpga-qspi"
        Linux 5.16-rc7
        mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()
        mm/damon/dbgfs: protect targets destructions with kdamond_lock
        mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid
        mm: delete unsafe BUG from page_cache_add_speculative()
        mm, hwpoison: fix condition in free hugetlb page path
        MAINTAINERS: mark more list instances as moderated
        kernel/crash_core: suppress unknown crashkernel parameter warning
        mm: mempolicy: fix THP allocations escaping mempolicy restrictions
        kfence: fix memory leak when cat kfence objects
        platform/x86: intel_pmc_core: fix memleak on registration failure
        net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
        r8152: sync ocp base
        r8152: fix the force speed doesn't work for RTL8156
        net: bridge: fix ioctl old_deviceless bridge argument
        net: stmmac: ptp: fix potentially overflowing expression
        net: dsa: tag_ocelot: use traffic class to map priority on injected header
        veth: ensure skb entering GRO are not cloned.
        ...
      
      Link: https://lore.kernel.org/r/20211227103644.566694-1-dinguyen@kernel.org
      
      
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      8922bb65
    • Olof Johansson's avatar
      Merge tag 'reset-fixes-for-v5.16-2' of git://git.pengutronix.de/pza/linux into arm/fixes · fde9ec3c
      Olof Johansson authored
      Reset controller fixes for v5.16, part 2
      
      Fix pm_runtime_resume_and_get() error handling in the
      reset-rzg2l-usbphy-ctrl driver.
      
      * tag 'reset-fixes-for-v5.16-2' of git://git.pengutronix.de/pza/linux:
        reset: renesas: Fix Runtime PM usage
        reset: tegra-bpmp: Revert Handle errors in BPMP response
      
      Link: https://lore.kernel.org/r/20220105172515.273947-1-p.zabel@pengutronix.de
      
      
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      fde9ec3c
    • Naveen N. Rao's avatar
      tracing: Tag trace_percpu_buffer as a percpu pointer · f28439db
      Naveen N. Rao authored
      Tag trace_percpu_buffer as a percpu pointer to resolve warnings
      reported by sparse:
        /linux/kernel/trace/trace.c:3218:46: warning: incorrect type in initializer (different address spaces)
        /linux/kernel/trace/trace.c:3218:46:    expected void const [noderef] __percpu *__vpp_verify
        /linux/kernel/trace/trace.c:3218:46:    got struct trace_buffer_struct *
        /linux/kernel/trace/trace.c:3234:9: warning: incorrect type in initializer (different address spaces)
        /linux/kernel/trace/trace.c:3234:9:    expected void const [noderef] __percpu *__vpp_verify
        /linux/kernel/trace/trace.c:3234:9:    got int *
      
      Link: https://lkml.kernel.org/r/ebabd3f23101d89cb75671b68b6f819f5edc830b.1640255304.git.naveen.n.rao@linux.vnet.ibm.com
      
      
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: 07d777fe
      
       ("tracing: Add percpu buffers for trace_printk()")
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f28439db
    • Naveen N. Rao's avatar
      tracing: Fix check for trace_percpu_buffer validity in get_trace_buf() · 823e670f
      Naveen N. Rao authored
      With the new osnoise tracer, we are seeing the below splat:
          Kernel attempted to read user page (c7d880000) - exploit attempt? (uid: 0)
          BUG: Unable to handle kernel data access on read at 0xc7d880000
          Faulting instruction address: 0xc0000000002ffa10
          Oops: Kernel access of bad area, sig: 11 [#1]
          LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
          ...
          NIP [c0000000002ffa10] __trace_array_vprintk.part.0+0x70/0x2f0
          LR [c0000000002ff9fc] __trace_array_vprintk.part.0+0x5c/0x2f0
          Call Trace:
          [c0000008bdd73b80] [c0000000001c49cc] put_prev_task_fair+0x3c/0x60 (unreliable)
          [c0000008bdd73be0] [c000000000301430] trace_array_printk_buf+0x70/0x90
          [c0000008bdd73c00] [c0000000003178b0] trace_sched_switch_callback+0x250/0x290
          [c0000008bdd73c90] [c000000000e70d60] __schedule+0x410/0x710
          [c0000008bdd73d40] [c000000000e710c0] schedule+0x60/0x130
          [c0000008bdd73d70] [c000000000030614] interrupt_exit_user_prepare_main+0x264/0x270
          [c0000008bdd73de0] [c000000000030a70] syscall_exit_prepare+0x150/0x180
          [c0000008bdd73e10] [c00000000000c174] system_call_vectored_common+0xf4/0x278
      
      osnoise tracer on ppc64le is triggering osnoise_taint() for negative
      duration in get_int_safe_duration() called from
      trace_sched_switch_callback()->thread_exit().
      
      The problem though is that the check for a valid trace_percpu_buffer is
      incorrect in get_trace_buf(). The check is being done after calculating
      the pointer for the current cpu, rather than on the main percpu pointer.
      Fix the check to be against trace_percpu_buffer.
      
      Link: https://lkml.kernel.org/r/a920e4272e0b0635cf20c444707cbce1b2c8973d.1640255304.git.naveen.n.rao@linux.vnet.ibm.com
      
      Cc: stable@vger.kernel.org
      Fixes: e2ace001
      
       ("tracing: Choose static tp_printk buffer by explicit nesting count")
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      823e670f