Skip to content
  1. Feb 10, 2021
    • Dan Williams's avatar
      libnvdimm/namespace: Fix visibility of namespace resource attribute · a2560f88
      Dan Williams authored
      commit 13f445d6 upstream.
      
      Legacy pmem namespaces lost support for the "resource" attribute when
      the code was cleaned up to put the permission visibility in the
      declaration. Restore this by listing 'resource' in the default
      attributes.
      
      A new ndctl regression test for pfn_to_online_page() corner cases builds
      on this fix.
      
      Fixes: bfd2e914
      
       ("libnvdimm: Simplify root read-only definition for the 'resource' attribute")
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/161052334995.1805594.12054873528154362921.stgit@dwillia2-desk3.amr.corp.intel.com
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2560f88
    • Alexey Kardashevskiy's avatar
      tracepoint: Fix race between tracing and removing tracepoint · 059e68da
      Alexey Kardashevskiy authored
      commit c8b186a8 upstream.
      
      When executing a tracepoint, the tracepoint's func is dereferenced twice -
      in __DO_TRACE() (where the returned pointer is checked) and later on in
      __traceiter_##_name where the returned pointer is dereferenced without
      checking which leads to races against tracepoint_removal_sync() and
      crashes.
      
      This adds a check before referencing the pointer in tracepoint_ptr_deref.
      
      Link: https://lkml.kernel.org/r/20210202072326.120557-1-aik@ozlabs.ru
      
      Cc: stable@vger.kernel.org
      Fixes: d25e37d8
      
       ("tracepoint: Optimize using static_call()")
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      059e68da
    • Viktor Rosendahl's avatar
      tracing: Use pause-on-trace with the latency tracers · 9e4a668f
      Viktor Rosendahl authored
      commit da7f84cd upstream.
      
      Eaerlier, tracing was disabled when reading the trace file. This behavior
      was changed with:
      
      commit 06e0a548 ("tracing: Do not disable tracing when reading the
      trace file").
      
      This doesn't seem to work with the latency tracers.
      
      The above mentioned commit dit not only change the behavior but also added
      an option to emulate the old behavior. The idea with this patch is to
      enable this pause-on-trace option when the latency tracers are used.
      
      Link: https://lkml.kernel.org/r/20210119164344.37500-2-Viktor.Rosendahl@bmw.de
      
      Cc: stable@vger.kernel.org
      Fixes: 06e0a548
      
       ("tracing: Do not disable tracing when reading the trace file")
      Signed-off-by: default avatarViktor Rosendahl <Viktor.Rosendahl@bmw.de>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e4a668f
    • Wang ShaoBo's avatar
      kretprobe: Avoid re-registration of the same kretprobe earlier · 8ce84b8e
      Wang ShaoBo authored
      commit 0188b878 upstream.
      
      Our system encountered a re-init error when re-registering same kretprobe,
      where the kretprobe_instance in rp->free_instances is illegally accessed
      after re-init.
      
      Implementation to avoid re-registration has been introduced for kprobe
      before, but lags for register_kretprobe(). We must check if kprobe has
      been re-registered before re-initializing kretprobe, otherwise it will
      destroy the data struct of kretprobe registered, which can lead to memory
      leak, system crash, also some unexpected behaviors.
      
      We use check_kprobe_rereg() to check if kprobe has been re-registered
      before running register_kretprobe()'s body, for giving a warning message
      and terminate registration process.
      
      Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: 1f0ab409
      
       ("kprobes: Prevent re-registration of the same kprobe")
      [ The above commit should have been done for kretprobes too ]
      Acked-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarAnanth N Mavinakayanahalli <ananth@linux.ibm.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarWang ShaoBo <bobo.shaobowang@huawei.com>
      Signed-off-by: default avatarCheng Jian <cj.chengjian@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ce84b8e
    • Masami Hiramatsu's avatar
      tracing/kprobe: Fix to support kretprobe events on unloaded modules · fb03f14c
      Masami Hiramatsu authored
      commit 97c753e6 upstream.
      
      Fix kprobe_on_func_entry() returns error code instead of false so that
      register_kretprobe() can return an appropriate error code.
      
      append_trace_kprobe() expects the kprobe registration returns -ENOENT
      when the target symbol is not found, and it checks whether the target
      module is unloaded or not. If the target module doesn't exist, it
      defers to probe the target symbol until the module is loaded.
      
      However, since register_kretprobe() returns -EINVAL instead of -ENOENT
      in that case, it always fail on putting the kretprobe event on unloaded
      modules. e.g.
      
      Kprobe event:
      /sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
      [   16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
      
      Kretprobe event: (p -> r)
      /sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
      sh: write error: Invalid argument
      /sys/kernel/debug/tracing # cat error_log
      [   41.122514] trace_kprobe: error: Failed to register probe event
        Command: r xfs:xfs_end_io
                   ^
      
      To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
      failure and return -ENOENT in that case. Otherwise it returns -EINVAL
      or 0 (succeeded, given address is on the entry).
      
      Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@devnote2
      
      Cc: stable@vger.kernel.org
      Fixes: 59158ec4
      
       ("tracing/kprobes: Check the probe on unloaded module correctly")
      Reported-by: default avatarJianlin Lv <Jianlin.Lv@arm.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb03f14c
    • Steven Rostedt (VMware)'s avatar
      fgraph: Initialize tracing_graph_pause at task creation · 43b5bdbf
      Steven Rostedt (VMware) authored
      commit 7e0a9220 upstream.
      
      On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
      will disable or pause function graph tracing, as there's some paths in
      bringing down the CPU that can have issues with its return address being
      modified. The task_struct structure has a "tracing_graph_pause" atomic
      counter, that when set to something other than zero, the function graph
      tracer will not modify the return address.
      
      The problem is that the tracing_graph_pause counter is initialized when the
      function graph tracer is enabled. This can corrupt the counter for the idle
      task if it is suspended in these architectures.
      
         CPU 1				CPU 2
         -----				-----
        do_idle()
          cpu_suspend()
            pause_graph_tracing()
                task_struct->tracing_graph_pause++ (0 -> 1)
      
      				start_graph_tracing()
      				  for_each_online_cpu(cpu) {
      				    ftrace_graph_init_idle_task(cpu)
      				      task-struct->tracing_graph_pause = 0 (1 -> 0)
      
            unpause_graph_tracing()
                task_struct->tracing_graph_pause-- (0 -> -1)
      
      The above should have gone from 1 to zero, and enabled function graph
      tracing again. But instead, it is set to -1, which keeps it disabled.
      
      There's no reason that the field tracing_graph_pause on the task_struct can
      not be initialized at boot up.
      
      Cc: stable@vger.kernel.org
      Fixes: 380c4b14
      
       ("tracing/function-graph-tracer: append the tracing_graph_flag")
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
      Reported-by: default avatar <pierre.gondois@arm.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43b5bdbf
    • Quanyang Wang's avatar
      gpiolib: free device name on error path to fix kmemleak · 8847a756
      Quanyang Wang authored
      commit c351bb64
      
       upstream.
      
      In gpiochip_add_data_with_key, we should check the return value of
      dev_set_name to ensure that device name is allocated successfully
      and then add a label on the error path to free device name to fix
      kmemleak as below:
      
      unreferenced object 0xc2d6fc40 (size 64):
        comm "kworker/0:1", pid 16, jiffies 4294937425 (age 65.120s)
        hex dump (first 32 bytes):
          67 70 69 6f 63 68 69 70 30 00 1a c0 54 63 1a c0  gpiochip0...Tc..
          0c ed 84 c0 48 ed 84 c0 3c ee 84 c0 10 00 00 00  ....H...<.......
        backtrace:
          [<962810f7>] kobject_set_name_vargs+0x2c/0xa0
          [<f50797e6>] dev_set_name+0x2c/0x5c
          [<94abbca9>] gpiochip_add_data_with_key+0xfc/0xce8
          [<5c4193e0>] omap_gpio_probe+0x33c/0x68c
          [<3402f137>] platform_probe+0x58/0xb8
          [<7421e210>] really_probe+0xec/0x3b4
          [<000f8ada>] driver_probe_device+0x58/0xb4
          [<67e0f7f7>] bus_for_each_drv+0x80/0xd0
          [<4de545dc>] __device_attach+0xe8/0x15c
          [<2e4431e7>] bus_probe_device+0x84/0x8c
          [<c18b1de9>] device_add+0x384/0x7c0
          [<5aff2995>] of_platform_device_create_pdata+0x8c/0xb8
          [<061c3483>] of_platform_bus_create+0x198/0x230
          [<5ee6d42a>] of_platform_populate+0x60/0xb8
          [<2647300f>] sysc_probe+0xd18/0x135c
          [<3402f137>] platform_probe+0x58/0xb8
      
      Signed-off-by: default avatarQuanyang Wang <quanyang.wang@windriver.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8847a756
    • Felix Fietkau's avatar
      mac80211: fix station rate table updates on assoc · 2ca1ddc3
      Felix Fietkau authored
      commit 18fe0fae
      
       upstream.
      
      If the driver uses .sta_add, station entries are only uploaded after the sta
      is in assoc state. Fix early station rate table updates by deferring them
      until the sta has been uploaded.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Link: https://lore.kernel.org/r/20210201083324.3134-1-nbd@nbd.name
      [use rcu_access_pointer() instead since we won't dereference here]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ca1ddc3
    • Sargun Dhillon's avatar
      ovl: implement volatile-specific fsync error behaviour · 8ccf963c
      Sargun Dhillon authored
      commit 335d3fc5
      
       upstream.
      
      Overlayfs's volatile option allows the user to bypass all forced sync calls
      to the upperdir filesystem. This comes at the cost of safety. We can never
      ensure that the user's data is intact, but we can make a best effort to
      expose whether or not the data is likely to be in a bad state.
      
      The best way to handle this in the time being is that if an overlayfs's
      upperdir experiences an error after a volatile mount occurs, that error
      will be returned on fsync, fdatasync, sync, and syncfs. This is
      contradictory to the traditional behaviour of VFS which fails the call
      once, and only raises an error if a subsequent fsync error has occurred,
      and been raised by the filesystem.
      
      One awkward aspect of the patch is that we have to manually set the
      superblock's errseq_t after the sync_fs callback as opposed to just
      returning an error from syncfs. This is because the call chain looks
      something like this:
      
      sys_syncfs ->
      	sync_filesystem ->
      		__sync_filesystem ->
      			/* The return value is ignored here
      			sb->s_op->sync_fs(sb)
      			_sync_blockdev
      		/* Where the VFS fetches the error to raise to userspace */
      		errseq_check_and_advance
      
      Because of this we call errseq_set every time the sync_fs callback occurs.
      Due to the nature of this seen / unseen dichotomy, if the upperdir is an
      inconsistent state at the initial mount time, overlayfs will refuse to
      mount, as overlayfs cannot get a snapshot of the upperdir's errseq that
      will increment on error until the user calls syncfs.
      
      Signed-off-by: default avatarSargun Dhillon <sargun@sargun.me>
      Suggested-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Fixes: c86243b0
      
       ("ovl: provide a mount option "volatile"")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ccf963c
    • Miklos Szeredi's avatar
      ovl: avoid deadlock on directory ioctl · a66f82a1
      Miklos Szeredi authored
      commit b854cc65 upstream.
      
      The function ovl_dir_real_file() currently uses the inode lock to serialize
      writes to the od->upperfile field.
      
      However, this function will get called by ovl_ioctl_set_flags(), which
      utilizes the inode lock too.  In this case ovl_dir_real_file() will try to
      claim a lock that is owned by a function in its call stack, which won't get
      released before ovl_dir_real_file() returns.
      
      Fix by replacing the open coded compare and exchange by an explicit atomic
      op.
      
      Fixes: 61536bed
      
       ("ovl: support [S|G]ETFLAGS and FS[S|G]ETXATTR ioctls for directories")
      Cc: stable@vger.kernel.org # v5.10
      Reported-by: default avatarIcenowy Zheng <icenowy@aosc.io>
      Tested-by: default avatarIcenowy Zheng <icenowy@aosc.io>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a66f82a1
    • Liangyan's avatar
      ovl: fix dentry leak in ovl_get_redirect · fb8caef7
      Liangyan authored
      commit e04527fe upstream.
      
      We need to lock d_parent->d_lock before dget_dlock, or this may
      have d_lockref updated parallelly like calltrace below which will
      cause dentry->d_lockref leak and risk a crash.
      
           CPU 0                                CPU 1
      ovl_set_redirect                       lookup_fast
        ovl_get_redirect                       __d_lookup
          dget_dlock
            //no lock protection here            spin_lock(&dentry->d_lock)
            dentry->d_lockref.count++            dentry->d_lockref.count++
      
      [   49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0
      [   49.799689] Oops: 0002 [#1] SMP PTI
      [   49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1
      [   49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014
      [   49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20
      [   49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246
      [   49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000
      [   49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088
      [   49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040
      [   49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000
      [   49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0
      [   49.807200] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000
      [   49.807935] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0
      [   49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   49.810410] Call Trace:
      [   49.810653]  d_delete+0x2c/0xb0
      [   49.810951]  vfs_rmdir+0xfd/0x120
      [   49.811264]  do_rmdir+0x14f/0x1a0
      [   49.811573]  do_syscall_64+0x5b/0x190
      [   49.811917]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   49.812385] RIP: 0033:0x7ffbf505ffd7
      [   49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054
      [   49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7
      [   49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50
      [   49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b
      [   49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012
      [   49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8
      [   49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042
      [   49.819052] CR2: 0000000000000088
      [   49.819368] ---[ end trace 4e652b8aa299aa2d ]---
      [   49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20
      [   49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246
      [   49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000
      [   49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088
      [   49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040
      [   49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000
      [   49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0
      [   49.828890] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000
      [   49.830725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0
      [   49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Cc: <stable@vger.kernel.org>
      Fixes: a6c60655
      
       ("ovl: redirect on rename-dir")
      Signed-off-by: default avatarLiangyan <liangyan.peng@linux.alibaba.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb8caef7
    • Mario Limonciello's avatar
      thunderbolt: Fix possible NULL pointer dereference in tb_acpi_add_link() · 0e5cb872
      Mario Limonciello authored
      commit 4d395c5e upstream.
      
      When we walk up the device hierarchy in tb_acpi_add_link() make sure we
      break the loop if the device has no parent. Otherwise we may crash the
      kernel by dereferencing a NULL pointer.
      
      Fixes: b2be2b05
      
       ("thunderbolt: Create device links from ACPI description")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMario Limonciello <mario.limonciello@dell.com>
      Acked-by: default avatarYehezkel Bernat <YehezkelShB@gmail.com>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e5cb872
    • Masahiro Yamada's avatar
      kbuild: fix duplicated flags in DEBUG_CFLAGS · 19155473
      Masahiro Yamada authored
      [ Upstream commit 315da87c ]
      
      Sedat Dilek noticed duplicated flags in DEBUG_CFLAGS when building
      deb-pkg with CONFIG_DEBUG_INFO. For example, 'make CC=clang bindeb-pkg'
      reproduces the issue.
      
      Kbuild recurses to the top Makefile for some targets such as package
      builds.
      
      With commit 121c5d08 ("kbuild: Only add -fno-var-tracking-assignments
      for old GCC versions") applied, DEBUG_CFLAGS is now reset only when
      CONFIG_CC_IS_GCC=y.
      
      Fix it to reset DEBUG_CFLAGS all the time.
      
      Fixes: 121c5d08
      
       ("kbuild: Only add -fno-var-tracking-assignments for old GCC versions")
      Reported-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Reviewed-by: default avatarMark Wielaard <mark@klomp.org>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      19155473
    • Roman Gushchin's avatar
      memblock: do not start bottom-up allocations with kernel_end · 1897a8f0
      Roman Gushchin authored
      [ Upstream commit 2dcb3964 ]
      
      With kaslr the kernel image is placed at a random place, so starting the
      bottom-up allocation with the kernel_end can result in an allocation
      failure and a warning like this one:
      
        hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
        ------------[ cut here ]------------
        memblock: bottom-up allocation failed, memory hotremove may be affected
        WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
        RIP: 0010:memblock_find_in_range_node+0x178/0x25a
        Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
        RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
        RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
        RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
        RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
        R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
        R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
        FS:  0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
        Call Trace:
          memblock_alloc_range_nid+0x8d/0x11e
          cma_declare_contiguous_nid+0x2c4/0x38c
          hugetlb_cma_reserve+0xdc/0x128
          flush_tlb_one_kernel+0xc/0x20
          native_set_fixmap+0x82/0xd0
          flat_get_apic_id+0x5/0x10
          register_lapic_address+0x8e/0x97
          setup_arch+0x8a5/0xc3f
          start_kernel+0x66/0x547
          load_ucode_bsp+0x4c/0xcd
          secondary_startup_64_no_verify+0xb0/0xbb
        random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
        ---[ end trace f151227d0b39be70 ]---
      
      At the same time, the kernel image is protected with memblock_reserve(),
      so we can just start searching at PAGE_SIZE.  In this case the bottom-up
      allocation has the same chances to success as a top-down allocation, so
      there is no reason to fallback in the case of a failure.  All together it
      simplifies the logic.
      
      Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
      Fixes: 8fabc623
      
       ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Wonhyuk Yang <vvghjk1234@gmail.com>
      Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1897a8f0
    • Eli Cohen's avatar
      vdpa/mlx5: Restore the hardware used index after change map · 346ea7cc
      Eli Cohen authored
      [ Upstream commit b35ccebe ]
      
      When a change of memory map occurs, the hardware resources are destroyed
      and then re-created again with the new memory map. In such case, we need
      to restore the hardware available and used indices. The driver failed to
      restore the used index which is added here.
      
      Also, since the driver also fails to reset the available and used
      indices upon device reset, fix this here to avoid regression caused by
      the fact that used index may not be zero upon device reset.
      
      Fixes: 1a86b377
      
       ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Link: https://lore.kernel.org/r/20210204073618.36336-1-elic@nvidia.com
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      346ea7cc
    • Sagi Grimberg's avatar
      nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs · c1debbaf
      Sagi Grimberg authored
      [ Upstream commit cb8563f5 ]
      
      When the host sends multiple h2cdata PDUs, we keep track on
      the receive progress and calculate the scatterlist index and
      offsets.
      
      The issue is that sg_offset should only be kept for the first
      iov entry we map in the iovec as this is the difference between
      our cursor and the sg entry offset itself.
      
      In addition, the sg index was calculated wrong because we should
      not round up when dividing the command byte offset with PAG_SIZE.
      
      Fixes: 872d26a3
      
       ("nvmet-tcp: add NVMe over TCP target driver")
      Reported-by: default avatarNarayan Ayalasomayajula <Narayan.Ayalasomayajula@wdc.com>
      Tested-by: default avatarNarayan Ayalasomayajula <Narayan.Ayalasomayajula@wdc.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c1debbaf
    • Hermann Lauer's avatar
      ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode · b9464c5f
      Hermann Lauer authored
      [ Upstream commit a900cac3 ]
      
      BPi Pro needs TX and RX delay for Gbit to work reliable and avoid high
      packet loss rates. The realtek phy driver overrides the settings of the
      pull ups for the delays, so fix this for BananaPro.
      
      Fix the phy-mode description to correctly reflect this so that the
      implementation doesn't reconfigure the delays incorrectly. This
      happened with commit bbc4d71d ("net: phy: realtek: fix rtl8211e
      rx/tx delay config").
      
      Fixes: 10662a33
      
       ("ARM: dts: sun7i: Add dts file for Bananapro board")
      Signed-off-by: default avatarHermann Lauer <Hermann.Lauer@uni-heidelberg.de>
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
      Link: https://lore.kernel.org/r/20210128111842.GA11919@lemon.iwr.uni-heidelberg.de
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b9464c5f
    • Dan Carpenter's avatar
      net: ipa: pass correct dma_handle to dma_free_coherent() · 38b83bce
      Dan Carpenter authored
      [ Upstream commit 4ace7a6e ]
      
      The "ring->addr = addr;" assignment is done a few lines later so we
      can't use "ring->addr" yet.  The correct dma_handle is "addr".
      
      Fixes: 650d1603
      
       ("soc: qcom: ipa: the generic software interface")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Link: https://lore.kernel.org/r/YBjpTU2oejkNIULT@mwanda
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      38b83bce
    • Heiner Kallweit's avatar
      r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set · 714c19bc
      Heiner Kallweit authored
      [ Upstream commit cc9f07a8 ]
      
      So far phy_disconnect() is called before free_irq(). If CONFIG_DEBUG_SHIRQ
      is set and interrupt is shared, then free_irq() creates an "artificial"
      interrupt by calling the interrupt handler. The "link change" flag is set
      in the interrupt status register, causing phylib to eventually call
      phy_suspend(). Because the net_device is detached from the PHY already,
      the PHY driver can't recognize that WoL is configured and powers down the
      PHY.
      
      Fixes: f1e911d5
      
       ("r8169: add basic phylib support")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/fe732c2c-a473-9088-3974-df83cfbd6efd@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      714c19bc
    • Stefan Chulski's avatar
      net: mvpp2: TCAM entry enable should be written after SRAM data · 397ae1a2
      Stefan Chulski authored
      [ Upstream commit 43f4a20a ]
      
      Last TCAM data contains TCAM enable bit.
      It should be written after SRAM data before entry enabled.
      
      Fixes: 3f518509
      
       ("ethernet: Add new driver for Marvell Armada 375 network unit")
      Signed-off-by: default avatarStefan Chulski <stefanc@marvell.com>
      Link: https://lore.kernel.org/r/1612172139-28343-1-git-send-email-stefanc@marvell.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      397ae1a2
    • Xie He's avatar
      net: lapb: Copy the skb before sending a packet · dec629e9
      Xie He authored
      [ Upstream commit 88c7a9fd ]
      
      When sending a packet, we will prepend it with an LAPB header.
      This modifies the shared parts of a cloned skb, so we should copy the
      skb rather than just clone it, before we prepend the header.
      
      In "Documentation/networking/driver.rst" (the 2nd point), it states
      that drivers shouldn't modify the shared parts of a cloned skb when
      transmitting.
      
      The "dev_queue_xmit_nit" function in "net/core/dev.c", which is called
      when an skb is being sent, clones the skb and sents the clone to
      AF_PACKET sockets. Because the LAPB drivers first remove a 1-byte
      pseudo-header before handing over the skb to us, if we don't copy the
      skb before prepending the LAPB header, the first byte of the packets
      received on AF_PACKET sockets can be corrupted.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarXie He <xie.he.0141@gmail.com>
      Acked-by: default avatarMartin Schiller <ms@dev.tdt.de>
      Link: https://lore.kernel.org/r/2021020105570...
      dec629e9
    • Maor Dickman's avatar
      net/mlx5e: Release skb in case of failure in tc update skb · 6a5c3bac
      Maor Dickman authored
      [ Upstream commit a34ffec8 ]
      
      In case of failure in tc update skb the packet is dropped
      without freeing the skb.
      
      Fixed by freeing the skb in case failure in tc update skb.
      
      Fixes: d6d27782 ("net/mlx5: E-Switch, Restore chain id on miss")
      Fixes: c7569097
      
       ("net/mlx5e: Add tc chains offload support for nic flows")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6a5c3bac
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Update max_opened_tc also when channels are closed · c2b2c4d2
      Maxim Mikityanskiy authored
      [ Upstream commit 5a2ba25a ]
      
      max_opened_tc is used for stats, so that potentially non-zero stats
      won't disappear when num_tc decreases. However, mlx5e_setup_tc_mqprio
      fails to update it in the flow where channels are closed.
      
      This commit fixes it. The new value of priv->channels.params.num_tc is
      always checked on exit. In case of errors it will just be the old value,
      and in case of success it will be the updated value.
      
      Fixes: 05909bab
      
       ("net/mlx5e: Avoid reset netdev stats on configuration changes")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c2b2c4d2
    • Maor Gottlieb's avatar
      net/mlx5: Fix leak upon failure of rule creation · 11c2c8fb
      Maor Gottlieb authored
      [ Upstream commit a5bfe6b4 ]
      
      When creation of a new rule that requires allocation of an FTE fails,
      need to call to tree_put_node on the FTE in order to release its'
      resource.
      
      Fixes: cefc2355
      
       ("net/mlx5: Fix FTE cleanup")
      Signed-off-by: default avatarMaor Gottlieb <maorg@nvidia.com>
      Reviewed-by: default avatarAlaa Hleihel <alaa@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      11c2c8fb
    • Daniel Jurgens's avatar
      net/mlx5: Fix function calculation for page trees · ada34201
      Daniel Jurgens authored
      [ Upstream commit ed5e83a3 ]
      
      The function calculation always results in a value of 0. This works
      generally, but when the release all pages feature is enabled it will
      result in crashes.
      
      Fixes: 0aa12847
      
       ("net/mlx5: Maintain separate page trees for ECPF and PF functions")
      Signed-off-by: default avatarDaniel Jurgens <danielj@nvidia.com>
      Reported-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ada34201
    • Lijun Pan's avatar
      ibmvnic: device remove has higher precedence over reset · b5802b74
      Lijun Pan authored
      [ Upstream commit 5e9eff5d
      
       ]
      
      Returning -EBUSY in ibmvnic_remove() does not actually hold the
      removal procedure since driver core doesn't care for the return
      value (see __device_release_driver() in drivers/base/dd.c
      calling dev->bus->remove()) though vio_bus_remove
      (in arch/powerpc/platforms/pseries/vio.c) records the
      return value and passes it on. [1]
      
      During the device removal precedure, checking for resetting
      bit is dropped so that we can continue executing all the
      cleanup calls in the rest of the remove function. Otherwise,
      it can cause latent memory leaks and kernel crashes.
      
      [1] https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
      Reported-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Fixes: 7d7195a0
      
       ("ibmvnic: Do not process device remove during device reset")
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Link: https://lore.kernel.org/r/20210129043402.95744-1-ljp@linux.ibm.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b5802b74
    • Aleksandr Loktionov's avatar
      i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues" · cd77dccc
      Aleksandr Loktionov authored
      [ Upstream commit f559a356 ]
      
      This reverts commit 2ad1274f
      
      VF queues were not brought up when PF was brought up after being
      downed if the VF driver disabled VFs queues during PF down.
      This could happen in some older or external VF driver implementations.
      The problem was that PF driver used vf->queues_enabled as a condition
      to decide what link-state it would send out which caused the issue.
      
      Remove the check for vf->queues_enabled in the VF link notify.
      Now VF will always be notified of the current link status.
      Also remove the queues_enabled member from i40e_vf structure as it is
      not used anymore. Otherwise VNF implementation was broken and caused
      a link flap.
      
      The original commit was a workaround to avoid breaking existing VFs though
      it's really a fault of the VF code not the PF. The commit should be safe to
      revert as all of the VFs we know of have been fixed. Also, since we now
      know there is a related bug in the workaround, removing it is preferred.
      
      Fixes: 2ad1274f
      
       ("i40e: don't report link up for a VF who hasn't enabled")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cd77dccc
    • Kevin Lo's avatar
      igc: check return value of ret_val in igc_config_fc_after_link_up · 1ac8bec2
      Kevin Lo authored
      [ Upstream commit b8811456 ]
      
      Check return value from ret_val to make error check actually work.
      
      Fixes: 4eb80801
      
       ("igc: Add setup link functionality")
      Signed-off-by: default avatarKevin Lo <kevlo@kevlo.org>
      Acked-by: default avatarSasha Neftin <sasha.neftin@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ac8bec2
    • Kevin Lo's avatar
      igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr · 0cda1604
      Kevin Lo authored
      [ Upstream commit ebc8d125 ]
      
      This patch sets the default return value to -IGC_ERR_NVM in
      igc_write_nvm_srwr. Without this change it wouldn't lead to a shadow RAM
      write EEWR timeout.
      
      Fixes: ab405612
      
       ("igc: Add NVM support")
      Signed-off-by: default avatarKevin Lo <kevlo@kevlo.org>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0cda1604
    • Chuck Lever's avatar
      SUNRPC: Fix NFS READs that start at non-page-aligned offsets · 8e081627
      Chuck Lever authored
      [ Upstream commit bad4c6eb
      
       ]
      
      Anj Duvnjak reports that the Kodi.tv NFS client is not able to read
      video files from a v5.10.11 Linux NFS server.
      
      The new sendpage-based TCP sendto logic was not attentive to non-
      zero page_base values. nfsd_splice_read() sets that field when a
      READ payload starts in the middle of a page.
      
      The Linux NFS client rarely emits an NFS READ that is not page-
      aligned. All of my testing so far has been with Linux clients, so I
      missed this one.
      
      Reported-by: default avatarA. Duvnjak <avian@extremenerds.net>
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211471
      Fixes: 4a85a6a3
      
       ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Tested-by: default avatarA. Duvnjak <avian@extremenerds.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8e081627
    • Zyta Szpak's avatar
      arm64: dts: ls1046a: fix dcfg address range · ceca8bae
      Zyta Szpak authored
      [ Upstream commit aa880c6f
      
       ]
      
      Dcfg was overlapping with clockgen address space which resulted
      in failure in memory allocation for dcfg. According regs description
      dcfg size should not be bigger than 4KB.
      
      Signed-off-by: default avatarZyta Szpak <zr@semihalf.com>
      Fixes: 8126d881
      
       ("arm64: dts: add QorIQ LS1046A SoC support")
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ceca8bae
    • David Howells's avatar
      rxrpc: Fix deadlock around release of dst cached on udp tunnel · e5ed4e08
      David Howells authored
      [ Upstream commit 5399d522
      
       ]
      
      AF_RXRPC sockets use UDP ports in encap mode.  This causes socket and dst
      from an incoming packet to get stolen and attached to the UDP socket from
      whence it is leaked when that socket is closed.
      
      When a network namespace is removed, the wait for dst records to be cleaned
      up happens before the cleanup of the rxrpc and UDP socket, meaning that the
      wait never finishes.
      
      Fix this by moving the rxrpc (and, by dependence, the afs) private
      per-network namespace registrations to the device group rather than subsys
      group.  This allows cached rxrpc local endpoints to be cleared and their
      UDP sockets closed before we try waiting for the dst records.
      
      The symptom is that lines looking like the following:
      
      	unregister_netdevice: waiting for lo to become free
      
      get emitted at regular intervals after running something like the
      referenced syzbot test.
      
      Thanks to Vadim for tracking this down and work out the fix.
      
      Reported-by: default avatar <syzbot+df400f2f24a1677cd7e0@syzkaller.appspotmail.com>
      Reported-by: default avatarVadim Fedorenko <vfedorenko@novek.ru>
      Fixes: 5271953c
      
       ("rxrpc: Use the UDP encap_rcv hook")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarVadim Fedorenko <vfedorenko@novek.ru>
      Link: https://lore.kernel.org/r/161196443016.3868642.5577440140646403533.stgit@warthog.procyon.org.uk
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e5ed4e08
    • Heiner Kallweit's avatar
      r8169: work around RTL8125 UDP hw bug · 7fc1a5a5
      Heiner Kallweit authored
      [ Upstream commit 8d520b4d ]
      
      It was reported that on RTL8125 network breaks under heavy UDP load,
      e.g. torrent traffic ([0], from comment 27). Realtek confirmed a hw bug
      and provided me with a test version of the r8125 driver including a
      workaround. Tests confirmed that the workaround fixes the issue.
      I modified the original version of the workaround to meet mainline
      code style.
      
      [0] https://bugzilla.kernel.org/show_bug.cgi?id=209839
      
      v2:
      - rebased to net
      v3:
      - make rtl_skb_is_udp() more robust and use skb_header_pointer()
        to access the ip(v6) header
      v4:
      - remove dependency on ptp_classify.h
      - replace magic number with offsetof(struct udphdr, len)
      
      Fixes: f1bce4ad
      
       ("r8169: add support for RTL8125")
      Tested-by: default avatarxplo <xplo.bn@gmail.com>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/6e453d49-1801-e6de-d5f7-d7e6c7526c8f@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7fc1a5a5
    • Marek Szyprowski's avatar
      arm64: dts: meson: switch TFLASH_VDD_EN pin to open drain on Odroid-C4 · ee1709a3
      Marek Szyprowski authored
      [ Upstream commit daf12bee
      
       ]
      
      For the proper reboot Odroid-C4 board requires to switch TFLASH_VDD_EN
      pin to the high impedance mode, otherwise the board is stuck in the
      middle of loading early stages of the bootloader from SD card.
      
      This can be achieved by using the OPEN_DRAIN flag instead of the
      ACTIVE_HIGH, what will leave the pin in input mode to achieve high state
      (pin has the pull-up) and solve the issue.
      
      Suggested-by: default avatarNeil Armstrong <narmstrong@baylibre.com>
      Fixes: 326e5751
      
       ("arm64: dts: meson-sm1: add support for Hardkernel ODROID-C4")
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Acked-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Signed-off-by: default avatarKevin Hilman <khilman@baylibre.com>
      Link: https://lore.kernel.org/r/20210122055218.27241-1-m.szyprowski@samsung.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee1709a3
    • Quentin Monnet's avatar
      bpf, preload: Fix build when $(O) points to a relative path · 6f5ee57a
      Quentin Monnet authored
      [ Upstream commit 150a2732 ]
      
      Building the kernel with CONFIG_BPF_PRELOAD, and by providing a relative
      path for the output directory, may fail with the following error:
      
        $ make O=build bindeb-pkg
        ...
        /.../linux/tools/scripts/Makefile.include:5: *** O=build does not exist.  Stop.
        make[7]: *** [/.../linux/kernel/bpf/preload/Makefile:9: kernel/bpf/preload/libbpf.a] Error 2
        make[6]: *** [/.../linux/scripts/Makefile.build:500: kernel/bpf/preload] Error 2
        make[5]: *** [/.../linux/scripts/Makefile.build:500: kernel/bpf] Error 2
        make[4]: *** [/.../linux/Makefile:1799: kernel] Error 2
        make[4]: *** Waiting for unfinished jobs....
      
      In the case above, for the "bindeb-pkg" target, the error is produced by
      the "dummy" check in Makefile.include, called from libbpf's Makefile.
      This check changes directory to $(PWD) before checking for the existence
      of $(O). But at this step we have $(PWD) pointing to "/.../linux/build",
      and $(O) pointing to "build". So the Makefile.include tries in fact to
      assert the existence of a directory named "/.../linux/build/build",
      which does not exist.
      
      Note that the error does not occur for all make targets and
      architectures combinations. This was observed on x86 for "bindeb-pkg",
      or for a regular build for UML [0].
      
      Here are some details. The root Makefile recursively calls itself once,
      after changing directory to $(O). The content for the variable $(PWD) is
      preserved across recursive calls to make, so it is unchanged at this
      step. For "bindeb-pkg", $(PWD) is eventually updated because the target
      writes a new Makefile (as debian/rules) and calls it indirectly through
      dpkg-buildpackage. This script does not preserve $(PWD), which is reset
      to the current working directory when the target in debian/rules is
      called.
      
      Although not investigated, it seems likely that something similar causes
      UML to change its value for $(PWD).
      
      Non-trivial fixes could be to remove the use of $(PWD) from the "dummy"
      check, or to make sure that $(PWD) and $(O) are preserved or updated to
      always play well and form a valid $(PWD)/$(O) path across the different
      targets and architectures. Instead, we take a simpler approach and just
      update $(O) when calling libbpf's Makefile, so it points to an absolute
      path which should always resolve for the "dummy" check run (through
      includes) by that Makefile.
      
      David Gow previously posted a slightly different version of this patch
      as a RFC [0], two months ago or so.
      
        [0] https://lore.kernel.org/bpf/20201119085022.3606135-1-davidgow@google.com/t/#u
      
      Fixes: d71fa5c9
      
       ("bpf: Add kernel module with user mode driver that populates bpffs.")
      Reported-by: default avatarDavid Gow <davidgow@google.com>
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Cc: Brendan Higgins <brendanhiggins@google.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Link: https://lore.kernel.org/bpf/20210126161320.24561-1-quentin@isovalent.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6f5ee57a
    • Johannes Berg's avatar
      um: virtio: free vu_dev only with the contained struct device · 72c8389f
      Johannes Berg authored
      [ Upstream commit f4172b08 ]
      
      Since struct device is refcounted, we shouldn't free the vu_dev
      immediately when it's removed from the platform device, but only
      when the references actually all go away. Move the freeing to
      the release to accomplish that.
      
      Fixes: 5d38f324
      
       ("um: drivers: Add virtio vhost-user driver")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      72c8389f
    • Pan Bian's avatar
      bpf, inode_storage: Put file handler if no storage was found · 571fe1ba
      Pan Bian authored
      [ Upstream commit b9557caa ]
      
      Put file f if inode_storage_ptr() returns NULL.
      
      Fixes: 8ea63684
      
       ("bpf: Implement bpf_local_storage for inodes")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarKP Singh <kpsingh@kernel.org>
      Link: https://lore.kernel.org/bpf/20210121020856.25507-1-bianpan2016@163.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      571fe1ba
    • Loris Reiff's avatar
      bpf, cgroup: Fix problematic bounds check · 9447d0f8
      Loris Reiff authored
      [ Upstream commit f4a2da75 ]
      
      Since ctx.optlen is signed, a larger value than max_value could be
      passed, as it is later on used as unsigned, which causes a WARN_ON_ONCE
      in the copy_to_user.
      
      Fixes: 0d01da6a
      
       ("bpf: implement getsockopt and setsockopt hooks")
      Signed-off-by: default avatarLoris Reiff <loris.reiff@liblor.ch>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/bpf/20210122164232.61770-2-loris.reiff@liblor.ch
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9447d0f8
    • Loris Reiff's avatar
      bpf, cgroup: Fix optlen WARN_ON_ONCE toctou · ee3844e6
      Loris Reiff authored
      [ Upstream commit bb8b81e3 ]
      
      A toctou issue in `__cgroup_bpf_run_filter_getsockopt` can trigger a
      WARN_ON_ONCE in a check of `copy_from_user`.
      
      `*optlen` is checked to be non-negative in the individual getsockopt
      functions beforehand. Changing `*optlen` in a race to a negative value
      will result in a `copy_from_user(ctx.optval, optval, ctx.optlen)` with
      `ctx.optlen` being a negative integer.
      
      Fixes: 0d01da6a
      
       ("bpf: implement getsockopt and setsockopt hooks")
      Signed-off-by: default avatarLoris Reiff <loris.reiff@liblor.ch>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/bpf/20210122164232.61770-1-loris.reiff@liblor.ch
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee3844e6
    • Eli Cohen's avatar
      vdpa/mlx5: Fix memory key MTT population · 28ad17a5
      Eli Cohen authored
      [ Upstream commit 710eb8e3 ]
      
      map_direct_mr() assumed that the number of scatter/gather entries
      returned by dma_map_sg_attrs() was equal to the number of segments in
      the sgl list. This led to wrong population of the mkey object. Fix this
      by properly referring to the returned value.
      
      The hardware expects each MTT entry to contain the DMA address of a
      contiguous block of memory of size (1 << mr->log_size) bytes.
      dma_map_sg_attrs() can coalesce several sg entries into a single
      scatter/gather entry of contiguous DMA range so we need to scan the list
      and refer to the size of each s/g entry.
      
      In addition, get rid of fill_sg() which effect is overwritten by
      populate_mtts().
      
      Fixes: 94abbccd
      
       ("vdpa/mlx5: Add shared memory registration code")
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Link: https://lore.kernel.org/r/20210107071845.GA224876@mtl-vdi-166.wap.labs.mlnx
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      28ad17a5