Skip to content
  1. Jun 30, 2020
    • Taehee Yoo's avatar
      ip_tunnel: fix use-after-free in ip_tunnel_lookup() · aa0405f4
      Taehee Yoo authored
      [ Upstream commit ba61539c
      
       ]
      
      In the datapath, the ip_tunnel_lookup() is used and it internally uses
      fallback tunnel device pointer, which is fb_tunnel_dev.
      This pointer variable should be set to NULL when a fb interface is deleted.
      But there is no routine to set fb_tunnel_dev pointer to NULL.
      So, this pointer will be still used after interface is deleted and
      it eventually results in the use-after-free problem.
      
      Test commands:
          ip netns add A
          ip netns add B
          ip link add eth0 type veth peer name eth1
          ip link set eth0 netns A
          ip link set eth1 netns B
      
          ip netns exec A ip link set lo up
          ip netns exec A ip link set eth0 up
          ip netns exec A ip link add gre1 type gre local 10.0.0.1 \
      	    remote 10.0.0.2
          ip netns exec A ip link set gre1 up
          ip netns exec A ip a a 10.0.100.1/24 dev gre1
          ip netns exec A ip a a 10.0.0.1/24 dev eth0
      
          ip netns exec B ip link set lo up
          ip netns exec B ip link set eth1 up
          ip netns exec B ip link add gre1 type gre local 10.0.0.2 \
      	    remote 10.0.0.1
          ip netns exec B ip link set gre1 up
          ip netns exec B ip a a 10.0.100.2/24 dev gre1
          ip netns exec B ip a a 10.0.0.2/24 dev eth1
          ip netns exec A hping3 10.0.100.2 -2 --flood -d 60000 &
          ip netns del B
      
      Splat looks like:
      [   77.793450][    C3] ==================================================================
      [   77.794702][    C3] BUG: KASAN: use-after-free in ip_tunnel_lookup+0xcc4/0xf30
      [   77.795573][    C3] Read of size 4 at addr ffff888060bd9c84 by task hping3/2905
      [   77.796398][    C3]
      [   77.796664][    C3] CPU: 3 PID: 2905 Comm: hping3 Not tainted 5.8.0-rc1+ #616
      [   77.797474][    C3] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   77.798453][    C3] Call Trace:
      [   77.798815][    C3]  <IRQ>
      [   77.799142][    C3]  dump_stack+0x9d/0xdb
      [   77.799605][    C3]  print_address_description.constprop.7+0x2cc/0x450
      [   77.800365][    C3]  ? ip_tunnel_lookup+0xcc4/0xf30
      [   77.800908][    C3]  ? ip_tunnel_lookup+0xcc4/0xf30
      [   77.801517][    C3]  ? ip_tunnel_lookup+0xcc4/0xf30
      [   77.802145][    C3]  kasan_report+0x154/0x190
      [   77.802821][    C3]  ? ip_tunnel_lookup+0xcc4/0xf30
      [   77.803503][    C3]  ip_tunnel_lookup+0xcc4/0xf30
      [   77.804165][    C3]  __ipgre_rcv+0x1ab/0xaa0 [ip_gre]
      [   77.804862][    C3]  ? rcu_read_lock_sched_held+0xc0/0xc0
      [   77.805621][    C3]  gre_rcv+0x304/0x1910 [ip_gre]
      [   77.806293][    C3]  ? lock_acquire+0x1a9/0x870
      [   77.806925][    C3]  ? gre_rcv+0xfe/0x354 [gre]
      [   77.807559][    C3]  ? erspan_xmit+0x2e60/0x2e60 [ip_gre]
      [   77.808305][    C3]  ? rcu_read_lock_sched_held+0xc0/0xc0
      [   77.809032][    C3]  ? rcu_read_lock_held+0x90/0xa0
      [   77.809713][    C3]  gre_rcv+0x1b8/0x354 [gre]
      [ ... ]
      
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Fixes: c5441932
      
       ("GRE: Refactor GRE tunneling code.")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa0405f4
    • David Christensen's avatar
      tg3: driver sleeps indefinitely when EEH errors exceed eeh_max_freezes · c55e6da8
      David Christensen authored
      [ Upstream commit 3a2656a2
      
       ]
      
      The driver function tg3_io_error_detected() calls napi_disable twice,
      without an intervening napi_enable, when the number of EEH errors exceeds
      eeh_max_freezes, resulting in an indefinite sleep while holding rtnl_lock.
      
      Add check for pcierr_recovery which skips code already executed for the
      "Frozen" state.
      
      Signed-off-by: default avatarDavid Christensen <drc@linux.vnet.ibm.com>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c55e6da8
    • Jeremy Kerr's avatar
      net: usb: ax88179_178a: fix packet alignment padding · 585fe306
      Jeremy Kerr authored
      [ Upstream commit e869e7a1
      
       ]
      
      Using a AX88179 device (0b95:1790), I see two bytes of appended data on
      every RX packet. For example, this 48-byte ping, using 0xff as a
      payload byte:
      
        04:20:22.528472 IP 192.168.1.1 > 192.168.1.2: ICMP echo request, id 2447, seq 1, length 64
      	0x0000:  000a cd35 ea50 000a cd35 ea4f 0800 4500
      	0x0010:  0054 c116 4000 4001 f63e c0a8 0101 c0a8
      	0x0020:  0102 0800 b633 098f 0001 87ea cd5e 0000
      	0x0030:  0000 dcf2 0600 0000 0000 ffff ffff ffff
      	0x0040:  ffff ffff ffff ffff ffff ffff ffff ffff
      	0x0050:  ffff ffff ffff ffff ffff ffff ffff ffff
      	0x0060:  ffff 961f
      
      Those last two bytes - 96 1f - aren't part of the original packet.
      
      In the ax88179 RX path, the usbnet rx_fixup function trims a 2-byte
      'alignment pseudo header' from the start of the packet, and sets the
      length from a per-packet field populated by hardware. It looks like that
      length field *includes* the 2-byte header; the current driver assumes
      that it's excluded.
      
      This change trims the 2-byte alignment header after we've set the packet
      length, so the resulting packet length is correct. While we're moving
      the comment around, this also fixes the spelling of 'pseudo'.
      
      Signed-off-by: default avatarJeremy Kerr <jk@ozlabs.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      585fe306
    • Yang Yingliang's avatar
      net: fix memleak in register_netdevice() · 81f8ebaa
      Yang Yingliang authored
      [ Upstream commit 814152a8
      
       ]
      
      I got a memleak report when doing some fuzz test:
      
      unreferenced object 0xffff888112584000 (size 13599):
        comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
        hex dump (first 32 bytes):
          74 61 70 30 00 00 00 00 00 00 00 00 00 00 00 00  tap0............
          00 ee d9 19 81 88 ff ff 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000002f60ba65>] __kmalloc_node+0x309/0x3a0
          [<0000000075b211ec>] kvmalloc_node+0x7f/0xc0
          [<00000000d3a97396>] alloc_netdev_mqs+0x76/0xfc0
          [<00000000609c3655>] __tun_chr_ioctl+0x1456/0x3d70
          [<000000001127ca24>] ksys_ioctl+0xe5/0x130
          [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
          [<00000000e1023498>] do_syscall_64+0x56/0xa0
          [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      unreferenced object 0xffff888111845cc0 (size 8):
        comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
        hex dump (first 8 bytes):
          74 61 70 30 00 88 ff ff                          tap0....
        backtrace:
          [<000000004c159777>] kstrdup+0x35/0x70
          [<00000000d8b496ad>] kstrdup_const+0x3d/0x50
          [<00000000494e884a>] kvasprintf_const+0xf1/0x180
          [<0000000097880a2b>] kobject_set_name_vargs+0x56/0x140
          [<000000008fbdfc7b>] dev_set_name+0xab/0xe0
          [<000000005b99e3b4>] netdev_register_kobject+0xc0/0x390
          [<00000000602704fe>] register_netdevice+0xb61/0x1250
          [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70
          [<000000001127ca24>] ksys_ioctl+0xe5/0x130
          [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
          [<00000000e1023498>] do_syscall_64+0x56/0xa0
          [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      unreferenced object 0xffff88811886d800 (size 512):
        comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
        hex dump (first 32 bytes):
          00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
          ff ff ff ff ff ff ff ff c0 66 3d a3 ff ff ff ff  .........f=.....
        backtrace:
          [<0000000050315800>] device_add+0x61e/0x1950
          [<0000000021008dfb>] netdev_register_kobject+0x17e/0x390
          [<00000000602704fe>] register_netdevice+0xb61/0x1250
          [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70
          [<000000001127ca24>] ksys_ioctl+0xe5/0x130
          [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
          [<00000000e1023498>] do_syscall_64+0x56/0xa0
          [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      If call_netdevice_notifiers() failed, then rollback_registered()
      calls netdev_unregister_kobject() which holds the kobject. The
      reference cannot be put because the netdev won't be add to todo
      list, so it will leads a memleak, we need put the reference to
      avoid memleak.
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81f8ebaa
    • Al Viro's avatar
      fix a braino in "sparc32: fix register window handling in genregs32_[gs]et()" · 1a78856e
      Al Viro authored
      [ Upstream commit 9d964e1b ]
      
      lost npc in PTRACE_SETREGSET, breaking PTRACE_SETREGS as well
      
      Fixes: cf51e129
      
       "sparc32: fix register window handling in genregs32_[gs]et()"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1a78856e
    • Valentin Longchamp's avatar
      net: sched: export __netdev_watchdog_up() · ab1276b9
      Valentin Longchamp authored
      [ Upstream commit 1a3db27a ]
      
      Since the quiesce/activate rework, __netdev_watchdog_up() is directly
      called in the ucc_geth driver.
      
      Unfortunately, this function is not available for modules and thus
      ucc_geth cannot be built as a module anymore. Fix it by exporting
      __netdev_watchdog_up().
      
      Since the commit introducing the regression was backported to stable
      branches, this one should ideally be as well.
      
      Fixes: 79dde73c
      
       ("net/ethernet/freescale: rework quiesce/activate for ucc_geth")
      Signed-off-by: default avatarValentin Longchamp <valentin@longchamp.me>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ab1276b9
    • Ridge Kennedy's avatar
      l2tp: Allow duplicate session creation with UDP · 387654fc
      Ridge Kennedy authored
      commit 0d0d9a38 upstream.
      
      In the past it was possible to create multiple L2TPv3 sessions with the
      same session id as long as the sessions belonged to different tunnels.
      The resulting sessions had issues when used with IP encapsulated tunnels,
      but worked fine with UDP encapsulated ones. Some applications began to
      rely on this behaviour to avoid having to negotiate unique session ids.
      
      Some time ago a change was made to require session ids to be unique across
      all tunnels, breaking the applications making use of this "feature".
      
      This change relaxes the duplicate session id check to allow duplicates
      if both of the colliding sessions belong to UDP encapsulated tunnels.
      
      Fixes: dbdbc73b
      
       ("l2tp: fix duplicate session creation")
      Signed-off-by: default avatarRidge Kennedy <ridge.kennedy@alliedtelesis.co.nz>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      387654fc
    • Martin Wilck's avatar
      scsi: scsi_devinfo: handle non-terminated strings · de4700c2
      Martin Wilck authored
      commit ba69ead9 upstream.
      
      devinfo->vendor and devinfo->model aren't necessarily
      zero-terminated.
      
      Fixes: b8018b97
      
       "scsi_devinfo: fixup string compare"
      Signed-off-by: default avatarMartin Wilck <mwilck@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de4700c2
    • Toke Høiland-Jørgensen's avatar
      net: Revert "pkt_sched: fq: use proper locking in fq_dump_stats()" · 1ab4cc9c
      Toke Høiland-Jørgensen authored
      This reverts commit 191cf872 which is
      commit 695b4ec0
      
       upstream.
      
      That commit should never have been backported since it relies on a change in
      locking semantics that was introduced in v4.8 and not backported. Because of
      this, the backported commit to sch_fq leads to lockups because of the double
      locking.
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ab4cc9c
    • Ahmed S. Darwish's avatar
      net: core: device_rename: Use rwsem instead of a seqcount · 602c47fb
      Ahmed S. Darwish authored
      [ Upstream commit 11d6011c ]
      
      Sequence counters write paths are critical sections that must never be
      preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed.
      
      Commit 5dbe7c17 ("net: fix kernel deadlock with interface rename and
      netdev name retrieval.") handled a deadlock, observed with
      CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was
      infinitely spinning: it got scheduled after the seqcount write side
      blocked inside its own critical section.
      
      To fix that deadlock, among other issues, the commit added a
      cond_resched() inside the read side section. While this will get the
      non-preemptible kernel eventually unstuck, the seqcount reader is fully
      exhausting its slice just spinning -- until TIF_NEED_RESCHED is set.
      
      The fix is also still broken: if the seqcount reader belongs to a
      real-time scheduling policy, it can spin forever and the kernel will
      livelock.
      
      Disabling preemption over the seqcount write side critical section will
      not work: inside it are a number of GFP_KERNEL allocations and mutex
      locking through the drivers/base/ :: device_rename() call chain.
      
      >From all the above, replace the seqcount with a rwsem.
      
      Fixes: 5dbe7c17 (net: fix kernel deadlock with interface rename and netdev name retrieval.)
      Fixes: 30e6c9fa (net: devnet_rename_seq should be a seqcount)
      Fixes: c91f6df2
      
       (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name)
      Cc: <stable@vger.kernel.org>
      Reported-by: kbuild test robot <lkp@intel.com> [ v1 missing up_read() on error exit ]
      Reported-by: Dan Carpenter <dan.carpenter@oracle.com> [ v1 missing up_read() on error exit ]
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      602c47fb
    • Thomas Gleixner's avatar
      sched/rt, net: Use CONFIG_PREEMPTION.patch · 20f05ba1
      Thomas Gleixner authored
      [ Upstream commit 2da2b32f
      
       ]
      
      CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
      Both PREEMPT and PREEMPT_RT require the same functionality which today
      depends on CONFIG_PREEMPT.
      
      Update the comment to use CONFIG_PREEMPTION.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/r/20191015191821.11479-22-bigeasy@linutronix.de
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      20f05ba1
    • Chen Yu's avatar
      e1000e: Do not wake up the system via WOL if device wakeup is disabled · 496fed53
      Chen Yu authored
      [ Upstream commit 6bf6be11 ]
      
      Currently the system will be woken up via WOL(Wake On LAN) even if the
      device wakeup ability has been disabled via sysfs:
       cat /sys/devices/pci0000:00/0000:00:1f.6/power/wakeup
       disabled
      
      The system should not be woken up if the user has explicitly
      disabled the wake up ability for this device.
      
      This patch clears the WOL ability of this network device if the
      user has disabled the wake up ability in sysfs.
      
      Fixes: bc7f75fa
      
       ("[E1000E]: New pci-express e1000 driver")
      Reported-by: default avatar"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      496fed53
    • Jiri Olsa's avatar
      kretprobe: Prevent triggering kretprobe from within kprobe_flush_task · 7b692b96
      Jiri Olsa authored
      [ Upstream commit 9b38cc70 ]
      
      Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
      My test was also able to trigger lockdep output:
      
       ============================================
       WARNING: possible recursive locking detected
       5.6.0-rc6+ #6 Not tainted
       --------------------------------------------
       sched-messaging/2767 is trying to acquire lock:
       ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
      
       but task is already holding lock:
       ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&(kretprobe_table_locks[i].lock));
         lock(&(kretprobe_table_locks[i].lock));
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by sched-messaging/2767:
        #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       stack backtrace:
       CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
       Call Trace:
        dump_stack+0x96/0xe0
        __lock_acquire.cold.57+0x173/0x2b7
        ? native_queued_spin_lock_slowpath+0x42b/0x9e0
        ? lockdep_hardirqs_on+0x590/0x590
        ? __lock_acquire+0xf63/0x4030
        lock_acquire+0x15a/0x3d0
        ? kretprobe_hash_lock+0x52/0xa0
        _raw_spin_lock_irqsave+0x36/0x70
        ? kretprobe_hash_lock+0x52/0xa0
        kretprobe_hash_lock+0x52/0xa0
        trampoline_handler+0xf8/0x940
        ? kprobe_fault_handler+0x380/0x380
        ? find_held_lock+0x3a/0x1c0
        kretprobe_trampoline+0x25/0x50
        ? lock_acquired+0x392/0xbc0
        ? _raw_spin_lock_irqsave+0x50/0x70
        ? __get_valid_kprobe+0x1f0/0x1f0
        ? _raw_spin_unlock_irqrestore+0x3b/0x40
        ? finish_task_switch+0x4b9/0x6d0
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
      
      The code within the kretprobe handler checks for probe reentrancy,
      so we won't trigger any _raw_spin_lock_irqsave probe in there.
      
      The problem is in outside kprobe_flush_task, where we call:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave
      
      where _raw_spin_lock_irqsave triggers the kretprobe and installs
      kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
      
      The kretprobe_trampoline handler is then executed with already
      locked kretprobe_table_locks, and first thing it does is to
      lock kretprobe_table_locks ;-) the whole lockup path like:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
      
              ---> kretprobe_table_locks locked
      
              kretprobe_trampoline
                trampoline_handler
                  kretprobe_hash_lock(current, &head, &flags);  <--- deadlock
      
      Adding kprobe_busy_begin/end helpers that mark code with fake
      probe installed to prevent triggering of another kprobe within
      this code.
      
      Using these helpers in kprobe_flush_task, so the probe recursion
      protection check is hit and the probe is never set to prevent
      above lockup.
      
      Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2
      
      Fixes: ef53d9c5
      
       ("kprobes: improve kretprobe scalability with hashed locking")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Reported-by: default avatar"Ziqian SUN (Zamir)" <zsun@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7b692b96
    • Masami Hiramatsu's avatar
      x86/kprobes: Avoid kretprobe recursion bug · 7f0e8bc7
      Masami Hiramatsu authored
      [ Upstream commit b191fa96
      
       ]
      
      Avoid kretprobe recursion loop bg by setting a dummy
      kprobes to current_kprobe per-CPU variable.
      
      This bug has been introduced with the asm-coded trampoline
      code, since previously it used another kprobe for hooking
      the function return placeholder (which only has a nop) and
      trampoline handler was called from that kprobe.
      
      This revives the old lost kprobe again.
      
      With this fix, we don't see deadlock anymore.
      
      And you can see that all inner-called kretprobe are skipped.
      
        event_1                                  235               0
        event_2                                19375           19612
      
      The 1st column is recorded count and the 2nd is missed count.
      Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed)
      (some difference are here because the counter is racy)
      
      Reported-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Tested-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: c9becf58
      
       ("[PATCH] kretprobe: kretprobe-booster")
      Link: http://lkml.kernel.org/r/155094064889.6137.972160690963039.stgit@devbox
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7f0e8bc7
    • Naveen N Rao's avatar
      powerpc/kprobes: Fixes for kprobe_lookup_name() on BE · 190d2217
      Naveen N Rao authored
      [ Upstream commit 30176466
      
       ]
      
      Fix two issues with kprobes.h on BE which were exposed with the
      optprobes work:
        - one, having to do with a missing include for linux/module.h for
          MODULE_NAME_LEN -- this didn't show up previously since the only
          users of kprobe_lookup_name were in kprobes.c, which included
          linux/module.h through other headers, and
        - two, with a missing const qualifier for a local variable which ends
          up referring a string literal. Again, this is unique to how
          kprobe_lookup_name is being invoked in optprobes.c
      
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      190d2217
    • Masami Hiramatsu's avatar
      kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex · f637e2c7
      Masami Hiramatsu authored
      [ Upstream commit 1a0aa991 ]
      
      In kprobe_optimizer() kick_kprobe_optimizer() is called
      without kprobe_mutex, but this can race with other caller
      which is protected by kprobe_mutex.
      
      To fix that, expand kprobe_mutex protected area to protect
      kick_kprobe_optimizer() call.
      
      Link: http://lkml.kernel.org/r/158927057586.27680.5036330063955940456.stgit@devnote2
      
      Fixes: cd7ebe22
      
       ("kprobes: Use text_poke_smp_batch for optimizing")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ziqian SUN <zsun@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f637e2c7
    • Eric Biggers's avatar
      crypto: algboss - don't wait during notifier callback · 44f5e4fc
      Eric Biggers authored
      commit 77251e41 upstream.
      
      When a crypto template needs to be instantiated, CRYPTO_MSG_ALG_REQUEST
      is sent to crypto_chain.  cryptomgr_schedule_probe() handles this by
      starting a thread to instantiate the template, then waiting for this
      thread to complete via crypto_larval::completion.
      
      This can deadlock because instantiating the template may require loading
      modules, and this (apparently depending on userspace) may need to wait
      for the crc-t10dif module (lib/crc-t10dif.c) to be loaded.  But
      crc-t10dif's module_init function uses crypto_register_notifier() and
      therefore takes crypto_chain.rwsem for write.  That can't proceed until
      the notifier callback has finished, as it holds this semaphore for read.
      
      Fix this by removing the wait on crypto_larval::completion from within
      cryptomgr_schedule_probe().  It's actually unnecessary because
      crypto_alg_mod_lookup() calls crypto_larval_wait() itself after sending
      CRYPTO_MSG_ALG_REQUEST.
      
      This only actually became a problem in v4.20 due to commit b7637754
      
      
      ("crc-t10dif: Pick better transform if one becomes available"), but the
      unnecessary wait was much older.
      
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207159
      Reported-by: default avatarMike Gerow <gerow@google.com>
      Fixes: 39871037
      
       ("crypto: algapi - Move larval completion into algboss")
      Cc: <stable@vger.kernel.org> # v3.6+
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reported-by: default avatarKai Lüke <kai@kinvolk.io>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44f5e4fc
    • Ahmed S. Darwish's avatar
      block: nr_sects_write(): Disable preemption on seqcount write · 578c6ed2
      Ahmed S. Darwish authored
      [ Upstream commit 15b81ce5 ]
      
      For optimized block readers not holding a mutex, the "number of sectors"
      64-bit value is protected from tearing on 32-bit architectures by a
      sequence counter.
      
      Disable preemption before entering that sequence counter's write side
      critical section. Otherwise, the read side can preempt the write side
      section and spin for the entire scheduler tick. If the reader belongs to
      a real-time scheduling class, it can spin forever and the kernel will
      livelock.
      
      Fixes: c83f6bf9
      
       ("block: add partition resize function to blkpg ioctl")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Reviewed-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      578c6ed2
    • Al Viro's avatar
      sparc64: fix misuses of access_process_vm() in genregs32_[sg]et() · 885237ae
      Al Viro authored
      [ Upstream commit 142cd252 ]
      
      We do need access_process_vm() to access the target's reg_window.
      However, access to caller's memory (storing the result in
      genregs32_get(), fetching the new values in case of genregs32_set())
      should be done by normal uaccess primitives.
      
      Fixes: ad4f9576
      
       ([SPARC64]: Fix user accesses in regset code.)
      Cc: stable@kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      885237ae
    • Lyude Paul's avatar
      drm/dp_mst: Increase ACT retry timeout to 3s · a039515c
      Lyude Paul authored
      [ Upstream commit 873a95e0
      
       ]
      
      Currently we only poll for an ACT up to 30 times, with a busy-wait delay
      of 100µs between each attempt - giving us a timeout of 2900µs. While
      this might seem sensible, it would appear that in certain scenarios it
      can take dramatically longer then that for us to receive an ACT. On one
      of the EVGA MST hubs that I have available, I observed said hub
      sometimes taking longer then a second before signalling the ACT. These
      delays mostly seem to occur when previous sideband messages we've sent
      are NAKd by the hub, however it wouldn't be particularly surprising if
      it's possible to reproduce times like this simply by introducing branch
      devices with large LCTs since payload allocations have to take effect on
      every downstream device up to the payload's target.
      
      So, instead of just retrying 30 times we poll for the ACT for up to 3ms,
      and additionally use usleep_range() to avoid a very long and rude
      busy-wait. Note that the previous retry count of 30 appears to have been
      arbitrarily chosen, as I can't find any mention of a recommended timeout
      or retry count for ACTs in the DisplayPort 2.0 specification. This also
      goes for the range we were previously using for udelay(), although I
      suspect that was just copied from the recommended delay for link
      training on SST devices.
      
      Changes since v1:
      * Use readx_poll_timeout() instead of open-coding timeout loop - Sean
        Paul
      Changes since v2:
      * Increase poll interval to 200us - Sean Paul
      * Print status in hex when we timeout waiting for ACT - Sean Paul
      
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: ad7f8a1f
      
       ("drm/helper: add Displayport multi-stream helper (v0.6)")
      Cc: Sean Paul <sean@poorly.run>
      Cc: <stable@vger.kernel.org> # v3.17+
      Reviewed-by: default avatarSean Paul <sean@poorly.run>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200406221253.1307209-4-lyude@redhat.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a039515c
    • Jeffle Xu's avatar
      ext4: fix partial cluster initialization when splitting extent · bd0fc767
      Jeffle Xu authored
      [ Upstream commit cfb3c85a ]
      
      Fix the bug when calculating the physical block number of the first
      block in the split extent.
      
      This bug will cause xfstests shared/298 failure on ext4 with bigalloc
      enabled occasionally. Ext4 error messages indicate that previously freed
      blocks are being freed again, and the following fsck will fail due to
      the inconsistency of block bitmap and bg descriptor.
      
      The following is an example case:
      
      1. First, Initialize a ext4 filesystem with cluster size '16K', block size
      '4K', in which case, one cluster contains four blocks.
      
      2. Create one file (e.g., xxx.img) on this ext4 filesystem. Now the extent
      tree of this file is like:
      
      ...
      36864:[0]4:220160
      36868:[0]14332:145408
      51200:[0]2:231424
      ...
      
      3. Then execute PUNCH_HOLE fallocate on this file. The hole range is
      like:
      
      ..
      ext4_ext_remove_space: dev 254,16 ino 12 since 49506 end 49506 depth 1
      ext4_ext_remove_space: dev 254,16 ino 12 since 49544 end 49546 depth 1
      ext4_ext_remove_space: dev 254,16 ino 12 since 49605 end 49607 depth 1
      ...
      
      4. Then the extent tree of this file after punching is like
      
      ...
      49507:[0]37:158047
      49547:[0]58:158087
      ...
      
      5. Detailed procedure of punching hole [49544, 49546]
      
      5.1. The block address space:
      ```
      lblk        ~49505  49506   49507~49543     49544~49546    49547~
      	  ---------+------+-------------+----------------+--------
      	    extent | hole |   extent	|	hole	 | extent
      	  ---------+------+-------------+----------------+--------
      pblk       ~158045  158046  158047~158083  158084~158086   158087~
      ```
      
      5.2. The detailed layout of cluster 39521:
      ```
      		cluster 39521
      	<------------------------------->
      
      		hole		  extent
      	<----------------------><--------
      
      lblk      49544   49545   49546   49547
      	+-------+-------+-------+-------+
      	|	|	|	|	|
      	+-------+-------+-------+-------+
      pblk     158084  1580845  158086  158087
      ```
      
      5.3. The ftrace output when punching hole [49544, 49546]:
      - ext4_ext_remove_space (start 49544, end 49546)
        - ext4_ext_rm_leaf (start 49544, end 49546, last_extent [49507(158047), 40], partial [pclu 39522 lblk 0 state 2])
          - ext4_remove_blocks (extent [49507(158047), 40], from 49544 to 49546, partial [pclu 39522 lblk 0 state 2]
            - ext4_free_blocks: (block 158084 count 4)
              - ext4_mballoc_free (extent 1/6753/1)
      
      5.4. Ext4 error message in dmesg:
      EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt.
      EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters
      
      In this case, the whole cluster 39521 is freed mistakenly when freeing
      pblock 158084~158086 (i.e., the first three blocks of this cluster),
      although pblock 158087 (the last remaining block of this cluster) has
      not been freed yet.
      
      The root cause of this isuue is that, the pclu of the partial cluster is
      calculated mistakenly in ext4_ext_remove_space(). The correct
      partial_cluster.pclu (i.e., the cluster number of the first block in the
      next extent, that is, lblock 49597 (pblock 158086)) should be 39521 rather
      than 39522.
      
      Fixes: f4226d9e
      
       ("ext4: fix partial cluster initialization")
      Signed-off-by: default avatarJeffle Xu <jefflexu@linux.alibaba.com>
      Reviewed-by: default avatarEric Whitney <enwlinux@gmail.com>
      Cc: stable@kernel.org # v3.19+
      Link: https://lore.kernel.org/r/1590121124-37096-1-git-send-email-jefflexu@linux.alibaba.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bd0fc767
    • Tom Rix's avatar
      selinux: fix double free · 13f03f2b
      Tom Rix authored
      commit 65de5096
      
       upstream.
      
      Clang's static analysis tool reports these double free memory errors.
      
      security/selinux/ss/services.c:2987:4: warning: Attempt to free released memory [unix.Malloc]
                              kfree(bnames[i]);
                              ^~~~~~~~~~~~~~~~
      security/selinux/ss/services.c:2990:2: warning: Attempt to free released memory [unix.Malloc]
              kfree(bvalues);
              ^~~~~~~~~~~~~~
      
      So improve the security_get_bools error handling by freeing these variables
      and setting their return pointers to NULL and the return len to 0
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Acked-by: default avatarStephen Smalley <stephen.smalley.work@gmail.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13f03f2b
    • Huacai Chen's avatar
      drm/qxl: Use correct notify port address when creating cursor ring · 311e2f76
      Huacai Chen authored
      commit 80e5f89d
      
       upstream.
      
      The command ring and cursor ring use different notify port addresses
      definition: QXL_IO_NOTIFY_CMD and QXL_IO_NOTIFY_CURSOR. However, in
      qxl_device_init() we use QXL_IO_NOTIFY_CMD to create both command ring
      and cursor ring. This doesn't cause any problems now, because QEMU's
      behaviors on QXL_IO_NOTIFY_CMD and QXL_IO_NOTIFY_CURSOR are the same.
      However, QEMU's behavior may be change in future, so let's fix it.
      
      P.S.: In the X.org QXL driver, the notify port address of cursor ring
            is correct.
      
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Cc: <stable@vger.kernel.org>
      Link: http://patchwork.freedesktop.org/patch/msgid/1585635488-17507-1-git-send-email-chenhc@lemote.com
      Signed-off-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      311e2f76
    • Lyude Paul's avatar
      drm/dp_mst: Reformat drm_dp_check_act_status() a bit · 8a654598
      Lyude Paul authored
      commit a5cb5fa6
      
       upstream.
      
      Just add a bit more line wrapping, get rid of some extraneous
      whitespace, remove an unneeded goto label, and move around some variable
      declarations. No functional changes here.
      
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      [this isn't a fix, but it's needed for the fix that comes after this]
      Fixes: ad7f8a1f
      
       ("drm/helper: add Displayport multi-stream helper (v0.6)")
      Cc: Sean Paul <sean@poorly.run>
      Cc: <stable@vger.kernel.org> # v3.17+
      Reviewed-by: default avatarSean Paul <sean@poorly.run>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200406221253.1307209-3-lyude@redhat.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a654598
    • Wolfram Sang's avatar
      drm: encoder_slave: fix refcouting error for modules · 64565a55
      Wolfram Sang authored
      [ Upstream commit f78d4032 ]
      
      module_put() balances try_module_get(), not request_module(). Fix the
      error path to match that.
      
      Fixes: 2066facc
      
       ("drm/kms: slave encoder interface.")
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarEmil Velikov <emil.l.velikov@gmail.com>
      Acked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      64565a55
    • Kai-Heng Feng's avatar
      libata: Use per port sync for detach · d8a71a53
      Kai-Heng Feng authored
      [ Upstream commit b5292111 ]
      
      Commit 130f4caf ("libata: Ensure ata_port probe has completed before
      detach") may cause system freeze during suspend.
      
      Using async_synchronize_full() in PM callbacks is wrong, since async
      callbacks that are already scheduled may wait for not-yet-scheduled
      callbacks, causes a circular dependency.
      
      Instead of using big hammer like async_synchronize_full(), use async
      cookie to make sure port probe are synced, without affecting other
      scheduled PM callbacks.
      
      Fixes: 130f4caf
      
       ("libata: Ensure ata_port probe has completed before detach")
      Suggested-by: default avatarJohn Garry <john.garry@huawei.com>
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Tested-by: default avatarJohn Garry <john.garry@huawei.com>
      BugLink: https://bugs.launchpad.net/bugs/1867983
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d8a71a53
    • Jason Yan's avatar
      block: Fix use-after-free in blkdev_get() · f9aa90e1
      Jason Yan authored
      [ Upstream commit 2d3a8e2d ]
      
      In blkdev_get() we call __blkdev_get() to do some internal jobs and if
      there is some errors in __blkdev_get(), the bdput() is called which
      means we have released the refcount of the bdev (actually the refcount of
      the bdev inode). This means we cannot access bdev after that point. But
      acctually bdev is still accessed in blkdev_get() after calling
      __blkdev_get(). This results in use-after-free if the refcount is the
      last one we released in __blkdev_get(). Let's take a look at the
      following scenerio:
      
        CPU0            CPU1                    CPU2
      blkdev_open     blkdev_open           Remove disk
                        bd_acquire
      		  blkdev_get
      		    __blkdev_get      del_gendisk
      					bdev_unhash_inode
        bd_acquire          bdev_get_gendisk
          bd_forget           failed because of unhashed
      	  bdput
      	              bdput (the last one)
      		        bdev_evict_inode
      
      	  	    access bdev => use after free
      
      [  459.350216] BUG: KASAN: use-after-free in __lock_acquire+0x24c1/0x31b0
      [  459.351190] Read of size 8 at addr ffff88806c815a80 by task syz-executor.0/20132
      [  459.352347]
      [  459.352594] CPU: 0 PID: 20132 Comm: syz-executor.0 Not tainted 4.19.90 #2
      [  459.353628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [  459.354947] Call Trace:
      [  459.355337]  dump_stack+0x111/0x19e
      [  459.355879]  ? __lock_acquire+0x24c1/0x31b0
      [  459.356523]  print_address_description+0x60/0x223
      [  459.357248]  ? __lock_acquire+0x24c1/0x31b0
      [  459.357887]  kasan_report.cold+0xae/0x2d8
      [  459.358503]  __lock_acquire+0x24c1/0x31b0
      [  459.359120]  ? _raw_spin_unlock_irq+0x24/0x40
      [  459.359784]  ? lockdep_hardirqs_on+0x37b/0x580
      [  459.360465]  ? _raw_spin_unlock_irq+0x24/0x40
      [  459.361123]  ? finish_task_switch+0x125/0x600
      [  459.361812]  ? finish_task_switch+0xee/0x600
      [  459.362471]  ? mark_held_locks+0xf0/0xf0
      [  459.363108]  ? __schedule+0x96f/0x21d0
      [  459.363716]  lock_acquire+0x111/0x320
      [  459.364285]  ? blkdev_get+0xce/0xbe0
      [  459.364846]  ? blkdev_get+0xce/0xbe0
      [  459.365390]  __mutex_lock+0xf9/0x12a0
      [  459.365948]  ? blkdev_get+0xce/0xbe0
      [  459.366493]  ? bdev_evict_inode+0x1f0/0x1f0
      [  459.367130]  ? blkdev_get+0xce/0xbe0
      [  459.367678]  ? destroy_inode+0xbc/0x110
      [  459.368261]  ? mutex_trylock+0x1a0/0x1a0
      [  459.368867]  ? __blkdev_get+0x3e6/0x1280
      [  459.369463]  ? bdev_disk_changed+0x1d0/0x1d0
      [  459.370114]  ? blkdev_get+0xce/0xbe0
      [  459.370656]  blkdev_get+0xce/0xbe0
      [  459.371178]  ? find_held_lock+0x2c/0x110
      [  459.371774]  ? __blkdev_get+0x1280/0x1280
      [  459.372383]  ? lock_downgrade+0x680/0x680
      [  459.373002]  ? lock_acquire+0x111/0x320
      [  459.373587]  ? bd_acquire+0x21/0x2c0
      [  459.374134]  ? do_raw_spin_unlock+0x4f/0x250
      [  459.374780]  blkdev_open+0x202/0x290
      [  459.375325]  do_dentry_open+0x49e/0x1050
      [  459.375924]  ? blkdev_get_by_dev+0x70/0x70
      [  459.376543]  ? __x64_sys_fchdir+0x1f0/0x1f0
      [  459.377192]  ? inode_permission+0xbe/0x3a0
      [  459.377818]  path_openat+0x148c/0x3f50
      [  459.378392]  ? kmem_cache_alloc+0xd5/0x280
      [  459.379016]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  459.379802]  ? path_lookupat.isra.0+0x900/0x900
      [  459.380489]  ? __lock_is_held+0xad/0x140
      [  459.381093]  do_filp_open+0x1a1/0x280
      [  459.381654]  ? may_open_dev+0xf0/0xf0
      [  459.382214]  ? find_held_lock+0x2c/0x110
      [  459.382816]  ? lock_downgrade+0x680/0x680
      [  459.383425]  ? __lock_is_held+0xad/0x140
      [  459.384024]  ? do_raw_spin_unlock+0x4f/0x250
      [  459.384668]  ? _raw_spin_unlock+0x1f/0x30
      [  459.385280]  ? __alloc_fd+0x448/0x560
      [  459.385841]  do_sys_open+0x3c3/0x500
      [  459.386386]  ? filp_open+0x70/0x70
      [  459.386911]  ? trace_hardirqs_on_thunk+0x1a/0x1c
      [  459.387610]  ? trace_hardirqs_off_caller+0x55/0x1c0
      [  459.388342]  ? do_syscall_64+0x1a/0x520
      [  459.388930]  do_syscall_64+0xc3/0x520
      [  459.389490]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  459.390248] RIP: 0033:0x416211
      [  459.390720] Code: 75 14 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83
      04 19 00 00 c3 48 83 ec 08 e8 0a fa ff ff 48 89 04 24 b8 02 00 00 00 0f
         05 <48> 8b 3c 24 48 89 c2 e8 53 fa ff ff 48 89 d0 48 83 c4 08 48 3d
            01
      [  459.393483] RSP: 002b:00007fe45dfe9a60 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
      [  459.394610] RAX: ffffffffffffffda RBX: 00007fe45dfea6d4 RCX: 0000000000416211
      [  459.395678] RDX: 00007fe45dfe9b0a RSI: 0000000000000002 RDI: 00007fe45dfe9b00
      [  459.396758] RBP: 000000000076bf20 R08: 0000000000000000 R09: 000000000000000a
      [  459.397930] R10: 0000000000000075 R11: 0000000000000293 R12: 00000000ffffffff
      [  459.399022] R13: 0000000000000bd9 R14: 00000000004cdb80 R15: 000000000076bf2c
      [  459.400168]
      [  459.400430] Allocated by task 20132:
      [  459.401038]  kasan_kmalloc+0xbf/0xe0
      [  459.401652]  kmem_cache_alloc+0xd5/0x280
      [  459.402330]  bdev_alloc_inode+0x18/0x40
      [  459.402970]  alloc_inode+0x5f/0x180
      [  459.403510]  iget5_locked+0x57/0xd0
      [  459.404095]  bdget+0x94/0x4e0
      [  459.404607]  bd_acquire+0xfa/0x2c0
      [  459.405113]  blkdev_open+0x110/0x290
      [  459.405702]  do_dentry_open+0x49e/0x1050
      [  459.406340]  path_openat+0x148c/0x3f50
      [  459.406926]  do_filp_open+0x1a1/0x280
      [  459.407471]  do_sys_open+0x3c3/0x500
      [  459.408010]  do_syscall_64+0xc3/0x520
      [  459.408572]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  459.409415]
      [  459.409679] Freed by task 1262:
      [  459.410212]  __kasan_slab_free+0x129/0x170
      [  459.410919]  kmem_cache_free+0xb2/0x2a0
      [  459.411564]  rcu_process_callbacks+0xbb2/0x2320
      [  459.412318]  __do_softirq+0x225/0x8ac
      
      Fix this by delaying bdput() to the end of blkdev_get() which means we
      have finished accessing bdev.
      
      Fixes: 77ea887e
      
       ("implement in-kernel gendisk events handling")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarJason Yan <yanaijie@huawei.com>
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f9aa90e1
    • Zhiqiang Liu's avatar
      bcache: fix potential deadlock problem in btree_gc_coalesce · 9517bec2
      Zhiqiang Liu authored
      [ Upstream commit be23e837 ]
      
      coccicheck reports:
        drivers/md//bcache/btree.c:1538:1-7: preceding lock on line 1417
      
      In btree_gc_coalesce func, if the coalescing process fails, we will goto
      to out_nocoalesce tag directly without releasing new_nodes[i]->write_lock.
      Then, it will cause a deadlock when trying to acquire new_nodes[i]->
      write_lock for freeing new_nodes[i] before return.
      
      btree_gc_coalesce func details as follows:
      	if alloc new_nodes[i] fails:
      		goto out_nocoalesce;
      	// obtain new_nodes[i]->write_lock
      	mutex_lock(&new_nodes[i]->write_lock)
      	// main coalescing process
      	for (i = nodes - 1; i > 0; --i)
      		[snipped]
      		if coalescing process fails:
      			// Here, directly goto out_nocoalesce
      			 // tag will cause a deadlock
      			goto out_nocoalesce;
      		[snipped]
      	// release new_nodes[i]->write_lock
      	mutex_unlock(&new_nodes[i]->write_lock)
      	// coalesing succ, return
      	return;
      out_nocoalesce:
      	btree_node_free(new_nodes[i])	// free new_nodes[i]
      	// obtain new_nodes[i]->write_lock
      	mutex_lock(&new_nodes[i]->write_lock);
      	// set flag for reuse
      	clear_bit(BTREE_NODE_dirty, &ew_nodes[i]->flags);
      	// release new_nodes[i]->write_lock
      	mutex_unlock(&new_nodes[i]->write_lock);
      
      To fix the problem, we add a new tag 'out_unlock_nocoalesce' for
      releasing new_nodes[i]->write_lock before out_nocoalesce tag. If
      coalescing process fails, we will go to out_unlock_nocoalesce tag
      for releasing new_nodes[i]->write_lock before free new_nodes[i] in
      out_nocoalesce tag.
      
      (Coly Li helps to clean up commit log format.)
      
      Fixes: 2a285686
      
       ("bcache: btree locking rework")
      Signed-off-by: default avatarZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9517bec2
    • Gaurav Singh's avatar
      perf report: Fix NULL pointer dereference in hists__fprintf_nr_sample_events() · d3e4550e
      Gaurav Singh authored
      [ Upstream commit 11b6e548 ]
      
      The 'evname' variable can be NULL, as it is checked a few lines back,
      check it before using.
      
      Fixes: 9e207ddf
      
       ("perf report: Show call graph from reference events")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/
      Signed-off-by: default avatarGaurav Singh <gaurav1086@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d3e4550e
    • Qais Yousef's avatar
      usb/ehci-platform: Set PM runtime as active on resume · 13af14df
      Qais Yousef authored
      [ Upstream commit 16bdc04c
      
       ]
      
      Follow suit of ohci-platform.c and perform pm_runtime_set_active() on
      resume.
      
      ohci-platform.c had a warning reported due to the missing
      pm_runtime_set_active() [1].
      
      [1] https://lore.kernel.org/lkml/20200323143857.db5zphxhq4hz3hmd@e107158-lin.cambridge.arm.com/
      
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarQais Yousef <qais.yousef@arm.com>
      CC: Tony Prisk <linux@prisktech.co.nz>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: Mathias Nyman <mathias.nyman@intel.com>
      CC: Oliver Neukum <oneukum@suse.de>
      CC: linux-arm-kernel@lists.infradead.org
      CC: linux-usb@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Link: https://lore.kernel.org/r/20200518154931.6144-3-qais.yousef@arm.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      13af14df
    • Qais Yousef's avatar
      usb/xhci-plat: Set PM runtime as active on resume · 737c975d
      Qais Yousef authored
      [ Upstream commit 79112cc3
      
       ]
      
      Follow suit of ohci-platform.c and perform pm_runtime_set_active() on
      resume.
      
      ohci-platform.c had a warning reported due to the missing
      pm_runtime_set_active() [1].
      
      [1] https://lore.kernel.org/lkml/20200323143857.db5zphxhq4hz3hmd@e107158-lin.cambridge.arm.com/
      
      Signed-off-by: default avatarQais Yousef <qais.yousef@arm.com>
      CC: Tony Prisk <linux@prisktech.co.nz>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: Mathias Nyman <mathias.nyman@intel.com>
      CC: Oliver Neukum <oneukum@suse.de>
      CC: linux-arm-kernel@lists.infradead.org
      CC: linux-usb@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Link: https://lore.kernel.org/r/20200518154931.6144-2-qais.yousef@arm.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      737c975d
    • Christophe JAILLET's avatar
      scsi: acornscsi: Fix an error handling path in acornscsi_probe() · 4d046d71
      Christophe JAILLET authored
      [ Upstream commit 42c76c98 ]
      
      'ret' is known to be 0 at this point.  Explicitly return -ENOMEM if one of
      the 'ecardm_iomap()' calls fail.
      
      Link: https://lore.kernel.org/r/20200530081622.577888-1-christophe.jaillet@wanadoo.fr
      Fixes: e95a1b65
      
       ("[ARM] rpc: acornscsi: update to new style ecard driver")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4d046d71
    • tannerlove's avatar
      selftests/net: in timestamping, strncpy needs to preserve null byte · 7f5c8bd3
      tannerlove authored
      [ Upstream commit 8027bc03 ]
      
      If user passed an interface option longer than 15 characters, then
      device.ifr_name and hwtstamp.ifr_name became non-null-terminated
      strings. The compiler warned about this:
      
      timestamping.c:353:2: warning: ‘strncpy’ specified bound 16 equals \
      destination size [-Wstringop-truncation]
        353 |  strncpy(device.ifr_name, interface, sizeof(device.ifr_name));
      
      Fixes: cb9eff09
      
       ("net: new user space API for time stamping of incoming and outgoing packets")
      Signed-off-by: default avatarTanner Love <tannerlove@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7f5c8bd3
    • Nick Desaulniers's avatar
      elfnote: mark all .note sections SHF_ALLOC · 7e5c47b8
      Nick Desaulniers authored
      [ Upstream commit 51da9dfb
      
       ]
      
      ELFNOTE_START allows callers to specify flags for .pushsection assembler
      directives.  All callsites but ELF_NOTE use "a" for SHF_ALLOC.  For vdso's
      that explicitly use ELF_NOTE_START and BUILD_SALT, the same section is
      specified twice after preprocessing, once with "a" flag, once without.
      Example:
      
      .pushsection .note.Linux, "a", @note ;
      .pushsection .note.Linux, "", @note ;
      
      While GNU as allows this ordering, it warns for the opposite ordering,
      making these directives position dependent.  We'd prefer not to precisely
      match this behavior in Clang's integrated assembler.  Instead, the non
      __ASSEMBLY__ definition of ELF_NOTE uses
      __attribute__((section(".note.Linux"))) which is created with SHF_ALLOC,
      so let's make the __ASSEMBLY__ definition of ELF_NOTE consistent with C
      and just always use "a" flag.
      
      This allows Clang to assemble a working mainline (5.6) kernel via:
      $ make CC=clang AS=clang
      
      Signed-off-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarFangrui Song <maskray@google.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Link: https://github.com/ClangBuiltLinux/linux/issues/913
      Link: http://lkml.kernel.org/r/20200325231250.99205-1-ndesaulniers@google.com
      Debugged-by: default avatarIlie Halip <ilie.halip@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7e5c47b8
    • Arnd Bergmann's avatar
      include/linux/bitops.h: avoid clang shift-count-overflow warnings · b656010d
      Arnd Bergmann authored
      [ Upstream commit bd93f003
      
       ]
      
      Clang normally does not warn about certain issues in inline functions when
      it only happens in an eliminated code path. However if something else
      goes wrong, it does tend to complain about the definition of hweight_long()
      on 32-bit targets:
      
        include/linux/bitops.h:75:41: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
                return sizeof(w) == 4 ? hweight32(w) : hweight64(w);
                                                       ^~~~~~~~~~~~
        include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64'
         define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))
                                                        ^~~~~~~~~~~~~~~~~~~~
        include/asm-generic/bitops/const_hweight.h:21:76: note: expanded from macro '__const_hweight64'
         define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
                                                                                   ^  ~~
        include/asm-generic/bitops/const_hweight.h:20:49: note: expanded from macro '__const_hweight32'
         define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16))
                                                        ^
        include/asm-generic/bitops/const_hweight.h:19:72: note: expanded from macro '__const_hweight16'
         define __const_hweight16(w) (__const_hweight8(w)  + __const_hweight8((w)  >> 8 ))
                                                                               ^
        include/asm-generic/bitops/const_hweight.h:12:9: note: expanded from macro '__const_hweight8'
                  (!!((w) & (1ULL << 2))) +     \
      
      Adding an explicit cast to __u64 avoids that warning and makes it easier
      to read other output.
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Link: http://lkml.kernel.org/r/20200505135513.65265-1-arnd@arndb.de
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b656010d
    • Jann Horn's avatar
      lib/zlib: remove outdated and incorrect pre-increment optimization · f0db9a64
      Jann Horn authored
      [ Upstream commit acaab733
      
       ]
      
      The zlib inflate code has an old micro-optimization based on the
      assumption that for pre-increment memory accesses, the compiler will
      generate code that fits better into the processor's pipeline than what
      would be generated for post-increment memory accesses.
      
      This optimization was already removed in upstream zlib in 2016:
      https://github.com/madler/zlib/commit/9aaec95e8211
      
      This optimization causes UB according to C99, which says in section 6.5.6
      "Additive operators": "If both the pointer operand and the result point to
      elements of the same array object, or one past the last element of the
      array object, the evaluation shall not produce an overflow; otherwise, the
      behavior is undefined".
      
      This UB is not only a theoretical concern, but can also cause trouble for
      future work on compiler-based sanitizers.
      
      According to the zlib commit, this optimization also is not optimal
      anymore with modern compilers.
      
      Replace uses of OFF, PUP and UP_UNALIGNED with their definitions in the
      POSTINC case, and remove the macro definitions, just like in the upstream
      patch.
      
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20200507123112.252723-1-jannh@google.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f0db9a64
    • Qiushi Wu's avatar
      scsi: iscsi: Fix reference count leak in iscsi_boot_create_kobj · d9aa44ef
      Qiushi Wu authored
      [ Upstream commit 0267ffce
      
       ]
      
      kobject_init_and_add() takes reference even when it fails. If this
      function returns an error, kobject_put() must be called to properly
      clean up the memory associated with the object.
      
      Link: https://lore.kernel.org/r/20200528201353.14849-1-wu000273@umn.edu
      Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9aa44ef
    • Bob Peterson's avatar
      gfs2: Allow lock_nolock mount to specify jid=X · 536fd514
      Bob Peterson authored
      [ Upstream commit ea22eee4
      
       ]
      
      Before this patch, a simple typo accidentally added \n to the jid=
      string for lock_nolock mounts. This made it impossible to mount a
      gfs2 file system with a journal other than journal0. Thus:
      
      mount -tgfs2 -o hostdata="jid=1" <device> <mount pt>
      
      Resulted in:
      mount: wrong fs type, bad option, bad superblock on <device>
      
      In most cases this is not a problem. However, for debugging and
      testing purposes we sometimes want to test the integrity of other
      journals. This patch removes the unnecessary \n and thus allows
      lock_nolock users to specify an alternate journal.
      
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      536fd514
    • Stafford Horne's avatar
      openrisc: Fix issue with argument clobbering for clone/fork · d9f4137a
      Stafford Horne authored
      [ Upstream commit 6bd140e1
      
       ]
      
      Working on the OpenRISC glibc port I found that sometimes clone was
      working strange.  That the tls data argument sent in r7 was always
      wrong.  Further investigation revealed that the arguments were getting
      clobbered in the entry code.  This patch removes the code that writes to
      the argument registers.  This was likely due to some old code hanging
      around.
      
      This patch fixes this up for clone and fork.  This fork clobber is
      harmless but also useless so remove.
      
      Signed-off-by: default avatarStafford Horne <shorne@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9f4137a
    • Xiyu Yang's avatar
      ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed · 0e8108e0
      Xiyu Yang authored
      [ Upstream commit 36124fb1
      
       ]
      
      fsl_asrc_dma_hw_params() invokes dma_request_channel() or
      fsl_asrc_get_dma_channel(), which returns a reference of the specified
      dma_chan object to "pair->dma_chan[dir]" with increased refcnt.
      
      The reference counting issue happens in one exception handling path of
      fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
      the function forgets to decrease the refcnt increased by
      dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
      leak.
      
      Fix this issue by calling dma_release_channel() when config DMA channel
      failed.
      
      Signed-off-by: default avatarXiyu Yang <xiyuyang19@fudan.edu.cn>
      Signed-off-by: default avatarXin Tan <tanxin.ctf@gmail.com>
      Link: https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyang19@fudan.edu.cn
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0e8108e0