Skip to content
  1. Oct 08, 2019
    • Paolo Abeni's avatar
      net: ipv4: avoid mixed n_redirects and rate_tokens usage · 562c2ff7
      Paolo Abeni authored
      [ Upstream commit b406472b ]
      
      Since commit c09551c6
      
       ("net: ipv4: use a dedicated counter
      for icmp_v4 redirect packets") we use 'n_redirects' to account
      for redirect packets, but we still use 'rate_tokens' to compute
      the redirect packets exponential backoff.
      
      If the device sent to the relevant peer any ICMP error packet
      after sending a redirect, it will also update 'rate_token' according
      to the leaking bucket schema; typically 'rate_token' will raise
      above BITS_PER_LONG and the redirect packets backoff algorithm
      will produce undefined behavior.
      
      Fix the issue using 'n_redirects' to compute the exponential backoff
      in ip_rt_send_redirect().
      
      Note that we still clear rate_tokens after a redirect silence period,
      to avoid changing an established behaviour.
      
      The root cause predates git history; before the mentioned commit in
      the critical scenario, the kernel stopped sending redirects, after
      the mentioned commit the behavior more randomic.
      
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Fixes: c09551c6
      
       ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      562c2ff7
    • Eric Dumazet's avatar
      ipv6: drop incoming packets having a v4mapped source address · 33b301a1
      Eric Dumazet authored
      [ Upstream commit 6af1799a ]
      
      This began with a syzbot report. syzkaller was injecting
      IPv6 TCP SYN packets having a v4mapped source address.
      
      After an unsuccessful 4-tuple lookup, TCP creates a request
      socket (SYN_RECV) and calls reqsk_queue_hash_req()
      
      reqsk_queue_hash_req() calls sk_ehashfn(sk)
      
      At this point we have AF_INET6 sockets, and the heuristic
      used by sk_ehashfn() to either hash the IPv4 or IPv6 addresses
      is to use ipv6_addr_v4mapped(&sk->sk_v6_daddr)
      
      For the particular spoofed packet, we end up hashing V4 addresses
      which were not initialized by the TCP IPv6 stack, so KMSAN fired
      a warning.
      
      I first fixed sk_ehashfn() to test both source and destination addresses,
      but then faced various problems, including user-space programs
      like packetdrill that had similar assumptions.
      
      Instead of trying to fix the whole ecosystem, it is better
      to admit that we have a dual stack behavior, and that we
      can not build linux kernels without V4 stack anyway.
      
      The dual stack API automatically forces the traffic to be IPv4
      if v4mapped addresses are used at bind() or connect(), so it makes
      no sense to allow IPv6 traffic to use the same v4mapped class.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33b301a1
    • Johan Hovold's avatar
      hso: fix NULL-deref on tty open · 78c01443
      Johan Hovold authored
      [ Upstream commit 8353da9f ]
      
      Fix NULL-pointer dereference on tty open due to a failure to handle a
      missing interrupt-in endpoint when probing modem ports:
      
      	BUG: kernel NULL pointer dereference, address: 0000000000000006
      	...
      	RIP: 0010:tiocmget_submit_urb+0x1c/0xe0 [hso]
      	...
      	Call Trace:
      	hso_start_serial_device+0xdc/0x140 [hso]
      	hso_serial_open+0x118/0x1b0 [hso]
      	tty_open+0xf1/0x490
      
      Fixes: 542f5482
      
       ("tty: Modem functions for the HSO driver")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      78c01443
    • Haishuang Yan's avatar
      erspan: remove the incorrect mtu limit for erspan · 881d4609
      Haishuang Yan authored
      [ Upstream commit 0e141f75 ]
      
      erspan driver calls ether_setup(), after commit 61e84623
      ("net: centralize net_device min/max MTU checking"), the range
      of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.
      
      It causes the dev mtu of the erspan device to not be greater
      than 1500, this limit value is not correct for ipgre tap device.
      
      Tested:
      Before patch:
      # ip link set erspan0 mtu 1600
      Error: mtu greater than device maximum.
      After patch:
      # ip link set erspan0 mtu 1600
      # ip -d link show erspan0
      21: erspan0@NONE: <BROADCAST,MULTICAST> mtu 1600 qdisc noop state DOWN
      mode DEFAULT group default qlen 1000
          link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 0
      
      Fixes: 61e84623
      
       ("net: centralize net_device min/max MTU checking")
      Signed-off-by: default avatarHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      881d4609
    • Vishal Kulkarni's avatar
      cxgb4:Fix out-of-bounds MSI-X info array access · 382aa3a9
      Vishal Kulkarni authored
      [ Upstream commit 6b517374 ]
      
      When fetching free MSI-X vectors for ULDs, check for the error code
      before accessing MSI-X info array. Otherwise, an out-of-bounds access is
      attempted, which results in kernel panic.
      
      Fixes: 94cdb8bb
      
       ("cxgb4: Add support for dynamic allocation of resources for ULD")
      Signed-off-by: default avatarShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: default avatarVishal Kulkarni <vishal@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      382aa3a9
    • Daniel Borkmann's avatar
      bpf: fix use after free in prog symbol exposure · 47569360
      Daniel Borkmann authored
      commit c751798a upstream.
      
      syzkaller managed to trigger the warning in bpf_jit_free() which checks via
      bpf_prog_kallsyms_verify_off() for potentially unlinked JITed BPF progs
      in kallsyms, and subsequently trips over GPF when walking kallsyms entries:
      
        [...]
        8021q: adding VLAN 0 to HW filter on device batadv0
        8021q: adding VLAN 0 to HW filter on device batadv0
        WARNING: CPU: 0 PID: 9869 at kernel/bpf/core.c:810 bpf_jit_free+0x1e8/0x2a0
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Workqueue: events bpf_prog_free_deferred
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x113/0x167 lib/dump_stack.c:113
         panic+0x212/0x40b kernel/panic.c:214
         __warn.cold.8+0x1b/0x38 kernel/panic.c:571
         report_bug+0x1a4/0x200 lib/bug.c:186
         fixup_bug arch/x86/kernel/traps.c:178 [inline]
         do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
         do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290
         invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
        RIP: 0010:bpf_jit_free+0x1e8/0x2a0
        Code: 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 86 00 00 00 48 ba 00 02 00 00 00 00 ad de 0f b6 43 02 49 39 d6 0f 84 5f fe ff ff <0f> 0b e9 58 fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1
        RSP: 0018:ffff888092f67cd8 EFLAGS: 00010202
        RAX: 0000000000000007 RBX: ffffc90001947000 RCX: ffffffff816e9d88
        RDX: dead000000000200 RSI: 0000000000000008 RDI: ffff88808769f7f0
        RBP: ffff888092f67d00 R08: fffffbfff1394059 R09: fffffbfff1394058
        R10: fffffbfff1394058 R11: ffffffff89ca02c7 R12: ffffc90001947002
        R13: ffffc90001947020 R14: ffffffff881eca80 R15: ffff88808769f7e8
        BUG: unable to handle kernel paging request at fffffbfff400d000
        #PF error: [normal kernel read fault]
        PGD 21ffee067 P4D 21ffee067 PUD 21ffed067 PMD 9f942067 PTE 0
        Oops: 0000 [#1] PREEMPT SMP KASAN
        CPU: 0 PID: 9869 Comm: kworker/0:7 Not tainted 5.0.0-rc8+ #1
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Workqueue: events bpf_prog_free_deferred
        RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:495 [inline]
        RIP: 0010:bpf_tree_comp kernel/bpf/core.c:558 [inline]
        RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
        RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
        RIP: 0010:bpf_prog_kallsyms_find+0x107/0x2e0 kernel/bpf/core.c:632
        Code: 00 f0 ff ff 44 38 c8 7f 08 84 c0 0f 85 fa 00 00 00 41 f6 45 02 01 75 02 0f 0b 48 39 da 0f 82 92 00 00 00 48 89 d8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 03 0f 8e 45 01 00 00 8b 03 48 c1 e0
        [...]
      
      Upon further debugging, it turns out that whenever we trigger this
      issue, the kallsyms removal in bpf_prog_ksym_node_del() was /skipped/
      but yet bpf_jit_free() reported that the entry is /in use/.
      
      Problem is that symbol exposure via bpf_prog_kallsyms_add() but also
      perf_event_bpf_event() were done /after/ bpf_prog_new_fd(). Once the
      fd is exposed to the public, a parallel close request came in right
      before we attempted to do the bpf_prog_kallsyms_add().
      
      Given at this time the prog reference count is one, we start to rip
      everything underneath us via bpf_prog_release() -> bpf_prog_put().
      The memory is eventually released via deferred free, so we're seeing
      that bpf_jit_free() has a kallsym entry because we added it from
      bpf_prog_load() but /after/ bpf_prog_put() from the remote CPU.
      
      Therefore, move both notifications /before/ we install the fd. The
      issue was never seen between bpf_prog_alloc_id() and bpf_prog_new_fd()
      because upon bpf_prog_get_fd_by_id() we'll take another reference to
      the BPF prog, so we're still holding the original reference from the
      bpf_prog_load().
      
      Fixes: 6ee52e2a ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
      Fixes: 74451e66
      
       ("bpf: make jited programs visible in traces")
      Reported-by: default avatar <syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Song Liu <songliubraving@fb.com>
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      47569360
    • Nicolas Boichat's avatar
      kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K · f892d2f0
      Nicolas Boichat authored
      [ Upstream commit b751c52b
      
       ]
      
      The current default value (400) is too low on many systems (e.g.  some
      ARM64 platform takes up 1000+ entries).
      
      syzbot uses 16000 as default value, and has proved to be enough on beefy
      configurations, so let's pick that value.
      
      This consumes more RAM on boot (each entry is 160 bytes, so in total
      ~2.5MB of RAM), but the memory would later be freed (early_log is
      __initdata).
      
      Link: http://lkml.kernel.org/r/20190730154027.101525-1-drinkcat@chromium.org
      Signed-off-by: default avatarNicolas Boichat <drinkcat@chromium.org>
      Suggested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f892d2f0
    • Changwei Ge's avatar
      ocfs2: wait for recovering done after direct unlock request · f886e1af
      Changwei Ge authored
      [ Upstream commit 0a3775e4
      
       ]
      
      There is a scenario causing ocfs2 umount hang when multiple hosts are
      rebooting at the same time.
      
      NODE1                           NODE2               NODE3
      send unlock requset to NODE2
                                      dies
                                                          become recovery master
                                                          recover NODE2
      find NODE2 dead
      mark resource RECOVERING
      directly remove lock from grant list
      calculate usage but RECOVERING marked
      **miss the window of purging
      clear RECOVERING
      
      To reproduce this issue, crash a host and then umount ocfs2
      from another node.
      
      To solve this, just let unlock progress wait for recovery done.
      
      Link: http://lkml.kernel.org/r/1550124866-20367-1-git-send-email-gechangwei@live.cn
      Signed-off-by: default avatarChangwei Ge <gechangwei@live.cn>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f886e1af
    • Greg Thelen's avatar
      kbuild: clean compressed initramfs image · b3008938
      Greg Thelen authored
      [ Upstream commit 6279eb3d ]
      
      Since 9e3596b0 ("kbuild: initramfs cleanup, set target from Kconfig")
      "make clean" leaves behind compressed initramfs images.  Example:
      
        $ make defconfig
        $ sed -i 's|CONFIG_INITRAMFS_SOURCE=""|CONFIG_INITRAMFS_SOURCE="/tmp/ir.cpio"|' .config
        $ make olddefconfig
        $ make -s
        $ make -s clean
        $ git clean -ndxf | grep initramfs
        Would remove usr/initramfs_data.cpio.gz
      
      clean rules do not have CONFIG_* context so they do not know which
      compression format was used.  Thus they don't know which files to delete.
      
      Tell clean to delete all possible compression formats.
      
      Once patched usr/initramfs_data.cpio.gz and friends are deleted by
      "make clean".
      
      Link: http://lkml.kernel.org/r/20190722063251.55541-1-gthelen@google.com
      Fixes: 9e3596b0
      
       ("kbuild: initramfs cleanup, set target from Kconfig")
      Signed-off-by: default avatarGreg Thelen <gthelen@google.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b3008938
    • David Howells's avatar
      hypfs: Fix error number left in struct pointer member · fca23609
      David Howells authored
      [ Upstream commit b54c64f7 ]
      
      In hypfs_fill_super(), if hypfs_create_update_file() fails,
      sbi->update_file is left holding an error number.  This is passed to
      hypfs_kill_super() which doesn't check for this.
      
      Fix this by not setting sbi->update_value until after we've checked for
      error.
      
      Fixes: 24bbb1fa
      
       ("[PATCH] s390_hypfs filesystem")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      cc: linux-s390@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fca23609
    • Jens Axboe's avatar
      pktcdvd: remove warning on attempting to register non-passthrough dev · 1a63ec5f
      Jens Axboe authored
      [ Upstream commit eb09b3cc
      
       ]
      
      Anatoly reports that he gets the below warning when booting -git on
      a sparc64 box on debian unstable:
      
      ...
      [   13.352975] aes_sparc64: Using sparc64 aes opcodes optimized AES
      implementation
      [   13.428002] ------------[ cut here ]------------
      [   13.428081] WARNING: CPU: 21 PID: 586 at
      drivers/block/pktcdvd.c:2597 pkt_setup_dev+0x2e4/0x5a0 [pktcdvd]
      [   13.428147] Attempt to register a non-SCSI queue
      [   13.428184] Modules linked in: pktcdvd libdes cdrom aes_sparc64
      n2_rng md5_sparc64 sha512_sparc64 rng_core sha256_sparc64 flash
      sha1_sparc64 ip_tables x_tables ipv6 crc_ccitt nf_defrag_ipv6 autofs4
      ext4 crc16 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy
      async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear
      md_mod crc32c_sparc64
      [   13.428452] CPU: 21 PID: 586 Comm: pktsetup Not tainted
      5.3.0-10169-g574cc4539762 #1234
      [   13.428507] Call Trace:
      [   13.428542]  [00000000004635c0] __warn+0xc0/0x100
      [   13.428582]  [0000000000463634] warn_slowpath_fmt+0x34/0x60
      [   13.428626]  [000000001045b244] pkt_setup_dev+0x2e4/0x5a0 [pktcdvd]
      [   13.428674]  [000000001045ccf4] pkt_ctl_ioctl+0x94/0x220 [pktcdvd]
      [   13.428724]  [00000000006b95c8] do_vfs_ioctl+0x628/0x6e0
      [   13.428764]  [00000000006b96c8] ksys_ioctl+0x48/0x80
      [   13.428803]  [00000000006b9714] sys_ioctl+0x14/0x40
      [   13.428847]  [0000000000406294] linux_sparc_syscall+0x34/0x44
      [   13.428890] irq event stamp: 4181
      [   13.428924] hardirqs last  enabled at (4189): [<00000000004e0a74>]
      console_unlock+0x634/0x6c0
      [   13.428984] hardirqs last disabled at (4196): [<00000000004e0540>]
      console_unlock+0x100/0x6c0
      [   13.429048] softirqs last  enabled at (3978): [<0000000000b2e2d8>]
      __do_softirq+0x498/0x520
      [   13.429110] softirqs last disabled at (3967): [<000000000042cfb4>]
      do_softirq_own_stack+0x34/0x60
      [   13.429172] ---[ end trace 2220ca468f32967d ]---
      [   13.430018] pktcdvd: setup of pktcdvd device failed
      [   13.455589] des_sparc64: Using sparc64 des opcodes optimized DES
      implementation
      [   13.515334] camellia_sparc64: Using sparc64 camellia opcodes
      optimized CAMELLIA implementation
      [   13.522856] pktcdvd: setup of pktcdvd device failed
      [   13.529327] pktcdvd: setup of pktcdvd device failed
      [   13.532932] pktcdvd: setup of pktcdvd device failed
      [   13.536165] pktcdvd: setup of pktcdvd device failed
      [   13.539372] pktcdvd: setup of pktcdvd device failed
      [   13.542834] pktcdvd: setup of pktcdvd device failed
      [   13.546536] pktcdvd: setup of pktcdvd device failed
      [   15.431071] XFS (dm-0): Mounting V5 Filesystem
      ...
      
      Apparently debian auto-attaches any cdrom like device to pktcdvd, which
      can lead to the above warning. There's really no reason to warn for this
      situation, kill it.
      
      Reported-by: default avatarAnatoly Pugachev <matorola@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1a63ec5f
    • OGAWA Hirofumi's avatar
      fat: work around race with userspace's read via blockdev while mounting · 3554b8a1
      OGAWA Hirofumi authored
      [ Upstream commit 07bfa441
      
       ]
      
      If userspace reads the buffer via blockdev while mounting,
      sb_getblk()+modify can race with buffer read via blockdev.
      
      For example,
      
                  FS                               userspace
          bh = sb_getblk()
          modify bh->b_data
                                        read
      				    ll_rw_block(bh)
      				      fill bh->b_data by on-disk data
      				      /* lost modified data by FS */
      				      set_buffer_uptodate(bh)
          set_buffer_uptodate(bh)
      
      Userspace should not use the blockdev while mounting though, the udev
      seems to be already doing this.  Although I think the udev should try to
      avoid this, workaround the race by small overhead.
      
      Link: http://lkml.kernel.org/r/87pnk7l3sw.fsf_-_@mail.parknet.co.jp
      Signed-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3554b8a1
    • Mike Rapoport's avatar
      ARM: 8903/1: ensure that usable memory in bank 0 starts from a PMD-aligned address · 735ad033
      Mike Rapoport authored
      [ Upstream commit 00d2ec1e
      
       ]
      
      The calculation of memblock_limit in adjust_lowmem_bounds() assumes that
      bank 0 starts from a PMD-aligned address. However, the beginning of the
      first bank may be NOMAP memory and the start of usable memory
      will be not aligned to PMD boundary. In such case the memblock_limit will
      be set to the end of the NOMAP region, which will prevent any memblock
      allocations.
      
      Mark the region between the end of the NOMAP area and the next PMD-aligned
      address as NOMAP as well, so that the usable memory will start at
      PMD-aligned address.
      
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      735ad033
    • Jia-Ju Bai's avatar
      security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() · 4b1e27b3
      Jia-Ju Bai authored
      [ Upstream commit 3f4287e7
      
       ]
      
      In smack_socket_sock_rcv_skb(), there is an if statement
      on line 3920 to check whether skb is NULL:
          if (skb && skb->secmark != 0)
      
      This check indicates skb can be NULL in some cases.
      
      But on lines 3931 and 3932, skb is used:
          ad.a.u.net->netif = skb->skb_iif;
          ipv6_skb_to_auditdata(skb, &ad.a, NULL);
      
      Thus, possible null-pointer dereferences may occur when skb is NULL.
      
      To fix these possible bugs, an if statement is added to check skb.
      
      These bugs are found by a static analysis tool STCheck written by us.
      
      Signed-off-by: default avatarJia-Ju Bai <baijiaju1990@gmail.com>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4b1e27b3
    • Thierry Reding's avatar
      PCI: exynos: Propagate errors for optional PHYs · 6001d7e3
      Thierry Reding authored
      [ Upstream commit ddd69600
      
       ]
      
      devm_of_phy_get() can fail for a number of reasons besides probe
      deferral. It can for example return -ENOMEM if it runs out of memory as
      it tries to allocate devres structures. Propagating only -EPROBE_DEFER
      is problematic because it results in these legitimately fatal errors
      being treated as "PHY not specified in DT".
      
      What we really want is to ignore the optional PHYs only if they have not
      been specified in DT. devm_of_phy_get() returns -ENODEV in this case, so
      that's the special case that we need to handle. So we propagate all
      errors, except -ENODEV, so that real failures will still cause the
      driver to fail probe.
      
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarAndrew Murray <andrew.murray@arm.com>
      Cc: Jingoo Han <jingoohan1@gmail.com>
      Cc: Kukjin Kim <kgene@kernel.org>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6001d7e3
    • Thierry Reding's avatar
      PCI: imx6: Propagate errors for optional regulators · 67681e47
      Thierry Reding authored
      [ Upstream commit 2170a09f
      
       ]
      
      regulator_get_optional() can fail for a number of reasons besides probe
      deferral. It can for example return -ENOMEM if it runs out of memory as
      it tries to allocate data structures. Propagating only -EPROBE_DEFER is
      problematic because it results in these legitimately fatal errors being
      treated as "regulator not specified in DT".
      
      What we really want is to ignore the optional regulators only if they
      have not been specified in DT. regulator_get_optional() returns -ENODEV
      in this case, so that's the special case that we need to handle. So we
      propagate all errors, except -ENODEV, so that real failures will still
      cause the driver to fail probe.
      
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarAndrew Murray <andrew.murray@arm.com>
      Cc: Richard Zhu <hongxing.zhu@nxp.com>
      Cc: Lucas Stach <l.stach@pengutronix.de>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: Fabio Estevam <festevam@gmail.com>
      Cc: kernel@pengutronix.de
      Cc: linux-imx@nxp.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      67681e47
    • Thierry Reding's avatar
      PCI: rockchip: Propagate errors for optional regulators · d0259726
      Thierry Reding authored
      [ Upstream commit 0e3ff0ac
      
       ]
      
      regulator_get_optional() can fail for a number of reasons besides probe
      deferral. It can for example return -ENOMEM if it runs out of memory as
      it tries to allocate data structures. Propagating only -EPROBE_DEFER is
      problematic because it results in these legitimately fatal errors being
      treated as "regulator not specified in DT".
      
      What we really want is to ignore the optional regulators only if they
      have not been specified in DT. regulator_get_optional() returns -ENODEV
      in this case, so that's the special case that we need to handle. So we
      propagate all errors, except -ENODEV, so that real failures will still
      cause the driver to fail probe.
      
      Tested-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarAndrew Murray <andrew.murray@arm.com>
      Reviewed-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Acked-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Cc: Shawn Lin <shawn.lin@rock-chips.com>
      Cc: Heiko Stuebner <heiko@sntech.de>
      Cc: linux-rockchip@lists.infradead.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d0259726
    • Joao Moreno's avatar
      HID: apple: Fix stuck function keys when using FN · d13fa754
      Joao Moreno authored
      [ Upstream commit aec256d0
      
       ]
      
      This fixes an issue in which key down events for function keys would be
      repeatedly emitted even after the user has raised the physical key. For
      example, the driver fails to emit the F5 key up event when going through
      the following steps:
      - fnmode=1: hold FN, hold F5, release FN, release F5
      - fnmode=2: hold F5, hold FN, release F5, release FN
      
      The repeated F5 key down events can be easily verified using xev.
      
      Signed-off-by: default avatarJoao Moreno <mail@joaomoreno.com>
      Co-developed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d13fa754
    • Anson Huang's avatar
      rtc: snvs: fix possible race condition · dc93018d
      Anson Huang authored
      [ Upstream commit 6fd4fe9b
      
       ]
      
      The RTC IRQ is requested before the struct rtc_device is allocated,
      this may lead to a NULL pointer dereference in IRQ handler.
      
      To fix this issue, allocating the rtc_device struct before requesting
      the RTC IRQ using devm_rtc_allocate_device, and use rtc_register_device
      to register the RTC device.
      
      Signed-off-by: default avatarAnson Huang <Anson.Huang@nxp.com>
      Reviewed-by: default avatarDong Aisheng <aisheng.dong@nxp.com>
      Link: https://lore.kernel.org/r/20190716071858.36750-1-Anson.Huang@nxp.com
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dc93018d
    • Will Deacon's avatar
      ARM: 8898/1: mm: Don't treat faults reported from cache maintenance as writes · 1edc3a5f
      Will Deacon authored
      [ Upstream commit 83402036
      
       ]
      
      Translation faults arising from cache maintenance instructions are
      rather unhelpfully reported with an FSR value where the WnR field is set
      to 1, indicating that the faulting access was a write. Since cache
      maintenance instructions on 32-bit ARM do not require any particular
      permissions, this can cause our private 'cacheflush' system call to fail
      spuriously if a translation fault is generated due to page aging when
      targetting a read-only VMA.
      
      In this situation, we will return -EFAULT to userspace, although this is
      unfortunately suppressed by the popular '__builtin___clear_cache()'
      intrinsic provided by GCC, which returns void.
      
      Although it's tempting to write this off as a userspace issue, we can
      actually do a little bit better on CPUs that support LPAE, even if the
      short-descriptor format is in use. On these CPUs, cache maintenance
      faults additionally set the CM field in the FSR, which we can use to
      suppress the write permission checks in the page fault handler and
      succeed in performing cache maintenance to read-only areas even in the
      presence of a translation fault.
      
      Reported-by: default avatarOrion Hodson <oth@google.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1edc3a5f
    • Miroslav Benes's avatar
      livepatch: Nullify obj->mod in klp_module_coming()'s error path · 95364cf7
      Miroslav Benes authored
      [ Upstream commit 4ff96fb5
      
       ]
      
      klp_module_coming() is called for every module appearing in the system.
      It sets obj->mod to a patched module for klp_object obj. Unfortunately
      it leaves it set even if an error happens later in the function and the
      patched module is not allowed to be loaded.
      
      klp_is_object_loaded() uses obj->mod variable and could currently give a
      wrong return value. The bug is probably harmless as of now.
      
      Signed-off-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95364cf7
    • Nishka Dasgupta's avatar
      PCI: tegra: Fix OF node reference leak · 2d91f917
      Nishka Dasgupta authored
      [ Upstream commit 9e38e690
      
       ]
      
      Each iteration of for_each_child_of_node() executes of_node_put() on the
      previous node, but in some return paths in the middle of the loop
      of_node_put() is missing thus causing a reference leak.
      
      Hence stash these mid-loop return values in a variable 'err' and add a
      new label err_node_put which executes of_node_put() on the previous node
      and returns 'err' on failure.
      
      Change mid-loop return statements to point to jump to this label to
      fix the reference leak.
      
      Issue found with Coccinelle.
      
      Signed-off-by: default avatarNishka Dasgupta <nishkadg.linux@gmail.com>
      [lorenzo.pieralisi@arm.com: rewrote commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2d91f917
    • Kai-Heng Feng's avatar
      mfd: intel-lpss: Remove D3cold delay · 54e57bee
      Kai-Heng Feng authored
      [ Upstream commit 76380a60
      
       ]
      
      Goodix touchpad may drop its first couple input events when
      i2c-designware-platdrv and intel-lpss it connects to took too long to
      runtime resume from runtime suspended state.
      
      This issue happens becuase the touchpad has a rather small buffer to
      store up to 13 input events, so if the host doesn't read those events in
      time (i.e. runtime resume takes too long), events are dropped from the
      touchpad's buffer.
      
      The bottleneck is D3cold delay it waits when transitioning from D3cold
      to D0, hence remove the delay to make the resume faster. I've tested
      some systems with intel-lpss and haven't seen any regression.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202683
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      54e57bee
    • Hans de Goede's avatar
      i2c-cht-wc: Fix lockdep warning · 95072b17
      Hans de Goede authored
      [ Upstream commit 232219b9
      
       ]
      
      When the kernel is build with lockdep support and the i2c-cht-wc driver is
      used, the following warning is shown:
      
      [   66.674334] ======================================================
      [   66.674337] WARNING: possible circular locking dependency detected
      [   66.674340] 5.3.0-rc4+ #83 Not tainted
      [   66.674342] ------------------------------------------------------
      [   66.674345] systemd-udevd/1232 is trying to acquire lock:
      [   66.674349] 00000000a74dab07 (intel_soc_pmic_chtwc:167:(&cht_wc_regmap_cfg)->lock){+.+.}, at: regmap_write+0x31/0x70
      [   66.674360]
                     but task is already holding lock:
      [   66.674362] 00000000d44a85b7 (i2c_register_adapter){+.+.}, at: i2c_smbus_xfer+0x49/0xf0
      [   66.674370]
                     which lock already depends on the new lock.
      
      [   66.674371]
                     the existing dependency chain (in reverse order) is:
      [   66.674374]
                     -> #1 (i2c_register_adapter){+.+.}:
      [   66.674381]        rt_mutex_lock_nested+0x46/0x60
      [   66.674384]        i2c_smbus_xfer+0x49/0xf0
      [   66.674387]        i2c_smbus_read_byte_data+0x45/0x70
      [   66.674391]        cht_wc_byte_reg_read+0x35/0x50
      [   66.674394]        _regmap_read+0x63/0x1a0
      [   66.674396]        _regmap_update_bits+0xa8/0xe0
      [   66.674399]        regmap_update_bits_base+0x63/0xa0
      [   66.674403]        regmap_irq_update_bits.isra.0+0x3b/0x50
      [   66.674406]        regmap_add_irq_chip+0x592/0x7a0
      [   66.674409]        devm_regmap_add_irq_chip+0x89/0xed
      [   66.674412]        cht_wc_probe+0x102/0x158
      [   66.674415]        i2c_device_probe+0x95/0x250
      [   66.674419]        really_probe+0xf3/0x380
      [   66.674422]        driver_probe_device+0x59/0xd0
      [   66.674425]        device_driver_attach+0x53/0x60
      [   66.674428]        __driver_attach+0x92/0x150
      [   66.674431]        bus_for_each_dev+0x7d/0xc0
      [   66.674434]        bus_add_driver+0x14d/0x1f0
      [   66.674437]        driver_register+0x6d/0xb0
      [   66.674440]        i2c_register_driver+0x45/0x80
      [   66.674445]        do_one_initcall+0x60/0x2f4
      [   66.674450]        kernel_init_freeable+0x20d/0x2b4
      [   66.674453]        kernel_init+0xa/0x10c
      [   66.674457]        ret_from_fork+0x3a/0x50
      [   66.674459]
                     -> #0 (intel_soc_pmic_chtwc:167:(&cht_wc_regmap_cfg)->lock){+.+.}:
      [   66.674465]        __lock_acquire+0xe07/0x1930
      [   66.674468]        lock_acquire+0x9d/0x1a0
      [   66.674472]        __mutex_lock+0xa8/0x9a0
      [   66.674474]        regmap_write+0x31/0x70
      [   66.674480]        cht_wc_i2c_adap_smbus_xfer+0x72/0x240 [i2c_cht_wc]
      [   66.674483]        __i2c_smbus_xfer+0x1a3/0x640
      [   66.674486]        i2c_smbus_xfer+0x67/0xf0
      [   66.674489]        i2c_smbus_read_byte_data+0x45/0x70
      [   66.674494]        bq24190_probe+0x26b/0x410 [bq24190_charger]
      [   66.674497]        i2c_device_probe+0x189/0x250
      [   66.674500]        really_probe+0xf3/0x380
      [   66.674503]        driver_probe_device+0x59/0xd0
      [   66.674506]        device_driver_attach+0x53/0x60
      [   66.674509]        __driver_attach+0x92/0x150
      [   66.674512]        bus_for_each_dev+0x7d/0xc0
      [   66.674515]        bus_add_driver+0x14d/0x1f0
      [   66.674518]        driver_register+0x6d/0xb0
      [   66.674521]        i2c_register_driver+0x45/0x80
      [   66.674524]        do_one_initcall+0x60/0x2f4
      [   66.674528]        do_init_module+0x5c/0x230
      [   66.674531]        load_module+0x2707/0x2a20
      [   66.674534]        __do_sys_init_module+0x188/0x1b0
      [   66.674537]        do_syscall_64+0x5c/0xb0
      [   66.674541]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [   66.674543]
                     other info that might help us debug this:
      
      [   66.674545]  Possible unsafe locking scenario:
      
      [   66.674547]        CPU0                    CPU1
      [   66.674548]        ----                    ----
      [   66.674550]   lock(i2c_register_adapter);
      [   66.674553]                                lock(intel_soc_pmic_chtwc:167:(&cht_wc_regmap_cfg)->lock);
      [   66.674556]                                lock(i2c_register_adapter);
      [   66.674559]   lock(intel_soc_pmic_chtwc:167:(&cht_wc_regmap_cfg)->lock);
      [   66.674561]
                      *** DEADLOCK ***
      
      The problem is that the CHT Whiskey Cove PMIC's builtin i2c-adapter is
      itself a part of an i2c-client (the PMIC). This means that transfers done
      through it take adapter->bus_lock twice, once for the parent i2c-adapter
      and once for its own bus_lock. Lockdep does not like this nested locking.
      
      To make lockdep happy in the case of busses with muxes, the i2c-core's
      i2c_adapter_lock_bus function calls:
      
       rt_mutex_lock_nested(&adapter->bus_lock, i2c_adapter_depth(adapter));
      
      But i2c_adapter_depth only works when the direct parent of the adapter is
      another adapter, as it is only meant for muxes. In this case there is an
      i2c-client and MFD instantiated platform_device in the parent->child chain
      between the 2 devices.
      
      This commit overrides the default i2c_lock_operations, passing a hardcoded
      depth of 1 to rt_mutex_lock_nested, making lockdep happy.
      
      Note that if there were to be a mux attached to the i2c-wc-cht adapter,
      this would break things again since the i2c-mux code expects the
      root-adapter to have a locking depth of 0. But the i2c-wc-cht adapter
      always has only 1 client directly attached in the form of the charger IC
      paired with the CHT Whiskey Cove PMIC.
      
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95072b17
    • Nathan Chancellor's avatar
      MIPS: tlbex: Explicitly cast _PAGE_NO_EXEC to a boolean · 9875f047
      Nathan Chancellor authored
      [ Upstream commit c59ae0a1 ]
      
      clang warns:
      
      arch/mips/mm/tlbex.c:634:19: error: use of logical '&&' with constant
      operand [-Werror,-Wconstant-logical-operand]
              if (cpu_has_rixi && _PAGE_NO_EXEC) {
                               ^  ~~~~~~~~~~~~~
      arch/mips/mm/tlbex.c:634:19: note: use '&' for a bitwise operation
              if (cpu_has_rixi && _PAGE_NO_EXEC) {
                               ^~
                               &
      arch/mips/mm/tlbex.c:634:19: note: remove constant to silence this
      warning
              if (cpu_has_rixi && _PAGE_NO_EXEC) {
                              ~^~~~~~~~~~~~~~~~
      1 error generated.
      
      Explicitly cast this value to a boolean so that clang understands we
      intend for this to be a non-zero value.
      
      Fixes: 00bf1c69
      
       ("MIPS: tlbex: Avoid placing software PTE bits in Entry* PFN fields")
      Link: https://github.com/ClangBuiltLinux/linux/issues/609
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9875f047
    • Chris Wilson's avatar
      dma-buf/sw_sync: Synchronize signal vs syncpt free · 8ccf3623
      Chris Wilson authored
      [ Upstream commit d3c6dd1f ]
      
      During release of the syncpt, we remove it from the list of syncpt and
      the tree, but only if it is not already been removed. However, during
      signaling, we first remove the syncpt from the list. So, if we
      concurrently free and signal the syncpt, the free may decide that it is
      not part of the tree and immediately free itself -- meanwhile the
      signaler goes on to use the now freed datastructure.
      
      In particular, we get struck by commit 0e2f733a ("dma-buf: make
      dma_fence structure a bit smaller v2") as the cb_list is immediately
      clobbered by the kfree_rcu.
      
      v2: Avoid calling into timeline_fence_release() from under the spinlock
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111381
      Fixes: d3862e44 ("dma-buf/sw-sync: Fix locking around sync_timeline lists")
      References: 0e2f733a
      
       ("dma-buf: make dma_fence structure a bit smaller v2")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Sean Paul <seanpaul@chromium.org>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: <stable@vger.kernel.org> # v4.14+
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190812154247.20508-1-chris@chris-wilson.co.uk
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8ccf3623
    • Bart Van Assche's avatar
      scsi: core: Reduce memory required for SCSI logging · 86906603
      Bart Van Assche authored
      [ Upstream commit dccc96ab ]
      
      The data structure used for log messages is so large that it can cause a
      boot failure. Since allocations from that data structure can fail anyway,
      use kmalloc() / kfree() instead of that data structure.
      
      See also https://bugzilla.kernel.org/show_bug.cgi?id=204119.
      See also commit ded85c19
      
       ("scsi: Implement per-cpu logging buffer") # v4.0.
      
      Reported-by: default avatarJan Palus <jpalus@fastmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Jan Palus <jpalus@fastmail.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      86906603
    • Eugen Hristev's avatar
      clk: at91: select parent if main oscillator or bypass is enabled · 6682e6f6
      Eugen Hristev authored
      [ Upstream commit 69a6bcde
      
       ]
      
      Selecting the right parent for the main clock is done using only
      main oscillator enabled bit.
      In case we have this oscillator bypassed by an external signal (no driving
      on the XOUT line), we still use external clock, but with BYPASS bit set.
      So, in this case we must select the same parent as before.
      Create a macro that will select the right parent considering both bits from
      the MOR register.
      Use this macro when looking for the right parent.
      
      Signed-off-by: default avatarEugen Hristev <eugen.hristev@microchip.com>
      Link: https://lkml.kernel.org/r/1568042692-11784-2-git-send-email-eugen.hristev@microchip.com
      Acked-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Reviewed-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6682e6f6
    • Arnd Bergmann's avatar
      arm64: fix unreachable code issue with cmpxchg · 8b524b10
      Arnd Bergmann authored
      [ Upstream commit 920fdab7
      
       ]
      
      On arm64 build with clang, sometimes the __cmpxchg_mb is not inlined
      when CONFIG_OPTIMIZE_INLINING is set.
      Clang then fails a compile-time assertion, because it cannot tell at
      compile time what the size of the argument is:
      
      mm/memcontrol.o: In function `__cmpxchg_mb':
      memcontrol.c:(.text+0x1a4c): undefined reference to `__compiletime_assert_175'
      memcontrol.c:(.text+0x1a4c): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `__compiletime_assert_175'
      
      Mark all of the cmpxchg() style functions as __always_inline to
      ensure that the compiler can see the result.
      
      Acked-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Link: https://github.com/ClangBuiltLinux/linux/issues/648
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Tested-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarAndrew Murray <andrew.murray@arm.com>
      Tested-by: default avatarAndrew Murray <andrew.murray@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b524b10
    • Nathan Lynch's avatar
      powerpc/pseries: correctly track irq state in default idle · 71197bca
      Nathan Lynch authored
      [ Upstream commit 92c94dfb ]
      
      prep_irq_for_idle() is intended to be called before entering
      H_CEDE (and it is used by the pseries cpuidle driver). However the
      default pseries idle routine does not call it, leading to mismanaged
      lazy irq state when the cpuidle driver isn't in use. Manifestations of
      this include:
      
      * Dropped IPIs in the time immediately after a cpu comes
        online (before it has installed the cpuidle handler), making the
        online operation block indefinitely waiting for the new cpu to
        respond.
      
      * Hitting this WARN_ON in arch_local_irq_restore():
      	/*
      	 * We should already be hard disabled here. We had bugs
      	 * where that wasn't the case so let's dbl check it and
      	 * warn if we are wrong. Only do that when IRQ tracing
      	 * is enabled as mfmsr() can be costly.
      	 */
      	if (WARN_ON_ONCE(mfmsr() & MSR_EE))
      		__hard_irq_disable();
      
      Call prep_irq_for_idle() from pseries_lpar_idle() and honor its
      result.
      
      Fixes: 363edbe2
      
       ("powerpc: Default arch idle could cede processor on pseries")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190910225244.25056-1-nathanl@linux.ibm.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      71197bca
    • Nicholas Piggin's avatar
      powerpc/64s/exception: machine check use correct cfar for late handler · 7c3e8297
      Nicholas Piggin authored
      [ Upstream commit 0b66370c
      
       ]
      
      Bare metal machine checks run an "early" handler in real mode before
      running the main handler which reports the event.
      
      The main handler runs exactly as a normal interrupt handler, after the
      "windup" which sets registers back as they were at interrupt entry.
      CFAR does not get restored by the windup code, so that will be wrong
      when the handler is run.
      
      Restore the CFAR to the saved value before running the late handler.
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190802105709.27696-8-npiggin@gmail.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c3e8297
    • Jean Delvare's avatar
      drm/amdgpu/si: fix ASIC tests · 02c333e2
      Jean Delvare authored
      [ Upstream commit 77efe48a
      
       ]
      
      Comparing adev->family with CHIP constants is not correct.
      adev->family can only be compared with AMDGPU_FAMILY constants and
      adev->asic_type is the struct member to compare with CHIP constants.
      They are separate identification spaces.
      
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Fixes: 62a37553
      
       ("drm/amdgpu: add si implementation v10")
      Cc: Ken Wang <Qingqing.Wang@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: "Christian König" <christian.koenig@amd.com>
      Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      02c333e2
    • Mark Menzynski's avatar
      drm/nouveau/volt: Fix for some cards having 0 maximum voltage · 2e137f0e
      Mark Menzynski authored
      [ Upstream commit a1af2afb
      
       ]
      
      Some, mostly Fermi, vbioses appear to have zero max voltage. That causes Nouveau to not parse voltage entries, thus users not being able to set higher clocks.
      
      When changing this value Nvidia driver still appeared to ignore it, and I wasn't able to find out why, thus the code is ignoring the value if it is zero.
      
      CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarMark Menzynski <mmenzyns@redhat.com>
      Reviewed-by: default avatarKarol Herbst <kherbst@redhat.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2e137f0e
    • hexin's avatar
      vfio_pci: Restore original state on release · ed9544ca
      hexin authored
      [ Upstream commit 92c80268 ]
      
      vfio_pci_enable() saves the device's initial configuration information
      with the intent that it is restored in vfio_pci_disable().  However,
      the commit referenced in Fixes: below replaced the call to
      __pci_reset_function_locked(), which is not wrapped in a state save
      and restore, with pci_try_reset_function(), which overwrites the
      restored device state with the current state before applying it to the
      device.  Reinstate use of __pci_reset_function_locked() to return to
      the desired behavior.
      
      Fixes: 890ed578
      
       ("vfio-pci: Use pci "try" reset interface")
      Signed-off-by: default avatarhexin <hexin15@baidu.com>
      Signed-off-by: default avatarLiu Qi <liuqi16@baidu.com>
      Signed-off-by: default avatarZhang Yu <zhangyu31@baidu.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ed9544ca
    • Sowjanya Komatineni's avatar
      pinctrl: tegra: Fix write barrier placement in pmx_writel · b64f7180
      Sowjanya Komatineni authored
      [ Upstream commit c2cf351e
      
       ]
      
      pmx_writel uses writel which inserts write barrier before the
      register write.
      
      This patch has fix to replace writel with writel_relaxed followed
      by a readback and memory barrier to ensure write operation is
      completed for successful pinctrl change.
      
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Reviewed-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Signed-off-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Link: https://lore.kernel.org/r/1565984527-5272-2-git-send-email-skomatineni@nvidia.com
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b64f7180
    • Nathan Lynch's avatar
      powerpc/pseries/mobility: use cond_resched when updating device tree · e9b7c5d8
      Nathan Lynch authored
      [ Upstream commit ccfb5bd7
      
       ]
      
      After a partition migration, pseries_devicetree_update() processes
      changes to the device tree communicated from the platform to
      Linux. This is a relatively heavyweight operation, with multiple
      device tree searches, memory allocations, and conversations with
      partition firmware.
      
      There's a few levels of nested loops which are bounded only by
      decisions made by the platform, outside of Linux's control, and indeed
      we have seen RCU stalls on large systems while executing this call
      graph. Use cond_resched() in these loops so that the cpu is yielded
      when needed.
      
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190802192926.19277-4-nathanl@linux.ibm.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e9b7c5d8
    • Christophe Leroy's avatar
      powerpc/futex: Fix warning: 'oldval' may be used uninitialized in this function · 31b0a491
      Christophe Leroy authored
      [ Upstream commit 38a0d0cd
      
       ]
      
      We see warnings such as:
        kernel/futex.c: In function 'do_futex':
        kernel/futex.c:1676:17: warning: 'oldval' may be used uninitialized in this function [-Wmaybe-uninitialized]
           return oldval == cmparg;
                         ^
        kernel/futex.c:1651:6: note: 'oldval' was declared here
          int oldval, ret;
              ^
      
      This is because arch_futex_atomic_op_inuser() only sets *oval if ret
      is 0 and GCC doesn't see that it will only use it when ret is 0.
      
      Anyway, the non-zero ret path is an error path that won't suffer from
      setting *oval, and as *oval is a local var in futex_atomic_op_inuser()
      it will have no impact.
      
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      [mpe: reword change log slightly]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/86b72f0c134367b214910b27b9a6dd3321af93bb.1565774657.git.christophe.leroy@c-s.fr
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      31b0a491
    • Nathan Lynch's avatar
      powerpc/rtas: use device model APIs and serialization during LPM · c643f5b6
      Nathan Lynch authored
      [ Upstream commit a6717c01 ]
      
      The LPAR migration implementation and userspace-initiated cpu hotplug
      can interleave their executions like so:
      
      1. Set cpu 7 offline via sysfs.
      
      2. Begin a partition migration, whose implementation requires the OS
         to ensure all present cpus are online; cpu 7 is onlined:
      
           rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up
      
         This sets cpu 7 online in all respects except for the cpu's
         corresponding struct device; dev->offline remains true.
      
      3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is
         already online and returns success. The driver core (device_online)
         sets dev->offline = false.
      
      4. The migration completes and restores cpu 7 to offline state:
      
           rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down
      
      This leaves cpu7 in a state where the driver core considers the cpu
      device online, but in all other respects it is offline and
      unused. Attempts to online the cpu via sysfs appear to succeed but the
      driver core actually does not pass the request to the lower-level
      cpuhp support code. This makes the cpu unusable until the cpu device
      is manually set offline and then online again via sysfs.
      
      Instead of directly calling cpu_up/cpu_down, the migration code should
      use the higher-level device core APIs to maintain consistent state and
      serialize operations.
      
      Fixes: 120496ac
      
       ("powerpc: Bring all threads online prior to migration/hibernation")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c643f5b6
    • Cédric Le Goater's avatar
      powerpc/xmon: Check for HV mode when dumping XIVE info from OPAL · 7d0b30d2
      Cédric Le Goater authored
      [ Upstream commit c3e0dbd7
      
       ]
      
      Currently, the xmon 'dx' command calls OPAL to dump the XIVE state in
      the OPAL logs and also outputs some of the fields of the internal XIVE
      structures in Linux. The OPAL calls can only be done on baremetal
      (PowerNV) and they crash a pseries machine. Fix by checking the
      hypervisor feature of the CPU.
      
      Signed-off-by: default avatarCédric Le Goater <clg@kaod.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190814154754.23682-2-clg@kaod.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7d0b30d2
    • Stephen Boyd's avatar
      clk: zx296718: Don't reference clk_init_data after registration · c96b058d
      Stephen Boyd authored
      [ Upstream commit 1a4549c1
      
       ]
      
      A future patch is going to change semantics of clk_register() so that
      clk_hw::init is guaranteed to be NULL after a clk is registered. Avoid
      referencing this member here so that we don't run into NULL pointer
      exceptions.
      
      Cc: Jun Nie <jun.nie@linaro.org>
      Cc: Shawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Link: https://lkml.kernel.org/r/20190815160020.183334-3-sboyd@kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c96b058d