Skip to content
  1. Aug 26, 2020
    • Linus Torvalds's avatar
      Merge tag 'nfsd-5.9-1' of git://git.linux-nfs.org/projects/cel/cel-2.6 · 2ac69819
      Linus Torvalds authored
      Pull nfs server fixes from Chuck Lever:
      
       - Eliminate an oops introduced in v5.8
      
       - Remove a duplicate #include added by nfsd-5.9
      
      * tag 'nfsd-5.9-1' of git://git.linux-nfs.org/projects/cel/cel-2.6:
        SUNRPC: remove duplicate include
        nfsd: fix oops on mixed NFSv4/NFSv3 client access
      2ac69819
    • Linus Torvalds's avatar
      Merge tag 'm68knommu-for-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · abb3438d
      Linus Torvalds authored
      Pull m68knommu fix from Greg Ungerer:
       "Only a single fix for the binfmt_flat loader (reverting a recent
        change)"
      
      * tag 'm68knommu-for-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        binfmt_flat: revert "binfmt_flat: don't offset the data start"
      abb3438d
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fix-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · d7d1f235
      Linus Torvalds authored
      Pull libnvdimm fixes from Vishal Verma:
       "A couple of minor fixes for things merged in 5.9-rc1.
      
        One is an out-of-bounds access caught by KASAN, and the second is a
        tweak to some overzealous logging about dax support even for
        traditional block devices which was unnecessary"
      
      * tag 'libnvdimm-fix-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        dax: do not print error message for non-persistent memory block device
        libnvdimm: KASAN: global-out-of-bounds Read in internal_create_group
      d7d1f235
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · b9aa527a
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - regression fix / revert of a commit that intended to reduce probing
         delay by ~50ms, but introduced a race that causes quite a few devices
         not to enumerate, or get stuck on first IRQ
      
       - buffer overflow fix in hiddev, from Peilin Ye
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        Revert "HID: usbhid: do not sleep when opening device"
        HID: hiddev: Fix slab-out-of-bounds write in hiddev_ioctl_usage()
        HID: quirks: Always poll three more Lenovo PixArt mice
        HID: i2c-hid: Always sleep 60ms after I2C_HID_PWR_ON commands
        HID: macally: Constify macally_id_table
        HID: cougar: Constify cougar_id_table
      b9aa527a
  2. Aug 25, 2020
    • Gustavo A. R. Silva's avatar
      lib: Revert use of fallthrough pseudo-keyword in lib/ · 6a9dc5fd
      Gustavo A. R. Silva authored
      The following build error for powerpc64 was reported by Nathan Chancellor:
      
        "$ scripts/config --file arch/powerpc/configs/powernv_defconfig -e KERNEL_XZ
      
         $ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc64le-linux- distclean powernv_defconfig zImage
         ...
         In file included from arch/powerpc/boot/../../../lib/decompress_unxz.c:234,
                          from arch/powerpc/boot/decompress.c:38:
         arch/powerpc/boot/../../../lib/xz/xz_dec_stream.c: In function 'dec_main':
         arch/powerpc/boot/../../../lib/xz/xz_dec_stream.c:586:4: error: 'fallthrough' undeclared (first use in this function)
           586 |    fallthrough;
               |    ^~~~~~~~~~~
      
         This will end up affecting distribution configurations such as Debian
         and OpenSUSE according to my testing. I am not sure what the solution
         is, the PowerPC wrapper does not set -D__KERNEL__ so I am not sure
         that compiler_attributes.h can be safely included."
      
      In order to avoid these sort of p...
      6a9dc5fd
    • Linus Torvalds's avatar
      Merge tag 'for-5.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 9907ab37
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - fix swapfile activation on subvolumes with deleted snapshots
      
       - error value mixup when removing directory entries from tree log
      
       - fix lzo compression level reset after previous level setting
      
       - fix space cache memory leak after transaction abort
      
       - fix const function attribute
      
       - more error handling improvements
      
      * tag 'for-5.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: detect nocow for swap after snapshot delete
        btrfs: check the right error variable in btrfs_del_dir_entries_in_log
        btrfs: fix space cache memory leak after transaction abort
        btrfs: use the correct const function attribute for btrfs_get_num_csums
        btrfs: reset compression level for lzo on remount
        btrfs: handle errors from async submission
      9907ab37
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-08-23' of git://git.kernel.dk/linux-block · c41c3ec4
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request from Sagi:
             - nvme completion rework from Christoph and Chao that mostly came
               from a bit of divergence of how we classify errors related to
               pathing/retry etc.
             - nvmet passthru fixes from Chaitanya
             - minor nvmet fixes from Amit and I
             - mpath round-robin path selection fix from Martin
             - ignore noiob for zoned devices from Keith
             - minor nvme-fc fix from Tianjia"
      
       - BFQ cgroup leak fix (Dmitry)
      
       - block layer MAINTAINERS addition (Geert)
      
       - fix null_blk FUA checking (Hou)
      
       - get_max_io_size() size fix (Keith)
      
       - fix block page_is_mergeable() for compound pages (Matthew)
      
       - discard granularity fixes (Ming)
      
       - IO scheduler ordering fix (Ming)
      
       - misc fixes
      
      * tag 'io_uring-5.9-2020-08-23' of git://git.kernel.dk/linux-block: (31 commits)
        null_blk: fix passing of REQ_FUA flag in null_handle_rq
        nvmet: Disable keep-alive timer when kato is cleared to 0h
        nvme: redirect commands on dying queue
        nvme: just check the status code type in nvme_is_path_error
        nvme: refactor command completion
        nvme: rename and document nvme_end_request
        nvme: skip noiob for zoned devices
        nvme-pci: fix PRP pool size
        nvme-pci: Use u32 for nvme_dev.q_depth and nvme_queue.q_depth
        nvme: Use spin_lock_irq() when taking the ctrl->lock
        nvmet: call blk_mq_free_request() directly
        nvmet: fix oops in pt cmd execution
        nvmet: add ns tear down label for pt-cmd handling
        nvme: multipath: round-robin: eliminate "fallback" variable
        nvme: multipath: round-robin: fix single non-optimized path case
        nvme-fc: Fix wrong return value in __nvme_fc_init_request()
        nvmet-passthru: Reject commands with non-sgl flags set
        nvmet: fix a memory leak
        blkcg: fix memleak for iolatency
        MAINTAINERS: Add missing header files to BLOCK LAYER section
        ...
      c41c3ec4
    • Linus Torvalds's avatar
      Merge tag 'fallthrough-pseudo-keyword-5.9-rc3' of... · 2bf74771
      Linus Torvalds authored
      Merge tag 'fallthrough-pseudo-keyword-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull 'fallthrough' keyword conversion from Gustavo A. R. Silva:
       "A tree-wide patch that replaces tons (2484) of /* fall through */
        comments, and its variants, with the new pseudo-keyword macro
        fallthrough[1]. Also, remove unnecessary fall-through markings when it
        is the case.
      
        There are currently 1167 intances of this fallthrough pseudo-keyword
        macro in mainline (5.9-rc2), that have been introduced over the last
        couple of development cycles:
      
          $ git grep -nw 'fallthrough;' | wc -l
          1167
      
        The global adoption of the fallthrough pseudo-keyword is something
        certain to happen; so, better sooner than later. :) This will also
        save everybody's time and thousands of lines of unnecessarily
        repetitive changelog text.
      
        After applying this patch on top of 5.9-rc2, we'll have a total of
        3651 instances of this macro:
      
          $ git grep -nw 'fallthrough;' | wc -l
          3651
      
        This treewide patch doesn't address ALL fall-through markings in all
        subsystems at once because I have previously sent out patches for some
        of such subsystems separately, and I will follow up on them; however,
        this definitely contributes most of the work needed to replace all the
        fall-through markings with the fallthrough pseudo-keyword macro in the
        whole codebase.
      
        I have build-tested this patch on 10 different architectures: x86_64,
        i386, arm64, powerpc, s390, sparc64, sh, m68k, powerpc64 and alpha
        (allyesconfig for all of them). This is in linux-next already and
        kernel test robot has also helped me to successfully build-test early
        versions of this patch[2][3][4][5]"
      
      [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
      [2] https://lore.kernel.org/lkml/5f3cc99a.HgvOW3rH0mD0RmkM%25lkp@intel.com/
      [3] https://lore.kernel.org/lkml/5f3dd1d2.l1axczH+t4hMBZ63%25lkp@intel.com/
      [4] https://lore.kernel.org/lkml/5f3e977a.mwYHUIObbR4SHr0B%25lkp@intel.com/
      [5] https://lore.kernel.org/lkml/5f3f9e1c.qsyb%2FaySkiXNpkO4%25lkp@intel.com/
      
      * tag 'fallthrough-pseudo-keyword-5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        treewide: Use fallthrough pseudo-keyword
      2bf74771
  3. Aug 24, 2020
    • Max Filippov's avatar
      binfmt_flat: revert "binfmt_flat: don't offset the data start" · 2217b982
      Max Filippov authored
      binfmt_flat loader uses the gap between text and data to store data
      segment pointers for the libraries. Even in the absence of shared
      libraries it stores at least one pointer to the executable's own data
      segment. Text and data can go back to back in the flat binary image and
      without offsetting data segment last few instructions in the text
      segment may get corrupted by the data segment pointer.
      
      Fix it by reverting commit a2357223 ("binfmt_flat: don't offset the
      data start").
      
      Cc: stable@vger.kernel.org
      Fixes: a2357223
      
       ("binfmt_flat: don't offset the data start")
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Ungerer <gerg@linux-m68k.org>
      2217b982
    • Gustavo A. R. Silva's avatar
      treewide: Use fallthrough pseudo-keyword · df561f66
      Gustavo A. R. Silva authored
      
      
      Replace the existing /* fall through */ comments and its variants with
      the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
      fall-through markings when it is the case.
      
      [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      df561f66
    • Linus Torvalds's avatar
      Linux 5.9-rc2 · d012a719
      Linus Torvalds authored
      d012a719
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · cb957121
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Add perf support for emitting extended registers for power10.
      
       - A fix for CPU hotplug on pseries, where on large/loaded systems we
         may not wait long enough for the CPU to be offlined, leading to
         crashes.
      
       - Addition of a raw cputable entry for Power10, which is not required
         to boot, but is required to make our PMU setup work correctly in
         guests.
      
       - Three fixes for the recent changes on 32-bit Book3S to move modules
         into their own segment for strict RWX.
      
       - A fix for a recent change in our powernv PCI code that could lead to
         crashes.
      
       - A change to our perf interrupt accounting to avoid soft lockups when
         using some events, found by syzkaller.
      
       - A change in the way we handle power loss events from the hypervisor
         on pseries. We no longer immediately shut down if we're told we're
         running on a UPS.
      
       - A few other minor fixes.
      
      Thanks to Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T
      Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz,
      Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth,
      Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann,
      Vaidyanathan Srinivasan, Vasant Hegde.
      
      * tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver
        powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000
        powerpc/pseries: Do not initiate shutdown when system is running on UPS
        powerpc/perf: Fix soft lockups due to missed interrupt accounting
        powerpc/powernv/pci: Fix possible crash when releasing DMA resources
        powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
        powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined
        powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32
        powerpc/fixmap: Fix the size of the early debug area
        powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled
        powerpc/kernel: Cleanup machine check function declarations
        powerpc: Add POWER10 raw mode cputable entry
        powerpc/perf: Add extended regs support for power10 platform
        powerpc/perf: Add support for outputting extended regs in perf intr_regs
        powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
      cb957121
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 550c2129
      Linus Torvalds authored
      Pull x86 fix from Thomas Gleixner:
       "A single fix for x86 which removes the RDPID usage from the paranoid
        entry path and unconditionally uses LSL to retrieve the CPU number.
      
        RDPID depends on MSR_TSX_AUX. KVM has an optmization to avoid
        expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values
        and restores them either when leaving the run loop, on preemption or
        when going out to user space. MSR_TSX_AUX is part of that lazy MSR
        set, so after writing the guest value and before the lazy restore any
        exception using the paranoid entry will read the guest value and use
        it as CPU number to retrieve the GSBASE value for the current CPU when
        FSGSBASE is enabled. As RDPID is only used in that particular entry
        path, there is no reason to burden VMENTER/EXIT with two extra MSR
        writes. Remove the RDPID optimization, which is not even backed by
        numbers from the paranoid entry path instead"
      
      * tag 'x86-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM
      550c2129
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cea05c19
      Linus Torvalds authored
      Pull x86 perf fix from Thomas Gleixner:
       "A single update for perf on x86 which has support for the broken down
        bandwith counters"
      
      * tag 'perf-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Add BW counters for GT, IA and IO breakdown
      cea05c19
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 10c091b6
      Linus Torvalds authored
      Pull EFI fixes from Thomas Gleixner:
      
       - Enforce NX on RO data in mixed EFI mode
      
       - Destroy workqueue in an error handling path to prevent UAF
      
       - Stop argument parser at '--' which is the delimiter for init
      
       - Treat a NULL command line pointer as empty instead of dereferncing it
         unconditionally.
      
       - Handle an unterminated command line correctly
      
       - Cleanup the 32bit code leftovers and remove obsolete documentation
      
      * tag 'efi-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Documentation: efi: remove description of efi=old_map
        efi/x86: Move 32-bit code into efi_32.c
        efi/libstub: Handle unterminated cmdline
        efi/libstub: Handle NULL cmdline
        efi/libstub: Stop parsing arguments at "--"
        efi: add missed destroy_workqueue when efisubsys_init fails
        efi/x86: Mark kernel rodata non-executable for mixed mode
      10c091b6
    • Linus Torvalds's avatar
      Merge tag 'core-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e99b2507
      Linus Torvalds authored
      Pull entry fix from Thomas Gleixner:
       "A single bug fix for the common entry code.
      
        The transcription of the x86 version messed up the reload of the
        syscall number from pt_regs after ptrace and seccomp which breaks
        syscall number rewriting"
      
      * tag 'core-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        core/entry: Respect syscall number rewrites
      e99b2507
    • Linus Torvalds's avatar
      Merge tag 'edac_urgent_for_v5.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · d9232cb7
      Linus Torvalds authored
      Pull EDAC fix from Borislav Petkov:
       "A single fix correcting a reversed error severity determination check
        which lead to a recoverable error getting marked as fatal, by Tony
        Luck"
      
      * tag 'edac_urgent_for_v5.9_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/{i7core,sb,pnd2,skx}: Fix error event severity
      d9232cb7
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 9d045ed1
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Nothing earth shattering here, lots of small fixes (f.e. missing RCU
        protection, bad ref counting, missing memset(), etc.) all over the
        place:
      
         1) Use get_file_rcu() in task_file iterator, from Yonghong Song.
      
         2) There are two ways to set remote source MAC addresses in macvlan
            driver, but only one of which validates things properly. Fix this.
            From Alvin Šipraga.
      
         3) Missing of_node_put() in gianfar probing, from Sumera
            Priyadarsini.
      
         4) Preserve device wanted feature bits across multiple netlink
            ethtool requests, from Maxim Mikityanskiy.
      
         5) Fix rcu_sched stall in task and task_file bpf iterators, from
            Yonghong Song.
      
         6) Avoid reset after device destroy in ena driver, from Shay
            Agroskin.
      
         7) Missing memset() in netlink policy export reallocation path, from
            Johannes Berg.
      
         8) Fix info leak in __smc_diag_dump(), from Peilin Ye.
      
         9) Decapsulate ECN properly for ipv6 in ipv4 tunnels, from Mark
            Tomlinson.
      
        10) Fix number of data stream negotiation in SCTP, from David Laight.
      
        11) Fix double free in connection tracker action module, from Alaa
            Hleihel.
      
        12) Don't allow empty NHA_GROUP attributes, from Nikolay Aleksandrov"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits)
        net: nexthop: don't allow empty NHA_GROUP
        bpf: Fix two typos in uapi/linux/bpf.h
        net: dsa: b53: check for timeout
        tipc: call rcu_read_lock() in tipc_aead_encrypt_done()
        net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments() error flow
        net: sctp: Fix negotiation of the number of data streams.
        dt-bindings: net: renesas, ether: Improve schema validation
        gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY
        hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit()
        hv_netvsc: Remove "unlikely" from netvsc_select_queue
        bpf: selftests: global_funcs: Check err_str before strstr
        bpf: xdp: Fix XDP mode when no mode flags specified
        selftests/bpf: Remove test_align leftovers
        tools/resolve_btfids: Fix sections with wrong alignment
        net/smc: Prevent kernel-infoleak in __smc_diag_dump()
        sfc: fix build warnings on 32-bit
        net: phy: mscc: Fix a couple of spelling mistakes "spcified" -> "specified"
        libbpf: Fix map index used in error message
        net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe()
        net: atlantic: Use readx_poll_timeout() for large timeout
        ...
      9d045ed1
  4. Aug 23, 2020
    • Linus Torvalds's avatar
      Merge branch 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · f320ac6e
      Linus Torvalds authored
      Pull epoll fixes from Al Viro:
       "Fix reference counting and clean up exit paths"
      
      * 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        do_epoll_ctl(): clean the failure exits up a bit
        epoll: Keep a reference on files added to the check list
      f320ac6e
    • Al Viro's avatar
      52c47969
    • Marc Zyngier's avatar
      epoll: Keep a reference on files added to the check list · a9ed4a65
      Marc Zyngier authored
      
      
      When adding a new fd to an epoll, and that this new fd is an
      epoll fd itself, we recursively scan the fds attached to it
      to detect cycles, and add non-epool files to a "check list"
      that gets subsequently parsed.
      
      However, this check list isn't completely safe when deletions
      can happen concurrently. To sidestep the issue, make sure that
      a struct file placed on the check list sees its f_count increased,
      ensuring that a concurrent deletion won't result in the file
      disapearing from under our feet.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a9ed4a65
    • Nikolay Aleksandrov's avatar
      net: nexthop: don't allow empty NHA_GROUP · eeaac363
      Nikolay Aleksandrov authored
      Currently the nexthop code will use an empty NHA_GROUP attribute, but it
      requires at least 1 entry in order to function properly. Otherwise we
      end up derefencing null or random pointers all over the place due to not
      having any nh_grp_entry members allocated, nexthop code relies on having at
      least the first member present. Empty NHA_GROUP doesn't make any sense so
      just disallow it.
      Also add a WARN_ON for any future users of nexthop_create_group().
      
       BUG: kernel NULL pointer dereference, address: 0000000000000080
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] SMP
       CPU: 0 PID: 558 Comm: ip Not tainted 5.9.0-rc1+ #93
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
       RIP: 0010:fib_check_nexthop+0x4a/0xaa
       Code: 0f 84 83 00 00 00 48 c7 02 80 03 f7 81 c3 40 80 fe fe 75 12 b8 ea ff ff ff 48 85 d2 74 6b 48 c7 02 40 03 f7 81 c3 48 8b 40 10 <48> 8b 80 80 00 00 00 eb 36 80 78 1a 00 74 12 b8 ea ff ff ff 48 85
       RSP: 0018:ffff88807983ba00 EFLAGS: 00010213
       RAX: 0000000000000000 RBX: ffff88807983bc00 RCX: 0000000000000000
       RDX: ffff88807983bc00 RSI: 0000000000000000 RDI: ffff88807bdd0a80
       RBP: ffff88807983baf8 R08: 0000000000000dc0 R09: 000000000000040a
       R10: 0000000000000000 R11: ffff88807bdd0ae8 R12: 0000000000000000
       R13: 0000000000000000 R14: ffff88807bea3100 R15: 0000000000000001
       FS:  00007f10db393700(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000080 CR3: 000000007bd0f004 CR4: 00000000003706f0
       Call Trace:
        fib_create_info+0x64d/0xaf7
        fib_table_insert+0xf6/0x581
        ? __vma_adjust+0x3b6/0x4d4
        inet_rtm_newroute+0x56/0x70
        rtnetlink_rcv_msg+0x1e3/0x20d
        ? rtnl_calcit.isra.0+0xb8/0xb8
        netlink_rcv_skb+0x5b/0xac
        netlink_unicast+0xfa/0x17b
        netlink_sendmsg+0x334/0x353
        sock_sendmsg_nosec+0xf/0x3f
        ____sys_sendmsg+0x1a0/0x1fc
        ? copy_msghdr_from_user+0x4c/0x61
        ___sys_sendmsg+0x63/0x84
        ? handle_mm_fault+0xa39/0x11b5
        ? sockfd_lookup_light+0x72/0x9a
        __sys_sendmsg+0x50/0x6e
        do_syscall_64+0x54/0xbe
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7f10dacc0bb7
       Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 8b 05 9a 4b 2b 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 f2 2a 00 f7 d8 64 89 02 48
       RSP: 002b:00007ffcbe628bf8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
       RAX: ffffffffffffffda RBX: 00007ffcbe628f80 RCX: 00007f10dacc0bb7
       RDX: 0000000000000000 RSI: 00007ffcbe628c60 RDI: 0000000000000003
       RBP: 000000005f41099c R08: 0000000000000001 R09: 0000000000000008
       R10: 00000000000005e9 R11: 0000000000000246 R12: 0000000000000000
       R13: 0000000000000000 R14: 00007ffcbe628d70 R15: 0000563a86c6e440
       Modules linked in:
       CR2: 0000000000000080
      
      CC: David Ahern <dsahern@gmail.com>
      Fixes: 430a0491
      
       ("nexthop: Add support for nexthop groups")
      Reported-by: default avatar <syzbot+a61aa19b0c14c8770bd9@syzkaller.appspotmail.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eeaac363
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.9' of... · c3d8f220
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - move -Wsign-compare warning from W=2 to W=3
      
       - fix the keyword _restrict to __restrict in genksyms
      
       - fix more bugs in qconf
      
      * tag 'kbuild-fixes-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kconfig: qconf: replace deprecated QString::sprintf() with QTextStream
        kconfig: qconf: remove redundant help in the info view
        kconfig: qconf: remove qInfo() to get back Qt4 support
        kconfig: qconf: remove unused colNr
        kconfig: qconf: fix the popup menu in the ConfigInfoView window
        kconfig: qconf: fix signal connection to invalid slots
        genksyms: keywords: Use __restrict not _restrict
        kbuild: remove redundant patterns in filter/filter-out
        extract-cert: add static to local data
        Makefile.extrawarn: Move sign-compare from W=2 to W=3
      c3d8f220
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · dd105d64
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Allow booting of late secondary CPUs affected by erratum 1418040
         (currently they are parked if none of the early CPUs are affected by
         this erratum).
      
       - Add the 32-bit vdso Makefile to the vdso_install rule so that 'make
         vdso_install' installs the 32-bit compat vdso when it is compiled.
      
       - Print a warning that untrusted guests without a CPU erratum
         workaround (Cortex-A57 832075) may deadlock the affected system.
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        ARM64: vdso32: Install vdso32 from vdso_install
        KVM: arm64: Print warning when cpu erratum can cause guests to deadlock
        arm64: Allow booting of late CPUs affected by erratum 1418040
        arm64: Move handling of erratum 1418040 into C code
      dd105d64
    • Linus Torvalds's avatar
      Merge tag 's390-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · d57ce840
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - a couple of fixes for storage key handling relevant for debugging
      
       - add cond_resched into potentially slow subchannels scanning loop
      
       - fixes for PF/VF linking and to ignore stale PCI configuration request
         events
      
      * tag 's390-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/pci: fix PF/VF linking on hot plug
        s390/pci: re-introduce zpci_remove_device()
        s390/pci: fix zpci_bus_link_virtfn()
        s390/ptrace: fix storage key handling
        s390/runtime_instrumentation: fix storage key handling
        s390/pci: ignore stale configuration request event
        s390/cio: add cond_resched() in the slow_eval_known_fn() loop
      d57ce840
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b2d9e996
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
      
       - PAE and PKU bugfixes for x86
      
       - selftests fix for new binutils
      
       - MMU notifier fix for arm64
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set
        KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()
        kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode
        kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode
        KVM: x86: fix access code passed to gva_to_gpa
        selftests: kvm: Use a shorter encoding to clear RAX
      b2d9e996
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 9e574b74
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "23 fixes in 5 drivers (qla2xxx, ufs, scsi_debug, fcoe, zfcp). The bulk
        of the changes are in qla2xxx and ufs and all are mostly small and
        definitely don't impact the core"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (23 commits)
        Revert "scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe"
        Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command"
        scsi: qla2xxx: Fix null pointer access during disconnect from subsystem
        scsi: qla2xxx: Check if FW supports MQ before enabling
        scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hba
        scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set anytime
        scsi: qla2xxx: Reduce noisy debug message
        scsi: qla2xxx: Fix login timeout
        scsi: qla2xxx: Indicate correct supported speeds for Mezz card
        scsi: qla2xxx: Flush I/O on zone disable
        scsi: qla2xxx: Flush all sessions on zone disable
        scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout values
        scsi: scsi_debug: Fix scp is NULL errors
        scsi: zfcp: Fix use-after-free in request timeout handlers
        scsi: ufs: No need to send Abort Task if the task in DB was cleared
        scsi: ufs: Clean up completed request without interrupt notification
        scsi: ufs: Improve interrupt handling for shared interrupts
        scsi: ufs: Fix interrupt error message for shared interrupts
        scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHL
        scsi: ufs-mediatek: Fix incorrect time to wait link status
        ...
      9e574b74
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · d6af6330
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
       "Another set of DT fixes:
      
         - restore range parsing error check
      
         - workaround PCI range parsing with missing 'device_type' now
           required
      
         - correct description of 'phy-connection-type'
      
         - fix erroneous matching on 'snps,dw-pcie' by 'intel,lgm-pcie' schema
      
         - a couple of grammar and whitespace fixes
      
         - update Shawn Guo's email"
      
      * tag 'devicetree-fixes-for-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: vendor-prefixes: Remove trailing whitespace
        dt-bindings: net: correct description of phy-connection-type
        dt-bindings: PCI: intel,lgm-pcie: Fix matching on all snps,dw-pcie instances
        of: address: Work around missing device_type property in pcie nodes
        dt: writing-schema: Miscellaneous grammar fixes
        dt-bindings: Use Shawn Guo's preferred e-mail for i.MX bindings
        of/address: check for invalid range.cpu_addr
      d6af6330
  5. Aug 22, 2020
    • Hou Pu's avatar
      null_blk: fix passing of REQ_FUA flag in null_handle_rq · 2d62e6b0
      Hou Pu authored
      REQ_FUA should be checked using rq->cmd_flags instead of req_op().
      
      Fixes: deb78b41
      
       ("nullb: emulate cache")
      Signed-off-by: default avatarHou Pu <houpu@bytedance.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2d62e6b0
    • Amit Engel's avatar
      nvmet: Disable keep-alive timer when kato is cleared to 0h · 0d3b6a8d
      Amit Engel authored
      
      
      Based on nvme spec, when keep alive timeout is set to zero
      the keep-alive timer should be disabled.
      
      Signed-off-by: default avatarAmit Engel <amit.engel@dell.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0d3b6a8d
    • Chao Leng's avatar
      nvme: redirect commands on dying queue · 5eac5f33
      Chao Leng authored
      
      
      If a command send through nvme-multipath failed on a dying queue, resend it
      on another path.
      
      Signed-off-by: default avatarChao Leng <lengchao@huawei.com>
      [hch: rebased on top of the completion refactoring]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5eac5f33
    • Christoph Hellwig's avatar
      nvme: just check the status code type in nvme_is_path_error · 1e41f3bd
      Christoph Hellwig authored
      
      
      Check the SCT sub-field for a path related status instead of enumerating
      invididual status code.  As of NVMe 1.4 this adds "Internal Path Error"
      and "Controller Pathing Error" to the list, but it also future proofs for
      additional status codes added to the category.
      
      Suggested-by: default avatarChao Leng <lengchao@huawei.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1e41f3bd
    • Christoph Hellwig's avatar
      nvme: refactor command completion · 5ddaabe8
      Christoph Hellwig authored
      
      
      Lift all the code to decide the dispostition of a completed command
      from nvme_complete_rq and nvme_failover_req into a new helper, which
      returns an emum of the potential actions.  nvme_complete_rq then
      just switches on those and calls the proper helper for the action.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5ddaabe8
    • Christoph Hellwig's avatar
      nvme: rename and document nvme_end_request · 2eb81a33
      Christoph Hellwig authored
      
      
      nvme_end_request is a bit misnamed, as it wraps around the
      blk_mq_complete_* API.  It's semantics also are non-trivial, so give it
      a more descriptive name and add a comment explaining the semantics.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2eb81a33
    • Keith Busch's avatar
      nvme: skip noiob for zoned devices · c41ad98b
      Keith Busch authored
      
      
      Zoned block devices reuse the chunk_sectors queue limit to define zone
      boundaries. If a such a device happens to also report an optimal
      boundary, do not use that to define the chunk_sectors as that may
      intermittently interfere with io splitting and zone size queries.
      
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c41ad98b
    • Christoph Hellwig's avatar
      nvme-pci: fix PRP pool size · c61b82c7
      Christoph Hellwig authored
      
      
      All operations are based on the controller, not the host page size.
      Switch the dma pool to use the controller page size as well to avoid
      massive overallocations on large page size systems.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c61b82c7
    • John Garry's avatar
      nvme-pci: Use u32 for nvme_dev.q_depth and nvme_queue.q_depth · 7442ddce
      John Garry authored
      Recently nvme_dev.q_depth was changed from an int to u16 type.
      
      This falls over for the queue depth calculation in nvme_pci_enable(),
      where NVME_CAP_MQES(dev->ctrl.cap) + 1 may overflow as a u16, as
      NVME_CAP_MQES() is a 16b number also. That happens for me, and this is the
      result:
      
      root@ubuntu:/home/john# [148.272996] Unable to handle kernel NULL pointer
      dereference at virtual address 0000000000000010
      Mem abort info:
      ESR = 0x96000004
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      Data abort info:
      ISV = 0, ISS = 0x00000004
      CM = 0, WnR = 0
      user pgtable: 4k pages, 48-bit VAs, pgdp=00000a27bf3c9000
      [0000000000000010] pgd=0000000000000000, p4d=0000000000000000
      Internal error: Oops: 96000004 [#1] PREEMPT SMP
      Modules linked in: nvme nvme_core
      CPU: 56 PID: 256 Comm: kworker/u195:0 Not tainted
      5.8.0-next-20200812 #27
      Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 -
      V1.16.01 03/15/2019
      Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      pstate: 80c00009 (Nzcv daif +PAN +UAO BTYPE=--)
      pc : __sg_alloc_table_from_pages+0xec/0x238
      lr : __sg_alloc_table_from_pages+0xc8/0x238
      sp : ffff800013ccbad0
      x29: ffff800013ccbad0 x28: ffff0a27b3d380a8
      x27: 0000000000000000 x26: 0000000000002dc2
      x25: 0000000000000dc0 x24: 0000000000000000
      x23: 0000000000000000 x22: ffff800013ccbbe8
      x21: 0000000000000010 x20: 0000000000000000
      x19: 00000000fffff000 x18: ffffffffffffffff
      x17: 00000000000000c0 x16: fffffe289eaf6380
      x15: ffff800011b59948 x14: ffff002bc8fe98f8
      x13: ff00000000000000 x12: ffff8000114ca000
      x11: 0000000000000000 x10: ffffffffffffffff
      x9 : ffffffffffffffc0 x8 : ffff0a27b5f9b6a0
      x7 : 0000000000000000 x6 : 0000000000000001
      x5 : ffff0a27b5f9b680 x4 : 0000000000000000
      x3 : ffff0a27b5f9b680 x2 : 0000000000000000
       x1 : 0000000000000001 x0 : 0000000000000000
       Call trace:
      __sg_alloc_table_from_pages+0xec/0x238
      sg_alloc_table_from_pages+0x18/0x28
      iommu_dma_alloc+0x474/0x678
      dma_alloc_attrs+0xd8/0xf0
      nvme_alloc_queue+0x114/0x160 [nvme]
      nvme_reset_work+0xb34/0x14b4 [nvme]
      process_one_work+0x1e8/0x360
      worker_thread+0x44/0x478
      kthread+0x150/0x158
      ret_from_fork+0x10/0x34
       Code: f94002c3 6b01017f 540007c2 11000486 (f8645aa5)
      ---[ end trace 89bb2b72d59bf925 ]---
      
      Fix by making onto a u32.
      
      Also use u32 for nvme_dev.q_depth, as we assign this value from
      nvme_dev.q_depth, and nvme_dev.q_depth will possibly hold 65536 - this
      avoids the same crash as above.
      
      Fixes: 61f3b896
      
       ("nvme-pci: use unsigned for io queue depth")
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7442ddce
    • Logan Gunthorpe's avatar
      nvme: Use spin_lock_irq() when taking the ctrl->lock · ecbcdf0c
      Logan Gunthorpe authored
      When locking the ctrl->lock spinlock IRQs need to be disabled to avoid a
      dead lock. The new spin_lock() calls recently added produce the
      following lockdep warning when running the blktest nvme/003:
      
          ================================
          WARNING: inconsistent lock state
          --------------------------------
          inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
          ksoftirqd/2/22 [HC0[0]:SC1[1]:HE0:SE0] takes:
          ffff888276a8c4c0 (&ctrl->lock){+.?.}-{2:2}, at: nvme_keep_alive_end_io+0x50/0xc0
          {SOFTIRQ-ON-W} state was registered at:
            lock_acquire+0x164/0x500
            _raw_spin_lock+0x28/0x40
            nvme_get_effects_log+0x37/0x1c0
            nvme_init_identify+0x9e4/0x14f0
            nvme_reset_work+0xadd/0x2360
            process_one_work+0x66b/0xb70
            worker_thread+0x6e/0x6c0
            kthread+0x1e7/0x210
            ret_from_fork+0x22/0x30
          irq event stamp: 1449221
          hardirqs last  enabled at (1449220): [<ffffffff81c58e69>] ktime_get+0xf9/0x140
          hardirqs last disabled at (1449221): [<ffffffff83129665>] _raw_spin_lock_irqsave+0x25/0x60
          softirqs last  enabled at (1449210): [<ffffffff83400447>] __do_softirq+0x447/0x595
          softirqs last disabled at (1449215): [<ffffffff81b489b5>] run_ksoftirqd+0x35/0x50
      
          other info that might help us debug this:
           Possible unsafe locking scenario:
      
                 CPU0
                 ----
            lock(&ctrl->lock);
            <Interrupt>
              lock(&ctrl->lock);
      
           *** DEADLOCK ***
      
          no locks held by ksoftirqd/2/22.
      
          stack backtrace:
          CPU: 2 PID: 22 Comm: ksoftirqd/2 Not tainted 5.8.0-rc4-eid-vmlocalyes-dbg-00157-g7236657c6b3a #1450
          Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
          Call Trace:
           dump_stack+0xc8/0x11a
           print_usage_bug.cold.63+0x235/0x23e
           mark_lock+0xa9c/0xcf0
           __lock_acquire+0xd9a/0x2b50
           lock_acquire+0x164/0x500
           _raw_spin_lock_irqsave+0x40/0x60
           nvme_keep_alive_end_io+0x50/0xc0
           blk_mq_end_request+0x158/0x210
           nvme_complete_rq+0x146/0x500
           nvme_loop_complete_rq+0x26/0x30 [nvme_loop]
           blk_done_softirq+0x187/0x1e0
           __do_softirq+0x118/0x595
           run_ksoftirqd+0x35/0x50
           smpboot_thread_fn+0x1d3/0x310
           kthread+0x1e7/0x210
           ret_from_fork+0x22/0x30
      
      Fixes: be93e87e
      
       ("nvme: support for multiple Command Sets Supported and Effects log pages")
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Tested-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ecbcdf0c
    • Chaitanya Kulkarni's avatar
      nvmet: call blk_mq_free_request() directly · 7ee51cf6
      Chaitanya Kulkarni authored
      
      
      Instead of calling blk_put_request() which calls blk_mq_free_request(),
      call blk_mq_free_request() directly for NVMeOF passthru. This is to
      mainly avoid an extra function call in the completion path
      nvmet_passthru_req_done().
      
      Signed-off-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7ee51cf6
    • Chaitanya Kulkarni's avatar
      nvmet: fix oops in pt cmd execution · a2138fd4
      Chaitanya Kulkarni authored
      
      
      In the existing NVMeOF Passthru core command handling on failure of
      nvme_alloc_request() it errors out with rq value set to NULL. In the
      error handling path it calls blk_put_request() without checking if
      rq is set to NULL or not which produces following Oops:-
      
      [ 1457.346861] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [ 1457.347838] #PF: supervisor read access in kernel mode
      [ 1457.348464] #PF: error_code(0x0000) - not-present page
      [ 1457.349085] PGD 0 P4D 0
      [ 1457.349402] Oops: 0000 [#1] SMP NOPTI
      [ 1457.349851] CPU: 18 PID: 10782 Comm: kworker/18:2 Tainted: G           OE     5.8.0-rc4nvme-5.9+ #35
      [ 1457.350951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214
      [ 1457.352347] Workqueue: events nvme_loop_execute_work [nvme_loop]
      [ 1457.353062] RIP: 0010:blk_mq_free_request+0xe/0x110
      [ 1457.353651] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8
      [ 1457.355975] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282
      [ 1457.356636] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
      [ 1457.357526] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000
      [ 1457.358416] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000
      [ 1457.359317] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000
      [ 1457.360424] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8
      [ 1457.361322] FS:  0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000
      [ 1457.362337] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1457.363058] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0
      [ 1457.363973] Call Trace:
      [ 1457.364296]  nvmet_passthru_execute_cmd+0x150/0x2c0 [nvmet]
      [ 1457.364990]  process_one_work+0x24e/0x5a0
      [ 1457.365493]  ? __schedule+0x353/0x840
      [ 1457.365957]  worker_thread+0x3c/0x380
      [ 1457.366426]  ? process_one_work+0x5a0/0x5a0
      [ 1457.366948]  kthread+0x135/0x150
      [ 1457.367362]  ? kthread_create_on_node+0x60/0x60
      [ 1457.367934]  ret_from_fork+0x22/0x30
      [ 1457.368388] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corer
      [ 1457.368414]  ata_piix crc32c_intel virtio_pci libata virtio_ring serio_raw t10_pi virtio floppy dm_]
      [ 1457.380849] CR2: 0000000000000000
      [ 1457.381288] ---[ end trace c6cab61bfd1f68fd ]---
      [ 1457.381861] RIP: 0010:blk_mq_free_request+0xe/0x110
      [ 1457.382469] Code: 3f ff ff ff 83 f8 01 75 0d 4c 89 e7 e8 1b db ff ff e9 2d ff ff ff 0f 0b eb ef 66 8
      [ 1457.384749] RSP: 0018:ffffc900035b7de0 EFLAGS: 00010282
      [ 1457.385393] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
      [ 1457.386264] RDX: ffffffffa060bd05 RSI: 0000000000000000 RDI: 0000000000000000
      [ 1457.387142] RBP: 0000000000000037 R08: 0000000000000000 R09: 0000000000000000
      [ 1457.388029] R10: 0000000000000000 R11: 000000000000006d R12: 0000000000000000
      [ 1457.388914] R13: ffff8887ffa68600 R14: 0000000000000000 R15: ffff8888150564c8
      [ 1457.389798] FS:  0000000000000000(0000) GS:ffff888814600000(0000) knlGS:0000000000000000
      [ 1457.390796] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1457.391508] CR2: 0000000000000000 CR3: 000000081c0ac000 CR4: 00000000003406e0
      [ 1457.392525] Kernel panic - not syncing: Fatal exception
      [ 1457.394138] Kernel Offset: disabled
      [ 1457.394677] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      We fix this Oops by adding a new goto label out_put_req and reordering
      the blk_put_request call to avoid calling blk_put_request() with rq
      value is set to NULL. Here we also update the rest of the code
      accordingly.
      
      Fixes: 06b7164dfdc0 ("nvmet: add passthru code to process commands")
      Signed-off-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a2138fd4