Skip to content
  1. Dec 23, 2019
    • Namhyung Kim's avatar
      libbpf: Fix build on read-only filesystems · fa633a0f
      Namhyung Kim authored
      
      
      I got the following error when I tried to build perf on a read-only
      filesystem with O=dir option.
      
        $ cd /some/where/ro/linux/tools/perf
        $ make O=$HOME/build/perf
        ...
          CC       /home/namhyung/build/perf/lib.o
        /bin/sh: bpf_helper_defs.h: Read-only file system
        make[3]: *** [Makefile:184: bpf_helper_defs.h] Error 1
        make[2]: *** [Makefile.perf:778: /home/namhyung/build/perf/libbpf.a] Error 2
        make[2]: *** Waiting for unfinished jobs....
          LD       /home/namhyung/build/perf/libperf-in.o
          AR       /home/namhyung/build/perf/libperf.a
          PERF_VERSION = 5.4.0
        make[1]: *** [Makefile.perf:225: sub-make] Error 2
        make: *** [Makefile:70: all] Error 2
      
      It was becaused bpf_helper_defs.h was generated in current directory.
      Move it to OUTPUT directory.
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20191223061326.843366-1-namhyung@kernel.org
      fa633a0f
    • Daniel Borkmann's avatar
      bpf: Fix precision tracking for unbounded scalars · f54c7898
      Daniel Borkmann authored
      Anatoly has been fuzzing with kBdysch harness and reported a hang in one
      of the outcomes. Upon closer analysis, it turns out that precise scalar
      value tracking is missing a few precision markings for unknown scalars:
      
        0: R1=ctx(id=0,off=0,imm=0) R10=fp0
        0: (b7) r0 = 0
        1: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
        1: (35) if r0 >= 0xf72e goto pc+0
        --> only follow fallthrough
        2: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
        2: (35) if r0 >= 0x80fe0000 goto pc+0
        --> only follow fallthrough
        3: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
        3: (14) w0 -= -536870912
        4: R0_w=invP536870912 R1=ctx(id=0,off=0,imm=0) R10=fp0
        4: (0f) r1 += r0
        5: R0_w=invP536870912 R1_w=inv(id=0) R10=fp0
        5: (55) if r1 != 0x104c1500 goto pc+0
        --> push other branch for later analysis
        R0_w=invP536870912 R1_w=inv273421568 R10=fp0
        6: R0_w=invP536870912 R1_w=inv273421568 R10=fp0
        6: (b7) r0 = 0
        7: R0=invP0 R1=inv273421568 R10=fp0
        7: (76) if w1 s>= 0xffffff00 goto pc+3
        --> only follow goto
        11: R0=invP0 R1=inv273421568 R10=fp0
        11: (95) exit
        6: R0_w=invP536870912 R1_w=inv(id=0) R10=fp0
        6: (b7) r0 = 0
        propagating r0
        7: safe
        processed 11 insns [...]
      
      In the analysis of the second path coming after the successful exit above,
      the path is being pruned at line 7. Pruning analysis found that both r0 are
      precise P0 and both R1 are non-precise scalars and given prior path with
      R1 as non-precise scalar succeeded, this one is therefore safe as well.
      
      However, problem is that given condition at insn 7 in the first run, we only
      followed goto and didn't push the other branch for later analysis, we've
      never walked the few insns in there and therefore dead-code sanitation
      rewrites it as goto pc-1, causing the hang depending on the skb address
      hitting these conditions. The issue is that R1 should have been marked as
      precise as well such that pruning enforces range check and conluded that new
      R1 is not in range of old R1. In insn 4, we mark R1 (skb) as unknown scalar
      via __mark_reg_unbounded() but not mark_reg_unbounded() and therefore
      regs->precise remains as false.
      
      Back in b5dc0163 ("bpf: precise scalar_value tracking"), this was not
      the case since marking out of __mark_reg_unbounded() had this covered as well.
      Once in both are set as precise in 4 as they should have been, we conclude
      that given R1 was in prior fall-through path 0x104c1500 and now is completely
      unknown, the check at insn 7 concludes that we need to continue walking.
      Analysis after the fix:
      
        0: R1=ctx(id=0,off=0,imm=0) R10=fp0
        0: (b7) r0 = 0
        1: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
        1: (35) if r0 >= 0xf72e goto pc+0
        2: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
        2: (35) if r0 >= 0x80fe0000 goto pc+0
        3: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
        3: (14) w0 -= -536870912
        4: R0_w=invP536870912 R1=ctx(id=0,off=0,imm=0) R10=fp0
        4: (0f) r1 += r0
        5: R0_w=invP536870912 R1_w=invP(id=0) R10=fp0
        5: (55) if r1 != 0x104c1500 goto pc+0
        R0_w=invP536870912 R1_w=invP273421568 R10=fp0
        6: R0_w=invP536870912 R1_w=invP273421568 R10=fp0
        6: (b7) r0 = 0
        7: R0=invP0 R1=invP273421568 R10=fp0
        7: (76) if w1 s>= 0xffffff00 goto pc+3
        11: R0=invP0 R1=invP273421568 R10=fp0
        11: (95) exit
        6: R0_w=invP536870912 R1_w=invP(id=0) R10=fp0
        6: (b7) r0 = 0
        7: R0_w=invP0 R1_w=invP(id=0) R10=fp0
        7: (76) if w1 s>= 0xffffff00 goto pc+3
        R0_w=invP0 R1_w=invP(id=0) R10=fp0
        8: R0_w=invP0 R1_w=invP(id=0) R10=fp0
        8: (a5) if r0 < 0x2007002a goto pc+0
        9: R0_w=invP0 R1_w=invP(id=0) R10=fp0
        9: (57) r0 &= -16316416
        10: R0_w=invP0 R1_w=invP(id=0) R10=fp0
        10: (a6) if w0 < 0x1201 goto pc+0
        11: R0_w=invP0 R1_w=invP(id=0) R10=fp0
        11: (95) exit
        11: R0=invP0 R1=invP(id=0) R10=fp0
        11: (95) exit
        processed 16 insns [...]
      
      Fixes: 6754172c
      
       ("bpf: fix precision tracking in presence of bpf2bpf calls")
      Reported-by: default avatarAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20191222223740.25297-1-daniel@iogearbox.net
      f54c7898
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · c6017471
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Fix a few bugs that could lead to corrupt files, fsck complaints, and
        filesystem crashes:
      
         - Minor documentation fixes
      
         - Fix a file corruption due to read racing with an insert range
           operation.
      
         - Fix log reservation overflows when allocating large rt extents
      
         - Fix a buffer log item flags check
      
         - Don't allow administrators to mount with sunit= options that will
           cause later xfs_repair complaints about the root directory being
           suspicious because the fs geometry appeared inconsistent
      
         - Fix a non-static helper that should have been static"
      
      * tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: Make the symbol 'xfs_rtalloc_log_count' static
        xfs: don't commit sunit/swidth updates to disk if that would cause repair failures
        xfs: split the sunit parameter update into two parts
        xfs: refactor agfl length computation function
        libxfs...
      c6017471
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a3965607
      Linus Torvalds authored
      Pull ext4 bug fixes from Ted Ts'o:
       "Ext4 bug fixes, including a regression fix"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: clarify impact of 'commit' mount option
        ext4: fix unused-but-set-variable warning in ext4_add_entry()
        jbd2: fix kernel-doc notation warning
        ext4: use RCU API in debug_print_tree
        ext4: validate the debug_want_extra_isize mount option at parse time
        ext4: reserve revoke credits in __ext4_new_inode
        ext4: unlock on error in ext4_expand_extra_isize()
        ext4: optimize __ext4_check_dir_entry()
        ext4: check for directory entries too close to block end
        ext4: fix ext4_empty_dir() for directories with holes
      a3965607
    • Linus Torvalds's avatar
      Merge tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block · 44579f35
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Let's try this one again, this time without the compat_ioctl changes.
        We've got those fixed up, but that can go out next week.
      
        This contains:
      
         - block queue flush lockdep annotation (Bart)
      
         - Type fix for bsg_queue_rq() (Bart)
      
         - Three dasd fixes (Stefan, Jan)
      
         - nbd deadlock fix (Mike)
      
         - Error handling bio user map fix (Yang)
      
         - iocost fix (Tejun)
      
         - sbitmap waitqueue addition fix that affects the kyber IO scheduler
           (David)"
      
      * tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block:
        sbitmap: only queue kyber's wait callback if not already active
        block: fix memleak when __blk_rq_map_user_iov() is failed
        s390/dasd: fix typo in copyright statement
        s390/dasd: fix memleak in path handling error case
        s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly
        block: Fix a lockdep complaint triggered by request queue flushing
        block: Fix the type of 'sts' in bsg_queue_rq()
        block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT
        nbd: fix shutdown and recv work deadlock v2
        iocost: over-budget forced IOs should schedule async delay
      44579f35
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a313c8e0
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "PPC:
         - Fix a bug where we try to do an ultracall on a system without an
           ultravisor
      
        KVM:
         - Fix uninitialised sysreg accessor
         - Fix handling of demand-paged device mappings
         - Stop spamming the console on IMPDEF sysregs
         - Relax mappings of writable memslots
         - Assorted cleanups
      
        MIPS:
         - Now orphan, James Hogan is stepping down
      
        x86:
         - MAINTAINERS change, so long Radim and thanks for all the fish
         - supported CPUID fixes for AMD machines without SPEC_CTRL"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        MAINTAINERS: remove Radim from KVM maintainers
        MAINTAINERS: Orphan KVM for MIPS
        kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD
        kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD
        KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor
        KVM: arm/arm64: Properly handle faulting of device mappings
        KVM: arm64: Ensure 'params' is initialised when looking up sys register
        KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region
        KVM: arm64: Don't log IMP DEF sysreg traps
        KVM: arm64: Sanely ratelimit sysreg messages
        KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create()
        KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy()
        KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
      a313c8e0
    • Linus Torvalds's avatar
      Merge tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 7214618c
      Linus Torvalds authored
      Pull RISC-V fixes from Paul Walmsley:
       "Several fixes, and one cleanup, for RISC-V.
      
        Fixes:
      
         - Fix an error in a Kconfig file that resulted in an undefined
           Kconfig option "CONFIG_CONFIG_MMU"
      
         - Fix undefined Kconfig option "CONFIG_CONFIG_MMU"
      
         - Fix scratch register clearing in M-mode (affects nommu users)
      
         - Fix a mismerge on my part that broke the build for
           CONFIG_SPARSEMEM_VMEMMAP users
      
        Cleanup:
      
         - Move SiFive L2 cache-related code to drivers/soc, per request"
      
      * tag 'riscv/for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: move sifive_l2_cache.c to drivers/soc
        riscv: define vmemmap before pfn_to_page calls
        riscv: fix scratch register clearing in M-mode.
        riscv: Fix use of undefined config option CONFIG_CONFIG_MMU
      7214618c
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 78bac77b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
          including adding a missing ipv6 match description.
      
       2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
          Bhat.
      
       3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.
      
       4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.
      
       5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.
      
       6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
          Chaignon.
      
       7) Multicast MAC limit test is off by one in qede, from Manish Chopra.
      
       8) Fix established socket lookup race when socket goes from
          TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
          RCU grace period. From Eric Dumazet.
      
       9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.
      
      10) Fix active backup transition after link failure in bonding, from
          Mahesh Bandewar.
      
      11) Avoid zero sized hash table in...
      78bac77b
    • Jan Stancek's avatar
      pipe: fix empty pipe check in pipe_write() · 0dd1e377
      Jan Stancek authored
      LTP pipeio_1 test is hanging with v5.5-rc2-385-gb8e382a185eb,
      with read side observing empty pipe and sleeping and write
      side running out of space and then sleeping as well. In this
      scenario there are 5 writers and 1 reader.
      
      Problem is that after pipe_write() reacquires pipe lock, it
      re-checks for empty pipe with potentially stale 'head' and
      doesn't wake up read side anymore. pipe->tail can advance
      beyond 'head', because there are multiple writers.
      
      Use pipe->head for empty pipe check after reacquiring lock
      to observe current state.
      
      Testing: With patch, LTP pipeio_1 ran successfully in loop for 1 hour.
               Without patch it hanged within a minute.
      
      Fixes: 1b6b26ae
      
       ("pipe: fix and clarify pipe write wakeup logic")
      Reported-by: default avatarRachel Sibley <rasibley@redhat.com>
      Signed-off-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0dd1e377
  2. Dec 22, 2019
  3. Dec 21, 2019
    • Masahiro Yamada's avatar
      kbuild: clarify the difference between obj-y and obj-m w.r.t. descending · 28f94a44
      Masahiro Yamada authored
      
      
      Kbuild descends into a directory by either 'y' or 'm', but there is an
      important difference.
      
      Kbuild combines the built-in objects into built-in.a in each directory.
      The built-in.a in the directory visited by obj-y is merged into the
      built-in.a in the parent directory. This merge happens recursively
      when Kbuild is ascending back towards the top directory, then built-in
      objects are linked into vmlinux eventually. This works properly only
      when the Makefile specifying obj-y is reachable by the chain of obj-y.
      
      On the other hand, Kbuild does not take built-in.a from the directory
      visited by obj-m. This it, all the objects in that directory are
      supposed to be modular. If Kbuild descends into a directory by obj-m,
      but the Makefile in the sub-directory specifies obj-y, those objects
      are just left orphan.
      
      The current statement "Kbuild only uses this information to decide that
      it needs to visit the directory" is misleading. Clarify the difference.
      
      Reported-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      28f94a44
    • Linus Torvalds's avatar
      Merge branch 'parisc-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 62104694
      Linus Torvalds authored
      Pul parisc fixes from Helge Deller:
       "Two build error fixes, one for the soft_offline_page() parameter
        change and one for a specific KEXEC/KEXEC_FILE configuration, as well
        as a compiler and a linker warning fix"
      
      * 'parisc-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix compiler warnings in debug_core.c
        parisc: soft_offline_page() now takes the pfn
        parisc: add missing __init annotation
        parisc: fix compilation when KEXEC=n and KEXEC_FILE=y
      62104694
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.5b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 62af608b
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
       "This contains two cleanup patches and a small series for supporting
        reloading the Xen block backend driver"
      
      * tag 'for-linus-5.5b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/grant-table: remove multiple BUG_ON on gnttab_interface
        xen-blkback: support dynamic unbind/bind
        xen/interface: re-define FRONT/BACK_RING_ATTACH()
        xenbus: limit when state is forced to closed
        xenbus: move xenbus_dev_shutdown() into frontend code...
        xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk
      62af608b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 6d04182d
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Two weeks worth of accumulated fixes:
      
         - A fix for a performance regression seen on PowerVM LPARs using
           dedicated CPUs, caused by our vcpu_is_preempted() returning true
           even for idle CPUs.
      
         - One of the ultravisor support patches broke KVM on big endian hosts
           in v5.4.
      
         - Our KUAP (Kernel User Access Prevention) code missed allowing
           access in __clear_user(), which could lead to an oops or erroneous
           SEGV when triggered via PTRACE_GETREGSET.
      
         - Two fixes for the ocxl driver, an open/remove race, and a memory
           leak in an error path.
      
         - A handful of other small fixes.
      
        Thanks to: Andrew Donnellan, Christian Zigotzky, Christophe Leroy,
        Christoph Hellwig, Daniel Axtens, David Hildenbrand, Frederic Barrat,
        Gautham R. Shenoy, Greg Kurz, Ihor Pasichnyk, Juri Lelli, Marcus
        Comstedt, Mike Rapoport, Parth Shah, Srikar Dronamraju, Vaidyanathan
        Srinivasan"
      
      * tag 'powerpc-5.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        KVM: PPC: Book3S HV: Fix regression on big endian hosts
        powerpc: Fix __clear_user() with KUAP enabled
        powerpc/pseries/cmm: fix managed page counts when migrating pages between zones
        powerpc/8xx: fix bogus __init on mmu_mapin_ram_chunk()
        ocxl: Fix potential memory leak on context creation
        powerpc/irq: fix stack overflow verification
        powerpc: Ensure that swiotlb buffer is allocated from low memory
        powerpc/shared: Use static key to detect shared processor
        powerpc/vcpu: Assume dedicated processors as non-preempt
        ocxl: Fix concurrent AFU open and device removal
      6d04182d
    • Linus Torvalds's avatar
      Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5c741e25
      Linus Torvalds authored
      Pull x86 RAS fixes from Borislav Petkov:
       "Three urgent RAS fixes for the AMD side of things:
      
         - initialize struct mce.bank so that calculated error severity on AMD
           SMCA machines is correct
      
         - do not send IPIs early during bank initialization, when interrupts
           are disabled
      
         - a fix for when only a subset of MCA banks are enabled, which led to
           boot hangs on some new AMD CPUs"
      
      * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Fix possibly incorrect severity calculation on AMD
        x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]
        x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()
      5c741e25
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 12ac9a08
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "One core framework fix to walk the orphan list and match up clks to
        parents when clk providers register the DT provider after registering
        all their clks (as they should).
      
        Then a handful of driver fixes for the qcom, imx, and at91 drivers.
      
        The driver fixes are relatively small fixes for incorrect register
        settings or missing locks causing race conditions"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: qcom: Avoid SMMU/cx gdsc corner cases
        clk: qcom: gcc-sc7180: Fix setting flag for votable GDSCs
        clk: Move clk_core_reparent_orphans() under CONFIG_OF
        clk: at91: fix possible deadlock
        clk: walk orphan list on clock provider registration
        clk: imx: pll14xx: fix clk_pll14xx_wait_lock
        clk: imx: clk-imx7ulp: Add missing sentinel of ulp_div_table
        clk: imx: clk-composite-8m: add lock to gate/mux
      12ac9a08
    • David S. Miller's avatar
      Merge branch 'sfc-fix-bugs-introduced-by-XDP-patches' · 4bfeadfc
      David S. Miller authored
      
      
      Edward Cree says:
      
      ====================
      sfc: fix bugs introduced by XDP patches
      
      Two fixes for bugs introduced by the XDP support in the sfc driver.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bfeadfc
    • Charles McLachlan's avatar
      sfc: Include XDP packet headroom in buffer step size. · 11a14dc8
      Charles McLachlan authored
      Correct a mismatch between rx_page_buf_step and the actual step size
      used when filling buffer pages.
      
      This patch fixes the page overrun that occured when the MTU was set to
      anything bigger than 1692.
      
      Fixes: 3990a8ff
      
       ("sfc: allocate channels for XDP tx queues")
      Signed-off-by: default avatarCharles McLachlan <cmclachlan@solarflare.com>
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11a14dc8
    • Edward Cree's avatar
      sfc: fix channel allocation with brute force · 8700aff0
      Edward Cree authored
      It was possible for channel allocation logic to get confused between what
       it had and what it wanted, and end up trying to use the same channel for
       both PTP and regular TX.  This led to a kernel panic:
          BUG: unable to handle page fault for address: 0000000000047635
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          PGD 0 P4D 0
          Oops: 0002 [#1] SMP PTI
          CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0-rc3-ehc14+ #900
          Hardware name: Dell Inc. PowerEdge R710/0M233H, BIOS 6.4.0 07/23/2013
          RIP: 0010:native_queued_spin_lock_slowpath+0x188/0x1e0
          Code: f3 90 48 8b 32 48 85 f6 74 f6 eb e8 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 c0 98 02 00 48 03 04 f5 a0 c6 ed 81 <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
          RSP: 0018:ffffc90000003d28 EFLAGS: 00010006
          RAX: 0000000000047635 RBX: 0000000000000246 RCX: 0000000000040000
          RDX: ffff888627a298c0 RSI: 0000000000003ffe RDI: ffff88861f6b8dd4
          RBP: ffff8886225c6e00 R08: 0000000000040000 R09: 0000000000000000
          R10: 0000000616f080c6 R11: 00000000000000c0 R12: ffff88861f6b8dd4
          R13: ffffc90000003dc8 R14: ffff88861942bf00 R15: ffff8886150f2000
          FS:  0000000000000000(0000) GS:ffff888627a00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000047635 CR3: 000000000200a000 CR4: 00000000000006f0
          Call Trace:
           <IRQ>
           _raw_spin_lock_irqsave+0x22/0x30
           skb_queue_tail+0x1b/0x50
           sock_queue_err_skb+0x9d/0xf0
           __skb_complete_tx_timestamp+0x9d/0xc0
           efx_dequeue_buffer+0x126/0x180 [sfc]
           efx_xmit_done+0x73/0x1c0 [sfc]
           efx_ef10_ev_process+0x56a/0xfe0 [sfc]
           ? tick_sched_do_timer+0x60/0x60
           ? timerqueue_add+0x5d/0x70
           ? enqueue_hrtimer+0x39/0x90
           efx_poll+0x111/0x380 [sfc]
           ? rcu_accelerate_cbs+0x50/0x160
           net_rx_action+0x14a/0x400
           __do_softirq+0xdd/0x2d0
           irq_exit+0xa0/0xb0
           do_IRQ+0x53/0xe0
           common_interrupt+0xf/0xf
           </IRQ>
      
      In the long run we intend to rewrite the channel allocation code, but for
       'net' fix this by allocating extra_channels, and giving them TX queues,
       even if we do not in fact need them (e.g. on NICs without MAC TX
       timestamping), and thereby using simpler logic to assign the channels
       once they're allocated.
      
      Fixes: 3990a8ff
      
       ("sfc: allocate channels for XDP tx queues")
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8700aff0
    • Geert Uytterhoeven's avatar
      net: dst: Force 4-byte alignment of dst_metrics · 258a980d
      Geert Uytterhoeven authored
      When storing a pointer to a dst_metrics structure in dst_entry._metrics,
      two flags are added in the least significant bits of the pointer value.
      Hence this assumes all pointers to dst_metrics structures have at least
      4-byte alignment.
      
      However, on m68k, the minimum alignment of 32-bit values is 2 bytes, not
      4 bytes.  Hence in some kernel builds, dst_default_metrics may be only
      2-byte aligned, leading to obscure boot warnings like:
      
          WARNING: CPU: 0 PID: 7 at lib/refcount.c:28 refcount_warn_saturate+0x44/0x9a
          refcount_t: underflow; use-after-free.
          Modules linked in:
          CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: G        W         5.5.0-rc2-atari-01448-g114a1a1038af891d-dirty #261
          Stack from 10835e6c:
      	    10835e6c 0038134f 00023fa6 00394b0f 0000001c 00000009 00321560 00023fea
      	    00394b0f 0000001c 001a70f8 00000009 00000000 10835eb4 00000001 00000000
      	    04208040 0000000a 00394b4a 10835ed4 00043aa8 001a70f8 00394b0f 0000001c
      	    00000009 00394b4a 0026aba8 003215a4 00000003 00000000 0026d5a8 00000001
      	    003215a4 003a4361 003238d6 000001f0 00000000 003215a4 10aa3b00 00025e84
      	    003ddb00 10834000 002416a8 10aa3b00 00000000 00000080 000aa038 0004854a
          Call Trace: [<00023fa6>] __warn+0xb2/0xb4
           [<00023fea>] warn_slowpath_fmt+0x42/0x64
           [<001a70f8>] refcount_warn_saturate+0x44/0x9a
           [<00043aa8>] printk+0x0/0x18
           [<001a70f8>] refcount_warn_saturate+0x44/0x9a
           [<0026aba8>] refcount_sub_and_test.constprop.73+0x38/0x3e
           [<0026d5a8>] ipv4_dst_destroy+0x5e/0x7e
           [<00025e84>] __local_bh_enable_ip+0x0/0x8e
           [<002416a8>] dst_destroy+0x40/0xae
      
      Fix this by forcing 4-byte alignment of all dst_metrics structures.
      
      Fixes: e5fd387a
      
       ("ipv6: do not overwrite inetpeer metrics prematurely")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      258a980d
    • Hangbin Liu's avatar
      selftests: pmtu: fix init mtu value in description · 15204477
      Hangbin Liu authored
      There is no a_r3, a_r4 in the testing topology.
      It should be b_r1, b_r2. Also b_r1 mtu is 1400 and b_r2 mtu is 1500.
      
      Fixes: e44e428f
      
       ("selftests: pmtu: add basic IPv4 and IPv6 PMTU tests")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15204477
    • Haiyang Zhang's avatar
      hv_netvsc: Fix unwanted rx_table reset · b0689faa
      Haiyang Zhang authored
      In existing code, the receive indirection table, rx_table, is in
      struct rndis_device, which will be reset when changing MTU, ringparam,
      etc. User configured receive indirection table values will be lost.
      
      To fix this, move rx_table to struct net_device_context, and check
      netif_is_rxfh_configured(), so rx_table will be set to default only
      if no user configured value.
      
      Fixes: ff4a4419
      
       ("netvsc: allow get/set of RSS indirection table")
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0689faa
    • Russell King's avatar
      net: phy: ensure that phy IDs are correctly typed · 7d49a32a
      Russell King authored
      PHY IDs are 32-bit unsigned quantities. Ensure that they are always
      treated as such, and not passed around as "int"s.
      
      Fixes: 13d0ab67
      
       ("net: phy: check return code when requesting PHY driver module")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d49a32a
    • Russell King's avatar
      mod_devicetable: fix PHY module format · d2ed49cf
      Russell King authored
      When a PHY is probed, if the top bit is set, we end up requesting a
      module with the string "mdio:-10101110000000100101000101010001" -
      the top bit is printed to a signed -1 value. This leads to the module
      not being loaded.
      
      Fix the module format string and the macro generating the values for
      it to ensure that we only print unsigned types and the top bit is
      always 0/1. We correctly end up with
      "mdio:10101110000000100101000101010001".
      
      Fixes: 8626d3b4
      
       ("phylib: Support phy module autoloading")
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2ed49cf
    • Manish Chopra's avatar
      qede: Disable hardware gro when xdp prog is installed · 4c8dc005
      Manish Chopra authored
      commit 18c602de ("qede: Use NETIF_F_GRO_HW.") introduced
      a regression in driver that when xdp program is installed on
      qede device, device's aggregation feature (hardware GRO) is not
      getting disabled, which is unexpected with xdp.
      
      Fixes: 18c602de
      
       ("qede: Use NETIF_F_GRO_HW.")
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c8dc005
    • David S. Miller's avatar
      Merge branch 'ena-fixes-of-interrupt-moderation-bugs' · 9f5e508b
      David S. Miller authored
      
      
      Arthur Kiyanovski says:
      
      ====================
      ena: fixes of interrupt moderation bugs
      
      Differences from V1:
      1. Updated default tx interrupt moderation to 64us
      2. Added "Fixes:" tags.
      3. Removed cosmetic changes that are not relevant for these bug fixes
      
      This patchset includes a couple of fixes of bugs in the implemenation of
      interrupt moderation.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f5e508b
    • Arthur Kiyanovski's avatar
      net: ena: fix issues in setting interrupt moderation params in ethtool · 41c53caa
      Arthur Kiyanovski authored
      Issue 1:
      --------
      Reproduction steps:
      1. sudo ethtool -C eth0 rx-usecs 128
      2. sudo ethtool -C eth0 adaptive-rx on
      3. sudo ethtool -C eth0 adaptive-rx off
      4. ethtool -c eth0
      
      expected output: rx-usecs 128
      actual output: rx-usecs 0
      
      Reason for issue:
      In stage 3, ethtool userspace calls first the ena_get_coalesce() handler
      to get the current value of all properties, and then the ena_set_coalesce()
      handler. When ena_get_coalesce() is called the adaptive interrupt
      moderation is still on. There is an if in the code that returns the
      rx_coalesce_usecs only if the adaptive interrupt moderation is off.
      And since it is still on, rx_coalesce_usecs is not set, meaning it
      stays 0.
      
      Solution to issue:
      Remove this if static interrupt moderation intervals have nothing to do
      with dynamic ones.
      
      Issue 2:
      --------
      Reproduction steps:
      1. sudo ethtool -C eth0 adaptive-rx on
      2. sudo ethtool -C eth0 rx-usecs 128
      3. ethtool -c eth0
      
      expected output: rx-usecs 128
      actual output: rx-usecs 0
      
      Reason for issue:
      In stage 2, when ena_set_coalesce() is called, the handler tests if
      rx adaptive interrupt moderation is on, and if it is, it returns before
      getting to the part in the function that sets the rx non-adaptive
      interrupt moderation interval.
      
      Solution to issue:
      Remove the return from the function when rx adaptive interrupt moderation
      is on.
      
      Also cleaned up the fixed code in ena_set_coalesce by grouping together
      adaptive interrupt moderation toggling, and using && instead of nested
      ifs.
      
      Fixes: b3db86dc ("net: ena: reimplement set/get_coalesce()")
      Fixes: 0eda8479 ("net: ena: fix retrieval of nonadaptive interrupt moderation intervals")
      Fixes: 1738cd3e
      
       ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41c53caa