Skip to content
  1. Feb 09, 2018
  2. Feb 08, 2018
  3. Feb 06, 2018
    • Daniel Borkmann's avatar
      Merge branch 'bpf-sockmap-fixes' · 41b0530e
      Daniel Borkmann authored
      
      
      John Fastabend says:
      
      ====================
      A set of fixes for sockmap to resolve programs referencing sockmaps
      and closing without deleting all entries in the map and/or not detaching
      BPF programs attached to the map. Both leaving entries in the map and
      not detaching programs may result in the map failing to be removed by
      BPF infrastructure due to reference counts never reaching zero.
      
      For this we pull in the ULP infrastructure to hook into the close()
      hook of the sock layer. This seemed natural because we have additional
      sockmap features (to add support for TX hooks) that will also use the
      ULP infrastructure. This allows us to cleanup entries in the map when
      socks are closed() and avoid trying to get the sk_state_change() hook
      to fire in all cases.
      
      The second issue resolved here occurs when users don't detach
      programs. The gist is a refcnt issue resolved by implementing the
      release callback. See patch for details.
      
      For testing I ran both sample/sockmap and selftests bpf/test_maps.c.
      Dave Watson ran TLS test suite on v1 version of the patches without
      the put_module error path change.
      
      v4 fix missing rcu_unlock()
      v3 wrap psock reference in RCU
      v2 changes rebased onto bpf-next with small update adding module_put
      ====================
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      41b0530e
    • John Fastabend's avatar
      bpf: sockmap, fix leaking maps with attached but not detached progs · 3d9e9526
      John Fastabend authored
      When a program is attached to a map we increment the program refcnt
      to ensure that the program is not removed while it is potentially
      being referenced from sockmap side. However, if this same program
      also references the map (this is a reasonably common pattern in
      my programs) then the verifier will also increment the maps refcnt
      from the verifier. This is to ensure the map doesn't get garbage
      collected while the program has a reference to it.
      
      So we are left in a state where the map holds the refcnt on the
      program stopping it from being removed and releasing the map refcnt.
      And vice versa the program holds a refcnt on the map stopping it
      from releasing the refcnt on the prog.
      
      All this is fine as long as users detach the program while the
      map fd is still around. But, if the user omits this detach command
      we are left with a dangling map we can no longer release.
      
      To resolve this when the map fd is released decrement the program
      references and remove any reference from the map to the program.
      This fixes the issue with possibly dangling map and creates a
      user side API constraint. That is, the map fd must be held open
      for programs to be attached to a map.
      
      Fixes: 174a79ff
      
       ("bpf: sockmap with sk redirect support")
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      3d9e9526
    • John Fastabend's avatar
      bpf: sockmap, add sock close() hook to remove socks · 1aa12bdf
      John Fastabend authored
      The selftests test_maps program was leaving dangling BPF sockmap
      programs around because not all psock elements were removed from
      the map. The elements in turn hold a reference on the BPF program
      they are attached to causing BPF programs to stay open even after
      test_maps has completed.
      
      The original intent was that sk_state_change() would be called
      when TCP socks went through TCP_CLOSE state. However, because
      socks may be in SOCK_DEAD state or the sock may be a listening
      socket the event is not always triggered.
      
      To resolve this use the ULP infrastructure and register our own
      proto close() handler. This fixes the above case.
      
      Fixes: 174a79ff
      
       ("bpf: sockmap with sk redirect support")
      Reported-by: default avatarPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      1aa12bdf
    • John Fastabend's avatar
      net: add a UID to use for ULP socket assignment · b11a632c
      John Fastabend authored
      
      
      Create a UID field and enum that can be used to assign ULPs to
      sockets. This saves a set of string comparisons if the ULP id
      is known.
      
      For sockmap, which is added in the next patches, a ULP is used to
      hook into TCP sockets close state. In this case the ULP being added
      is done at map insert time and the ULP is known and done on the kernel
      side. In this case the named lookup is not needed. Because we don't
      want to expose psock internals to user space socket options a user
      visible flag is also added. For TLS this is set for BPF it will be
      cleared.
      
      Alos remove pr_notice, user gets an error code back and should check
      that rather than rely on logs.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b11a632c
    • Yonghong Song's avatar
      tools/bpf: fix batch-mode test failure of test_xdp_redirect.sh · 7b4eb53d
      Yonghong Song authored
      The tests at tools/testing/selftests/bpf can run in patch mode, e.g.,
          make -C tools/testing/selftests/bpf run_tests
      
      With the batch mode, I experimented intermittent test failure of
      test_xdp_redirect.sh.
          ....
          selftests: test_xdp_redirect [PASS]
          selftests: test_xdp_redirect.sh [PASS]
          RTNETLINK answers: File exists
          selftests: test_xdp_meta [FAILED]
          selftests: test_xdp_meta.sh [FAIL]
          ....
      
      The following illustrates what caused the failure:
           (1). test_xdp_redirect creates veth pairs (veth1,veth11) and
                (veth2,veth22), and assign veth11 and veth22 to namespace
                ns1 and ns2 respectively.
           (2). at the end of test_xdp_redirect test, ns1 and ns2 are
                deleted. During this process, the deletion of actual
                namespace resources, including deletion of veth1{1} and veth2{2},
                is put into a workqueue to be processed asynchronously.
           (3). test_xdp_meta tries to create veth pair (veth1, veth2).
                The previous veth deletions in step (2) have not finished yet,
                and veth1 or veth2 may be still valid in the kernel, thus
                causing the failure.
      
      The fix is to explicitly delete the veth pair before test_xdp_redirect
      exits. Only one end of veth needs deletion as the kernel will delete
      the other end automatically. Also test_xdp_meta is also fixed in
      similar manner to avoid future potential issues.
      
      Fixes: 996139e8 ("selftests: bpf: add a test for XDP redirect")
      Fixes: 22c88526
      
       ("bpf: improve selftests and add tests for meta pointer")
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      7b4eb53d
  4. Feb 05, 2018
    • Yonghong Song's avatar
      bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y · 09584b40
      Yonghong Song authored
      With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file,
      tools/testing/selftests/bpf/test_kmod.sh failed like below:
        [root@localhost bpf]# ./test_kmod.sh
        sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
        [ JIT enabled:0 hardened:0 ]
        [  132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [ JIT enabled:1 hardened:0 ]
        [  133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [ JIT enabled:1 hardened:1 ]
        [  134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [ JIT enabled:1 hardened:2 ]
        [  136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
        [  136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
        [root@localhost bpf]#
      
      The test_kmod.sh load/remove test_bpf.ko multiple times with different
      settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297
      of test_bpf.ko is designed such that JIT always fails.
      
      Commit 290af866 (bpf: introduce BPF_JIT_ALWAYS_ON config)
      introduced the following tightening logic:
          ...
              if (!bpf_prog_is_dev_bound(fp->aux)) {
                      fp = bpf_int_jit_compile(fp);
          #ifdef CONFIG_BPF_JIT_ALWAYS_ON
                      if (!fp->jited) {
                              *err = -ENOTSUPP;
                              return fp;
                      }
          #endif
          ...
      With this logic, Test #297 always gets return value -ENOTSUPP
      when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure.
      
      This patch fixed the failure by marking Test #297 as expected failure
      when CONFIG_BPF_JIT_ALWAYS_ON is defined.
      
      Fixes: 290af866
      
       (bpf: introduce BPF_JIT_ALWAYS_ON config)
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      09584b40
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · a6b88814
      David S. Miller authored
      
      
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2018-02-02
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) support XDP attach in libbpf, from Eric.
      
      2) minor fixes, from Daniel, Jakub, Yonghong, Alexei.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6b88814
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 35277995
      Linus Torvalds authored
      Pull spectre/meltdown updates from Thomas Gleixner:
       "The next round of updates related to melted spectrum:
      
         - The initial set of spectre V1 mitigations:
      
             - Array index speculation blocker and its usage for syscall,
               fdtable and the n180211 driver.
      
             - Speculation barrier and its usage in user access functions
      
         - Make indirect calls in KVM speculation safe
      
         - Blacklisting of known to be broken microcodes so IPBP/IBSR are not
           touched.
      
         - The initial IBPB support and its usage in context switch
      
         - The exposure of the new speculation MSRs to KVM guests.
      
         - A fix for a regression in x86/32 related to the cpu entry area
      
         - Proper whitelisting for known to be safe CPUs from the mitigations.
      
         - objtool fixes to deal proper with retpolines and alternatives
      
         - Exclude __init functions from retpolines which speeds up the boot
           process.
      
         - Removal of the syscall64 fast path and related cleanups and
           simplifications
      
         - Removal of the unpatched paravirt mode which is yet another source
           of indirect unproteced calls.
      
         - A new and undisputed version of the module mismatch warning
      
         - A couple of cleanup and correctness fixes all over the place
      
        Yet another step towards full mitigation. There are a few things still
        missing like the RBS underflow mitigation for Skylake and other small
        details, but that's being worked on.
      
        That said, I'm taking a belated christmas vacation for a week and hope
        that everything is magically solved when I'm back on Feb 12th"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
        KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
        KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
        KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
        KVM/x86: Add IBPB support
        KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
        x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
        x86/pti: Mark constant arrays as __initconst
        x86/spectre: Simplify spectre_v2 command line parsing
        x86/retpoline: Avoid retpolines for built-in __init functions
        x86/kvm: Update spectre-v1 mitigation
        KVM: VMX: make MSR bitmaps per-VCPU
        x86/paravirt: Remove 'noreplace-paravirt' cmdline option
        x86/speculation: Use Indirect Branch Prediction Barrier in context switch
        x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
        x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"
        x86/spectre: Report get_user mitigation for spectre_v1
        nl80211: Sanitize array index in parse_txq_params
        vfs, fdtable: Prevent bounds-check bypass via speculative execution
        x86/syscall: Sanitize syscall table de-references under speculation
        x86/get_user: Use pointer masking to limit speculation
        ...
      35277995
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0a646e9c
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A small set of changes:
      
         - a fixup for kexec related to 5-level paging mode. That covers most
           of the cases except kexec from a 5-level kernel to a 4-level
           kernel. The latter needs more work and is going to come in 4.17
      
         - two trivial fixes for build warnings triggered by LTO and gcc-8"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/power: Fix swsusp_arch_resume prototype
        x86/dumpstack: Avoid uninitlized variable
        x86/kexec: Make kexec (mostly) work in 5-level paging mode
      0a646e9c
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f74a127f
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "Two small changes:
      
         - a fix for a interrupt regression caused by the vector management
           changes in 4.15 affecting museum pieces which rely on interrupt
           probing for legacy (e.g. parallel port) devices.
      
           One of the startup calls in the autoprobe code was not changed to
           the new activate_and_startup() function resulting in a warning and
           as a consequence failing to discover the device interrupt.
      
         - a trivial update to the copyright/license header of the STM32 irq
           chip driver"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Make legacy autoprobing work again
        irqchip/stm32: Fix copyright
      f74a127f
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180204' of git://git.kernel.dk/linux-block · 64b28683
      Linus Torvalds authored
      Pull more block updates from Jens Axboe:
       "Most of this is fixes and not new code/features:
      
         - skd fix from Arnd, fixing a build error dependent on sla allocator
           type.
      
         - blk-mq scheduler discard merging fixes, one from me and one from
           Keith. This fixes a segment miscalculation for blk-mq-sched, where
           we mistakenly think two segments are physically contigious even
           though the request isn't carrying real data. Also fixes a bio-to-rq
           merge case.
      
         - Don't re-set a bit on the buffer_head flags, if it's already set.
           This can cause scalability concerns on bigger machines and
           workloads. From Kemi Wang.
      
         - Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to
           distuingish between a local (device related) resource starvation
           and a global one. The latter might happen without IO being in
           flight, so it has to be handled a bit differently. From Ming"
      
      * tag 'for-linus-20180204' of git://git.kernel.dk/linux-block:
        block: skd: fix incorrect linux/slab_def.h inclusion
        buffer: Avoid setting buffer bits that are already set
        blk-mq-sched: Enable merging discard bio into request
        blk-mq: fix discard merge with scheduler attached
        blk-mq: introduce BLK_STS_DEV_RESOURCE
      64b28683
    • Linus Torvalds's avatar
      Merge tag 'ntb-4.16' of git://github.com/jonmason/ntb · d3658c22
      Linus Torvalds authored
      Pull NTB updates from Jon Mason:
       "Bug fixes galore, removal of the ntb atom driver, and updates to the
        ntb tools and tests to support the multi-port interface"
      
      * tag 'ntb-4.16' of git://github.com/jonmason/ntb: (37 commits)
        NTB: ntb_perf: fix cast to restricted __le32
        ntb_perf: Fix an error code in perf_copy_chunk()
        ntb_hw_switchtec: Make function switchtec_ntb_remove() static
        NTB: ntb_tool: fix memory leak on 'buf' on error exit path
        NTB: ntb_perf: fix printing of resource_size_t
        NTB: ntb_hw_idt: Set NTB_TOPO_SWITCH topology
        NTB: ntb_test: Update ntb_perf tests
        NTB: ntb_test: Update ntb_tool MW tests
        NTB: ntb_test: Add ntb_tool Message tests
        NTB: ntb_test: Update ntb_tool Scratchpad tests
        NTB: ntb_test: Update ntb_tool DB tests
        NTB: ntb_test: Update ntb_tool link tests
        NTB: ntb_test: Add ntb_tool port tests
        NTB: ntb_test: Safely use paths with whitespace
        NTB: ntb_perf: Add full multi-port NTB API support
        NTB: ntb_tool: Add full multi-port NTB API support
        NTB: ntb_pp: Add full multi-port NTB API support
        NTB: Fix UB/bug in ntb_mw_get_align()
        NTB: Set dma mask and dma coherent mask to NTB devices
        NTB: Rename NTB messaging API methods
        ...
      d3658c22
    • Linus Torvalds's avatar
      Merge tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration · 8ac4840a
      Linus Torvalds authored
      Pull mailbox updates from Jassi Brar:
       "Misc driver changes only:
      
         - TI-MsgMgr: Fix print format for a printk
      
         - TI-MSgMgr: SPDX license switch for the driver
      
         - QCOM-IPC: Convert driver to use regmap
      
         - QCOM-IPC: Spawn sibling clock device from mailbox driver"
      
      * tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        dt-bindings: mailbox: qcom: Document the APCS clock binding
        mailbox: qcom: Create APCS child device for clock controller
        mailbox: qcom: Convert APCS IPC driver to use regmap
        mailbox: ti-msgmgr: Use %zu for size_t print format
        mailbox: ti-msgmgr: Switch to SPDX Licensing
      8ac4840a
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 4141cf67
      Linus Torvalds authored
      Pull i2c updates from Wolfram Sang:
       "I2C has the following changes for you:
      
         - new flag to mark DMA safe buffers in i2c_msg. Also, some
           infrastructure around it. And docs.
      
         - huge refactoring of the at24 driver led by the new maintainer
           Bartosz
      
         - update I2C bus recovery to send STOP after recovery
      
         - conversion from gpio to gpiod for I2C bus recovery
      
         - adding a fault-injector to the i2c-gpio driver
      
         - lots of small driver improvements, and bigger ones to
           i2c-sh_mobile"
      
      * 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (99 commits)
        i2c: mv64xxx: Add myself as maintainer for this driver
        i2c: mv64xxx: Fix clock resource by adding an optional bus clock
        i2c: mv64xxx: Remove useless test before clk_disable_unprepare
        i2c: mxs: use true and false for boolean values
        i2c: meson: update doc description to fix build warnings
        i2c: meson: add configurable divider factors
        dt-bindings: i2c: update documentation for the Meson-AXG
        i2c: imx-lpi2c: add runtime pm support
        i2c: rcar: fix some trivial typos in comments
        i2c: davinci: fix the cpufreq transition
        i2c: rk3x: add proper kerneldoc header
        i2c: rk3x: account for const type of of_device_id.data
        i2c: acorn: remove outdated path from file header
        i2c: acorn: add MODULE_LICENSE tag
        i2c: rcar: implement bus recovery
        i2c: send STOP after successful bus recovery
        i2c: ensure SDA is released in recovery if SDA is controllable
        i2c: add 'set_sda' to bus_recovery_info
        i2c: add identifier in declarations for i2c_bus_recovery
        i2c: make kerneldoc about bus recovery more precise
        ...
      4141cf67
    • Linus Torvalds's avatar
      Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt · 3462ac57
      Linus Torvalds authored
      Pull fscrypt updates from Ted Ts'o:
       "Refactor support for encrypted symlinks to move common code to fscrypt"
      
      Ted also points out about the merge:
       "This makes the f2fs symlink code use the fscrypt_encrypt_symlink()
        from the fscrypt tree. This will end up dropping the kzalloc() ->
        f2fs_kzalloc() change, which means the fscrypt-specific allocation
        won't get tested by f2fs's kmalloc error injection system; which is
        fine"
      
      * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: (26 commits)
        fscrypt: fix build with pre-4.6 gcc versions
        fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()
        fscrypt: document symlink length restriction
        fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
        fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
        fscrypt: calculate NUL-padding length in one place only
        fscrypt: move fscrypt_symlink_data to fscrypt_private.h
        fscrypt: remove fscrypt_fname_usr_to_disk()
        ubifs: switch to fscrypt_get_symlink()
        ubifs: switch to fscrypt ->symlink() helper functions
        ubifs: free the encrypted symlink target
        f2fs: switch to fscrypt_get_symlink()
        f2fs: switch to fscrypt ->symlink() helper functions
        ext4: switch to fscrypt_get_symlink()
        ext4: switch to fscrypt ->symlink() helper functions
        fscrypt: new helper function - fscrypt_get_symlink()
        fscrypt: new helper functions for ->symlink()
        fscrypt: trim down fscrypt.h includes
        fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
        fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
        ...
      3462ac57
  5. Feb 04, 2018