Skip to content
  1. Jul 14, 2020
    • Jiri Olsa's avatar
      bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object · fbbb68de
      Jiri Olsa authored
      
      
      The resolve_btfids tool scans elf object for .BTF_ids section
      and resolves its symbols with BTF ID values.
      
      It will be used to during linking time to resolve arrays of BTF
      ID values used in verifier, so these IDs do not need to be
      resolved in runtime.
      
      The expected layout of .BTF_ids section is described in main.c
      header. Related kernel changes are coming in following changes.
      
      Build issue reported by 0-DAY CI Kernel Test Service.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200711215329.41165-2-jolsa@kernel.org
      fbbb68de
  2. Jul 11, 2020
  3. Jul 10, 2020
    • Andrii Nakryiko's avatar
      libbpf: Fix memory leak and optimize BTF sanitization · 5c3320d7
      Andrii Nakryiko authored
      Coverity's static analysis helpfully reported a memory leak introduced by
      0f0e55d8 ("libbpf: Improve BTF sanitization handling"). While fixing it,
      I realized that btf__new() already creates a memory copy, so there is no need
      to do this. So this patch also fixes misleading btf__new() signature to make
      data into a `const void *` input parameter. And it avoids unnecessary memory
      allocation and copy in BTF sanitization code altogether.
      
      Fixes: 0f0e55d8
      
       ("libbpf: Improve BTF sanitization handling")
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200710011023.1655008-1-andriin@fb.com
      5c3320d7
  4. Jul 09, 2020
    • Daniel Borkmann's avatar
      Merge branch 'bpf-libbpf-old-kernel' · 2977282b
      Daniel Borkmann authored
      
      
      Andrii Nakryiko says:
      
      ====================
      This patch set improves libbpf's support of old kernels, missing features like
      BTF support, global variables support, etc.
      
      Most critical one is a silent drop of CO-RE relocations if libbpf fails to
      load BTF (despite sanitization efforts). This is frequently the case for
      kernels that have no BTF support whatsoever. There are still useful BPF
      applications that could work on such kernels and do rely on CO-RE. To that
      end, this series revamps the way BTF is handled in libbpf. Failure to load BTF
      into kernel doesn't prevent libbpf from using BTF in its full capability
      (e.g., for CO-RE relocations) internally.
      
      Another issue that was identified was reliance of perf_buffer__new() on
      BPF_OBJ_GET_INFO_BY_FD command, which is more recent that perf_buffer support
      itself. Furthermore, BPF_OBJ_GET_INFO_BY_FD is needed just for some sanity
      checks to provide better user errors, so could be safely omitted if kernel
      doesn't provide it.
      
      Perf_buffer selftest was adjusted to use skeleton, instead of bpf_prog_load().
      The latter uses BPF_F_TEST_RND_HI32 flag, which is a relatively recent
      addition and unnecessary fails selftest in libbpf's Travis CI tests. By using
      skeleton we both get a shorter selftest and it work on pretty ancient kernels,
      giving better libbpf test coverage.
      
      One new selftest was added that relies on basic CO-RE features, but otherwise
      doesn't expect any recent features (like global variables) from kernel. Again,
      it's good to have better coverage of old kernels in libbpf testing.
      ====================
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      2977282b
    • Andrii Nakryiko's avatar
      selftests/bpf: Switch perf_buffer test to tracepoint and skeleton · 6984cbc6
      Andrii Nakryiko authored
      
      
      Switch perf_buffer test to use skeleton to avoid use of bpf_prog_load() and
      make test a bit more succinct. Also switch BPF program to use tracepoint
      instead of kprobe, as that allows to support older kernels, which had
      tracepoint support before kprobe support in the form that libbpf expects
      (i.e., libbpf expects /sys/bus/event_source/devices/kprobe/type, which doesn't
      always exist on old kernels).
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200708015318.3827358-7-andriin@fb.com
      6984cbc6
    • Andrii Nakryiko's avatar
      libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in perf_buffer · 0e289487
      Andrii Nakryiko authored
      
      
      perf_buffer__new() is relying on BPF_OBJ_GET_INFO_BY_FD availability for few
      sanity checks. OBJ_GET_INFO for maps is actually much more recent feature than
      perf_buffer support itself, so this causes unnecessary problems on old kernels
      before BPF_OBJ_GET_INFO_BY_FD was added.
      
      This patch makes those sanity checks optional and just assumes best if command
      is not supported. If user specified something incorrectly (e.g., wrong map
      type), kernel will reject it later anyway, except user won't get a nice
      explanation as to why it failed. This seems like a good trade off for
      supporting perf_buffer on old kernels.
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200708015318.3827358-6-andriin@fb.com
      0e289487
    • Andrii Nakryiko's avatar
      selftests/bpf: Add test relying only on CO-RE and no recent kernel features · fcda189a
      Andrii Nakryiko authored
      
      
      Add a test that relies on CO-RE, but doesn't expect any of the recent
      features, not available on old kernels. This is useful for Travis CI tests
      running against very old kernels (e.g., libbpf has 4.9 kernel testing now), to
      verify that CO-RE still works, even if kernel itself doesn't support BTF yet,
      as long as there is .BTF embedded into vmlinux image by pahole. Given most of
      CO-RE doesn't require any kernel awareness of BTF, it is a useful test to
      validate that libbpf's BTF sanitization is working well even with ancient
      kernels.
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200708015318.3827358-5-andriin@fb.com
      fcda189a
    • Andrii Nakryiko's avatar
      libbpf: Improve BTF sanitization handling · 0f0e55d8
      Andrii Nakryiko authored
      
      
      Change sanitization process to preserve original BTF, which might be used by
      libbpf itself for Kconfig externs, CO-RE relocs, etc, even if kernel is old
      and doesn't support BTF. To achieve that, if libbpf detects the need for BTF
      sanitization, it would clone original BTF, sanitize it in-place, attempt to
      load it into kernel, and if successful, will preserve loaded BTF FD in
      original `struct btf`, while freeing sanitized local copy.
      
      If kernel doesn't support any BTF, original btf and btf_ext will still be
      preserved to be used later for CO-RE relocation and other BTF-dependent libbpf
      features, which don't dependon kernel BTF support.
      
      Patch takes care to not specify BTF and BTF.ext features when loading BPF
      programs and/or maps, if it was detected that kernel doesn't support BTF
      features.
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200708015318.3827358-4-andriin@fb.com
      0f0e55d8
    • Andrii Nakryiko's avatar
      libbpf: Add btf__set_fd() for more control over loaded BTF FD · 81372e12
      Andrii Nakryiko authored
      
      
      Add setter for BTF FD to allow application more fine-grained control in more
      advanced scenarios. Storing BTF FD inside `struct btf` provides little benefit
      and probably would be better done differently (e.g., btf__load() could just
      return FD on success), but we are stuck with this due to backwards
      compatibility. The main problem is that it's impossible to load BTF and than
      free user-space memory, but keep FD intact, because `struct btf` assumes
      ownership of that FD upon successful load and will attempt to close it during
      btf__free(). To allow callers (e.g., libbpf itself for BTF sanitization) to
      have more control over this, add btf__set_fd() to allow to reset FD
      arbitrarily, if necessary.
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200708015318.3827358-3-andriin@fb.com
      81372e12
    • Andrii Nakryiko's avatar
      libbpf: Make BTF finalization strict · bfc96656
      Andrii Nakryiko authored
      
      
      With valid ELF and valid BTF, there is no reason (apart from bugs) why BTF
      finalization should fail. So make it strict and return error if it fails. This
      makes CO-RE relocation more reliable, as they are not going to be just
      silently skipped, if BTF finalization failed.
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200708015318.3827358-2-andriin@fb.com
      bfc96656
    • Jesper Dangaard Brouer's avatar
      selftests/bpf: test_progs avoid minus shell exit codes · b8c50df0
      Jesper Dangaard Brouer authored
      There are a number of places in test_progs that use minus-1 as the argument
      to exit(). This is confusing as a process exit status is masked to be a
      number between 0 and 255 as defined in man exit(3). Thus, users will see
      status 255 instead of minus-1.
      
      This patch use positive exit code 3 instead of minus-1. These cases are put
      in the same group of infrastructure setup errors.
      
      Fixes: fd27b183 ("selftests/bpf: Reset process and thread affinity after each test/sub-test")
      Fixes: 811d7e37
      
       ("bpf: selftests: Restore netns after each test")
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/159410594499.1093222.11080787853132708654.stgit@firesoul
      b8c50df0
    • Jesper Dangaard Brouer's avatar
      selftests/bpf: test_progs use another shell exit on non-actions · 3220fb66
      Jesper Dangaard Brouer authored
      This is a follow up adjustment to commit 6c92bd5c ("selftests/bpf:
      Test_progs indicate to shell on non-actions"), that returns shell exit
      indication EXIT_FAILURE (value 1) when user selects a non-existing test.
      
      The problem with using EXIT_FAILURE is that a shell script cannot tell
      the difference between a non-existing test and the test failing.
      
      This patch uses value 2 as shell exit indication.
      (Aside note unrecognized option parameters use value 64).
      
      Fixes: 6c92bd5c
      
       ("selftests/bpf: Test_progs indicate to shell on non-actions")
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/159410593992.1093222.90072558386094370.stgit@firesoul
      3220fb66
    • Louis Peens's avatar
      bpf: Fix another bpftool segfault without skeleton code enabled · 625eb8e8
      Louis Peens authored
      emit_obj_refs_json needs to added the same as with emit_obj_refs_plain
      to prevent segfaults, similar to Commit "8ae4121b bpf: Fix bpftool
      without skeleton code enabled"). See the error below:
      
          # ./bpftool -p prog
          {
              "error": "bpftool built without PID iterator support"
          },[{
                  "id": 2,
                  "type": "cgroup_skb",
                  "tag": "7be49e3934a125ba",
                  "gpl_compatible": true,
                  "loaded_at": 1594052789,
                  "uid": 0,
                  "bytes_xlated": 296,
                  "jited": true,
                  "bytes_jited": 203,
                  "bytes_memlock": 4096,
                  "map_ids": [2,3
          Segmentation fault (core dumped)
      
      The same happens for ./bpftool -p map, as well as ./bpftool -j prog/map.
      
      Fixes: d53dee3f
      
       ("tools/bpftool: Show info for processes holding BPF map/prog/link/btf FDs")
      Signed-off-by: default avatarLouis Peens <louis.peens@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20200708110827.7673-1-louis.peens@netronome.com
      625eb8e8
  5. Jul 08, 2020
  6. Jul 07, 2020
    • Davide Caratti's avatar
      mptcp: fix race in subflow_data_ready() · d47a7215
      Davide Caratti authored
      
      
      syzkaller was able to make the kernel reach subflow_data_ready() for a
      server subflow that was closed before subflow_finish_connect() completed.
      In these cases we can avoid using the path for regular/fallback MPTCP
      data, and just wake the main socket, to avoid the following warning:
      
       WARNING: CPU: 0 PID: 9370 at net/mptcp/subflow.c:885
       subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
       Kernel panic - not syncing: panic_on_warn set ...
       CPU: 0 PID: 9370 Comm: syz-executor.0 Not tainted 5.7.0 #106
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
       rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
       Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0xb7/0xfe lib/dump_stack.c:118
        panic+0x29e/0x692 kernel/panic.c:221
        __warn.cold+0x2f/0x3d kernel/panic.c:582
        report_bug+0x28b/0x2f0 lib/bug.c:195
        fixup_bug arch/x86/kernel/traps.c:105 [inline]
        fixup_bug arch/x86/kernel/traps.c:100 [inline]
        do_error_trap+0x10f/0x180 arch/x86/kernel/traps.c:197
        do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:216
        invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:1027
       RIP: 0010:subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
       Code: 04 02 84 c0 74 06 0f 8e 91 00 00 00 41 0f b6 5e 48 31 ff 83 e3 18
       89 de e8 37 ec 3d fe 84 db 0f 85 65 ff ff ff e8 fa ea 3d fe <0f> 0b e9
       59 ff ff ff e8 ee ea 3d fe 48 89 ee 4c 89 ef e8 f3 77 ff
       RSP: 0018:ffff88811b2099b0 EFLAGS: 00010206
       RAX: ffff888111197000 RBX: 0000000000000000 RCX: ffffffff82fbc609
       RDX: 0000000000000100 RSI: ffffffff82fbc616 RDI: 0000000000000001
       RBP: ffff8881111bc800 R08: ffff888111197000 R09: ffffed10222a82af
       R10: ffff888111541577 R11: ffffed10222a82ae R12: 1ffff11023641336
       R13: ffff888111541000 R14: ffff88810fd4ca00 R15: ffff888111541570
        tcp_child_process+0x754/0x920 net/ipv4/tcp_minisocks.c:841
        tcp_v4_do_rcv+0x749/0x8b0 net/ipv4/tcp_ipv4.c:1642
        tcp_v4_rcv+0x2666/0x2e60 net/ipv4/tcp_ipv4.c:1999
        ip_protocol_deliver_rcu+0x29/0x1f0 net/ipv4/ip_input.c:204
        ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
        NF_HOOK include/linux/netfilter.h:421 [inline]
        ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
        dst_input include/net/dst.h:441 [inline]
        ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
        ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
        NF_HOOK include/linux/netfilter.h:421 [inline]
        ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
        __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5268
        __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5382
        process_backlog+0x1e5/0x6d0 net/core/dev.c:6226
        napi_poll net/core/dev.c:6671 [inline]
        net_rx_action+0x3e3/0xd70 net/core/dev.c:6739
        __do_softirq+0x18c/0x634 kernel/softirq.c:292
        do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
        </IRQ>
        do_softirq.part.0+0x26/0x30 kernel/softirq.c:337
        do_softirq arch/x86/include/asm/preempt.h:26 [inline]
        __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:189
        local_bh_enable include/linux/bottom_half.h:32 [inline]
        rcu_read_unlock_bh include/linux/rcupdate.h:723 [inline]
        ip_finish_output2+0x78a/0x19c0 net/ipv4/ip_output.c:229
        __ip_finish_output+0x471/0x720 net/ipv4/ip_output.c:306
        dst_output include/net/dst.h:435 [inline]
        ip_local_out+0x181/0x1e0 net/ipv4/ip_output.c:125
        __ip_queue_xmit+0x7a1/0x14e0 net/ipv4/ip_output.c:530
        __tcp_transmit_skb+0x19dc/0x35e0 net/ipv4/tcp_output.c:1238
        __tcp_send_ack.part.0+0x3c2/0x5b0 net/ipv4/tcp_output.c:3785
        __tcp_send_ack net/ipv4/tcp_output.c:3791 [inline]
        tcp_send_ack+0x7d/0xa0 net/ipv4/tcp_output.c:3791
        tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6040 [inline]
        tcp_rcv_state_process+0x36a4/0x49c2 net/ipv4/tcp_input.c:6209
        tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1651
        sk_backlog_rcv include/net/sock.h:996 [inline]
        __release_sock+0x1ad/0x310 net/core/sock.c:2548
        release_sock+0x54/0x1a0 net/core/sock.c:3064
        inet_wait_for_connect net/ipv4/af_inet.c:594 [inline]
        __inet_stream_connect+0x57e/0xd50 net/ipv4/af_inet.c:686
        inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:725
        mptcp_stream_connect+0x171/0x5f0 net/mptcp/protocol.c:1920
        __sys_connect_file net/socket.c:1854 [inline]
        __sys_connect+0x267/0x2f0 net/socket.c:1871
        __do_sys_connect net/socket.c:1882 [inline]
        __se_sys_connect net/socket.c:1879 [inline]
        __x64_sys_connect+0x6f/0xb0 net/socket.c:1879
        do_syscall_64+0xb7/0x3d0 arch/x86/entry/common.c:295
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
       RIP: 0033:0x7fb577d06469
       Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89
       f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
       f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
       RSP: 002b:00007fb5783d5dd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
       RAX: ffffffffffffffda RBX: 000000000068bfa0 RCX: 00007fb577d06469
       RDX: 000000000000004d RSI: 0000000020000040 RDI: 0000000000000003
       RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
       R13: 000000000041427c R14: 00007fb5783d65c0 R15: 0000000000000003
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/39
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Fixes: e1ff9e82
      
       ("net: mptcp: improve fallback to TCP")
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d47a7215
    • Alexander A. Klimov's avatar
      Replace HTTP links with HTTPS ones: IPv* · 7a6498eb
      Alexander A. Klimov authored
      
      
      Rationale:
      Reduces attack surface on kernel devs opening the links for MITM
      as HTTPS traffic is much harder to manipulate.
      
      Deterministic algorithm:
      For each file:
        If not .svg:
          For each line:
            If doesn't contain `\bxmlns\b`:
              For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
                If both the HTTP and HTTPS versions
                return 200 OK and serve the same content:
                  Replace HTTP with HTTPS.
      
      Signed-off-by: default avatarAlexander A. Klimov <grandmaster@al2klimov.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a6498eb
    • David S. Miller's avatar
      Merge branch 'qed-warnings-cleanup' · 1eafa736
      David S. Miller authored
      
      
      Alexander Lobakin says:
      
      ====================
      net: qed/qede: W=1 C=1 warnings cleanup
      
      This set cleans qed/qede build log under W=1 C=1 with GCC 8 and
      sparse 0.6.2. The only thing left is "context imbalance -- unexpected
      unlock" in one of the source files, which will be issued later during
      the refactoring cycles.
      
      The biggest part is handling the endianness warnings. The current code
      often just assumes that both host and device operate in LE, which is
      obviously incorrect (despite the fact that it's true for x86 platforms),
      and makes sparse {s,m}ad.
      
      The rest of the series is mostly random non-functional fixes
      here-and-there.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1eafa736
    • Alexander Lobakin's avatar
      net: qede: fix BE vs CPU comparison · fd081662
      Alexander Lobakin authored
      
      
      Flow Dissector's keys are mostly Network / Big Endian. U{16,32}_MAX are
      the same in either of byteorders, but let's make sparse happy with
      wrapping them into noops.
      
      Signed-off-by: default avatarAlexander Lobakin <alobakin@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd081662