Skip to content
  1. May 22, 2020
    • Björn Töpel's avatar
      i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL · 3b4f0b66
      Björn Töpel authored
      
      
      Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
      APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff
      pointer.
      
      v4->v5: Fixed "warning: Excess function parameter 'bi' description in
              'i40e_construct_skb_zc'". (Jakub)
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: intel-wired-lan@lists.osuosl.org
      Link: https://lore.kernel.org/bpf/20200520192103.355233-9-bjorn.topel@gmail.com
      3b4f0b66
    • Björn Töpel's avatar
      i40e: Separate kernel allocated rx_bi rings from AF_XDP rings · be1222b5
      Björn Töpel authored
      
      
      Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP
      zero-copy/sk_buff rx_bi rings are now separate. Functions to properly
      allocate the different rings are added as well.
      
      v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot)
      v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub)
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: intel-wired-lan@lists.osuosl.org
      Link: https://lore.kernel.org/bpf/20200520192103.355233-8-bjorn.topel@gmail.com
      be1222b5
    • Björn Töpel's avatar
      i40e: Refactor rx_bi accesses · e1675f97
      Björn Töpel authored
      
      
      As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL
      APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to
      use an accessor function.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: intel-wired-lan@lists.osuosl.org
      Link: https://lore.kernel.org/bpf/20200520192103.355233-7-bjorn.topel@gmail.com
      e1675f97
    • Björn Töpel's avatar
      xsk: Introduce AF_XDP buffer allocation API · 2b43470a
      Björn Töpel authored
      
      
      In order to simplify AF_XDP zero-copy enablement for NIC driver
      developers, a new AF_XDP buffer allocation API is added. The
      implementation is based on a single core (single producer/consumer)
      buffer pool for the AF_XDP UMEM.
      
      A buffer is allocated using the xsk_buff_alloc() function, and
      returned using xsk_buff_free(). If a buffer is disassociated with the
      pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is
      said to be released. Currently, the release function is only used by
      the AF_XDP internals and not visible to the driver.
      
      Drivers using this API should register the XDP memory model with the
      new MEM_TYPE_XSK_BUFF_POOL type.
      
      The API is defined in net/xdp_sock_drv.h.
      
      The buffer type is struct xdp_buff, and follows the lifetime of
      regular xdp_buffs, i.e.  the lifetime of an xdp_buff is restricted to
      a NAPI context. In other words, the API is not replacing xdp_frames.
      
      In addition to introducing the API and implementations, the AF_XDP
      core is migrated to use the new APIs.
      
      rfc->v1: Fixed build errors/warnings for m68k and riscv. (kbuild test
               robot)
               Added headroom/chunk size getter. (Maxim/Björn)
      
      v1->v2: Swapped SoBs. (Maxim)
      
      v2->v3: Initialize struct xdp_buff member frame_sz. (Björn)
              Add API to query the DMA address of a frame. (Maxim)
              Do DMA sync for CPU till the end of the frame to handle
              possible growth (frame_sz). (Maxim)
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@mellanox.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200520192103.355233-6-bjorn.topel@gmail.com
      2b43470a
    • Björn Töpel's avatar
      xsk: Move defines only used by AF_XDP internals to xsk.h · 89e4a376
      Björn Töpel authored
      
      
      Move the XSK_NEXT_PG_CONTIG_{MASK,SHIFT}, and
      XDP_UMEM_USES_NEED_WAKEUP defines from xdp_sock.h to the AF_XDP
      internal xsk.h file. Also, start using the BIT{,_ULL} macro instead of
      explicit shifts.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200520192103.355233-5-bjorn.topel@gmail.com
      89e4a376
    • Magnus Karlsson's avatar
      xsk: Move driver interface to xdp_sock_drv.h · a71506a4
      Magnus Karlsson authored
      
      
      Move the AF_XDP zero-copy driver interface to its own include file
      called xdp_sock_drv.h. This, hopefully, will make it more clear for
      NIC driver implementors to know what functions to use for zero-copy
      support.
      
      v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub)
      
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com
      a71506a4
    • Björn Töpel's avatar
      xsk: Move xskmap.c to net/xdp/ · d20a1676
      Björn Töpel authored
      
      
      The XSKMAP is partly implemented by net/xdp/xsk.c. Move xskmap.c from
      kernel/bpf/ to net/xdp/, which is the logical place for AF_XDP related
      code. Also, move AF_XDP struct definitions, and function declarations
      only used by AF_XDP internals into net/xdp/xsk.h.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200520192103.355233-3-bjorn.topel@gmail.com
      d20a1676
    • Björn Töpel's avatar
      xsk: Fix xsk_umem_xdp_frame_sz() · 44ac082b
      Björn Töpel authored
      Calculating the "data_hard_end" for an XDP buffer coming from AF_XDP
      zero-copy mode, the return value of xsk_umem_xdp_frame_sz() is added
      to "data_hard_start".
      
      Currently, the chunk size of the UMEM is returned by
      xsk_umem_xdp_frame_sz(). This is not correct, if the fixed UMEM
      headroom is non-zero. Fix this by returning the chunk_size without the
      UMEM headroom.
      
      Fixes: 2a637c5b
      
       ("xdp: For Intel AF_XDP drivers add XDP frame_sz")
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200520192103.355233-2-bjorn.topel@gmail.com
      44ac082b
  2. May 20, 2020
    • Andrii Nakryiko's avatar
      selftests/bpf: Convert bpf_iter_test_kern{3, 4}.c to define own bpf_iter_meta · dda18a5c
      Andrii Nakryiko authored
      b9f4c01f ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h")
      missed the fact that bpf_iter_test_kern{3,4}.c are not just including
      bpf_iter_test_kern_common.h and need similar bpf_iter_meta re-definition
      explicitly.
      
      Fixes: b9f4c01f
      
       ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h")
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200519192341.134360-1-andriin@fb.com
      dda18a5c
    • Andrii Nakryiko's avatar
      selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h · b9f4c01f
      Andrii Nakryiko authored
      
      
      It's good to be able to compile bpf_iter selftest even on systems that don't
      have the very latest vmlinux.h, e.g., for libbpf tests against older kernels in
      Travis CI. To that extent, re-define bpf_iter_meta and corresponding bpf_iter
      context structs in each selftest. To avoid type clashes with vmlinux.h, rename
      vmlinux.h's definitions to get them out of the way.
      
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Link: https://lore.kernel.org/bpf/20200518234516.3915052-1-andriin@fb.com
      b9f4c01f
    • Alexei Starovoitov's avatar
      tools/bpf: sync bpf.h · fb53d3b6
      Alexei Starovoitov authored
      
      
      Sync tools/include/uapi/linux/bpf.h from include/uapi.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      fb53d3b6
    • Alexei Starovoitov's avatar
      Merge branch 'getpeername' · 0e5633ac
      Alexei Starovoitov authored
      
      
      Daniel Borkmann says:
      
      ====================
      Trivial patch to add get{peer,sock}name cgroup attach types to the BPF
      sock_addr programs in order to enable rewriting sockaddr structs from
      both calls along with libbpf and bpftool support as well as selftests.
      
      Thanks!
      
      v1 -> v2:
        - use __u16 for ports in start_server_with_port() signature and in
          expected_{local,peer} ports in the test case (Andrey)
        - Added both Andrii's and Andrey's ACKs
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0e5633ac
    • Daniel Borkmann's avatar
      bpf, testing: Add get{peer, sock}name selftests to test_progs · 566fc3f5
      Daniel Borkmann authored
      
      
      Extend the existing connect_force_port test to assert get{peer,sock}name programs
      as well. The workflow for e.g. IPv4 is as follows: i) server binds to concrete
      port, ii) client calls getsockname() on server fd which exposes 1.2.3.4:60000 to
      client, iii) client connects to service address 1.2.3.4:60000 binds to concrete
      local address (127.0.0.1:22222) and remaps service address to a concrete backend
      address (127.0.0.1:60123), iv) client then calls getsockname() on its own fd to
      verify local address (127.0.0.1:22222) and getpeername() on its own fd which then
      publishes service address (1.2.3.4:60000) instead of actual backend. Same workflow
      is done for IPv6 just with different address/port tuples.
      
        # ./test_progs -t connect_force_port
        #14 connect_force_port:OK
        Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarAndrey Ignatov <rdna@fb.com>
      Link: https://lore.kernel.org/bpf/3343da6ad08df81af715a95d61a84fb4a960f2bf.1589841594.git.daniel@iogearbox.net
      566fc3f5
    • Daniel Borkmann's avatar
      bpf, bpftool: Enable get{peer, sock}name attach types · 05ee19c1
      Daniel Borkmann authored
      
      
      Make bpftool aware and add the new get{peer,sock}name attach types to its
      cli, documentation and bash completion to allow attachment/detachment of
      sock_addr programs there.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarAndrey Ignatov <rdna@fb.com>
      Link: https://lore.kernel.org/bpf/9765b3d03e4c29210c4df56a9cc7e52f5f7bb5ef.1589841594.git.daniel@iogearbox.net
      05ee19c1
    • Daniel Borkmann's avatar
      bpf, libbpf: Enable get{peer, sock}name attach types · f15ed018
      Daniel Borkmann authored
      
      
      Trivial patch to add the new get{peer,sock}name attach types to the section
      definitions in order to hook them up to sock_addr cgroup program type.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarAndrey Ignatov <rdna@fb.com>
      Link: https://lore.kernel.org/bpf/7fcd4b1e41a8ebb364754a5975c75a7795051bd2.1589841594.git.daniel@iogearbox.net
      f15ed018
    • Daniel Borkmann's avatar
      bpf: Add get{peer, sock}name attach types for sock_addr · 1b66d253
      Daniel Borkmann authored
      As stated in 983695fa
      
       ("bpf: fix unconnected udp hooks"), the objective
      for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be
      transparent to applications. In Cilium we make use of these hooks [0] in
      order to enable E-W load balancing for existing Kubernetes service types
      for all Cilium managed nodes in the cluster. Those backends can be local
      or remote. The main advantage of this approach is that it operates as close
      as possible to the socket, and therefore allows to avoid packet-based NAT
      given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses.
      
      This also allows to expose NodePort services on loopback addresses in the
      host namespace, for example. As another advantage, this also efficiently
      blocks bind requests for applications in the host namespace for exposed
      ports. However, one missing item is that we also need to perform reverse
      xlation for inet{,6}_getname() hooks such that we can return the service
      IP/port tuple back to the application instead of the remote peer address.
      
      The vast majority of applications does not bother about getpeername(), but
      in a few occasions we've seen breakage when validating the peer's address
      since it returns unexpectedly the backend tuple instead of the service one.
      Therefore, this trivial patch allows to customise and adds a getpeername()
      as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order
      to address this situation.
      
      Simple example:
      
        # ./cilium/cilium service list
        ID   Frontend     Service Type   Backend
        1    1.2.3.4:80   ClusterIP      1 => 10.0.0.10:80
      
      Before; curl's verbose output example, no getpeername() reverse xlation:
      
        # curl --verbose 1.2.3.4
        * Rebuilt URL to: 1.2.3.4/
        *   Trying 1.2.3.4...
        * TCP_NODELAY set
        * Connected to 1.2.3.4 (10.0.0.10) port 80 (#0)
        > GET / HTTP/1.1
        > Host: 1.2.3.4
        > User-Agent: curl/7.58.0
        > Accept: */*
        [...]
      
      After; with getpeername() reverse xlation:
      
        # curl --verbose 1.2.3.4
        * Rebuilt URL to: 1.2.3.4/
        *   Trying 1.2.3.4...
        * TCP_NODELAY set
        * Connected to 1.2.3.4 (1.2.3.4) port 80 (#0)
        > GET / HTTP/1.1
        >  Host: 1.2.3.4
        > User-Agent: curl/7.58.0
        > Accept: */*
        [...]
      
      Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed
      peer to the context similar as in inet{,6}_getname() fashion, but API-wise
      this is suboptimal as it always enforces programs having to test for ctx->peer
      which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split.
      Similarly, the checked return code is on tnum_range(1, 1), but if a use case
      comes up in future, it can easily be changed to return an error code instead.
      Helper and ctx member access is the same as with connect/sendmsg/etc hooks.
      
        [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarAndrey Ignatov <rdna@fb.com>
      Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net
      1b66d253
  3. May 19, 2020
  4. May 16, 2020
    • John Fastabend's avatar
      bpf: Selftests, add ktls tests to test_sockmap · 96586dd9
      John Fastabend authored
      
      
      Until now we have only had minimal ktls+sockmap testing when being
      used with helpers and different sendmsg/sendpage patterns. Add a
      pass with ktls here.
      
      To run just ktls tests,
      
       $ ./test_sockmap --whitelist="ktls"
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939736278.15176.5435314315563203761.stgit@john-Precision-5820-Tower
      96586dd9
    • John Fastabend's avatar
      bpf: Selftests, add blacklist to test_sockmap · a7238f7c
      John Fastabend authored
      
      
      This adds a blacklist to test_sockmap. For example, now we can run
      all apply and cork tests except those with timeouts by doing,
      
       $ ./test_sockmap --whitelist "apply,cork" --blacklist "hang"
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939734350.15176.6643981099665208826.stgit@john-Precision-5820-Tower
      a7238f7c
    • John Fastabend's avatar
      bpf: Selftests, add whitelist option to test_sockmap · 065a74cb
      John Fastabend authored
      
      
      Allow running specific tests with a comma deliminated whitelist. For example
      to run all apply and cork tests.
      
       $ ./test_sockmap --whitelist="cork,apply"
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939732464.15176.1959113294944564542.stgit@john-Precision-5820-Tower
      065a74cb
    • John Fastabend's avatar
      bpf: Selftests, provide verbose option for selftests execution · b98ca90c
      John Fastabend authored
      
      
      Pass options from command line args into individual tests which allows us
      to use verbose option from command line with selftests. Now when verbose
      option is set individual subtest details will be printed. Also we can
      consolidate cgroup bring up and tear down.
      
      Additionally just setting verbose is very noisy so introduce verbose=1
      and verbose=2. Really verbose=2 is only useful when developing tests
      or debugging some specific issue.
      
      For example now we get output like this with --verbose,
      
      #20/17 sockhash:txmsg test pull-data:OK
       [TEST 160]: (512, 1, 3, sendpage, pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
       [TEST 161]: (100, 1, 5, sendpage, pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
       [TEST 162]: (2, 1024, 256, sendpage, pop (4096,8192),): msg_loop_rx: iov_count 1 iov_buf 255 cnt 2 err 0
       [TEST 163]: (512, 1, 3, sendpage, redir,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
       [TEST 164]: (100, 1, 5, sendpage, redir,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
       [TEST 165]: (512, 1, 3, sendpage, cork 512,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
       [TEST 166]: (100, 1, 5, sendpage, cork 512,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
       [TEST 167]: (512, 1, 3, sendpage, redir,cork 4,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 1 cnt 512 err 0
       [TEST 168]: (100, 1, 5, sendpage, redir,cork 4,pop (1,3),): msg_loop_rx: iov_count 1 iov_buf 3 cnt 100 err 0
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939730412.15176.1975675235035143367.stgit@john-Precision-5820-Tower
      b98ca90c
    • John Fastabend's avatar
      bpf: Selftests, break down test_sockmap into subtests · 328aa08a
      John Fastabend authored
      
      
      At the moment test_sockmap runs all 800+ tests ungrouped which is not
      ideal because it makes it hard to see what is failing but also more
      importantly its hard to confirm all cases are tested. Additionally,
      after inspecting we noticed the runtime is bloated because we run
      many duplicate tests. Worse some of these tests are known error cases
      that wait for the recvmsg handler to timeout which creats long delays.
      Also we noted some tests were not clearing their options and as a
      result the following tests would run with extra and incorrect options.
      
      Fix this by reorganizing test code so its clear what tests are running
      and when. Then it becomes easy to remove duplication and run tests with
      only the set of send/recv patterns that are relavent.
      
      To accomplish this break test_sockmap into subtests and remove
      unnecessary duplication. The output is more readable now and
      the runtime reduced.
      
      Now default output prints subtests like this,
      
       $ ./test_sockmap
       # 1/ 6  sockmap:txmsg test passthrough:OK
       ...
       #22/ 1 sockhash:txmsg test push/pop data:OK
       Pass: 22 Fail: 0
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939728384.15176.13601520183665880762.stgit@john-Precision-5820-Tower
      328aa08a
    • John Fastabend's avatar
      bpf: Selftests, improve test_sockmap total bytes counter · 18d4e900
      John Fastabend authored
      
      
      The recv thread in test_sockmap waits to receive all bytes from sender but
      in the case we use pop data it may wait for more bytes then actually being
      sent. This stalls the test harness for multiple seconds. Because this
      happens in multiple tests it slows time to run the selftest.
      
      Fix by doing a better job of accounting for total bytes when pop helpers
      are used.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939726542.15176.5964532245173539540.stgit@john-Precision-5820-Tower
      18d4e900
    • John Fastabend's avatar
      bpf: Selftests, print error in test_sockmap error cases · 248aba1d
      John Fastabend authored
      
      
      Its helpful to know the error value if an error occurs.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939724566.15176.12079885932643225626.stgit@john-Precision-5820-Tower
      248aba1d
    • John Fastabend's avatar
      bpf: Selftests, sockmap test prog run without setting cgroup · 13a5f3ff
      John Fastabend authored
      
      
      Running test_sockmap with arguments to specify a test pattern requires
      including a cgroup argument. Instead of requiring this if the option is
      not provided create one
      
      This is not used by selftest runs but I use it when I want to test a
      specific test. Most useful when developing new code and/or tests.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939722675.15176.6294210959489131688.stgit@john-Precision-5820-Tower
      13a5f3ff
    • John Fastabend's avatar
      bpf: Selftests, remove prints from sockmap tests · d79a3212
      John Fastabend authored
      
      
      The prints in the test_sockmap programs were only useful when we
      didn't have enough control over test infrastructure to know from
      user program what was being pushed into kernel side.
      
      Now that we have or will shortly have better test controls lets
      remove the printers. This means we can remove half the programs
      and cleanup bpf side.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939720756.15176.9806965887313279429.stgit@john-Precision-5820-Tower
      d79a3212
    • John Fastabend's avatar
      bpf: Selftests, move sockmap bpf prog header into progs · 991e35ee
      John Fastabend authored
      
      
      Moves test_sockmap_kern.h into progs directory but does not change
      code at all.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Link: https://lore.kernel.org/bpf/158939718921.15176.5766299102332077086.stgit@john-Precision-5820-Tower
      991e35ee
    • Stanislav Fomichev's avatar
      selftests/bpf: Move test_align under test_progs · 3b09d27c
      Stanislav Fomichev authored
      
      
      There is a much higher chance we can see the regressions if the
      test is part of test_progs.
      
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200515194904.229296-2-sdf@google.com
      3b09d27c
    • Stanislav Fomichev's avatar
      selftests/bpf: Fix test_align verifier log patterns · 5366d226
      Stanislav Fomichev authored
      Commit 294f2fc6 ("bpf: Verifer, adjust_scalar_min_max_vals to always
      call update_reg_bounds()") changed the way verifier logs some of its state,
      adjust the test_align accordingly. Where possible, I tried to not copy-paste
      the entire log line and resorted to dropping the last closing brace instead.
      
      Fixes: 294f2fc6
      
       ("bpf: Verifer, adjust_scalar_min_max_vals to always call update_reg_bounds()")
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200515194904.229296-1-sdf@google.com
      5366d226
    • Ian Rogers's avatar
      libbpf, hashmap: Fix signedness warnings · 8d35d74f
      Ian Rogers authored
      
      
      Fixes the following warnings:
      
        hashmap.c: In function ‘hashmap__clear’:
        hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
          150 |  for (bkt = 0; bkt < map->cap; bkt++)        \
      
        hashmap.c: In function ‘hashmap_grow’:
        hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
          150 |  for (bkt = 0; bkt < map->cap; bkt++)        \
      
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200515165007.217120-4-irogers@google.com
      8d35d74f
    • Ian Rogers's avatar
      libbpf, hashmap: Remove unused #include · f516acd5
      Ian Rogers authored
      
      
      Remove #include of libbpf_internal.h that is unused.
      
      Discussed in this thread:
      https://lore.kernel.org/lkml/CAEf4BzZRmiEds_8R8g4vaAeWvJzPb4xYLnpF0X2VNY8oTzkphQ@mail.gmail.com/
      
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200515165007.217120-3-irogers@google.com
      f516acd5
    • Daniel Borkmann's avatar
      bpf: Fix check_return_code to only allow [0,1] in trace_iter progs · 2ec0616e
      Daniel Borkmann authored
      As per 15d83c4d ("bpf: Allow loading of a bpf_iter program") we only
      allow a range of [0,1] for return codes. Therefore BPF_TRACE_ITER relies
      on the default tnum_range(0, 1) which is set in range var. On recent merge
      of net into net-next commit e92888c7 ("bpf: Enforce returning 0 for
      fentry/fexit progs") got pulled in and caused a merge conflict with the
      changes from 15d83c4d. The resolution had a snall hiccup in that it
      removed the [0,1] range restriction again so that BPF_TRACE_ITER would
      have no enforcement. Fix it by adding it back.
      
      Fixes: da07f52d
      
       ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2ec0616e
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · da07f52d
      David S. Miller authored
      
      
      Move the bpf verifier trace check into the new switch statement in
      HEAD.
      
      Resolve the overlapping changes in hinic, where bug fixes overlap
      the addition of VF support.
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da07f52d
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · f85c1598
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix sk_psock reference count leak on receive, from Xiyu Yang.
      
       2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.
      
       3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
          from Maciej Żenczykowski.
      
       4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
          Abeni.
      
       5) Fix netprio cgroup v2 leak, from Zefan Li.
      
       6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.
      
       7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.
      
       8) Various bpf probe handling fixes, from Daniel Borkmann.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
        selftests: mptcp: pm: rm the right tmp file
        dpaa2-eth: properly handle buffer size restrictions
        bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
        bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
        bpf: Restrict bpf_probe_read{, str}() only to archs where they work
        MAINTAINERS: Mark networking drivers as Maintained.
        ipmr: Add lockdep expression to ipmr_for_each_table macro
        ipmr: Fix RCU list debugging warning
        drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
        net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
        tcp: fix error recovery in tcp_zerocopy_receive()
        MAINTAINERS: Add Jakub to networking drivers.
        MAINTAINERS: another add of Karsten Graul for S390 networking
        drivers: ipa: fix typos for ipa_smp2p structure doc
        pppoe: only process PADT targeted at local interfaces
        selftests/bpf: Enforce returning 0 for fentry/fexit programs
        bpf: Enforce returning 0 for fentry/fexit progs
        net: stmmac: fix num_por initialization
        security: Fix the default value of secid_to_secctx hook
        libbpf: Fix register naming in PT_REGS s390 macros
        ...
      f85c1598
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · d5dfe4f1
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "A few minor bug fixes for user visible defects, and one regression:
      
         - Various bugs from static checkers and syzkaller
      
         - Add missing error checking in mlx4
      
         - Prevent RTNL lock recursion in i40iw
      
         - Fix segfault in cxgb4 in peer abort cases
      
         - Fix a regression added in 5.7 where the IB_EVENT_DEVICE_FATAL could
           be lost, and wasn't delivered to all the FDs"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/uverbs: Move IB_EVENT_DEVICE_FATAL to destroy_uobj
        RDMA/uverbs: Do not discard the IB_EVENT_DEVICE_FATAL event
        RDMA/iw_cxgb4: Fix incorrect function parameters
        RDMA/core: Fix double put of resource
        IB/core: Fix potential NULL pointer dereference in pkey cache
        IB/hfi1: Fix another case where pq is left on waitlist
        IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()
        IB/mlx4: Test return value of calls to ib_get_cached_pkey
        RDMA/rxe: Always return ERR_PTR from rxe_create_mmap_info()
        i40iw: Fix error handling in i40iw_manage_arp_cache()
      d5dfe4f1