Skip to content
  1. Oct 17, 2018
    • John Fastabend's avatar
      bpf: sockmap, support for msg_peek in sk_msg with redirect ingress · 02c558b2
      John Fastabend authored
      
      
      This adds support for the MSG_PEEK flag when doing redirect to ingress
      and receiving on the sk_msg psock queue. Previously the flag was
      being ignored which could confuse applications if they expected the
      flag to work as normal.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      02c558b2
    • John Fastabend's avatar
      bpf: skmsg, improve sk_msg_used_element to work in cork context · 8734a162
      John Fastabend authored
      
      
      Currently sk_msg_used_element is only called in zerocopy context where
      cork is not possible and if this case happens we fallback to copy
      mode. However the helper is more useful if it works in all contexts.
      
      This patch resolved the case where if end == head indicating a full
      or empty ring the helper always reports an empty ring. To fix this
      add a test for the full ring case to avoid reporting a full ring
      has 0 elements. This additional functionality will be used in the
      next patches from recvmsg context where end = head with a full ring
      is a valid case.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      8734a162
    • John Fastabend's avatar
      bpf: sockmap, fix skmsg recvmsg handler to track size correctly · 3f4c3127
      John Fastabend authored
      
      
      When converting sockmap to new skmsg generic data structures we missed
      that the recvmsg handler did not correctly use sg.size and instead was
      using individual elements length. The result is if a sock is closed
      with outstanding data we omit the call to sk_mem_uncharge() and can
      get the warning below.
      
      [   66.728282] WARNING: CPU: 6 PID: 5783 at net/core/stream.c:206 sk_stream_kill_queues+0x1fa/0x210
      
      To fix this correct the redirect handler to xfer the size along with
      the scatterlist and also decrement the size from the recvmsg handler.
      Now when a sock is closed the remaining 'size' will be decremented
      with sk_mem_uncharge().
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      3f4c3127
    • Alexei Starovoitov's avatar
      Merge branch 'nfp-improve-bpf-offload' · 9032c10e
      Alexei Starovoitov authored
      
      
      Jakub Kicinski says:
      
      ====================
      this set adds check to make sure offload behaviour is correct.
      First when atomic counters are used, we must make sure the map
      does not already contain data we did not prepare for holding
      atomics.
      
      Second patch double checks vNIC capabilities for program offload
      in case program is shared by multiple vNICs with different
      constraints.
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9032c10e
    • Jakub Kicinski's avatar
      nfp: bpf: double check vNIC capabilities after object sharing · 44b6fed0
      Jakub Kicinski authored
      
      
      Program translation stage checks that program can be offloaded to
      the netdev which was passed during the load (bpf_attr->prog_ifindex).
      After program sharing was introduced, however, the netdev on which
      program is loaded can theoretically be different, and therefore
      we should recheck the program size and max stack size at load time.
      
      This was found by code inspection, AFAIK today all vNICs have
      identical caps.
      
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      44b6fed0
    • Jakub Kicinski's avatar
      nfp: bpf: protect against mis-initializing atomic counters · 527db74b
      Jakub Kicinski authored
      
      
      Atomic operations on the NFP are currently always in big endian.
      The driver keeps track of regions of memory storing atomic values
      and byte swaps them accordingly.  There are corner cases where
      the map values may be initialized before the driver knows they
      are used as atomic counters.  This can happen either when the
      datapath is performing the update and the stack contents are
      unknown or when map is updated before the program which will
      use it for atomic values is loaded.
      
      To avoid situation where user initializes the value to 0 1 2 3
      and then after loading a program which uses the word as an atomic
      counter starts reading 3 2 1 0 - only allow atomic counters to be
      initialized to endian-neutral values.
      
      For updates from the datapath the stack information may not be
      as precise, so just allow initializing such values to 0.
      
      Example code which would break:
      struct bpf_map_def SEC("maps") rxcnt = {
             .type = BPF_MAP_TYPE_HASH,
             .key_size = sizeof(__u32),
             .value_size = sizeof(__u64),
             .max_entries = 1,
      };
      
      int xdp_prog1()
      {
            	__u64 nonzeroval = 3;
      	__u32 key = 0;
      	__u64 *value;
      
      	value = bpf_map_lookup_elem(&rxcnt, &key);
      	if (!value)
      		bpf_map_update_elem(&rxcnt, &key, &nonzeroval, BPF_ANY);
      	else
      		__sync_fetch_and_add(value, 1);
      
      	return XDP_PASS;
      }
      
      $ offload bpftool map dump
      key: 00 00 00 00 value: 00 00 00 03 00 00 00 00
      
      should be:
      
      $ offload bpftool map dump
      key: 00 00 00 00 value: 03 00 00 00 00 00 00 00
      
      Reported-by: default avatarDavid Beckett <david.beckett@netronome.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      527db74b
    • Andrey Ignatov's avatar
      libbpf: Per-symbol visibility for DSO · ab9e0848
      Andrey Ignatov authored
      Make global symbols in libbpf DSO hidden by default with
      -fvisibility=hidden and export symbols that are part of ABI explicitly
      with __attribute__((visibility("default"))).
      
      This is common practice that should prevent from accidentally exporting
      a symbol, that is not supposed to be a part of ABI what, in turn,
      improves both libbpf developer- and user-experiences. See [1] for more
      details.
      
      Export control becomes more important since more and more projects use
      libbpf.
      
      The patch doesn't export a bunch of netlink related functions since as
      agreed in [2] they'll be reworked. That doesn't break bpftool since
      bpftool links libbpf statically.
      
      [1] https://www.akkadia.org/drepper/dsohowto.pdf (2.2 Export Control)
      [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg251434.html
      
      
      
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ab9e0848
  2. Oct 16, 2018