Skip to content
  1. Nov 03, 2019
    • Daniel Borkmann's avatar
      uaccess: Add non-pagefault user-space write function · 1d1585ca
      Daniel Borkmann authored
      Commit 3d708182
      
       ("uaccess: Add non-pagefault user-space read functions")
      missed to add probe write function, therefore factor out a probe_write_common()
      helper with most logic of probe_kernel_write() except setting KERNEL_DS, and
      add a new probe_user_write() helper so it can be used from BPF side.
      
      Again, on some archs, the user address space and kernel address space can
      co-exist and be overlapping, so in such case, setting KERNEL_DS would mean
      that the given address is treated as being in kernel address space.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Link: https://lore.kernel.org/bpf/9df2542e68141bfa3addde631441ee45503856a8.1572649915.git.daniel@iogearbox.net
      1d1585ca
    • Alexei Starovoitov's avatar
      Merge branch 'map-pinning' · e1cb7d2d
      Alexei Starovoitov authored
      
      
      Toke Høiland-Jørgensen says:
      
      ====================
      This series adds support to libbpf for reading 'pinning' settings from BTF-based
      map definitions. It introduces a new open option which can set the pinning path;
      if no path is set, /sys/fs/bpf is used as the default. Callers can customise the
      pinning between open and load by setting the pin path per map, and still get the
      automatic reuse feature.
      
      The semantics of the pinning is similar to the iproute2 "PIN_GLOBAL" setting,
      and the eventual goal is to move the iproute2 implementation to be based on
      libbpf and the functions introduced in this series.
      
      Changelog:
      
      v6:
        - Fix leak of struct bpf_object in selftest
        - Make struct bpf_map arg const in bpf_map__is_pinned() and bpf_map__get_pin_path()
      
      v5:
        - Don't pin maps with pinning set, but with a value of LIBBPF_PIN_NONE
        - Add a few more selftests:
          - Should not pin map with pinning set, but value LIBBPF_PIN_NONE
          - Should fail to load a map with an invalid pinning value
          - Should fail to re-use maps with parameter mismatch
        - Alphabetise libbpf.map
        - Whitespace and typo fixes
      
      v4:
        - Don't check key_type_id and value_type_id when checking for map reuse
          compatibility.
        - Move building of map->pin_path into init_user_btf_map()
        - Get rid of 'pinning' attribute in struct bpf_map
        - Make sure we also create parent directory on auto-pin (new patch 3).
        - Abort the selftest on error instead of attempting to continue.
        - Support unpinning all pinned maps with bpf_object__unpin_maps(obj, NULL)
        - Support pinning at map->pin_path with bpf_object__pin_maps(obj, NULL)
        - Make re-pinning a map at the same path a noop
        - Rename the open option to pin_root_path
        - Add a bunch more self-tests for pin_maps(NULL) and unpin_maps(NULL)
        - Fix a couple of smaller nits
      
      v3:
        - Drop bpf_object__pin_maps_opts() and just use an open option to customise
          the pin path; also don't touch bpf_object__{un,}pin_maps()
        - Integrate pinning and reuse into bpf_object__create_maps() instead of having
          multiple loops though the map structure
        - Make errors in map reuse and pinning fatal to the load procedure
        - Add selftest to exercise pinning feature
        - Rebase series to latest bpf-next
      
      v2:
        - Drop patch that adds mounting of bpffs
        - Only support a single value of the pinning attribute
        - Add patch to fixup error handling in reuse_fd()
        - Implement the full automatic pinning and map reuse logic on load
      ====================
      
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e1cb7d2d
    • Toke Høiland-Jørgensen's avatar
      selftests: Add tests for automatic map pinning · 2f4a32cc
      Toke Høiland-Jørgensen authored
      
      
      This adds a new BPF selftest to exercise the new automatic map pinning
      code.
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/157269298209.394725.15420085139296213182.stgit@toke.dk
      2f4a32cc
    • Toke Høiland-Jørgensen's avatar
      libbpf: Add auto-pinning of maps when loading BPF objects · 57a00f41
      Toke Høiland-Jørgensen authored
      
      
      This adds support to libbpf for setting map pinning information as part of
      the BTF map declaration, to get automatic map pinning (and reuse) on load.
      The pinning type currently only supports a single PIN_BY_NAME mode, where
      each map will be pinned by its name in a path that can be overridden, but
      defaults to /sys/fs/bpf.
      
      Since auto-pinning only does something if any maps actually have a
      'pinning' BTF attribute set, we default the new option to enabled, on the
      assumption that seamless pinning is what most callers want.
      
      When a map has a pin_path set at load time, libbpf will compare the map
      pinned at that location (if any), and if the attributes match, will re-use
      that map instead of creating a new one. If no existing map is found, the
      newly created map will instead be pinned at the location.
      
      Programs wanting to customise the pinning can override the pinning paths
      using bpf_map__set_pin_path() before calling bpf_object__load() (including
      setting it to NULL to disable pinning of a particular map).
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk
      57a00f41
    • Toke Høiland-Jørgensen's avatar
      libbpf: Move directory creation into _pin() functions · 196f8487
      Toke Høiland-Jørgensen authored
      
      
      The existing pin_*() functions all try to create the parent directory
      before pinning. Move this check into the per-object _pin() functions
      instead. This ensures consistent behaviour when auto-pinning is
      added (which doesn't go through the top-level pin_maps() function), at the
      cost of a few more calls to mkdir().
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk
      196f8487
    • Toke Høiland-Jørgensen's avatar
      libbpf: Store map pin path and status in struct bpf_map · 4580b25f
      Toke Høiland-Jørgensen authored
      
      
      Support storing and setting a pin path in struct bpf_map, which can be used
      for automatic pinning. Also store the pin status so we can avoid attempts
      to re-pin a map that has already been pinned (or reused from a previous
      pinning).
      
      The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is
      called with a NULL path argument (which was previously illegal), it will
      (un)pin only those maps that have a pin_path set.
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk
      4580b25f
    • Toke Høiland-Jørgensen's avatar
      libbpf: Fix error handling in bpf_map__reuse_fd() · d1b4574a
      Toke Høiland-Jørgensen authored
      
      
      bpf_map__reuse_fd() was calling close() in the error path before returning
      an error value based on errno. However, close can change errno, so that can
      lead to potentially misleading error messages. Instead, explicitly store
      errno in the err variable before each goto.
      
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk
      d1b4574a
  2. Nov 02, 2019
    • Daniel Borkmann's avatar
      Merge branch 'bpf-xskmap-perf-improvements' · 78db77fa
      Daniel Borkmann authored
      Björn Töpel says:
      
      ====================
      This set consists of three patches from Maciej and myself which are
      optimizing the XSKMAP lookups.  In the first patch, the sockets are
      moved to be stored at the tail of the struct xsk_map. The second
      patch, Maciej implements map_gen_lookup() for XSKMAP. The third patch,
      introduced in this revision, moves various XSKMAP functions, to permit
      the compiler to do more aggressive inlining.
      
      Based on the XDP program from tools/lib/bpf/xsk.c where
      bpf_map_lookup_elem() is explicitly called, this work yields a 5%
      improvement for xdpsock's rxdrop scenario. The last patch yields 2%
      improvement.
      
      Jonathan's Acked-by: for patch 1 and 2 was carried on. Note that the
      overflow checks are done in the bpf_map_area_alloc() and
      bpf_map_charge_init() functions, which was fixed in commit
      ff1c08e1
      
       ("bpf: Change size to u64 for bpf_map_{area_alloc,
      charge_init}()").
      
        [1] https://patchwork.ozlabs.org/patch/1186170/
      
      v1->v2: * Change size/cost to size_t and use {struct, array}_size
                where appropriate. (Jakub)
      v2->v3: * Proper commit message for patch 2.
      v3->v4: * Change size_t to u64 to handle 32-bit overflows. (Jakub)
              * Introduced patch 3.
      v4->v5: * Use BPF_SIZEOF size, instead of BPF_DW, for correct
                pointer-sized loads. (Daniel)
      ====================
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      78db77fa
    • Björn Töpel's avatar
      xsk: Restructure/inline XSKMAP lookup/redirect/flush · d817991c
      Björn Töpel authored
      
      
      In this commit the XSKMAP entry lookup function used by the XDP
      redirect code is moved from the xskmap.c file to the xdp_sock.h
      header, so the lookup can be inlined from, e.g., the
      bpf_xdp_redirect_map() function.
      
      Further the __xsk_map_redirect() and __xsk_map_flush() is moved to the
      xsk.c, which lets the compiler inline the xsk_rcv() and xsk_flush()
      functions.
      
      Finally, all the XDP socket functions were moved from linux/bpf.h to
      net/xdp_sock.h, where most of the XDP sockets functions are anyway.
      
      This yields a ~2% performance boost for the xdpsock "rx_drop"
      scenario.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191101110346.15004-4-bjorn.topel@gmail.com
      d817991c
    • Maciej Fijalkowski's avatar
      bpf: Implement map_gen_lookup() callback for XSKMAP · e65650f2
      Maciej Fijalkowski authored
      
      
      Inline the xsk_map_lookup_elem() via implementing the map_gen_lookup()
      callback. This results in emitting the bpf instructions in place of
      bpf_map_lookup_elem() helper call and better performance of bpf
      programs.
      
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Link: https://lore.kernel.org/bpf/20191101110346.15004-3-bjorn.topel@gmail.com
      e65650f2
    • Björn Töpel's avatar
      xsk: Store struct xdp_sock as a flexible array member of the XSKMAP · 64fe8c06
      Björn Töpel authored
      
      
      Prior this commit, the array storing XDP socket instances were stored
      in a separate allocated array of the XSKMAP. Now, we store the sockets
      as a flexible array member in a similar fashion as the arraymap. Doing
      so, we do less pointer chasing in the lookup.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Link: https://lore.kernel.org/bpf/20191101110346.15004-2-bjorn.topel@gmail.com
      64fe8c06
  3. Nov 01, 2019
  4. Oct 31, 2019
    • Daniel Borkmann's avatar
      Merge branch 'bpf-cleanup-btf-raw-tp' · 06087114
      Daniel Borkmann authored
      
      
      Alexei Starovoitov says:
      
      ====================
      v1->v2: addressed Andrii's feedback
      
      When BTF-enabled raw_tp were introduced the plan was to follow up
      with BTF-enabled kprobe and kretprobe reusing PROG_RAW_TRACEPOINT
      and PROG_KPROBE types. But k[ret]probe expect pt_regs while
      BTF-enabled program ctx will be the same as raw_tp.
      kretprobe is indistinguishable from kprobe while BTF-enabled
      kretprobe will have access to retval while kprobe will not.
      Hence PROG_KPROBE type is not reusable and reusing
      PROG_RAW_TRACEPOINT no longer fits well.
      Hence introduce 'umbrella' prog type BPF_PROG_TYPE_TRACING
      that will cover different BTF-enabled tracing attach points.
      The changes make libbpf side cleaner as well.
      check_attach_btf_id() is cleaner too.
      ====================
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      06087114
    • Alexei Starovoitov's avatar
      libbpf: Add support for prog_tracing · 12a8654b
      Alexei Starovoitov authored
      
      
      Cleanup libbpf from expected_attach_type == attach_btf_id hack
      and introduce BPF_PROG_TYPE_TRACING.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org
      12a8654b
    • Alexei Starovoitov's avatar
      bpf: Replace prog_raw_tp+btf_id with prog_tracing · f1b9509c
      Alexei Starovoitov authored
      
      
      The bpf program type raw_tp together with 'expected_attach_type'
      was the most appropriate api to indicate BTF-enabled raw_tp programs.
      But during development it became apparent that 'expected_attach_type'
      cannot be used and new 'attach_btf_id' field had to be introduced.
      Which means that the information is duplicated in two fields where
      one of them is ignored.
      Clean it up by introducing new program type where both
      'expected_attach_type' and 'attach_btf_id' fields have
      specific meaning.
      In the future 'expected_attach_type' will be extended
      with other attach points that have similar semantics to raw_tp.
      This patch is replacing BTF-enabled BPF_PROG_TYPE_RAW_TRACEPOINT with
      prog_type = BPF_RPOG_TYPE_TRACING
      expected_attach_type = BPF_TRACE_RAW_TP
      attach_btf_id = btf_id of raw tracepoint inside the kernel
      Future patches will add
      expected_attach_type = BPF_TRACE_FENTRY or BPF_TRACE_FEXIT
      where programs have the same input context and the same helpers,
      but different attach points.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20191030223212.953010-2-ast@kernel.org
      f1b9509c
    • Alexei Starovoitov's avatar
      bpf: Fix bpf jit kallsym access · af91acbc
      Alexei Starovoitov authored
      Jiri reported crash when JIT is on, but net.core.bpf_jit_kallsyms is off.
      bpf_prog_kallsyms_find() was skipping addr->bpf_prog resolution
      logic in oops and stack traces. That's incorrect.
      It should only skip addr->name resolution for 'cat /proc/kallsyms'.
      That's what bpf_jit_kallsyms and bpf_jit_harden protect.
      
      Fixes: 3dec541b
      
       ("bpf: Add support for BTF pointers to x86 JIT")
      Reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20191030233019.1187404-1-ast@kernel.org
      af91acbc
  5. Oct 30, 2019
  6. Oct 29, 2019
  7. Oct 28, 2019
  8. Oct 27, 2019
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 5b7fe93d
      David S. Miller authored
      
      
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2019-10-27
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      We've added 52 non-merge commits during the last 11 day(s) which contain
      a total of 65 files changed, 2604 insertions(+), 1100 deletions(-).
      
      The main changes are:
      
       1) Revolutionize BPF tracing by using in-kernel BTF to type check BPF
          assembly code. The work here teaches BPF verifier to recognize
          kfree_skb()'s first argument as 'struct sk_buff *' in tracepoints
          such that verifier allows direct use of bpf_skb_event_output() helper
          used in tc BPF et al (w/o probing memory access) that dumps skb data
          into perf ring buffer. Also add direct loads to probe memory in order
          to speed up/replace bpf_probe_read() calls, from Alexei Starovoitov.
      
       2) Big batch of changes to improve libbpf and BPF kselftests. Besides
          others: generalization of libbpf's CO-RE relocation support to now
          also include field existence relocations, revamp the BPF kselftest
          Makefile to add test runner concept allowing to exercise various
          ways to build BPF programs, and teach bpf_object__open() and friends
          to automatically derive BPF program type/expected attach type from
          section names to ease their use, from Andrii Nakryiko.
      
       3) Fix deadlock in stackmap's build-id lookup on rq_lock(), from Song Liu.
      
       4) Allow to read BTF as raw data from bpftool. Most notable use case
          is to dump /sys/kernel/btf/vmlinux through this, from Jiri Olsa.
      
       5) Use bpf_redirect_map() helper in libbpf's AF_XDP helper prog which
          manages to improve "rx_drop" performance by ~4%., from Björn Töpel.
      
       6) Fix to restore the flow dissector after reattach BPF test and also
          fix error handling in bpf_helper_defs.h generation, from Jakub Sitnicki.
      
       7) Improve verifier's BTF ctx access for use outside of raw_tp, from
          Martin KaFai Lau.
      
       8) Improve documentation for AF_XDP with new sections and to reflect
          latest features, from Magnus Karlsson.
      
       9) Add back 'version' section parsing to libbpf for old kernels, from
          John Fastabend.
      
      10) Fix strncat bounds error in libbpf's libbpf_prog_type_by_name(),
          from KP Singh.
      
      11) Turn on -mattr=+alu32 in LLVM by default for BPF kselftests in order
          to improve insn coverage for built BPF progs, from Yonghong Song.
      
      12) Misc minor cleanups and fixes, from various others.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b7fe93d
    • Roman Mashak's avatar
      tc-testing: list required kernel options for act_ct action · b9512485
      Roman Mashak authored
      
      
      Updated config with required kernel options for conntrac TC action,
      so that tdc can run the tests.
      
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9512485
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 4b1f5dda
      David S. Miller authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for net-next,
      more specifically:
      
      * Updates for ipset:
      
      1) Coding style fix for ipset comment extension, from Jeremy Sowden.
      
      2) De-inline many functions in ipset, from Jeremy Sowden.
      
      3) Move ipset function definition from header to source file.
      
      4) Move ip_set_put_flags() to source, export it as a symbol, remove
         inline.
      
      5) Move range_to_mask() to the source file where this is used.
      
      6) Move ip_set_get_ip_port() to the source file where this is used.
      
      * IPVS selftests and netns improvements:
      
      7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan.
      
      8) Three patches to add selftest script for ipvs, also from
         Haishuang Yan.
      
      * Conntrack updates and new nf_hook_slow_list() function:
      
      9) Document ct ecache extension, from Florian Westphal.
      
      10) Skip ct extensions from ctnetlink dump, from Florian.
      
      11) Free ct extension immediately, from Florian.
      
      12) Skip access to ecache extension from nf_ct_deliver_cached_events()
          this is not correct as reported by Syzbot.
      
      13) Add and use nf_hook_slow_list(), from Florian.
      
      * Flowtable infrastructure updates:
      
      14) Move priority to nf_flowtable definition.
      
      15) Dynamic allocation of per-device hooks in flowtables.
      
      16) Allow to include netdevice only once in flowtable definitions.
      
      17) Rise maximum number of devices per flowtable.
      
      * Netfilter hardware offload infrastructure updates:
      
      18) Add nft_flow_block_chain() helper function.
      
      19) Pass callback list to nft_setup_cb_call().
      
      20) Add nft_flow_cls_offload_setup() helper function.
      
      21) Remove rules for the unregistered device via netdevice event.
      
      22) Support for multiple devices in a basechain definition at the
          ingress hook.
      
      22) Add nft_chain_offload_cmd() helper function.
      
      23) Add nft_flow_block_offload_init() helper function.
      
      24) Rewind in case of failing to bind multiple devices to hook.
      
      25) Typo in IPv6 tproxy module description, from Norman Rasmussen.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b1f5dda
    • David S. Miller's avatar
      Merge branch 'net-aquantia-ptp-followup-fixes' · 64fe8e97
      David S. Miller authored
      
      
      Igor Russkikh says:
      
      ====================
      net: aquantia: ptp followup fixes
      
      Here are two sparse warnings, third patch is a fix for
      scaled_ppm_to_ppb missing. Eventually I reworked this
      to exclude ptp module from build. Please consider it instead
      of this patch: https://patchwork.ozlabs.org/patch/1184171/
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64fe8e97
    • Igor Russkikh's avatar
      net: aquantia: disable ptp object build if no config · 7873ee26
      Igor Russkikh authored
      
      
      We do disable aq_ptp module build using inline
      stubs when CONFIG_PTP_1588_CLOCK is not declared.
      
      This reduces module size and removes unnecessary code.
      
      Reported-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7873ee26
    • Igor Russkikh's avatar
      net: aquantia: fix warnings on endianness · 5eeb6c3c
      Igor Russkikh authored
      fixes to remove sparse warnings:
      sparse: sparse: cast to restricted __be64
      
      Fixes: 04a18399
      
       ("net: aquantia: implement data PTP datapath")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5eeb6c3c
    • Igor Russkikh's avatar
      net: aquantia: fix var initialization warning · bb1eded1
      Igor Russkikh authored
      found by sparse, simply useless local initialization with zero.
      
      Fixes: 94ad9455
      
       ("net: aquantia: add PTP rings infrastructure")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb1eded1
  9. Oct 26, 2019
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables_offload: unbind if multi-device binding fails · 671312e1
      Pablo Neira Ayuso authored
      nft_flow_block_chain() needs to unbind in case of error when performing
      the multi-device binding.
      
      Fixes: d54725cd
      
       ("netfilter: nf_tables: support for multiple devices per netdev hook")
      Reported-by: default avatarwenxu <wenxu@ucloud.cn>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      671312e1
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables_offload: add nft_flow_block_offload_init() · 75ceaf86
      Pablo Neira Ayuso authored
      
      
      This patch adds the nft_flow_block_offload_init() helper function to
      initialize the flow_block_offload object.
      
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      75ceaf86
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables_offload: add nft_chain_offload_cmd() · 6df5490f
      Pablo Neira Ayuso authored
      
      
      This patch adds the nft_chain_offload_cmd() helper function.
      
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6df5490f
    • Florian Westphal's avatar
      netfilter: ecache: don't look for ecache extension on dying/unconfirmed conntracks · ad88b7a6
      Florian Westphal authored
      
      
      syzbot reported following splat:
      BUG: KASAN: use-after-free in __nf_ct_ext_exist
      include/net/netfilter/nf_conntrack_extend.h:53 [inline]
      BUG: KASAN: use-after-free in nf_ct_deliver_cached_events+0x5c3/0x6d0
      net/netfilter/nf_conntrack_ecache.c:205
      nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:65 [inline]
      nf_confirm+0x3d8/0x4d0 net/netfilter/nf_conntrack_proto.c:154
      [..]
      
      While there is no reproducer yet, the syzbot report contains one
      interesting bit of information:
      
      Freed by task 27585:
      [..]
       kfree+0x10a/0x2c0 mm/slab.c:3757
       nf_ct_ext_destroy+0x2ab/0x2e0 net/netfilter/nf_conntrack_extend.c:38
       nf_conntrack_free+0x8f/0xe0 net/netfilter/nf_conntrack_core.c:1418
       destroy_conntrack+0x1a2/0x270 net/netfilter/nf_conntrack_core.c:626
       nf_conntrack_put include/linux/netfilter/nf_conntrack_common.h:31 [inline]
       nf_ct_resolve_clash net/netfilter/nf_conntrack_core.c:915 [inline]
       ^^^^^^^^^^^^^^^^^^^
       __nf_conntrack_confirm+0x21ca/0x2830 net/netfilter/nf_conntrack_core.c:1038
       nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:63 [inline]
       nf_confirm+0x3e7/0x4d0 net/netfilter/nf_conntrack_proto.c:154
      
      This is whats happening:
      
      1. a conntrack entry is about to be confirmed (added to hash table).
      2. a clash with existing entry is detected.
      3. nf_ct_resolve_clash() puts skb->nfct (the "losing" entry).
      4. this entry now has a refcount of 0 and is freed to SLAB_TYPESAFE_BY_RCU
         kmem cache.
      
      skb->nfct has been replaced by the one found in the hash.
      Problem is that nf_conntrack_confirm() uses the old ct:
      
      static inline int nf_conntrack_confirm(struct sk_buff *skb)
      {
       struct nf_conn *ct = (struct nf_conn *)skb_nfct(skb);
       int ret = NF_ACCEPT;
      
        if (ct) {
          if (!nf_ct_is_confirmed(ct))
             ret = __nf_conntrack_confirm(skb);
          if (likely(ret == NF_ACCEPT))
      	nf_ct_deliver_cached_events(ct); /* This ct has refcount 0! */
        }
        return ret;
      }
      
      As of "netfilter: conntrack: free extension area immediately", we can't
      access conntrack extensions in this case.
      
      To fix this, make sure we check the dying bit presence before attempting
      to get the eache extension.
      
      Reported-by: default avatar <syzbot+c7aabc9fe93e7f3637ba@syzkaller.appspotmail.com>
      Fixes: 2ad9d774
      
       ("netfilter: conntrack: free extension area immediately")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ad88b7a6
    • David S. Miller's avatar
      Merge branch 'ionic-updates' · 0629d245
      David S. Miller authored
      
      
      Shannon Nelson says:
      
      ====================
      ionic updates
      
      These are a few of the driver updates we've been working on internally.
      These clean up a few mismatched struct comments, add checking for dead
      firmware, fix an initialization bug, and change the Rx buffer management.
      
      These are based on net-next v5.4-rc3-709-g985fd98ab5cc.
      
      v2: clear napi->skb in the error case in ionic_rx_frags()
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0629d245
    • Shannon Nelson's avatar
      ionic: update driver version · 63ad1cd6
      Shannon Nelson authored
      
      
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63ad1cd6
    • Shannon Nelson's avatar
      ionic: implement support for rx sgl · 08f2e4b2
      Shannon Nelson authored
      
      
      Even out Rx performance across MTU sizes by changing from full
      skb allocations to page-based frag allocations.  The device
      supports a form of scatter-gather in the Rx path, so we can
      set up a number of pages for each descriptor, all of which are
      easier to alloc and pass around than the standard kzalloc'd
      buffer.  An skb is wrapped around the pages while processing
      the received packets, and pages are recycled as needed, or
      left alone if they weren't used in the Rx.
      
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08f2e4b2