Skip to content
  1. Dec 14, 2019
  2. Dec 13, 2019
  3. Dec 12, 2019
    • Daniel Borkmann's avatar
      bpf, x86, arm64: Enable jit by default when not built as always-on · 81c22041
      Daniel Borkmann authored
      
      
      After Spectre 2 fix via 290af866 ("bpf: introduce BPF_JIT_ALWAYS_ON
      config") most major distros use BPF_JIT_ALWAYS_ON configuration these days
      which compiles out the BPF interpreter entirely and always enables the
      JIT. Also given recent fix in e1608f3f ("bpf: Avoid setting bpf insns
      pages read-only when prog is jited"), we additionally avoid fragmenting
      the direct map for the BPF insns pages sitting in the general data heap
      since they are not used during execution. Latter is only needed when run
      through the interpreter.
      
      Since both x86 and arm64 JITs have seen a lot of exposure over the years,
      are generally most up to date and maintained, there is more downside in
      !BPF_JIT_ALWAYS_ON configurations to have the interpreter enabled by default
      rather than the JIT. Add a ARCH_WANT_DEFAULT_BPF_JIT config which archs can
      use to set the bpf_jit_{enable,kallsyms} to 1. Back in the days the
      bpf_jit_kallsyms knob was set to 0 by default since major distros still
      had /proc/kallsyms addresses exposed to unprivileged user space which is
      not the case anymore. Hence both knobs are set via BPF_JIT_DEFAULT_ON which
      is set to 'y' in case of BPF_JIT_ALWAYS_ON or ARCH_WANT_DEFAULT_BPF_JIT.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/f78ad24795c2966efcc2ee19025fa3459f622185.1575903816.git.daniel@iogearbox.net
      81c22041
    • Daniel Borkmann's avatar
      bpf: Emit audit messages upon successful prog load and unload · bae141f5
      Daniel Borkmann authored
      
      
      Allow for audit messages to be emitted upon BPF program load and
      unload for having a timeline of events. The load itself is in
      syscall context, so additional info about the process initiating
      the BPF prog creation can be logged and later directly correlated
      to the unload event.
      
      The only info really needed from BPF side is the globally unique
      prog ID where then audit user space tooling can query / dump all
      info needed about the specific BPF program right upon load event
      and enrich the record, thus these changes needed here can be kept
      small and non-intrusive to the core.
      
      Raw example output:
      
        # auditctl -D
        # auditctl -a always,exit -F arch=x86_64 -S bpf
        # ausearch --start recent -m 1334
        ...
        ----
        time->Wed Nov 27 16:04:13 2019
        type=PROCTITLE msg=audit(1574867053.120:84664): proctitle="./bpf"
        type=SYSCALL msg=audit(1574867053.120:84664): arch=c000003e syscall=321   \
          success=yes exit=3 a0=5 a1=7ffea484fbe0 a2=70 a3=0 items=0 ppid=7477    \
          pid=12698 auid=1001 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001    \
          egid=1001 sgid=1001 fsgid=1001 tty=pts2 ses=4 comm="bpf"                \
          exe="/home/jolsa/auditd/audit-testsuite/tests/bpf/bpf"                  \
          subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
        type=UNKNOWN[1334] msg=audit(1574867053.120:84664): prog-id=76 op=LOAD
        ----
        time->Wed Nov 27 16:04:13 2019
        type=UNKNOWN[1334] msg=audit(1574867053.120:84665): prog-id=76 op=UNLOAD
        ...
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Co-developed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Link: https://lore.kernel.org/bpf/20191206214934.11319-1-jolsa@kernel.org
      bae141f5
  4. Dec 11, 2019
  5. Dec 10, 2019
  6. Dec 09, 2019
    • Jason A. Donenfeld's avatar
      net: WireGuard secure network tunnel · e7096c13
      Jason A. Donenfeld authored
      WireGuard is a layer 3 secure networking tunnel made specifically for
      the kernel, that aims to be much simpler and easier to audit than IPsec.
      Extensive documentation and description of the protocol and
      considerations, along with formal proofs of the cryptography, are
      available at:
      
        * https://www.wireguard.com/
        * https://www.wireguard.com/papers/wireguard.pdf
      
      
      
      This commit implements WireGuard as a simple network device driver,
      accessible in the usual RTNL way used by virtual network drivers. It
      makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of
      networking subsystem APIs. It has a somewhat novel multicore queueing
      system designed for maximum throughput and minimal latency of encryption
      operations, but it is implemented modestly using workqueues and NAPI.
      Configuration is done via generic Netlink, and following a review from
      the Netlink maintainer a year ago, several high profile userspace tools
      have already implemented the API.
      
      This commit also comes with several different tests, both in-kernel
      tests and out-of-kernel tests based on network namespaces, taking profit
      of the fact that sockets used by WireGuard intentionally stay in the
      namespace the WireGuard interface was originally created, exactly like
      the semantics of userspace tun devices. See wireguard.com/netns/ for
      pictures and examples.
      
      The source code is fairly short, but rather than combining everything
      into a single file, WireGuard is developed as cleanly separable files,
      making auditing and comprehension easier. Things are laid out as
      follows:
      
        * noise.[ch], cookie.[ch], messages.h: These implement the bulk of the
          cryptographic aspects of the protocol, and are mostly data-only in
          nature, taking in buffers of bytes and spitting out buffers of
          bytes. They also handle reference counting for their various shared
          pieces of data, like keys and key lists.
      
        * ratelimiter.[ch]: Used as an integral part of cookie.[ch] for
          ratelimiting certain types of cryptographic operations in accordance
          with particular WireGuard semantics.
      
        * allowedips.[ch], peerlookup.[ch]: The main lookup structures of
          WireGuard, the former being trie-like with particular semantics, an
          integral part of the design of the protocol, and the latter just
          being nice helper functions around the various hashtables we use.
      
        * device.[ch]: Implementation of functions for the netdevice and for
          rtnl, responsible for maintaining the life of a given interface and
          wiring it up to the rest of WireGuard.
      
        * peer.[ch]: Each interface has a list of peers, with helper functions
          available here for creation, destruction, and reference counting.
      
        * socket.[ch]: Implementation of functions related to udp_socket and
          the general set of kernel socket APIs, for sending and receiving
          ciphertext UDP packets, and taking care of WireGuard-specific sticky
          socket routing semantics for the automatic roaming.
      
        * netlink.[ch]: Userspace API entry point for configuring WireGuard
          peers and devices. The API has been implemented by several userspace
          tools and network management utility, and the WireGuard project
          distributes the basic wg(8) tool.
      
        * queueing.[ch]: Shared function on the rx and tx path for handling
          the various queues used in the multicore algorithms.
      
        * send.c: Handles encrypting outgoing packets in parallel on
          multiple cores, before sending them in order on a single core, via
          workqueues and ring buffers. Also handles sending handshake and cookie
          messages as part of the protocol, in parallel.
      
        * receive.c: Handles decrypting incoming packets in parallel on
          multiple cores, before passing them off in order to be ingested via
          the rest of the networking subsystem with GRO via the typical NAPI
          poll function. Also handles receiving handshake and cookie messages
          as part of the protocol, in parallel.
      
        * timers.[ch]: Uses the timer wheel to implement protocol particular
          event timeouts, and gives a set of very simple event-driven entry
          point functions for callers.
      
        * main.c, version.h: Initialization and deinitialization of the module.
      
        * selftest/*.h: Runtime unit tests for some of the most security
          sensitive functions.
      
        * tools/testing/selftests/wireguard/netns.sh: Aforementioned testing
          script using network namespaces.
      
      This commit aims to be as self-contained as possible, implementing
      WireGuard as a standalone module not needing much special handling or
      coordination from the network subsystem. I expect for future
      optimizations to the network stack to positively improve WireGuard, and
      vice-versa, but for the time being, this exists as intentionally
      standalone.
      
      We introduce a menu option for CONFIG_WIREGUARD, as well as providing a
      verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG.
      
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: linux-crypto@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7096c13
    • Linus Torvalds's avatar
      Linux 5.5-rc1 · e42617b8
      Linus Torvalds authored
      e42617b8