Commit 15175336 authored by Christoph Hellwig's avatar Christoph Hellwig Committed by Alexei Starovoitov
Browse files

bpf, docs: Better document the legacy packet access instruction



Use consistent terminology and structured RST elements to better document
these two oddball instructions.

Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220131183638.3934982-4-hch@lst.de
parent 63d8c242
Loading
Loading
Loading
Loading
+32 −22
Original line number Diff line number Diff line
@@ -213,8 +213,8 @@ The mode modifier is one of:
  mode modifier  value  description
  =============  =====  ====================================
  BPF_IMM        0x00   used for 64-bit mov
  BPF_ABS        0x20   legacy BPF packet access
  BPF_IND        0x40   legacy BPF packet access
  BPF_ABS        0x20   legacy BPF packet access (absolute)
  BPF_IND        0x40   legacy BPF packet access (indirect)
  BPF_MEM        0x60   regular load and store operations
  BPF_ATOMIC     0xc0   atomic operations
  =============  =====  ====================================
@@ -294,29 +294,39 @@ eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists
of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
instruction that loads 64-bit immediate value into a dst_reg.

Packet access instructions
--------------------------
Legacy BPF Packet access instructions
-------------------------------------

eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
(BPF_IND | <size> | BPF_LD) which are used to access packet data.
eBPF has special instructions for access to packet data that have been
carried over from classic BPF to retain the performance of legacy socket
filters running in the eBPF interpreter.

They had to be carried over from classic BPF to have strong performance of
socket filters running in eBPF interpreter. These instructions can only
be used when interpreter context is a pointer to ``struct sk_buff`` and
have seven implicit operands. Register R6 is an implicit input that must
contain pointer to sk_buff. Register R0 is an implicit output which contains
the data fetched from the packet. Registers R1-R5 are scratch registers
and must not be used to store the data across BPF_ABS | BPF_LD or
BPF_IND | BPF_LD instructions.
The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
``BPF_IND | <size> | BPF_LD``.

These instructions have implicit program exit condition as well. When
eBPF program is trying to access the data beyond the packet boundary,
the interpreter will abort the execution of the program. JIT compilers
therefore must preserve this property. src_reg and imm32 fields are
explicit inputs to these instructions.
These instructions are used to access packet data and can only be used when
the program context is a pointer to networking packet.  ``BPF_ABS``
accesses packet data at an absolute offset specified by the immediate data
and ``BPF_IND`` access packet data at an offset that includes the value of
a register in addition to the immediate data.

For example, BPF_IND | BPF_W | BPF_LD means::
These instructions have seven implicit operands:

  R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
 * Register R6 is an implicit input that must contain pointer to a
   struct sk_buff.
 * Register R0 is an implicit output which contains the data fetched from
   the packet.
 * Registers R1-R5 are scratch registers that are clobbered after a call to
   ``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions.

These instructions have an implicit program exit condition as well. When an
eBPF program is trying to access the data beyond the packet boundary, the
program execution will be aborted.

``BPF_ABS | BPF_W | BPF_LD`` means::

and R1 - R5 are clobbered.
  R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32))

``BPF_IND | BPF_W | BPF_LD`` means::

  R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))