Skip to content
  1. May 12, 2021
  2. May 11, 2021
    • Andrii Nakryiko's avatar
      bpf: Prevent writable memory-mapping of read-only ringbuf pages · 04ea3086
      Andrii Nakryiko authored
      Only the very first page of BPF ringbuf that contains consumer position
      counter is supposed to be mapped as writeable by user-space. Producer
      position is read-only and can be modified only by the kernel code. BPF ringbuf
      data pages are read-only as well and are not meant to be modified by
      user-code to maintain integrity of per-record headers.
      
      This patch allows to map only consumer position page as writeable and
      everything else is restricted to be read-only. remap_vmalloc_range()
      internally adds VM_DONTEXPAND, so all the established memory mappings can't be
      extended, which prevents any future violations through mremap()'ing.
      
      Fixes: 457f4436
      
       ("bpf: Implement BPF ring buffer and verifier support for it")
      Reported-by: Ryota Shiga (Flatt Security)
      Reported-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      04ea3086
    • Thadeu Lima de Souza Cascardo's avatar
      bpf, ringbuf: Deny reserve of buffers larger than ringbuf · 4b81cceb
      Thadeu Lima de Souza Cascardo authored
      A BPF program might try to reserve a buffer larger than the ringbuf size.
      If the consumer pointer is way ahead of the producer, that would be
      successfully reserved, allowing the BPF program to read or write out of
      the ringbuf allocated area.
      
      Reported-by: Ryota Shiga (Flatt Security)
      Fixes: 457f4436
      
       ("bpf: Implement BPF ring buffer and verifier support for it")
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4b81cceb
    • Daniel Borkmann's avatar
      bpf: Fix alu32 const subreg bound tracking on bitwise operations · 049c4e13
      Daniel Borkmann authored
      Fix a bug in the verifier's scalar32_min_max_*() functions which leads to
      incorrect tracking of 32 bit bounds for the simulation of and/or/xor bitops.
      When both the src & dst subreg is a known constant, then the assumption is
      that scalar_min_max_*() will take care to update bounds correctly. However,
      this is not the case, for example, consider a register R2 which has a tnum
      of 0xffffffff00000000, meaning, lower 32 bits are known constant and in this
      case of value 0x00000001. R2 is then and'ed with a register R3 which is a
      64 bit known constant, here, 0x100000002.
      
      What can be seen in line '10:' is that 32 bit bounds reach an invalid state
      where {u,s}32_min_value > {u,s}32_max_value. The reason is scalar32_min_max_*()
      delegates 32 bit bounds updates to scalar_min_max_*(), however, that really
      only takes place when both the 64 bit src & dst register is a known constant.
      Given scalar32_min_max_*() is intended to be designed as closely as possible
      to scalar_min_max_*(), update the 32 bit bounds in this situation through
      __mark_reg32_known() which will set all {u,s}32_{min,max}_value to the correct
      constant, which is 0x00000000 after the fix (given 0x00000001 & 0x00000002 in
      32 bit space). This is possible given var32_off already holds the final value
      as dst_reg->var_off is updated before calling scalar32_min_max_*().
      
      Before fix, invalid tracking of R2:
      
        [...]
        9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
        9: (5f) r2 &= r3
        10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=1,s32_max_value=0,u32_min_value=1,u32_max_value=0) R3_w=inv4294967298 R10=fp0
        [...]
      
      After fix, correct tracking of R2:
      
        [...]
        9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
        9: (5f) r2 &= r3
        10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=0,s32_max_value=0,u32_min_value=0,u32_max_value=0) R3_w=inv4294967298 R10=fp0
        [...]
      
      Fixes: 3f50f132 ("bpf: Verifier, do explicit ALU32 bounds tracking")
      Fixes: 2921c90d
      
       ("bpf: Fix a verifier failure with xor")
      Reported-by: default avatarManfred Paul <(@_manfp)>
      Reported-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      049c4e13
  3. May 07, 2021
  4. May 06, 2021
  5. May 04, 2021
  6. May 03, 2021
    • Daniel Borkmann's avatar
      bpf: Fix leakage of uninitialized bpf stack under speculation · 801c6058
      Daniel Borkmann authored
      The current implemented mechanisms to mitigate data disclosure under
      speculation mainly address stack and map value oob access from the
      speculative domain. However, Piotr discovered that uninitialized BPF
      stack is not protected yet, and thus old data from the kernel stack,
      potentially including addresses of kernel structures, could still be
      extracted from that 512 bytes large window. The BPF stack is special
      compared to map values since it's not zero initialized for every
      program invocation, whereas map values /are/ zero initialized upon
      their initial allocation and thus cannot leak any prior data in either
      domain. In the non-speculative domain, the verifier ensures that every
      stack slot read must have a prior stack slot write by the BPF program
      to avoid such data leaking issue.
      
      However, this is not enough: for example, when the pointer arithmetic
      operation moves the stack pointer from the last valid stack offset to
      the first valid offset, the sanitation logic allows for any intermediate
      offsets during speculative execution, which could then be used to
      extract any restricted stack content via side-channel.
      
      Given for unprivileged stack pointer arithmetic the use of unknown
      but bounded scalars is generally forbidden, we can simply turn the
      register-based arithmetic operation into an immediate-based arithmetic
      operation without the need for masking. This also gives the benefit
      of reducing the needed instructions for the operation. Given after
      the work in 7fedb63a
      
       ("bpf: Tighten speculative pointer arithmetic
      mask"), the aux->alu_limit already holds the final immediate value for
      the offset register with the known scalar. Thus, a simple mov of the
      immediate to AX register with using AX as the source for the original
      instruction is sufficient and possible now in this case.
      
      Reported-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      801c6058
    • Daniel Borkmann's avatar
      bpf: Fix masking negation logic upon negative dst register · b9b34ddb
      Daniel Borkmann authored
      The negation logic for the case where the off_reg is sitting in the
      dst register is not correct given then we cannot just invert the add
      to a sub or vice versa. As a fix, perform the final bitwise and-op
      unconditionally into AX from the off_reg, then move the pointer from
      the src to dst and finally use AX as the source for the original
      pointer arithmetic operation such that the inversion yields a correct
      result. The single non-AX mov in between is possible given constant
      blinding is retaining it as it's not an immediate based operation.
      
      Fixes: 979d63d5
      
       ("bpf: prevent out of bounds speculation on pointer arithmetic")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b9b34ddb
  7. May 01, 2021
  8. Apr 30, 2021