Skip to content
  1. May 01, 2020
    • Paolo Abeni's avatar
      mptcp: move option parsing into mptcp_incoming_options() · cfde141e
      Paolo Abeni authored
      The mptcp_options_received structure carries several per
      packet flags (mp_capable, mp_join, etc.). Such fields must
      be cleared on each packet, even on dropped ones or packet
      not carrying any MPTCP options, but the current mptcp
      code clears them only on TCP option reset.
      
      On several races/corner cases we end-up with stray bits in
      incoming options, leading to WARN_ON splats. e.g.:
      
      [  171.164906] Bad mapping: ssn=32714 map_seq=1 map_data_len=32713
      [  171.165006] WARNING: CPU: 1 PID: 5026 at net/mptcp/subflow.c:533 warn_bad_map (linux-mptcp/net/mptcp/subflow.c:533 linux-mptcp/net/mptcp/subflow.c:531)
      [  171.167632] Modules linked in: ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel geneve ip6_udp_tunnel udp_tunnel macsec macvtap tap ipvlan macvlan 8021q garp mrp xfrm_interface veth netdevsim nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun binfmt_misc intel_rapl_msr intel_rapl_common rfkill kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ip_tables xfs libcrc32c crc32c_intel serio_raw virtio_console ata_generic virtio_blk virtio_net net_failover failover ata_piix libata
      [  171.199464] CPU: 1 PID: 5026 Comm: repro Not tainted 5.7.0-rc1.mptcp_f227fdf5d388+ #95
      [  171.200886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
      [  171.202546] RIP: 0010:warn_bad_map (linux-mptcp/net/mptcp/subflow.c:533 linux-mptcp/net/mptcp/subflow.c:531)
      [  171.206537] Code: c1 ea 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 1d 8b 55 3c 44 89 e6 48 c7 c7 20 51 13 95 e8 37 8b 22 fe <0f> 0b 48 83 c4 08 5b 5d 41 5c c3 89 4c 24 04 e8 db d6 94 fe 8b 4c
      [  171.220473] RSP: 0018:ffffc90000150560 EFLAGS: 00010282
      [  171.221639] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  171.223108] RDX: 0000000000000000 RSI: 0000000000000008 RDI: fffff5200002a09e
      [  171.224388] RBP: ffff8880aa6e3c00 R08: 0000000000000001 R09: fffffbfff2ec9955
      [  171.225706] R10: ffffffff9764caa7 R11: fffffbfff2ec9954 R12: 0000000000007fca
      [  171.227211] R13: ffff8881066f4a7f R14: ffff8880aa6e3c00 R15: 0000000000000020
      [  171.228460] FS:  00007f8623719740(0000) GS:ffff88810be00000(0000) knlGS:0000000000000000
      [  171.230065] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  171.231303] CR2: 00007ffdab190a50 CR3: 00000001038ea006 CR4: 0000000000160ee0
      [  171.232586] Call Trace:
      [  171.233109]  <IRQ>
      [  171.233531] get_mapping_status (linux-mptcp/net/mptcp/subflow.c:691)
      [  171.234371] mptcp_subflow_data_available (linux-mptcp/net/mptcp/subflow.c:736 linux-mptcp/net/mptcp/subflow.c:832)
      [  171.238181] subflow_state_change (linux-mptcp/net/mptcp/subflow.c:1085 (discriminator 1))
      [  171.239066] tcp_fin (linux-mptcp/net/ipv4/tcp_input.c:4217)
      [  171.240123] tcp_data_queue (linux-mptcp/./include/linux/compiler.h:199 linux-mptcp/net/ipv4/tcp_input.c:4822)
      [  171.245083] tcp_rcv_established (linux-mptcp/./include/linux/skbuff.h:1785 linux-mptcp/./include/net/tcp.h:1774 linux-mptcp/./include/net/tcp.h:1847 linux-mptcp/net/ipv4/tcp_input.c:5238 linux-mptcp/net/ipv4/tcp_input.c:5730)
      [  171.254089] tcp_v4_rcv (linux-mptcp/./include/linux/spinlock.h:393 linux-mptcp/net/ipv4/tcp_ipv4.c:2009)
      [  171.258969] ip_protocol_deliver_rcu (linux-mptcp/net/ipv4/ip_input.c:204 (discriminator 1))
      [  171.260214] ip_local_deliver_finish (linux-mptcp/./include/linux/rcupdate.h:651 linux-mptcp/net/ipv4/ip_input.c:232)
      [  171.261389] ip_local_deliver (linux-mptcp/./include/linux/netfilter.h:307 linux-mptcp/./include/linux/netfilter.h:301 linux-mptcp/net/ipv4/ip_input.c:252)
      [  171.265884] ip_rcv (linux-mptcp/./include/linux/netfilter.h:307 linux-mptcp/./include/linux/netfilter.h:301 linux-mptcp/net/ipv4/ip_input.c:539)
      [  171.273666] process_backlog (linux-mptcp/./include/linux/rcupdate.h:651 linux-mptcp/net/core/dev.c:6135)
      [  171.275328] net_rx_action (linux-mptcp/net/core/dev.c:6572 linux-mptcp/net/core/dev.c:6640)
      [  171.280472] __do_softirq (linux-mptcp/./arch/x86/include/asm/jump_label.h:25 linux-mptcp/./include/linux/jump_label.h:200 linux-mptcp/./include/trace/events/irq.h:142 linux-mptcp/kernel/softirq.c:293)
      [  171.281379] do_softirq_own_stack (linux-mptcp/arch/x86/entry/entry_64.S:1083)
      [  171.282358]  </IRQ>
      
      We could address the issue clearing explicitly the relevant fields
      in several places - tcp_parse_option, tcp_fast_parse_options,
      possibly others.
      
      Instead we move the MPTCP option parsing into the already existing
      mptcp ingress hook, so that we need to clear the fields in a single
      place.
      
      This allows us dropping an MPTCP hook from the TCP code and
      removing the quite large mptcp_options_received from the tcp_sock
      struct. On the flip side, the MPTCP sockets will traverse the
      option space twice (in tcp_parse_option() and in
      mptcp_incoming_options(). That looks acceptable: we already
      do that for syn and 3rd ack packets, plain TCP socket will
      benefit from it, and even MPTCP sockets will experience better
      code locality, reducing the jumps between TCP and MPTCP code.
      
      v1 -> v2:
       - rebased on current '-net' tree
      
      Fixes: 648ef4b8
      
       ("mptcp: Implement MPTCP receive path")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfde141e
    • Paolo Abeni's avatar
      mptcp: consolidate synack processing. · 263e1201
      Paolo Abeni authored
      
      
      Currently the MPTCP code uses 2 hooks to process syn-ack
      packets, mptcp_rcv_synsent() and the sk_rx_dst_set()
      callback.
      
      We can drop the first, moving the relevant code into the
      latter, reducing the hooking into the TCP code. This is
      also needed by the next patch.
      
      v1 -> v2:
       - use local tcp sock ptr instead of casting the sk variable
         several times - DaveM
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      263e1201
  2. Apr 30, 2020
  3. Apr 29, 2020
    • YueHaibing's avatar
      net/x25: Fix null-ptr-deref in x25_disconnect · 8999dc89
      YueHaibing authored
      
      
      We should check null before do x25_neigh_put in x25_disconnect,
      otherwise may cause null-ptr-deref like this:
      
       #include <sys/socket.h>
       #include <linux/x25.h>
      
       int main() {
          int sck_x25;
          sck_x25 = socket(AF_X25, SOCK_SEQPACKET, 0);
          close(sck_x25);
          return 0;
       }
      
      BUG: kernel NULL pointer dereference, address: 00000000000000d8
      CPU: 0 PID: 4817 Comm: t2 Not tainted 5.7.0-rc3+ #159
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-
      RIP: 0010:x25_disconnect+0x91/0xe0
      Call Trace:
       x25_release+0x18a/0x1b0
       __sock_release+0x3d/0xc0
       sock_close+0x13/0x20
       __fput+0x107/0x270
       ____fput+0x9/0x10
       task_work_run+0x6d/0xb0
       exit_to_usermode_loop+0x102/0x110
       do_syscall_64+0x23c/0x260
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Reported-by: default avatar <syzbot+6db548b615e5aeefdce2@syzkaller.appspotmail.com>
      Fixes: 4becb7ee
      
       ("net/x25: Fix x25_neigh refcnt leak when x25 disconnect")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8999dc89
    • Gavin Shan's avatar
      net/ena: Fix build warning in ena_xdp_set() · caec6619
      Gavin Shan authored
      
      
      This fixes the following build warning in ena_xdp_set(), which is
      observed on aarch64 with 64KB page size.
      
         In file included from ./include/net/inet_sock.h:19,
            from ./include/net/ip.h:27,
            from drivers/net/ethernet/amazon/ena/ena_netdev.c:46:
         drivers/net/ethernet/amazon/ena/ena_netdev.c: In function         \
         ‘ena_xdp_set’:                                                    \
         drivers/net/ethernet/amazon/ena/ena_netdev.c:557:6: warning:      \
         format ‘%lu’                                                      \
         expects argument of type ‘long unsigned int’, but argument 4      \
         has type ‘int’                                                    \
         [-Wformat=] "Failed to set xdp program, the current MTU (%d) is   \
         larger than the maximum allowed MTU (%lu) while xdp is on",
      
      Signed-off-by: default avatarGavin Shan <gshan@redhat.com>
      Acked-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      caec6619
  4. Apr 28, 2020
  5. Apr 26, 2020
  6. Apr 25, 2020
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · ab51cac0
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix memory leak in netfilter flowtable, from Roi Dayan.
      
       2) Ref-count leaks in netrom and tipc, from Xiyu Yang.
      
       3) Fix warning when mptcp socket is never accepted before close, from
          Florian Westphal.
      
       4) Missed locking in ovs_ct_exit(), from Tonghao Zhang.
      
       5) Fix large delays during PTP synchornization in cxgb4, from Rahul
          Lakkireddy.
      
       6) team_mode_get() can hang, from Taehee Yoo.
      
       7) Need to use kvzalloc() when allocating fw tracer in mlx5 driver,
          from Niklas Schnelle.
      
       8) Fix handling of bpf XADD on BTF memory, from Jann Horn.
      
       9) Fix BPF_STX/BPF_B encoding in x86 bpf jit, from Luke Nelson.
      
      10) Missing queue memory release in iwlwifi pcie code, from Johannes
          Berg.
      
      11) Fix NULL deref in macvlan device event, from Taehee Yoo.
      
      12) Initialize lan87xx phy correctly, from Yuiko Oshino.
      
      13) Fix looping between VRF and XFRM lookups, from David Ahern.
      
      14) etf packet scheduler assumes all sockets are full sockets, which is
          not necessarily true. From Eric Dumazet.
      
      15) Fix mptcp data_fin handling in RX path, from Paolo Abeni.
      
      16) fib_select_default() needs to handle nexthop objects, from David
          Ahern.
      
      17) Use GFP_ATOMIC under spinlock in mac80211_hwsim, from Wei Yongjun.
      
      18) vxlan and geneve use wrong nlattr array, from Sabrina Dubroca.
      
      19) Correct rx/tx stats in bcmgenet driver, from Doug Berger.
      
      20) BPF_LDX zero-extension is encoded improperly in x86_32 bpf jit, fix
          from Luke Nelson.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (100 commits)
        selftests/bpf: Fix a couple of broken test_btf cases
        tools/runqslower: Ensure own vmlinux.h is picked up first
        bpf: Make bpf_link_fops static
        bpftool: Respect the -d option in struct_ops cmd
        selftests/bpf: Add test for freplace program with expected_attach_type
        bpf: Propagate expected_attach_type when verifying freplace programs
        bpf: Fix leak in LINK_UPDATE and enforce empty old_prog_fd
        bpf, x86_32: Fix logic error in BPF_LDX zero-extension
        bpf, x86_32: Fix clobbering of dst for BPF_JSET
        bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension
        bpf: Fix reStructuredText markup
        net: systemport: suppress warnings on failed Rx SKB allocations
        net: bcmgenet: suppress warnings on failed Rx SKB allocations
        macsec: avoid to set wrong mtu
        mac80211: sta_info: Add lockdep condition for RCU list usage
        mac80211: populate debugfs only after cfg80211 init
        net: bcmgenet: correct per TX/RX ring statistics
        net: meth: remove spurious copyright text
        net: phy: bcm84881: clear settings on link down
        chcr: Fix CPU hard lockup
        ...
      ab51cac0
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 167ff131
      David S. Miller authored
      
      
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2020-04-24
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 17 non-merge commits during the last 5 day(s) which contain
      a total of 19 files changed, 203 insertions(+), 85 deletions(-).
      
      The main changes are:
      
      1) link_update fix, from Andrii.
      
      2) libbpf get_xdp_id fix, from David.
      
      3) xadd verifier fix, from Jann.
      
      4) x86-32 JIT fixes, from Luke and Wang.
      
      5) test_btf fix, from Stanislav.
      
      6) freplace verifier fix, from Toke.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      167ff131
    • Stanislav Fomichev's avatar
      selftests/bpf: Fix a couple of broken test_btf cases · e1cebd84
      Stanislav Fomichev authored
      Commit 51c39bb1 ("bpf: Introduce function-by-function verification")
      introduced function linkage flag and changed the error message from
      "vlen != 0" to "Invalid func linkage" and broke some fake BPF programs.
      
      Adjust the test accordingly.
      
      AFACT, the programs don't really need any arguments and only look
      at BTF for maps, so let's drop the args altogether.
      
      Before:
      BTF raw test[103] (func (Non zero vlen)): do_test_raw:3703:FAIL expected
      err_str:vlen != 0
      magic: 0xeb9f
      version: 1
      flags: 0x0
      hdr_len: 24
      type_off: 0
      type_len: 72
      str_off: 72
      str_len: 10
      btf_total_size: 106
      [1] INT (anon) size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
      [2] INT (anon) size=4 bits_offset=0 nr_bits=32 encoding=(none)
      [3] FUNC_PROTO (anon) return=0 args=(1 a, 2 b)
      [4] FUNC func type_id=3 Invalid func linkage
      
      BTF libbpf test[1] (test_btf_haskv.o): libbpf: load bpf program failed:
      Invalid argument
      libbpf: -- BEGIN DUMP LOG ---
      libbpf:
      Validating test_long_fname_2() func#1...
      Arg#0 type PTR in test_long_fname_2() is not supported yet.
      processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0
      peak_states 0 mark_read 0
      
      libbpf: -- END LOG --
      libbpf: failed to load program 'dummy_tracepoint'
      libbpf: failed to load object 'test_btf_haskv.o'
      do_test_file:4201:FAIL bpf_object__load: -4007
      BTF libbpf test[2] (test_btf_newkv.o): libbpf: load bpf program failed:
      Invalid argument
      libbpf: -- BEGIN DUMP LOG ---
      libbpf:
      Validating test_long_fname_2() func#1...
      Arg#0 type PTR in test_long_fname_2() is not supported yet.
      processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0
      peak_states 0 mark_read 0
      
      libbpf: -- END LOG --
      libbpf: failed to load program 'dummy_tracepoint'
      libbpf: failed to load object 'test_btf_newkv.o'
      do_test_file:4201:FAIL bpf_object__load: -4007
      BTF libbpf test[3] (test_btf_nokv.o): libbpf: load bpf program failed:
      Invalid argument
      libbpf: -- BEGIN DUMP LOG ---
      libbpf:
      Validating test_long_fname_2() func#1...
      Arg#0 type PTR in test_long_fname_2() is not supported yet.
      processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0
      peak_states 0 mark_read 0
      
      libbpf: -- END LOG --
      libbpf: failed to load program 'dummy_tracepoint'
      libbpf: failed to load object 'test_btf_nokv.o'
      do_test_file:4201:FAIL bpf_object__load: -4007
      
      Fixes: 51c39bb1
      
       ("bpf: Introduce function-by-function verification")
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200422003753.124921-1-sdf@google.com
      e1cebd84