Skip to content
  1. Nov 09, 2019
  2. Nov 08, 2019
    • Hoang Le's avatar
      tipc: eliminate checking netns if node established · d408bef4
      Hoang Le authored
      Currently, we scan over all network namespaces at each received
      discovery message in order to check if the sending peer might be
      present in a host local namespaces.
      
      This is unnecessary since we can assume that a peer will not change its
      location during an established session.
      
      We now improve the condition for this testing so that we don't perform
      any redundant scans.
      
      Fixes: f73b1281
      
       ("tipc: improve throughput between nodes in netns")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d408bef4
    • Eric Dumazet's avatar
      net: add a READ_ONCE() in skb_peek_tail() · f8cc62ca
      Eric Dumazet authored
      
      
      skb_peek_tail() can be used without protection of a lock,
      as spotted by KCSAN [1]
      
      In order to avoid load-stearing, add a READ_ONCE()
      
      Note that the corresponding WRITE_ONCE() are already there.
      
      [1]
      BUG: KCSAN: data-race in sk_wait_data / skb_queue_tail
      
      read to 0xffff8880b36a4118 of 8 bytes by task 20426 on cpu 1:
       skb_peek_tail include/linux/skbuff.h:1784 [inline]
       sk_wait_data+0x15b/0x250 net/core/sock.c:2477
       kcm_wait_data+0x112/0x1f0 net/kcm/kcmsock.c:1103
       kcm_recvmsg+0xac/0x320 net/kcm/kcmsock.c:1130
       sock_recvmsg_nosec net/socket.c:871 [inline]
       sock_recvmsg net/socket.c:889 [inline]
       sock_recvmsg+0x92/0xb0 net/socket.c:885
       ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480
       do_recvmmsg+0x19a/0x5c0 net/socket.c:2601
       __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680
       __do_sys_recvmmsg net/socket.c:2703 [inline]
       __se_sys_recvmmsg net/socket.c:2696 [inline]
       __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      write to 0xffff8880b36a4118 of 8 bytes by task 451 on cpu 0:
       __skb_insert include/linux/skbuff.h:1852 [inline]
       __skb_queue_before include/linux/skbuff.h:1958 [inline]
       __skb_queue_tail include/linux/skbuff.h:1991 [inline]
       skb_queue_tail+0x7e/0xc0 net/core/skbuff.c:3145
       kcm_queue_rcv_skb+0x202/0x310 net/kcm/kcmsock.c:206
       kcm_rcv_strparser+0x74/0x4b0 net/kcm/kcmsock.c:370
       __strp_recv+0x348/0xf50 net/strparser/strparser.c:309
       strp_recv+0x84/0xa0 net/strparser/strparser.c:343
       tcp_read_sock+0x174/0x5c0 net/ipv4/tcp.c:1639
       strp_read_sock+0xd4/0x140 net/strparser/strparser.c:366
       do_strp_work net/strparser/strparser.c:414 [inline]
       strp_work+0x9a/0xe0 net/strparser/strparser.c:423
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 451 Comm: kworker/u4:3 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: kstrp strp_work
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8cc62ca
    • Eric Dumazet's avatar
      net: add annotations on hh->hh_len lockless accesses · c305c6ae
      Eric Dumazet authored
      
      
      KCSAN reported a data-race [1]
      
      While we can use READ_ONCE() on the read sides,
      we need to make sure hh->hh_len is written last.
      
      [1]
      
      BUG: KCSAN: data-race in eth_header_cache / neigh_resolve_output
      
      write to 0xffff8880b9dedcb8 of 4 bytes by task 29760 on cpu 0:
       eth_header_cache+0xa9/0xd0 net/ethernet/eth.c:247
       neigh_hh_init net/core/neighbour.c:1463 [inline]
       neigh_resolve_output net/core/neighbour.c:1480 [inline]
       neigh_resolve_output+0x415/0x470 net/core/neighbour.c:1470
       neigh_output include/net/neighbour.h:511 [inline]
       ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       NF_HOOK include/linux/netfilter.h:305 [inline]
       ndisc_send_skb+0x459/0x5f0 net/ipv6/ndisc.c:505
       ndisc_send_ns+0x207/0x430 net/ipv6/ndisc.c:647
       rt6_probe_deferred+0x98/0xf0 net/ipv6/route.c:615
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      read to 0xffff8880b9dedcb8 of 4 bytes by task 29572 on cpu 1:
       neigh_resolve_output net/core/neighbour.c:1479 [inline]
       neigh_resolve_output+0x113/0x470 net/core/neighbour.c:1470
       neigh_output include/net/neighbour.h:511 [inline]
       ip6_finish_output2+0x7a2/0xec0 net/ipv6/ip6_output.c:116
       __ip6_finish_output net/ipv6/ip6_output.c:142 [inline]
       __ip6_finish_output+0x2d7/0x330 net/ipv6/ip6_output.c:127
       ip6_finish_output+0x41/0x160 net/ipv6/ip6_output.c:152
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip6_output+0xf2/0x280 net/ipv6/ip6_output.c:175
       dst_output include/net/dst.h:436 [inline]
       NF_HOOK include/linux/netfilter.h:305 [inline]
       ndisc_send_skb+0x459/0x5f0 net/ipv6/ndisc.c:505
       ndisc_send_ns+0x207/0x430 net/ipv6/ndisc.c:647
       rt6_probe_deferred+0x98/0xf0 net/ipv6/route.c:615
       process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
       worker_thread+0xa0/0x800 kernel/workqueue.c:2415
       kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 29572 Comm: kworker/1:4 Not tainted 5.4.0-rc6+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events rt6_probe_deferred
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c305c6ae
    • David S. Miller's avatar
      Merge branch 'u64_stats_t' · 9dfd8714
      David S. Miller authored
      
      
      Eric Dumazet says:
      
      ====================
      net: introduce u64_stats_t
      
      KCSAN found a data-race in per-cpu u64 stats accounting.
      
      (The stack traces are included in the 8th patch :
       tun: switch to u64_stats_t)
      
      This patch series first consolidate code in five patches.
      Then the last three patches address the data-race resolution.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dfd8714
    • Eric Dumazet's avatar
      net: use u64_stats_t in struct pcpu_lstats · fd2f4737
      Eric Dumazet authored
      
      
      In order to fix the data-race found by KCSAN, we
      can use the new u64_stats_t type and its accessors instead
      of plain u64 fields. This will still generate optimal code
      for both 32 and 64 bit platforms.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd2f4737
    • Eric Dumazet's avatar
      tun: switch to u64_stats_t · 5260dd3e
      Eric Dumazet authored
      
      
      In order to fix this data-race found by KCSAN [1],
      switch to u64_stats_t helpers. They provide all
      the needed annotations, without adding extra cost.
      
      [1]
      BUG: KCSAN: data-race in tun_get_user / tun_net_get_stats64
      
      read to 0xffffe8ffffd8aca8 of 8 bytes by task 4882 on cpu 0:
       tun_net_get_stats64+0x9b/0x230 drivers/net/tun.c:1171
       dev_get_stats+0x89/0x1e0 net/core/dev.c:9103
       rtnl_fill_stats+0x56/0x370 net/core/rtnetlink.c:1177
       rtnl_fill_ifinfo+0xd3b/0x2100 net/core/rtnetlink.c:1667
       rtmsg_ifinfo_build_skb+0xb0/0x150 net/core/rtnetlink.c:3472
       rtmsg_ifinfo_event.part.0+0x4e/0xb0 net/core/rtnetlink.c:3504
       rtmsg_ifinfo_event net/core/rtnetlink.c:3515 [inline]
       rtmsg_ifinfo+0x85/0x90 net/core/rtnetlink.c:3513
       __dev_notify_flags+0x18b/0x200 net/core/dev.c:7649
       dev_change_flags+0xb8/0xe0 net/core/dev.c:7691
       dev_ifsioc+0x201/0x6a0 net/core/dev_ioctl.c:237
       dev_ioctl+0x149/0x660 net/core/dev_ioctl.c:489
       sock_do_ioctl+0xdb/0x230 net/socket.c:1061
       sock_ioctl+0x3a3/0x5e0 net/socket.c:1189
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:509 [inline]
       do_vfs_ioctl+0x991/0xc60 fs/ioctl.c:696
      
      write to 0xffffe8ffffd8aca8 of 8 bytes by task 4883 on cpu 1:
       tun_get_user+0x1d94/0x2ba0 drivers/net/tun.c:2002
       tun_chr_write_iter+0x79/0xd0 drivers/net/tun.c:2022
       call_write_iter include/linux/fs.h:1895 [inline]
       new_sync_write+0x388/0x4a0 fs/read_write.c:483
       __vfs_write+0xb1/0xc0 fs/read_write.c:496
       __kernel_write+0xb8/0x240 fs/read_write.c:515
       write_pipe_buf+0xb6/0xf0 fs/splice.c:794
       splice_from_pipe_feed fs/splice.c:500 [inline]
       __splice_from_pipe+0x248/0x480 fs/splice.c:624
       splice_from_pipe+0xbb/0x100 fs/splice.c:659
       default_file_splice_write+0x45/0x90 fs/splice.c:806
       do_splice_from fs/splice.c:848 [inline]
       direct_splice_actor+0xa0/0xc0 fs/splice.c:1020
       splice_direct_to_actor+0x215/0x510 fs/splice.c:975
       do_splice_direct+0x161/0x1e0 fs/splice.c:1063
       do_sendfile+0x384/0x7f0 fs/read_write.c:1464
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 4883 Comm: syz-executor.1 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5260dd3e
    • Eric Dumazet's avatar
      u64_stats: provide u64_stats_t type · 316580b6
      Eric Dumazet authored
      
      
      On 64bit arches, struct u64_stats_sync is empty and provides
      no help against load/store tearing.
      
      Using READ_ONCE()/WRITE_ONCE() would be needed.
      
      But the update side would be slightly more expensive.
      
      local64_t was defined so that we could use regular adds
      in a manner which is atomic wrt IRQs.
      
      However the u64_stats infra means we do not have to use
      local64_t on 32bit arches since the syncp provides the needed
      protection.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      316580b6
    • Eric Dumazet's avatar
      net: dummy: use standard dev_lstats_add() and dev_lstats_read() · 4a43b1f9
      Eric Dumazet authored
      
      
      This driver can simply use the common infrastructure instead
      of duplicating it.
      
      This cleanup will ease u64_stats_t adoption in a single location.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a43b1f9
    • Eric Dumazet's avatar
      vsockmon: use standard dev_lstats_add() and dev_lstats_read() · 4f77eb09
      Eric Dumazet authored
      
      
      This cleanup will ease u64_stats_t adoption in a single location.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f77eb09
    • Eric Dumazet's avatar
      veth: use standard dev_lstats_add() and dev_lstats_read() · b4fba476
      Eric Dumazet authored
      
      
      This cleanup will ease u64_stats_t adoption in a single location.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4fba476
    • Eric Dumazet's avatar
      net: nlmon: use standard dev_lstats_add() and dev_lstats_read() · 3ed91226
      Eric Dumazet authored
      
      
      No need to hand-code the exact same functions.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ed91226
    • Eric Dumazet's avatar
      net: provide dev_lstats_add() helper · dd5382a0
      Eric Dumazet authored
      
      
      Many network drivers need it and hand-coded the same function.
      
      In order to ease u64_stats_t adoption, it is time to factorize.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd5382a0
    • Eric Dumazet's avatar
      net: provide dev_lstats_read() helper · de7d5084
      Eric Dumazet authored
      
      
      Many network drivers use hand-coded implementation of the same thing,
      let's factorize things so that u64_stats_t adoption is done once.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de7d5084
    • David S. Miller's avatar
      Merge branch 'net-Demote-MTU-change-prints-to-debug' · 0f030bdb
      David S. Miller authored
      
      
      Florian Fainelli says:
      
      ====================
      net: Demote MTU change prints to debug
      
      This patch series demotes several drivers that printed MTU change and
      could therefore spam the kernel console if one has a test that it's all
      about testing the values. Intel drivers were not also particularly
      consistent in how they printed the same message, so now they are.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f030bdb
    • Florian Fainelli's avatar
      net: qcom/emac: Demote MTU change print to debug · 54093866
      Florian Fainelli authored
      
      
      Changing the MTU can be a frequent operation and it is already clear
      when (or not) a MTU change is successful, demote prints to debug prints.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: default avatarTimur Tabi <timur@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54093866
    • Florian Fainelli's avatar
      net: ethernet: intel: Demote MTU change prints to debug · 12299132
      Florian Fainelli authored
      
      
      Changing a network device MTU can be a fairly frequent operation, and
      failure to change the MTU is reflected to user-space properly, both by
      an appropriate message as well as by looking at whether the device's MTU
      matches the configuration.
      
      Demote the prints to debug prints by using netdev_dbg(), making all
      Intel wired LAN drivers consistent, since they used a mixture of PCI
      device and network device prints before.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12299132
    • Ivan Khoronzhuk's avatar
      ethernet: ti: cpts: use ktime_get_real_ns helper · 693bd8b7
      Ivan Khoronzhuk authored
      
      
      Update on more short variant for getting real clock in ns.
      
      Signed-off-by: default avatarIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      693bd8b7
    • David S. Miller's avatar
      Merge branch 'aquantia-next' · a9ae1683
      David S. Miller authored
      
      
      Igor Russkikh says:
      
      ====================
      Aquantia Marvell atlantic driver updates 11-2019
      
      Here is a bunch of atlantic driver new features and updates.
      
      Shortlist:
      - Me adding ethtool private flags for various loopback test modes,
      - Nikita is doing some work here on power management, implementing new PM API,
        He also did some checkpatch style cleanup of older driver parts.
      - I'm also adding a new UDP GSO offload support and flags for loopback activation
      - We are now Marvell, so I am changing email addresses on maintainers list.
      
      v2: styling, ip6 correct handling in udpgso
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9ae1683
    • Igor Russkikh's avatar
      net: atlantic: change email domains to Marvell · 362cabda
      Igor Russkikh authored
      
      
      Aquantia is now part of Marvell, eventually we'll cease standalone
      aquantia.com domain. Thus, change the maintainers file and some other
      references to @marvell.com domain
      
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      362cabda
    • Igor Russkikh's avatar
      net: atlantic: implement UDP GSO offload · 822cd114
      Igor Russkikh authored
      
      
      atlantic hardware does support UDP hardware segmentation offload.
      This allows user to specify one large contiguous buffer with data
      which then will be split automagically into multiple UDP packets
      of specified size.
      
      Bulk sending of large UDP streams lowers CPU usage and increases
      bandwidth.
      
      We did estimations both with udpgso_bench_tx test tool and with modified
      iperf3 measurement tool (4 streams, multithread, 200b packet size)
      over AQC<->AQC 10G link. Flow control is disabled to prevent RX side
      impact on measurements.
      
      No UDP GSO:
      	iperf3 -c 10.0.1.2 -u -b0 -l 200 -P4 --multithread
      UDP GSO:
      	iperf3 -c 10.0.1.2 -u -b0 -l 12600 --udp-lso 200 -P4 --multithread
      
      Mode          CPU   iperf speed    Line speed   Packets per second
      -------------------------------------------------------------
      NO UDP GSO    350%   3.07 Gbps      3.8 Gbps     1,919,419
      SW UDP GSO    200%   5.55 Gbps      6.4 Gbps     3,286,144
      HW UDP GSO    90%    6.80 Gbps      8.4 Gbps     4,273,117
      
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      822cd114
    • Nikita Danilov's avatar
      net: atlantic: update flow control logic · 8009bb19
      Nikita Danilov authored
      
      
      We now differentiate requested and negotiated flow control
      modes. Therefore `ethtool -A` now operates on local requested
      FC values, and regular link settings shows the negotiated FC
      settings.
      
      Signed-off-by: default avatarNikita Danilov <ndanilov@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8009bb19
    • Igor Russkikh's avatar
      net: atlantic: stylistic renames · ddef5526
      Igor Russkikh authored
      
      
      We are trying to follow the naming of the chip (atlantic), not
      company. So replace some old namings.
      
      Signed-off-by: default avatarNikita Danilov <ndanilov@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddef5526
    • Nikita Danilov's avatar
      net: atlantic: code style cleanup · 7b0c342f
      Nikita Danilov authored
      
      
      Thats a pure checkpatck walkthrough the code with no functional
      changes. Reverse christmas tree, spacing, etc.
      
      Signed-off-by: default avatarNikita Danilov <ndanilov@marvell.com>
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b0c342f
    • Igor Russkikh's avatar
      net: atlantic: loopback tests via private flags · ea4b4d7f
      Igor Russkikh authored
      
      
      Here we add a number of ethtool private flags
      to allow enabling various loopbacks on HW.
      
      Thats useful for verification and bringup works.
      
      Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea4b4d7f