Skip to content
  1. Nov 07, 2019
    • Michael Walle's avatar
      net: phy: at803x: fix Kconfig description · 4985dffc
      Michael Walle authored
      
      
      The name of the PHY is actually AR803x not AT803x. Additionally, add the
      name of the vendor and mention the AR8031 support.
      
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4985dffc
    • Eric Dumazet's avatar
      tcp: fix data-race in tcp_recvmsg() · a5a7daa5
      Eric Dumazet authored
      Reading tp->recvmsg_inq after socket lock is released
      raises a KCSAN warning [1]
      
      Replace has_tss & has_cmsg by cmsg_flags and make
      sure to not read tp->recvmsg_inq a second time.
      
      [1]
      BUG: KCSAN: data-race in tcp_chrono_stop / tcp_recvmsg
      
      write to 0xffff888126adef24 of 2 bytes by interrupt on cpu 0:
       tcp_chrono_set net/ipv4/tcp_output.c:2309 [inline]
       tcp_chrono_stop+0x14c/0x280 net/ipv4/tcp_output.c:2338
       tcp_clean_rtx_queue net/ipv4/tcp_input.c:3165 [inline]
       tcp_ack+0x274f/0x3170 net/ipv4/tcp_input.c:3688
       tcp_rcv_established+0x37e/0xf50 net/ipv4/tcp_input.c:5696
       tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
       tcp_v4_rcv+0x19dc/0x1bb0 net/ipv4/tcp_ipv4.c:1942
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124
       netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5214
       napi_skb_finish net/core/dev.c:5677 [inline]
       napi_gro_receive+0x28f/0x330 net/core/dev.c:5710
      
      read to 0xffff888126adef25 of 1 bytes by task 7275 on cpu 1:
       tcp_recvmsg+0x77b/0x1a30 net/ipv4/tcp.c:2187
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec net/socket.c:871 [inline]
       sock_recvmsg net/socket.c:889 [inline]
       sock_recvmsg+0x92/0xb0 net/socket.c:885
       sock_read_iter+0x15f/0x1e0 net/socket.c:967
       call_read_iter include/linux/fs.h:1889 [inline]
       new_sync_read+0x389/0x4f0 fs/read_write.c:414
       __vfs_read+0xb1/0xc0 fs/read_write.c:427
       vfs_read fs/read_write.c:461 [inline]
       vfs_read+0x143/0x2c0 fs/read_write.c:446
       ksys_read+0xd5/0x1b0 fs/read_write.c:587
       __do_sys_read fs/read_write.c:597 [inline]
       __se_sys_read fs/read_write.c:595 [inline]
       __x64_sys_read+0x4c/0x60 fs/read_write.c:595
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 7275 Comm: sshd Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: b75eba76
      
       ("tcp: send in-queue bytes in cmsg upon read")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5a7daa5
    • Eric Dumazet's avatar
      net: silence data-races on sk_backlog.tail · 9ed498c6
      Eric Dumazet authored
      
      
      sk->sk_backlog.tail might be read without holding the socket spinlock,
      we need to add proper READ_ONCE()/WRITE_ONCE() to silence the warnings.
      
      KCSAN reported :
      
      BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg
      
      write to 0xffff8881265109f8 of 8 bytes by interrupt on cpu 1:
       __sk_add_backlog include/net/sock.h:907 [inline]
       sk_add_backlog include/net/sock.h:938 [inline]
       tcp_add_backlog+0x476/0xce0 net/ipv4/tcp_ipv4.c:1759
       tcp_v4_rcv+0x1a70/0x1bd0 net/ipv4/tcp_ipv4.c:1947
       ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204
       ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
       dst_input include/net/dst.h:442 [inline]
       ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
       NF_HOOK include/linux/netfilter.h:305 [inline]
       NF_HOOK include/linux/netfilter.h:299 [inline]
       ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
       __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:4929
       __netif_receive_skb+0x37/0xf0 net/core/dev.c:5043
       netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5133
       napi_skb_finish net/core/dev.c:5596 [inline]
       napi_gro_receive+0x28f/0x330 net/core/dev.c:5629
       receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
       virtnet_receive drivers/net/virtio_net.c:1323 [inline]
       virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
       napi_poll net/core/dev.c:6311 [inline]
       net_rx_action+0x3ae/0xa90 net/core/dev.c:6379
       __do_softirq+0x115/0x33f kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0xbb/0xe0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       do_IRQ+0xa6/0x180 arch/x86/kernel/irq.c:263
       ret_from_intr+0x0/0x19
       native_safe_halt+0xe/0x10 arch/x86/kernel/paravirt.c:71
       arch_cpu_idle+0x1f/0x30 arch/x86/kernel/process.c:571
       default_idle_call+0x1e/0x40 kernel/sched/idle.c:94
       cpuidle_idle_call kernel/sched/idle.c:154 [inline]
       do_idle+0x1af/0x280 kernel/sched/idle.c:263
       cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355
       start_secondary+0x208/0x260 arch/x86/kernel/smpboot.c:264
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
      
      read to 0xffff8881265109f8 of 8 bytes by task 8057 on cpu 0:
       tcp_recvmsg+0x46e/0x1b40 net/ipv4/tcp.c:2050
       inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec net/socket.c:871 [inline]
       sock_recvmsg net/socket.c:889 [inline]
       sock_recvmsg+0x92/0xb0 net/socket.c:885
       sock_read_iter+0x15f/0x1e0 net/socket.c:967
       call_read_iter include/linux/fs.h:1889 [inline]
       new_sync_read+0x389/0x4f0 fs/read_write.c:414
       __vfs_read+0xb1/0xc0 fs/read_write.c:427
       vfs_read fs/read_write.c:461 [inline]
       vfs_read+0x143/0x2c0 fs/read_write.c:446
       ksys_read+0xd5/0x1b0 fs/read_write.c:587
       __do_sys_read fs/read_write.c:597 [inline]
       __se_sys_read fs/read_write.c:595 [inline]
       __x64_sys_read+0x4c/0x60 fs/read_write.c:595
       do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 8057 Comm: syz-fuzzer Not tainted 5.4.0-rc6+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ed498c6
    • Ioana Ciornei's avatar
      dpaa2-eth: fix an always true condition in dpaa2_mac_get_if_mode · 226df3ef
      Ioana Ciornei authored
      Convert the phy_mode() function to return the if_mode through an
      argument, similar to the new form of of_get_phy_mode().
      This will help with handling errors in a common manner and also will fix
      an always true condition.
      
      Fixes: 0c65b2b9
      
       ("net: of_get_phy_mode: Change API to solve int/unit warnings")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      226df3ef
    • Tonghao Zhang's avatar
      net: openvswitch: select vport upcall portid directly · 90ce9f23
      Tonghao Zhang authored
      
      
      The commit 69c51582ff786 ("dpif-netlink: don't allocate per
      thread netlink sockets"), in Open vSwitch ovs-vswitchd, has
      changed the number of allocated sockets to just one per port
      by moving the socket array from a per handler structure to
      a per datapath one. In the kernel datapath, a vport will have
      only one socket in most case, if so select it directly in
      fast-path.
      
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90ce9f23
    • Wei Yongjun's avatar
      net: axienet: Fix error return code in axienet_probe() · eb34e98b
      Wei Yongjun authored
      In the DMA memory resource get failed case, the error is not
      set and 0 will be returned. Fix it by removing redundant check
      since devm_ioremap_resource() will handle it.
      
      Fixes: 28ef9ebd
      
       ("net: axienet: make use of axistream-connected attribute optional")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb34e98b
    • Wei Yongjun's avatar
      net: aquantia: fix return value check in aq_ptp_init() · 1dcff44a
      Wei Yongjun authored
      
      
      Function ptp_clock_register() returns ERR_PTR() and never returns
      NULL. The NULL test should be removed.
      
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Acked-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1dcff44a
    • Wei Yongjun's avatar
      ptp: ptp_clockmatrix: Fix missing unlock on error in idtcm_probe() · b97fa0b5
      Wei Yongjun authored
      Add the missing unlock before return from function idtcm_probe()
      in the error handling case.
      
      Fixes: 3a6ba7dc
      
       ("ptp: Add a ptp clock driver for IDT ClockMatrix.")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Reviewed-by: default avatarVincent Cheng <vincent.cheng.xh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b97fa0b5
    • Tuong Lien's avatar
      tipc: eliminate the dummy packet in link synching · d0d605c5
      Tuong Lien authored
      When preparing tunnel packets for the link failover or synchronization,
      as for the safe algorithm, we added a dummy packet on the pair link but
      never sent it out. In the case of failover, the pair link will be reset
      anyway. But for link synching, it will always result in retransmission
      of the dummy packet after that.
      We have also observed that such the retransmission at the early stage
      when a new node comes in a large cluster will take some time and hard
      to be done, leading to the repeated retransmit failures and the link is
      reset.
      
      Since in commit 4929a932
      
       ("tipc: optimize link synching mechanism")
      we have already built a dummy 'TUNNEL_PROTOCOL' message on the new link
      for the synchronization, there's no need for the dummy on the pair one,
      this commit will skip it when the new mechanism takes in place. In case
      nothing exists in the pair link's transmq, the link synching will just
      start and stop shortly on the peer side.
      
      The patch is backward compatible.
      
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Tested-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0d605c5
    • David S. Miller's avatar
      Merge branch 'lwtunnel-add-ip-and-ip6-options-setting-and-dumping' · 3924f72a
      David S. Miller authored
      
      
      Xin Long says:
      
      ====================
      lwtunnel: add ip and ip6 options setting and dumping
      
      With this patchset, users can configure options by ip route encap
      for geneve, vxlan and ersapn lwtunnel, like:
      
        # ip r a 1.1.1.0/24 encap ip id 1 geneve class 0 type 0 \
          data "1212121234567890" dst 10.1.0.2 dev geneve1
      
        # ip r a 1.1.1.0/24 encap ip id 1 vxlan gbp 456 \
          dst 10.1.0.2 dev erspan1
      
        # ip r a 1.1.1.0/24 encap ip id 1 erspan ver 1 idx 123 \
          dst 10.1.0.2 dev erspan1
      
      iproute side patch is attached on the reply of this mail.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3924f72a
    • Xin Long's avatar
      lwtunnel: add options setting and dumping for erspan · b0a21810
      Xin Long authored
      
      
      Based on the code framework built on the last patch, to
      support setting and dumping for vxlan, we only need to
      add ip_tun_parse_opts_erspan() for .build_state and
      ip_tun_fill_encap_opts_erspan() for .fill_encap and
      if (tun_flags & TUNNEL_ERSPAN_OPT) for .get_encap_size.
      
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0a21810
    • Xin Long's avatar
      lwtunnel: add options setting and dumping for vxlan · edf31cbb
      Xin Long authored
      
      
      Based on the code framework built on the last patch, to
      support setting and dumping for vxlan, we only need to
      add ip_tun_parse_opts_vxlan() for .build_state and
      ip_tun_fill_encap_opts_vxlan() for .fill_encap and
      if (tun_flags & TUNNEL_VXLAN_OPT) for .get_encap_size.
      
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edf31cbb
    • Xin Long's avatar
      lwtunnel: add options setting and dumping for geneve · 4ece4778
      Xin Long authored
      
      
      To add options setting and dumping, .build_state(), .fill_encap() and
      .get_encap_size() in ip_tun_lwt_ops needs to be extended:
      
      ip_tun_build_state():
        ip_tun_parse_opts():
          ip_tun_parse_opts_geneve()
      
      ip_tun_fill_encap_info():
        ip_tun_fill_encap_opts():
          ip_tun_fill_encap_opts_geneve()
      
      ip_tun_encap_nlsize()
         ip_tun_opts_nlsize():
           if (tun_flags & TUNNEL_GENEVE_OPT)
      
      ip_tun_parse_opts(), ip_tun_fill_encap_opts() and ip_tun_opts_nlsize()
      processes LWTUNNEL_IP_OPTS.
      
      ip_tun_parse_opts_geneve(), ip_tun_fill_encap_opts_geneve() and
      if (tun_flags & TUNNEL_GENEVE_OPT) processes LWTUNNEL_IP_OPTS_GENEVE.
      
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ece4778
    • Xin Long's avatar
      lwtunnel: add options process for cmp_encap · 0eb8eb2f
      Xin Long authored
      When comparing two tun_info, dst_cache member should have been skipped,
      as dst_cache is a per cpu pointer and they are always different values
      even in two tun_info with the same keys.
      
      So this patch is to skip dst_cache member and compare the key, mode and
      options_len only. For the future opts setting support, also to compare
      options.
      
      Fixes: 2d798499
      
       ("lwtunnel: ip tunnel: fix multiple routes with different encap")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eb8eb2f
    • Xin Long's avatar
      lwtunnel: add options process for arp request · f52f11ec
      Xin Long authored
      Without options copied to the dst tun_info in iptunnel_metadata_reply()
      called by arp_process for handling arp_request, the generated arp_reply
      packet may be dropped or sent out with wrong options for some tunnels
      like erspan and vxlan, and the traffic will break.
      
      Fixes: 63d008a4
      
       ("ipv4: send arp replies to the correct tunnel")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f52f11ec
    • Hoang Le's avatar
      tipc: reduce sensitive to retransmit failures · 426071f1
      Hoang Le authored
      With huge cluster (e.g >200nodes), the amount of that flow:
      gap -> retransmit packet -> acked will take time in case of STATE_MSG
      dropped/delayed because a lot of traffic. This lead to 1.5 sec tolerance
      value criteria made link easy failure around 2nd, 3rd of failed
      retransmission attempts.
      
      Instead of re-introduced criteria of 99 faled retransmissions to fix the
      issue, we increase failure detection timer to ten times tolerance value.
      
      Fixes: 77cf8edb
      
       ("tipc: simplify stale link failure criteria")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Acked-by: Jon
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      426071f1
    • Hoang Le's avatar
      tipc: update cluster capabilities if node deleted · 6708ef77
      Hoang Le authored
      
      
      There are two improvements when re-calculate cluster capabilities:
      
      - When deleting a specific down node, need to re-calculate.
      - In tipc_node_cleanup(), do not need to re-calculate if node
      is still existing in cluster.
      
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Acked-by: Jon
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6708ef77
    • Francesco Ruggeri's avatar
      selftest: net: add some traceroute tests · 3c28d99f
      Francesco Ruggeri authored
      
      
      Added the following traceroute tests.
      
      IPV6:
      Verify that in this scenario
      
             ------------------------ N2
              |                    |
            ------              ------  N3  ----
            | R1 |              | R2 |------|H2|
            ------              ------      ----
              |                    |
             ------------------------ N1
                       |
                      ----
                      |H1|
                      ----
      
      where H1's default route goes through R1 and R1's default route goes
      through R2 over N2, traceroute6 from H1 to H2 reports R2's address
      on N2 and not N1.
      
      IPV4:
      Verify that traceroute from H1 to H2 shows 1.0.1.1 in this scenario
      
                         1.0.3.1/24
      ---- 1.0.1.3/24    1.0.1.1/24 ---- 1.0.2.1/24    1.0.2.4/24 ----
      |H1|--------------------------|R1|--------------------------|H2|
      ----            N1            ----            N2            ----
      
      where net.ipv4.icmp_errors_use_inbound_ifaddr is set on R1 and
      1.0.3.1/24 and 1.0.1.1/24 are respectively R1's primary and secondary
      address on N1.
      
      v2: fixed some typos, and have bridge in R1 instead of R2 in IPV6 test.
      
      Signed-off-by: default avatarFrancesco Ruggeri <fruggeri@arista.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c28d99f
    • David S. Miller's avatar
      Merge branch 'net-various-KCSAN-inspired-fixes' · 3edcc568
      David S. Miller authored
      
      
      Eric Dumazet says:
      
      ====================
      net: various KCSAN inspired fixes
      
      This is a series of minor fixes, mostly dealing with
      lockless accesses to socket 'sk_ack_backlog', 'sk_max_ack_backlog'
      ane neighbour 'confirmed' fields.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3edcc568
    • Eric Dumazet's avatar
      net: annotate lockless accesses to sk->sk_max_ack_backlog · 099ecf59
      Eric Dumazet authored
      
      
      sk->sk_max_ack_backlog can be read without any lock being held
      at least in TCP/DCCP cases.
      
      We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing
      and/or potential KCSAN warnings.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      099ecf59
    • Eric Dumazet's avatar
      net: annotate lockless accesses to sk->sk_ack_backlog · 288efe86
      Eric Dumazet authored
      
      
      sk->sk_ack_backlog can be read without any lock being held.
      We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing
      and/or potential KCSAN warnings.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      288efe86
    • Eric Dumazet's avatar
      net: use helpers to change sk_ack_backlog · 7976a11b
      Eric Dumazet authored
      
      
      Writers are holding a lock, but many readers do not.
      
      Following patch will add appropriate barriers in
      sk_acceptq_removed() and sk_acceptq_added().
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7976a11b
    • Eric Dumazet's avatar
      net: avoid potential false sharing in neighbor related code · 25c7a6d1
      Eric Dumazet authored
      
      
      There are common instances of the following construct :
      
      	if (n->confirmed != now)
      		n->confirmed = now;
      
      A C compiler could legally remove the conditional.
      
      Use READ_ONCE()/WRITE_ONCE() to avoid this problem.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25c7a6d1
    • Eric Dumazet's avatar
      inet_diag: use jiffies_delta_to_msecs() · 3828a93f
      Eric Dumazet authored
      
      
      Use jiffies_delta_to_msecs() to avoid reporting 'infinite'
      timeouts and to cleanup code.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3828a93f
    • Eric Dumazet's avatar
      net: neigh: use long type to store jiffies delta · 9d027e3a
      Eric Dumazet authored
      A difference of two unsigned long needs long storage.
      
      Fixes: c7fb64db
      
       ("[NETLINK]: Neighbour table configuration and statistics via rtnetlink")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d027e3a
    • Roman Mashak's avatar
      tc-testing: updated pedit TDC tests · 71c780f1
      Roman Mashak authored
      
      
      Added tests for u8/u32 clear value, u8/16 retain value, u16/32 invert value,
      u8/u16/u32 preserve value and test for negative offsets.
      
      Signed-off-by: default avatarRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71c780f1
    • Jakub Kicinski's avatar
      selftests: devlink: undo changes at the end of resource_test · 462ef975
      Jakub Kicinski authored
      The netdevsim object is reused by all the tests, but the resource
      tests puts it into a broken state (failed reload in a different
      namespace). Make sure it's fixed up at the end of that test
      otherwise subsequent tests fail.
      
      Fixes: b74c37fd
      
       ("selftests: netdevsim: add tests for devlink reload with resources")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      462ef975
    • Claudiu Manoil's avatar
      gianfar: Maximize Rx buffer size · a9b97286
      Claudiu Manoil authored
      
      
      Until now the size of a Rx buffer was artificially limited
      to 1536B (which happens to be the default, after reset, hardware
      value for a Rx buffer). This approach however leaves unused
      memory space for Rx packets, since the driver uses a paged
      allocation scheme that reserves half a page for each Rx skb.
      There's also the inconvenience that frames around 1536 bytes
      can get scattered if the limit is slightly exceeded. This limit
      can be exceeded even for standard MTU of 1500B traffic, for common
      cases like stacked VLANs, or DSA tags.
      To address these issues, let's just compute the buffer size
      starting from the upper limit of 2KB (half a page) and
      subtract the skb overhead and alignment restrictions.
      
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9b97286
    • Yunfeng Ye's avatar
      ehea: replace with page_shift() in ehea_is_hugepage() · 9439bb0f
      Yunfeng Ye authored
      The function page_shift() is supported after the commit 94ad9338
      
      
      ("mm: introduce page_shift()").
      
      So replace with page_shift() in ehea_is_hugepage() for readability.
      
      Signed-off-by: default avatarYunfeng Ye <yeyunfeng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9439bb0f
    • Zhu Yanjun's avatar
      net: forcedeth: add xmit_more support · 5d8876e2
      Zhu Yanjun authored
      This change adds support for xmit_more based on the igb commit 6f19e12f
      ("igb: flush when in xmit_more mode and under descriptor pressure") and
      commit 6b16f9ee
      
       ("net: move skb->xmit_more hint to softnet data") that
      were made to igb to support this feature. The function netif_xmit_stopped
      is called to check whether transmit queue on device is currently unable to
      send to determine whether we must write the tail because we can add no
      further buffers.
      
      When normal packets and/or xmit_more packets fill up tx_desc, it is
      necessary to trigger NIC tx reg.
      
      Following the advice from David Miller and Jakub Kicinski, after the
      xmit_more feature is added, the following scenario will occur.
      
               |
         xmit_more packets
               |
         DMA_MAPPING
               |
         DMA_MAPPING error check
               |
         xmit_more packets already in HW xmit queue
               |
      
      In the above scenario, if DMA_MAPPING error occurrs, the xmit_more packets
      already in HW xmit queue will also be dropped. This is different from the
      behavior before xmit_more feature. So it is necessary to trigger NIC HW tx
      reg in the above scenario.
      
      To the non-xmit_more packets, the above scenario will not occur.
      
      Tested:
        - pktgen (xmit_more packets) SMP x86_64 ->
          Test command:
          ./pktgen_sample03_burst_single_flow.sh ... -b 8 -n 1000000
          Test results:
          Params:
          ...
          burst: 8
          ...
          Result: OK: 12194004(c12188996+d5007) usec, 1000001 (1500byte,0frags)
          82007pps 984Mb/sec (984084000bps) errors: 0
      
        - iperf (normal packets) SMP x86_64 ->
          Test command:
          Server: iperf -s
          Client: iperf -c serverip
          Result:
          TCP window size: 85.0 KByte (default)
          ------------------------------------------------------------
          [ ID] Interval       Transfer     Bandwidth
          [  3]  0.0-10.0 sec  1.10 GBytes   942 Mbits/sec
      
      CC: Joe Jin <joe.jin@oracle.com>
      CC: JUNXIAO_BI <junxiao.bi@oracle.com>
      Reported-and-tested-by: default avatarNan san <nan.1986san@gmail.com>
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d8876e2
    • David S. Miller's avatar
      Merge branch 'netdevsim-fix-tests-and-netdevsim' · fb90ab6b
      David S. Miller authored
      
      
      Jakub Kicinski says:
      
      ====================
      netdevsim: fix tests and netdevsim
      
      The first patch fixes a merge which brought back some dead
      code. Next a tiny re-write of the main test using netdevsim
      aims to ease debugging.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb90ab6b
    • Jakub Kicinski's avatar
      selftests: bpf: log direct file writes · acceca8d
      Jakub Kicinski authored
      
      
      Recent changes to netdevsim moved creating and destroying
      devices from netlink to sysfs. The sysfs writes have been
      implemented as direct writes, without shelling out. This
      is faster, but leaves no trace in the logs. Add explicit
      logs to make debugging possible.
      
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acceca8d
    • Jakub Kicinski's avatar
      netdevsim: drop code duplicated by a merge · bfcccfe7
      Jakub Kicinski authored
      Looks like the port adding loop makes a re-appearance on net-next
      after net was merged back into it (even though it doesn't feature
      in the merge diff).
      
      The ports are already added in nsim_dev_create() so when we try
      to add them again get EEXIST, and see:
      
      netdevsim: probe of netdevsim0 failed with error -17
      
      in the logs. When we remove the loop again the nsim_dev_probe()
      and nsim_dev_remove() become a wrapper of nsim_dev_create() and
      nsim_dev_destroy(). Remove this layer of indirection.
      
      Fixes: d31e9558 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net"
      
      )
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfcccfe7
  2. Nov 06, 2019