Skip to content
  1. Apr 11, 2018
    • Martin Schwidefsky's avatar
      s390: correct nospec auto detection init order · 6a3d1e81
      Martin Schwidefsky authored
      
      
      With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via
      early_initcall is done *after* the early_param functions. This
      overwrites any settings done with the nobp/no_spectre_v2/spectre_v2
      parameters. The code patching for the kernel is done after the
      evaluation of the early parameters but before the early_initcall
      is done. The end result is a kernel image that is patched correctly
      but the kernel modules are not.
      
      Make sure that the nospec auto detection function is called before the
      early parameters are evaluated and before the code patching is done.
      
      Fixes: 6e179d64 ("s390: add automatic detection of the spectre defense")
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      6a3d1e81
    • Harald Freudenberger's avatar
      s390/zcrypt: Support up to 256 crypto adapters. · af4a7227
      Harald Freudenberger authored
      
      
      There was an artificial restriction on the card/adapter id
      to only 6 bits but all the AP commands do support adapter
      ids with 8 bit. This patch removes this restriction to 64
      adapters and now up to 256 adapter can get addressed.
      
      Some of the ioctl calls work on the max number of cards
      possible (which was 64). These ioctls are now deprecated
      but still supported. All the defines, structs and ioctl
      interface declarations have been kept for compabibility.
      There are now new ioctls (and defines for these) with an
      additional '2' appended which provide the extended versions
      with 256 cards supported.
      
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      af4a7227
  2. Apr 10, 2018
  3. Apr 09, 2018
    • haibinzhang(张海斌)'s avatar
      vhost-net: set packet weight of tx polling to 2 * vq size · a2ac9990
      haibinzhang(张海斌) authored
      
      
      handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
      polling udp packets with small length(e.g. 1byte udp payload), because setting
      VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.
      
      Ping-Latencies shown below were tested between two Virtual Machines using
      netperf (UDP_STREAM, len=1), and then another machine pinged the client:
      
      vq size=256
      Packet-Weight   Ping-Latencies(millisecond)
                         min      avg       max
      Origin           3.319   18.489    57.303
      64               1.643    2.021     2.552
      128              1.825    2.600     3.224
      256              1.997    2.710     4.295
      512              1.860    3.171     4.631
      1024             2.002    4.173     9.056
      2048             2.257    5.650     9.688
      4096             2.093    8.508    15.943
      
      vq size=512
      Packet-Weight   Ping-Latencies(millisecond)
                         min      avg       max
      Origin           6.537   29.177    66.245
      64               2.798    3.614     4.403
      128              2.861    3.820     4.775
      256              3.008    4.018     4.807
      512              3.254    4.523     5.824
      1024             3.079    5.335     7.747
      2048             3.944    8.201    12.762
      4096             4.158   11.057    19.985
      
      Seems pretty consistent, a small dip at 2 VQ sizes.
      Ring size is a hint from device about a burst size it can tolerate. Based on
      benchmarks, set the weight to 2 * vq size.
      
      To evaluate this change, another tests were done using netperf(RR, TX) between
      two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
      tweaked through qemu. Results shown below does not show obvious changes.
      
      vq size=256 TCP_RR                vq size=512 TCP_RR
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
         1/       1/  -7%/        -2%      1/       1/   0%/        -2%
         1/       4/  +1%/         0%      1/       4/  +1%/         0%
         1/       8/  +1%/        -2%      1/       8/   0%/        +1%
        64/       1/  -6%/         0%     64/       1/  +7%/        +3%
        64/       4/   0%/        +2%     64/       4/  -1%/        +1%
        64/       8/   0%/         0%     64/       8/  -1%/        -2%
       256/       1/  -3%/        -4%    256/       1/  -4%/        -2%
       256/       4/  +3%/        +4%    256/       4/  +1%/        +2%
       256/       8/  +2%/         0%    256/       8/  +1%/        -1%
      
      vq size=256 UDP_RR                vq size=512 UDP_RR
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
         1/       1/  -5%/        +1%      1/       1/  -3%/        -2%
         1/       4/  +4%/        +1%      1/       4/  -2%/        +2%
         1/       8/  -1%/        -1%      1/       8/  -1%/         0%
        64/       1/  -2%/        -3%     64/       1/  +1%/        +1%
        64/       4/  -5%/        -1%     64/       4/  +2%/         0%
        64/       8/   0%/        -1%     64/       8/  -2%/        +1%
       256/       1/  +7%/        +1%    256/       1/  -7%/         0%
       256/       4/  +1%/        +1%    256/       4/  -3%/        -4%
       256/       8/  +2%/        +2%    256/       8/  +1%/        +1%
      
      vq size=256 TCP_STREAM            vq size=512 TCP_STREAM
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
        64/       1/   0%/        -3%     64/       1/   0%/         0%
        64/       4/  +3%/        -1%     64/       4/  -2%/        +4%
        64/       8/  +9%/        -4%     64/       8/  -1%/        +2%
       256/       1/  +1%/        -4%    256/       1/  +1%/        +1%
       256/       4/  -1%/        -1%    256/       4/  -3%/         0%
       256/       8/  +7%/        +5%    256/       8/  -3%/         0%
       512/       1/  +1%/         0%    512/       1/  -1%/        -1%
       512/       4/  +1%/        -1%    512/       4/   0%/         0%
       512/       8/  +7%/        -5%    512/       8/  +6%/        -1%
      1024/       1/   0%/        -1%   1024/       1/   0%/        +1%
      1024/       4/  +3%/         0%   1024/       4/  +1%/         0%
      1024/       8/  +8%/        +5%   1024/       8/  -1%/         0%
      2048/       1/  +2%/        +2%   2048/       1/  -1%/         0%
      2048/       4/  +1%/         0%   2048/       4/   0%/        -1%
      2048/       8/  -2%/         0%   2048/       8/   5%/        -1%
      4096/       1/  -2%/         0%   4096/       1/  -2%/         0%
      4096/       4/  +2%/         0%   4096/       4/   0%/         0%
      4096/       8/  +9%/        -2%   4096/       8/  -5%/        -1%
      
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarHaibin Zhang <haibinzhang@tencent.com>
      Signed-off-by: default avatarYunfang Tai <yunfangtai@tencent.com>
      Signed-off-by: default avatarLidong Chen <lidongchen@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2ac9990
    • Vadim Lomovtsev's avatar
      net: thunderx: rework mac addresses list to u64 array · 9b5c4dfb
      Vadim Lomovtsev authored
      
      
      It is too expensive to pass u64 values via linked list, instead
      allocate array for them by overall number of mac addresses from netdev.
      
      This eventually removes multiple kmalloc() calls, aviod memory
      fragmentation and allow to put single null check on kmalloc
      return value in order to prevent a potential null pointer dereference.
      
      Addresses-Coverity-ID: 1467429 ("Dereference null return value")
      Fixes: 37c3347e ("net: thunderx: add ndo_set_rx_mode callback implementation for VF")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarVadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b5c4dfb
    • Eric Dumazet's avatar
      inetpeer: fix uninit-value in inet_getpeer · b6a37e5e
      Eric Dumazet authored
      
      
      syzbot/KMSAN reported that p->dtime was read while it was
      not yet initialized in :
      
      	delta = (__u32)jiffies - p->dtime;
      	if (delta < ttl || !refcount_dec_if_one(&p->refcnt))
      		gc_stack[i] = NULL;
      
      This is a false positive, because the inetpeer wont be erased
      from rb-tree if the refcount_dec_if_one(&p->refcnt) does not
      succeed. And this wont happen before first inet_putpeer() call
      for this inetpeer has been done, and ->dtime field is written
      exactly before the refcount_dec_and_test(&p->refcnt).
      
      The KMSAN report was :
      
      BUG: KMSAN: uninit-value in inet_peer_gc net/ipv4/inetpeer.c:163 [inline]
      BUG: KMSAN: uninit-value in inet_getpeer+0x1567/0x1e70 net/ipv4/inetpeer.c:228
      CPU: 0 PID: 9494 Comm: syz-executor5 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       inet_peer_gc net/ipv4/inetpeer.c:163 [inline]
       inet_getpeer+0x1567/0x1e70 net/ipv4/inetpeer.c:228
       inet_getpeer_v4 include/net/inetpeer.h:110 [inline]
       icmpv4_xrlim_allow net/ipv4/icmp.c:330 [inline]
       icmp_send+0x2b44/0x3050 net/ipv4/icmp.c:725
       ip_options_compile+0x237c/0x29f0 net/ipv4/ip_options.c:472
       ip_rcv_options net/ipv4/ip_input.c:284 [inline]
       ip_rcv_finish+0xda8/0x16d0 net/ipv4/ip_input.c:365
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493
       __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562
       __netif_receive_skb net/core/dev.c:4627 [inline]
       netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701
       netif_receive_skb+0x230/0x240 net/core/dev.c:4725
       tun_rx_batched drivers/net/tun.c:1555 [inline]
       tun_get_user+0x6d88/0x7580 drivers/net/tun.c:1962
       tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990
       do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
       do_iter_write+0x30d/0xd40 fs/read_write.c:932
       vfs_writev fs/read_write.c:977 [inline]
       do_writev+0x3c9/0x830 fs/read_write.c:1012
       SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
       SyS_writev+0x56/0x80 fs/read_write.c:1082
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x455111
      RSP: 002b:00007fae0365cba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 000000000000002e RCX: 0000000000455111
      RDX: 0000000000000001 RSI: 00007fae0365cbf0 RDI: 00000000000000fc
      RBP: 0000000020000040 R08: 00000000000000fc R09: 0000000000000000
      R10: 000000000000002e R11: 0000000000000293 R12: 00000000ffffffff
      R13: 0000000000000658 R14: 00000000006fc8e0 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
       kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756
       inet_getpeer+0xed8/0x1e70 net/ipv4/inetpeer.c:210
       inet_getpeer_v4 include/net/inetpeer.h:110 [inline]
       ip4_frag_init+0x4d1/0x740 net/ipv4/ip_fragment.c:153
       inet_frag_alloc net/ipv4/inet_fragment.c:369 [inline]
       inet_frag_create net/ipv4/inet_fragment.c:385 [inline]
       inet_frag_find+0x7da/0x1610 net/ipv4/inet_fragment.c:418
       ip_find net/ipv4/ip_fragment.c:275 [inline]
       ip_defrag+0x448/0x67a0 net/ipv4/ip_fragment.c:676
       ip_check_defrag+0x775/0xda0 net/ipv4/ip_fragment.c:724
       packet_rcv_fanout+0x2a8/0x8d0 net/packet/af_packet.c:1447
       deliver_skb net/core/dev.c:1897 [inline]
       deliver_ptype_list_skb net/core/dev.c:1912 [inline]
       __netif_receive_skb_core+0x314a/0x4a80 net/core/dev.c:4545
       __netif_receive_skb net/core/dev.c:4627 [inline]
       netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701
       netif_receive_skb+0x230/0x240 net/core/dev.c:4725
       tun_rx_batched drivers/net/tun.c:1555 [inline]
       tun_get_user+0x6d88/0x7580 drivers/net/tun.c:1962
       tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990
       do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
       do_iter_write+0x30d/0xd40 fs/read_write.c:932
       vfs_writev fs/read_write.c:977 [inline]
       do_writev+0x3c9/0x830 fs/read_write.c:1012
       SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
       SyS_writev+0x56/0x80 fs/read_write.c:1082
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6a37e5e
    • Russell King's avatar
    • Esben Haabendal's avatar
      dp83640: Ensure against premature access to PHY registers after reset · 76327a35
      Esben Haabendal authored
      
      
      The datasheet specifies a 3uS pause after performing a software
      reset. The default implementation of genphy_soft_reset() does not
      provide this, so implement soft_reset with the needed pause.
      
      Signed-off-by: default avatarEsben Haabendal <eha@deif.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76327a35