Skip to content
  1. Feb 23, 2017
    • Eric Dumazet's avatar
      net/mlx4_en: Use __skb_fill_page_desc() · 7f0137e2
      Eric Dumazet authored
      Or we might miss the fact that a page was allocated from memory reserves.
      
      Fixes: dceeab0e
      
       ("mlx4: support __GFP_MEMALLOC for rx")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f0137e2
    • Jack Morgenstein's avatar
      net/mlx4_core: Use cq quota in SRIOV when creating completion EQs · 6ed63d84
      Jack Morgenstein authored
      When creating EQs to handle CQ completion events for the PF
      or for VFs, we create enough EQE entries to handle completions
      for the max number of CQs that can use that EQ.
      
      When SRIOV is activated, the max number of CQs a VF (or the PF) can
      obtain is its CQ quota (determined by the Hypervisor resource tracker).
      Therefore, when creating an EQ, the number of EQE entries that the VF
      should request for that EQ is the CQ quota value (and not the total
      number of CQs available in the FW).
      
      Under SRIOV, the PF, also must use its CQ quota, because
      the resource tracker also controls how many CQs the PF can obtain.
      
      Using the FW total CQs instead of the CQ quota when creating EQs resulted
      wasting MTT entries, due to allocating more EQEs than were needed.
      
      Fixes: 5a0d0a61
      
       ("mlx4: Structures and init/teardown for VF resource quotas")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Reported-by: default avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ed63d84
    • Majd Dibbiny's avatar
      net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs · 95f1ba9a
      Majd Dibbiny authored
      In the VF driver, module parameter mlx4_log_num_mgm_entry_size was
      mistakenly overwritten -- and in a manner which overrode the
      device-managed flow steering option encoded in the parameter.
      
      log_num_mgm_entry_size is a global module parameter which
      affects all ConnectX-3 PFs installed on that host.
      If a VF changes log_num_mgm_entry_size, this will affect all PFs
      which are probed subsequent to the change (by disabling DMFS for
      those PFs).
      
      Fixes: 3c439b55
      
       ("mlx4_core: Allow choosing flow steering mode")
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95f1ba9a
    • Eugenia Emantayev's avatar
      net/mlx4: Spoofcheck and zero MAC can't coexist · 745d8ae4
      Eugenia Emantayev authored
      Spoofcheck can't be enabled if VF MAC is zero.
      Vice versa, can't zero MAC if spoofcheck is on.
      
      Fixes: 8f7ba3ca
      
       ('net/mlx4: Add set VF mac address support')
      Signed-off-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      745d8ae4
    • Or Gerlitz's avatar
      net/mlx4: Change ENOTSUPP to EOPNOTSUPP · 423b3aec
      Or Gerlitz authored
      
      
      As ENOTSUPP is specific to NFS, change the return error value to
      EOPNOTSUPP in various places in the mlx4 driver.
      
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Suggested-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      423b3aec
    • Dmitry V. Levin's avatar
      uapi: fix linux/rds.h userspace compilation errors · c12f4d76
      Dmitry V. Levin authored
      Consistently use types from linux/types.h to fix the following
      linux/rds.h userspace compilation errors:
      
      /usr/include/linux/rds.h:198:2: error: unknown type name 'u8'
        u8 rx_traces;
      /usr/include/linux/rds.h:199:2: error: unknown type name 'u8'
        u8 rx_trace_pos[RDS_MSG_RX_DGRAM_TRACE_MAX];
      /usr/include/linux/rds.h:203:2: error: unknown type name 'u8'
        u8 rx_traces;
      /usr/include/linux/rds.h:204:2: error: unknown type name 'u8'
        u8 rx_trace_pos[RDS_MSG_RX_DGRAM_TRACE_MAX];
      /usr/include/linux/rds.h:205:2: error: unknown type name 'u64'
        u64 rx_trace[RDS_MSG_RX_DGRAM_TRACE_MAX];
      
      Fixes: 3289025a
      
       ("RDS: add receive message trace used by application")
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c12f4d76
    • Dmitry V. Levin's avatar
      uapi: fix linux/seg6.h and linux/seg6_iptunnel.h userspace compilation errors · ea3ebc73
      Dmitry V. Levin authored
      Include <linux/in6.h> in uapi/linux/seg6.h to fix the following
      linux/seg6.h userspace compilation error:
      
      /usr/include/linux/seg6.h:31:18: error: array type has incomplete element type 'struct in6_addr'
        struct in6_addr segments[0];
      
      Include <linux/seg6.h> in uapi/linux/seg6_iptunnel.h to fix
      the following linux/seg6_iptunnel.h userspace compilation error:
      
      /usr/include/linux/seg6_iptunnel.h:26:21: error: array type has incomplete element type 'struct ipv6_sr_hdr'
        struct ipv6_sr_hdr srh[0];
      
      Fixes: a50a05f4
      
       ("ipv6: sr: add missing Kbuild export for header files")
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea3ebc73
    • Jiri Pirko's avatar
      lib: Remove string from parman config selection · 50ab3af1
      Jiri Pirko authored
      
      
      As reported by Geert, remove the string so the user does not see this
      config option. The option is explicitly selected only as a dependency of
      in-kernel users.
      
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Fixes: 44091d29
      
       ("lib: Introduce priority array area manager")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50ab3af1
    • Zhu Yanjun's avatar
      forcedeth: Remove return from a void function · ca92aea9
      Zhu Yanjun authored
      
      
      In a void function, it is not necessary to append a return statement in it.
      
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca92aea9
    • Colin Ian King's avatar
      bpf: fix spelling mistake: "proccessed" -> "processed" · bc1750f3
      Colin Ian King authored
      
      
      trivial fix to spelling mistake in verbose log message
      
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc1750f3
    • Dmitry V. Levin's avatar
      uapi: fix linux/llc.h userspace compilation error · 40df93be
      Dmitry V. Levin authored
      
      
      Include <linux/if.h> to fix the following linux/llc.h userspace
      compilation error:
      
      /usr/include/linux/llc.h:26:27: error: 'IFHWADDRLEN' undeclared here (not in a function)
        unsigned char   sllc_mac[IFHWADDRLEN];
      
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40df93be
    • Dmitry V. Levin's avatar
      uapi: fix linux/ip6_tunnel.h userspace compilation errors · 557d7acd
      Dmitry V. Levin authored
      
      
      Include <linux/if.h> and <linux/in6.h> to fix the following
      linux/ip6_tunnel.h userspace compilation errors:
      
      /usr/include/linux/ip6_tunnel.h:23:12: error: 'IFNAMSIZ' undeclared here (not in a function)
        char name[IFNAMSIZ]; /* name of tunnel device */
      /usr/include/linux/ip6_tunnel.h:30:18: error: field 'laddr' has incomplete type
        struct in6_addr laddr; /* local tunnel end-point address */
      
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      557d7acd
    • David S. Miller's avatar
      Merge branch 'mlx5-fixes' · 79873fb6
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox mlx5e fixes for 4.11-rc1
      
      This series includes some important bug fixes for mlx5e driver.
      
      Three misc fixes:
      From Mohamad, compilation fix on s390 system
      From Me, A fix for driver unload when switchdev mode is on.
      From Tariq, HW LRO frag size optimization for when build_skb is not used
      (striding RQ mode).
      
      Three CQE compression related fixes:
      Two fixes from Tariq and I, to correctly setup CQE compression
      parameters on driver load and on arbitrary user modifications.
      Last patch, fixes a very critical issue that was originally reported
      by Tom, where the driver reported csum errors or even page ref issues
      for when cqe compression is enabled and rapidly active.
      
      For your convenience this series was generated on top of net-next branch:
      005c3490
      
       ('Revert "ath10k: Search SMBIOS for OEM board file extension"')
      
      for -stable:
      net/mlx5e: Register/unregister vport representors on interface (for kernel >= 4.9)
      net/mlx5e: Do not reduce LRO WQE size when not using build_skb (for kernel >= 4.9)
      net/mlx5e: Fix broken CQE compression initialization (for kernel >= 4.9)
      net/mlx5e: Update MPWQE stride size when modifying CQE compress state (for kernel >= 4.7)
      net/mlx5e: Fix wrong CQE decompression (for kernel >= 4.7)
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79873fb6
    • Tariq Toukan's avatar
      net/mlx5e: Fix wrong CQE decompression · 36154be4
      Tariq Toukan authored
      In cqe compression with striding RQ, the decompression of the CQE field
      wqe_counter was done with a wrong wraparound value.
      This caused handling cqes with a wrong pointer to wqe (rx descriptor)
      and creating SKBs with wrong data, pointing to wrong (and already consumed)
      strides/pages.
      
      The meaning of the CQE field wqe_counter in striding RQ holds the
      stride index instead of the WQE index. Hence, when decompressing
      a CQE, wqe_counter should have wrapped-around the number of strides
      in a single multi-packet WQE.
      
      We dropped this wrap-around mask at all in CQE decompression of striding
      RQ. It is not needed as in such cases the CQE compression session would
      break because of different value of wqe_id field, starting a new
      compression session.
      
      Tested:
       ethtool -K ethxx lro off/on
       ethtool --set-priv-flags ethxx rx_cqe_compress on
       super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
       verified no csum errors and no page refcount issues.
      
      Fixes: 7219ab34
      
       ("net/mlx5e: CQE compression")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reported-by: default avatarTom Herbert <tom@herbertland.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36154be4
    • Saeed Mahameed's avatar
      net/mlx5e: Update MPWQE stride size when modifying CQE compress state · 6dc4b54e
      Saeed Mahameed authored
      When the admin enables/disables cqe compression, updating
      mpwqe stride size is required:
          CQE compress ON  ==> stride size = 256B
          CQE compress OFF ==> stride size = 64B
      
      This is already done on driver load via mlx5e_set_rq_type_params, all we
      need is just to call it on arbitrary admin changes of cqe compression
      state via priv flags or when changing timestamping state
      (as it is mutually exclusive with cqe compression).
      
      This bug introduces no functional damage, it only makes cqe compression
      occur less often, since in ConnectX4-LX CQE compression is performed
      only on packets smaller than stride size.
      
      Tested:
       ethtool --set-priv-flags ethxx rx_cqe_compress on
       pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
       verify `ethtool -S ethxx | grep compress` are advancing more often
       (rapidly)
      
      Fixes: 7219ab34
      
       ("net/mlx5e: CQE compression")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6dc4b54e
    • Tariq Toukan's avatar
      net/mlx5e: Fix broken CQE compression initialization · b0d4660b
      Tariq Toukan authored
      Some of RQ type parameters are derived from CQE compression state flag,
      CQE compression flag was initialized only after RQ type parameters
      setup. This leads to load RQ with stride size smaller than what we
      want for when CQE compression is on.
      
      This bug introduces no functional damage, it only makes CQE compression
      occur less often, since in ConnectX4-LX CQE compression is performed
      only on packets smaller than stride size.
      
      Fix this by marking default status of CQE compression in PFLAG prior to
      calling mlx5e_set_rq_priv_params(), as it inits some fields based on it.
      
      Tested:
       load driver on systems where rx CQE compress will be on (MH)
       pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
       verify `ethtool -S ethxx | grep compress` are advancing more often
       (rapidly)
      
      Fixes: 2fc4bfb7
      
       ("net/mlx5e: Dynamic RQ type infrastructure")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Cc: kernel-team@fb.com
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0d4660b
    • Tariq Toukan's avatar
      net/mlx5e: Do not reduce LRO WQE size when not using build_skb · 4078e637
      Tariq Toukan authored
      When rq_type is Striding RQ, no room of SKB_RESERVE is needed
      as SKB allocation is not done via build_skb.
      
      Fixes: e4b85508
      
       ("net/mlx5e: Slightly reduce hardware LRO size")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4078e637
    • Saeed Mahameed's avatar
      net/mlx5e: Register/unregister vport representors on interface attach/detach · 6f08a22c
      Saeed Mahameed authored
      Currently vport representors are added only on driver load and removed on
      driver unload.  Apparently we forgot to handle them when we added the
      seamless reset flow feature.  This caused to leave the representors
      netdevs alive and active with open HW resources on pci shutdown and on
      error reset flows.
      
      To overcome this we move their handling to interface attach/detach, so
      they would be cleaned up on shutdown and recreated on reset flows.
      
      Fixes: 26e59d80
      
       ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarHadar Hen Zion <hadarh@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f08a22c
    • Mohamad Haj Yahia's avatar
      net/mlx5e: s390 system compilation fix · 18bcf742
      Mohamad Haj Yahia authored
      Add necessary headers include for s390 arch compilation.
      
      Fixes: e586b3b0 ("net/mlx5: Ethernet Datapath files")
      Fixes: d605d668
      
       ("net/mlx5e: Add support for ethtool self..")
      Signed-off-by: default avatarMohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18bcf742
    • Alexey Kodanev's avatar
      tcp: account for ts offset only if tsecr not zero · eee2faab
      Alexey Kodanev authored
      We can get SYN with zero tsecr, don't apply offset in this case.
      
      Fixes: ee684b6f
      
       ("tcp: send packets with a socket timestamp")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eee2faab
    • Alexey Kodanev's avatar
      tcp: setup timestamp offset when write_seq already set · 00355fa5
      Alexey Kodanev authored
      Found that when randomized tcp offsets are enabled (by default)
      TCP client can still start new connections without them. Later,
      if server does active close and re-uses sockets in TIME-WAIT
      state, new SYN from client can be rejected on PAWS check inside
      tcp_timewait_state_process(), because either tw_ts_recent or
      rcv_tsval doesn't really have an offset set.
      
      Here is how to reproduce it with LTP netstress tool:
          netstress -R 1 &
          netstress -H 127.0.0.1 -lr 1000000 -a1
      
          [...]
          < S  seq 1956977072 win 43690 TS val 295618 ecr 459956970
          > .  ack 1956911535 win 342 TS val 459967184 ecr 1547117608
          < R  seq 1956911535 win 0 length 0
      +1. < S  seq 1956977072 win 43690 TS val 296640 ecr 459956970
          > S. seq 657450664 ack 1956977073 win 43690 TS val 459968205 ecr 296640
      
      Fixes: 95a22cae
      
       ("tcp: randomize tcp timestamp offsets for each connection")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00355fa5
    • Andrey Ryabinin's avatar
      net/dccp: fix use after free in tw_timer_handler() · ec7cb62d
      Andrey Ryabinin authored
      DCCP doesn't purge timewait sockets on network namespace shutdown.
      So, after net namespace destroyed we could still have an active timer
      which will trigger use after free in tw_timer_handler():
      
          BUG: KASAN: use-after-free in tw_timer_handler+0x4a/0xa0 at addr ffff88010e0d1e10
          Read of size 8 by task swapper/1/0
          Call Trace:
           __asan_load8+0x54/0x90
           tw_timer_handler+0x4a/0xa0
           call_timer_fn+0x127/0x480
           expire_timers+0x1db/0x2e0
           run_timer_softirq+0x12f/0x2a0
           __do_softirq+0x105/0x5b4
           irq_exit+0xdd/0xf0
           smp_apic_timer_interrupt+0x57/0x70
           apic_timer_interrupt+0x90/0xa0
      
          Object at ffff88010e0d1bc0, in cache net_namespace size: 6848
          Allocated:
           save_stack_trace+0x1b/0x20
           kasan_kmalloc+0xee/0x180
           kasan_slab_alloc+0x12/0x20
           kmem_cache_alloc+0x134/0x310
           copy_net_ns+0x8d/0x280
           create_new_namespaces+0x23f/0x340
           unshare_nsproxy_namespaces+0x75/0xf0
           SyS_unshare+0x299/0x4f0
           entry_SYSCALL_64_fastpath+0x18/0xad
          Freed:
           save_stack_trace+0x1b/0x20
           kasan_slab_free+0xae/0x180
           kmem_cache_free+0xb4/0x350
           net_drop_ns+0x3f/0x50
           cleanup_net+0x3df/0x450
           process_one_work+0x419/0xbb0
           worker_thread+0x92/0x850
           kthread+0x192/0x1e0
           ret_from_fork+0x2e/0x40
      
      Add .exit_batch hook to dccp_v4_ops()/dccp_v6_ops() which will purge
      timewait sockets on net namespace destruction and prevent above issue.
      
      Fixes: f2bf415c
      
       ("mib: add net to NET_ADD_STATS_BH")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec7cb62d
    • Dmitry V. Levin's avatar
      uapi: fix linux/if.h userspace compilation errors · 2618be7d
      Dmitry V. Levin authored
      
      
      Include <sys/socket.h> (guarded by ifndef __KERNEL__) to fix
      the following linux/if.h userspace compilation errors:
      
      /usr/include/linux/if.h:234:19: error: field 'ifru_addr' has incomplete type
         struct sockaddr ifru_addr;
      /usr/include/linux/if.h:235:19: error: field 'ifru_dstaddr' has incomplete type
         struct sockaddr ifru_dstaddr;
      /usr/include/linux/if.h:236:19: error: field 'ifru_broadaddr' has incomplete type
         struct sockaddr ifru_broadaddr;
      /usr/include/linux/if.h:237:19: error: field 'ifru_netmask' has incomplete type
         struct sockaddr ifru_netmask;
      /usr/include/linux/if.h:238:20: error: field 'ifru_hwaddr' has incomplete type
         struct  sockaddr ifru_hwaddr;
      
      This also fixes userspace compilation of the following uapi headers:
        linux/atmbr2684.h
        linux/gsmmux.h
        linux/if_arp.h
        linux/if_bonding.h
        linux/if_frad.h
        linux/if_pppox.h
        linux/if_tunnel.h
        linux/netdevice.h
        linux/route.h
        linux/wireless.h
      
      As no uapi header provides a definition of struct sockaddr, inclusion
      of <sys/socket.h> seems to be the most conservative and the only safe
      fix available.
      
      All current users of <linux/if.h> are very likely to be including
      <sys/socket.h> already because the latter is the sole provider
      of struct sockaddr definition in libc, so adding a uapi header
      with a definition of struct sockaddr would create a potential
      conflict with <sys/socket.h>.
      
      Replacing struct sockaddr in the definition of struct ifreq with
      a different type would create a potential incompatibility with current
      users of struct ifreq who might rely on ifru_addr et al members being
      of type struct sockaddr.
      
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2618be7d
    • Ridge Kennedy's avatar
      l2tp: Avoid schedule while atomic in exit_net · 12d656af
      Ridge Kennedy authored
      While destroying a network namespace that contains a L2TP tunnel a
      "BUG: scheduling while atomic" can be observed.
      
      Enabling lockdep shows that this is happening because l2tp_exit_net()
      is calling l2tp_tunnel_closeall() (via l2tp_tunnel_delete()) from
      within an RCU critical section.
      
      l2tp_exit_net() takes rcu_read_lock_bh()
        << list_for_each_entry_rcu() >>
        l2tp_tunnel_delete()
          l2tp_tunnel_closeall()
            __l2tp_session_unhash()
              synchronize_rcu() << Illegal inside RCU critical section >>
      
      BUG: sleeping function called from invalid context
      in_atomic(): 1, irqs_disabled(): 0, pid: 86, name: kworker/u16:2
      INFO: lockdep is turned off.
      CPU: 2 PID: 86 Comm: kworker/u16:2 Tainted: G        W  O    4.4.6-at1 #2
      Hardware name: Xen HVM domU, BIOS 4.6.1-xs125300 05/09/2016
      Workqueue: netns cleanup_net
       0000000000000000 ffff880202417b90 ffffffff812b0013 ffff880202410ac0
       ffffffff81870de8 ffff880202417bb8 ffffffff8107aee8 ffffffff81870de8
       0000000000000c51 0000000000000000 ffff880202417be0 ffffffff8107b024
      Call Trace:
       [<ffffffff812b0013>] dump_stack+0x85/0xc2
       [<ffffffff8107aee8>] ___might_sleep+0x148/0x240
       [<ffffffff8107b024>] __might_sleep+0x44/0x80
       [<ffffffff810b21bd>] synchronize_sched+0x2d/0xe0
       [<ffffffff8109be6d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff8105c7bb>] ? __local_bh_enable_ip+0x6b/0xc0
       [<ffffffff816a1b00>] ? _raw_spin_unlock_bh+0x30/0x40
       [<ffffffff81667482>] __l2tp_session_unhash+0x172/0x220
       [<ffffffff81667397>] ? __l2tp_session_unhash+0x87/0x220
       [<ffffffff8166888b>] l2tp_tunnel_closeall+0x9b/0x140
       [<ffffffff81668c74>] l2tp_tunnel_delete+0x14/0x60
       [<ffffffff81668dd0>] l2tp_exit_net+0x110/0x270
       [<ffffffff81668d5c>] ? l2tp_exit_net+0x9c/0x270
       [<ffffffff815001c3>] ops_exit_list.isra.6+0x33/0x60
       [<ffffffff81501166>] cleanup_net+0x1b6/0x280
       ...
      
      This bug can easily be reproduced with a few steps:
      
       $ sudo unshare -n bash  # Create a shell in a new namespace
       # ip link set lo up
       # ip addr add 127.0.0.1 dev lo
       # ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 tunnel_id 1 \
          peer_tunnel_id 1 udp_sport 50000 udp_dport 50000
       # ip l2tp add session name foo tunnel_id 1 session_id 1 \
          peer_session_id 1
       # ip link set foo up
       # exit  # Exit the shell, in turn exiting the namespace
       $ dmesg
       ...
       [942121.089216] BUG: scheduling while atomic: kworker/u16:3/13872/0x00000200
       ...
      
      To fix this, move the call to l2tp_tunnel_closeall() out of the RCU
      critical section, and instead call it from l2tp_tunnel_del_work(), which
      is running from the l2tp_wq workqueue.
      
      Fixes: 2b551c6e
      
       ("l2tp: close sessions before initiating tunnel delete")
      Signed-off-by: default avatarRidge Kennedy <ridge.kennedy@alliedtelesis.co.nz>
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12d656af
    • Bhumika Goyal's avatar
      qlogic: netxen: constify bin_attribute structures · ff292458
      Bhumika Goyal authored
      
      
      Declare bin_attribute structures as const as they are only passed as an
      arguments to the functions device_remove_bin_file and
      device_create_bin_file. These function arguments are of type const, so
      bin_attribute structures having this property can be made const too.
      Done using Coccinelle:
      
      @r1 disable optional_qualifier @
      identifier i;
      position p;
      @@
      static struct bin_attribute i@p = {...};
      
      @ok1@
      identifier r1.i;
      position p,p1;
      @@
      (
      device_remove_bin_file(...,&i@p)
      |
      device_create_bin_file(..., &i@p1)
      )
      
      @bad@
      position p!={r1.p,ok1.p,ok1.p1};
      identifier r1.i;
      @@
      i@p
      
      @depends on !bad disable optional_qualifier@
      identifier r1.i;
      @@
      +const
      struct bin_attribute i;
      
      Signed-off-by: default avatarBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff292458
    • Bhumika Goyal's avatar
      qlogic: qlcnic_sysfs: constify bin_attribute structures · 0ccea221
      Bhumika Goyal authored
      
      
      Declare bin_attribute structures as const as they are only passed as an
      arguments to the functions device_remove_bin_file and
      device_create_bin_file. These function arguments are of type const, so
      bin_attribute structures having this property can be made const too.
      Done using Coccinelle:
      
      @r1 disable optional_qualifier @
      identifier i;
      position p;
      @@
      static struct bin_attribute i@p = {...};
      
      @ok1@
      identifier r1.i;
      position p,p1;
      @@
      (
      device_remove_bin_file(...,&i@p)
      |
      device_create_bin_file(..., &i@p1)
      )
      
      @bad@
      position p!={r1.p,ok1.p,ok1.p1};
      identifier r1.i;
      @@
      i@p
      
      @depends on !bad disable optional_qualifier@
      identifier r1.i;
      @@
      +const
      struct bin_attribute i;
      
      Signed-off-by: default avatarBhumika Goyal <bhumirks@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ccea221
    • Christian Lamparter's avatar
      net: emac: add support for device-tree based PHY discovery and setup · a577ca6b
      Christian Lamparter authored
      
      
      This patch adds glue-code that allows the EMAC driver to interface
      with the existing dt-supported PHYs in drivers/net/phy.
      
      Because currently, the emac driver maintains a small library of
      supported phys for in a private phy.c file located in the drivers
      directory.
      
      The support is limited to mostly single ethernet transceiver like the:
      CIS8201, BCM5248, ET1011C, Marvell 88E1111 and 88E1112, AR8035.
      
      However, routers like the Netgear WNDR4700 and Cisco Meraki MX60(W)
      have a 5-port switch (AR8327N) attached to the EMAC. The switch chip
      is supported by the qca8k mdio driver, which uses the generic phy
      library. Another reason is that PHYLIB also supports the BCM54610,
      which was used for the Western Digital My Book Live.
      
      This will now also make EMAC select PHYLIB.
      
      Signed-off-by: default avatarChristian Lamparter <chunkeey@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a577ca6b
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · ca78d317
      Linus Torvalds authored
      Pull arm64 updates from Will Deacon:
       - Errata workarounds for Qualcomm's Falkor CPU
       - Qualcomm L2 Cache PMU driver
       - Qualcomm SMCCC firmware quirk
       - Support for DEBUG_VIRTUAL
       - CPU feature detection for userspace via MRS emulation
       - Preliminary work for the Statistical Profiling Extension
       - Misc cleanups and non-critical fixes
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits)
        arm64/kprobes: consistently handle MRS/MSR with XZR
        arm64: cpufeature: correctly handle MRS to XZR
        arm64: traps: correctly handle MRS/MSR with XZR
        arm64: ptrace: add XZR-safe regs accessors
        arm64: include asm/assembler.h in entry-ftrace.S
        arm64: fix warning about swapper_pg_dir overflow
        arm64: Work around Falkor erratum 1003
        arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2
        arm64: arch_timer: document Hisilicon erratum 161010101
        arm64: use is_vmalloc_addr
      ...
      ca78d317
    • Linus Torvalds's avatar
      Merge tag 'arc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · a4ee7bac
      Linus Torvalds authored
      Pull ARC updates from Vineet Gupta:
      
       - Intc imporvements [Yuriy]
      
       - VDK platform updates [Alexey]
      
      * tag 'arc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: [plat-*] ARC_HAS_COH_CACHES no longer relevant
        ARCv2: intc: Delete useless comments in Device Trees
        ARCv2: IDU-intc: Delete deprecated parameters in Device Trees
        ARCv2: IDU-intc: mask all common interrupts by default
        ARCv2: IDU-intc: Use build registers for getting numbers of interrupts
        ARCv2: intc: Set default priority for all core interrupts
        ARCv2: intc: Use runtime value of irq count for setting up intc
        ARCv2: intc: Rework the build time irq count information
        ARC: [intc-*]: confine NR_CPU_IRQS to intc code
        ARCv2: intc: Use ARC_REG_STATUS32 for addressing STATUS32 reg
        arc: vdk: Add support of UIO
        arc: vdk: Add support of MMC controller
        arc: vdk: Disable halt on reset
      a4ee7bac
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 38705613
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
       "Highlights include:
      
         - Support for direct mapped LPC on POWER9, giving Linux direct access
           to devices that may be on there such as a UART.
      
         - Memory hotplug support for the Power9 Radix MMU.
      
         - Add new AUX vectors describing the processor's cache geometry, to
           be used by glibc.
      
         - The ability for a guest to ask the hypervisor to resize the guest's
           hash table, and in addition support for doing so automatically when
           memory is hotplugged into/out-of the guest. This allows the hash
           table to be sized based on the current memory usage of the guest,
           rather than the maximum possible memory usage.
      
         - Implementation of optprobes (kprobe optimisation) for powerpc.
      
        In addition there's the topic branch shared with the KVM tree, which
        includes support for guests to use the Radix MMU on Power9.
      
        Thanks to:
          Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton
          Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens,
          Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin
          Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan,
          Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot,
          Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza
          Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun"
      
      * tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits)
        powerpc/mm/radix: Skip ptesync in pte update helpers
        powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
        powerpc/mm/radix: Update pte update sequence for pte clear case
        powerpc/mm: Update PROTFAULT handling in the page fault path
        powerpc/xmon: Fix data-breakpoint
        powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y
        powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
        powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n
        powerpc/pseries: Fix typo in parameter description
        powerpc/kprobes: Remove kprobe_exceptions_notify()
        kprobes: Introduce weak variant of kprobe_exceptions_notify()
        powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL
        powerpc/powernv: Fix opal_exit tracepoint opcode
        powerpc: Add a prototype for mcount() so it can be versioned
        powerpc: Drop GPL from of_node_to_nid() export to match other arches
        powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
        powerpc/kprobes: Implement Optprobes
        powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
        powerpc: Add helper to check if offset is within relative branch range
        powerpc/bpf: Introduce __PPC_SH64()
        ...
      38705613
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · ff47d8c0
      Linus Torvalds authored
      Pull s390 updates from Martin Schwidefsky:
      
       - New entropy generation for the pseudo random number generator.
      
       - Early boot printk output via sclp to help debug crashes on boot. This
         needs to be enabled with a kernel parameter.
      
       - Add proper no-execute support with a bit in the page table entry.
      
       - Bug fixes and cleanups.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (65 commits)
        s390/syscall: fix single stepped system calls
        s390/zcrypt: make ap_bus explicitly non-modular
        s390/zcrypt: Removed unneeded debug feature directory creation.
        s390: add missing "do {} while (0)" loop constructs to multiline macros
        s390/mm: add cond_resched call to kernel page table dumper
        s390: get rid of MACHINE_HAS_PFMF and MACHINE_HAS_HPAGE
        s390/mm: make memory_block_size_bytes available for !MEMORY_HOTPLUG
        s390: replace ACCESS_ONCE with READ_ONCE
        s390: Audit and remove any remaining unnecessary uses of module.h
        s390: mm: Audit and remove any unnecessary uses of module.h
        s390: kernel: Audit and remove any unnecessary uses of module.h
        s390/kdump: Use "LINUX" ELF note name instead of "CORE"
        s390: add no-execute support
        s390: report new vector facilities
        s390: use correct input data address for setup_randomness
        s390/sclp: get rid of common response code handling
        s390/sclp: don't add new lines to each printed string
        s390/sclp: make early sclp code readable
        s390/sclp: disable early sclp code as soon as the base sclp driver is active
        s390/sclp: move early printk code to drivers
        ...
      ff47d8c0
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · 3051bf36
      Linus Torvalds authored
      Pull networking updates from David Miller:
       "Highlights:
      
         1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
            Varadhan.
      
         2) Simplify classifier state on sk_buff in order to shrink it a bit.
            From Willem de Bruijn.
      
         3) Introduce SIPHASH and it's usage for secure sequence numbers and
            syncookies. From Jason A. Donenfeld.
      
         4) Reduce CPU usage for ICMP replies we are going to limit or
            suppress, from Jesper Dangaard Brouer.
      
         5) Introduce Shared Memory Communications socket layer, from Ursula
            Braun.
      
         6) Add RACK loss detection and allow it to actually trigger fast
            recovery instead of just assisting after other algorithms have
            triggered it. From Yuchung Cheng.
      
         7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.
      
         8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.
      
         9) Export MPLS packet stats via netlink, from Robert Shearman.
      
        10) Significantly improve inet port bind conflict handling, especially
            when an application is restarted and changes it's setting of
            reuseport. From Josef Bacik.
      
        11) Implement TX batching in vhost_net, from Jason Wang.
      
        12) Extend the dummy device so that VF (virtual function) features,
            such as configuration, can be more easily tested. From Phil
            Sutter.
      
        13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
            Dumazet.
      
        14) Add new bpf MAP, implementing a longest prefix match trie. From
            Daniel Mack.
      
        15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.
      
        16) Add new aquantia driver, from David VomLehn.
      
        17) Add bpf tracepoints, from Daniel Borkmann.
      
        18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
            Florian Fainelli.
      
        19) Remove custom busy polling in many drivers, it is done in the core
            networking since 4.5 times. From Eric Dumazet.
      
        20) Support XDP adjust_head in virtio_net, from John Fastabend.
      
        21) Fix several major holes in neighbour entry confirmation, from
            Julian Anastasov.
      
        22) Add XDP support to bnxt_en driver, from Michael Chan.
      
        23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.
      
        24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.
      
        25) Support GRO in IPSEC protocols, from Steffen Klassert"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
        Revert "ath10k: Search SMBIOS for OEM board file extension"
        net: socket: fix recvmmsg not returning error from sock_error
        bnxt_en: use eth_hw_addr_random()
        bpf: fix unlocking of jited image when module ronx not set
        arch: add ARCH_HAS_SET_MEMORY config
        net: napi_watchdog() can use napi_schedule_irqoff()
        tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
        net/hsr: use eth_hw_addr_random()
        net: mvpp2: enable building on 64-bit platforms
        net: mvpp2: switch to build_skb() in the RX path
        net: mvpp2: simplify MVPP2_PRS_RI_* definitions
        net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
        net: mvpp2: remove unused register definitions
        net: mvpp2: simplify mvpp2_bm_bufs_add()
        net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
        net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
        net: mvpp2: release reference to txq_cpu[] entry after unmapping
        net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
        net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
        net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
        ...
      3051bf36
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugins-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1e74a2eb
      Linus Torvalds authored
      Pull gcc-plugins updates from Kees Cook:
       "This includes infrastructure updates and the structleak plugin, which
        performs forced initialization of certain structures to avoid possible
        information exposures to userspace.
      
        Summary:
      
         - infrastructure updates (gcc-common.h)
      
         - introduce structleak plugin for forced initialization of some
           structures"
      
      * tag 'gcc-plugins-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: Add structleak for more stack initialization
        gcc-plugins: consolidate on PASS_INFO macro
        gcc-plugins: add PASS_INFO and build_const_char_string()
      1e74a2eb
  2. Feb 22, 2017
    • Kees Cook's avatar
    • Kees Cook's avatar
    • Linus Torvalds's avatar
      Merge tag 'rodata-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 7bb03382
      Linus Torvalds authored
      Pull rodata updates from Kees Cook:
       "This renames the (now inaccurate) DEBUG_RODATA and related
        SET_MODULE_RONX configs to the more sensible STRICT_KERNEL_RWX and
        STRICT_MODULE_RWX"
      
      * tag 'rodata-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONX
        arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to be common
      7bb03382
    • Linus Torvalds's avatar
      Merge tag 'usercopy-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 4a0853bf
      Linus Torvalds authored
      Pull usercopy test updates from Kees Cook:
       "This improves the usercopy tests:
      
         - check zeroing on failed copy_from_user()/get_user() (caught bug on
           ARM)
      
         - adjust tests for SMAP/PAN (can't zero userspace memory on failure)"
      
      * tag 'usercopy-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        usercopy: Add tests for all get_user() sizes
        usercopy: Adjust tests to deal with SMAP/PAN
        usercopy: add testcases to check zeroing on failure
      4a0853bf
    • Linus Torvalds's avatar
      Merge tag 'pstore-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 6d1dd93e
      Linus Torvalds authored
      Pull pstore updates from Kees Cook:
       "Minor changes to pstore tree:
      
         - update MAINTAINERS with current git repo, add more files.
      
         - move prz allocation checks into the walker
      
         - initialize flags correctly (by accident spinlock was technically
           ok)"
      
      * tag 'pstore-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        MAINTAINERS: Adjust pstore git repo URI, add files
        pstore: Check for prz allocation in walker
        pstore: Correctly initialize spinlock and flags
      6d1dd93e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 81bbef23
      Linus Torvalds authored
      Pull HID fix from Jiri Kosina:
       "Regression fix for HID_RMI-driven synaptics touchpads in
        !CONFIG_HID_RMI cases"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: rmi: fallback to generic/multitouch if hid-rmi is not built
      81bbef23
    • Linus Torvalds's avatar
      Merge branch 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 79f4d1d5
      Linus Torvalds authored
      Pull libata updates from Tejun Heo:
      
       - Bartlomiej added pata_falcon
      
       - Christoph is trying to remove use of static 4k buf.  It's still WIP
      
       - config cleanup around HAS_DMA
      
       - other fixes and driver-specific changes
      
      * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (29 commits)
        ata: pata_of_platform: using of_property_read_u32() helper
        pata_atiixp: Don't use unconnected secondary port on SB600/SB700
        libata-sff: Don't scan disabled ports when checking for legacy mode.
        pata_octeon_cf: remove unused local variables from octeon_cf_set_piomode()
        ahci: qoriq: added ls2088a platforms support
        ahci: qoriq: report error when ecc register address is missing in dts
        ahci: qoriq: added a condition to enable dma coherence
        Revert "libata: switch to dynamic allocation instead of ata_scsi_rbuf"
        ahci: imx: fix building without hwmon or thermal
        ata: add Atari Falcon PATA controller driver
        ata: pass queued command to ->sff_data_xfer method
        ata: allow subsystem to be used on m68k arch
        libata: switch to dynamic allocation instead of ata_scsi_rbuf
        libata: don't call ata_scsi_rbuf_fill for command without a response buffer
        libata: call ->scsi_done from ata_scsi_simulate
        libata: remove the done callback from ata_scsi_args
        libata: move struct ata_scsi_args to libata-scsi.c
        libata: avoid global response buffer in atapi_qc_complete
        libata-eh: Use switch() instead of sparse array for protocol strings
        ata: sata_mv: Convert to devm_ioremap_resource()
        ...
      79f4d1d5