Skip to content
  1. Apr 25, 2016
    • Jacob Keller's avatar
      ixgbe: resolve shift of negative value warning · 3e973dc4
      Jacob Keller authored
      
      
      Make use of GENMASK instead of open coding the equivalent operation
      incorrectly.
      
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3e973dc4
    • Jacob Keller's avatar
      ixgbe: use BIT() macro · b4f47a48
      Jacob Keller authored
      
      
      Several areas of ixgbe were written before widespread usage of the
      BIT(n) macro. With the impending release of GCC 6 and its associated new
      warnings, some usages such as (1 << 31) have been noted within the ixgbe
      driver source. Fix these wholesale and prevent future issues by simply
      using BIT macro instead of hand coded bit shifts.
      
      Also fix a few shifts that are shifting values into place by using the
      'u' prefix to indicate unsigned. It doesn't strictly matter in these
      cases because we're not shifting by too large a value, but these are all
      unsigned values and should be indicated as such.
      
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b4f47a48
    • Don Skidmore's avatar
      ixgbe: Add work around for empty SFP+ cage crosstalk · 4319a797
      Don Skidmore authored
      
      
      It is possible on some systems that crosstalk could lead to link flap
      on empty SFP+ cages.  A new NVM bit was defined to let SW know it
      needs to implement the work around which consists of verifying that
      there is a module in the cage before acting on the LSC.
      
      Signed-off-by: default avatarDon Skidmore <donald.c.skidmore@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4319a797
    • Mark Rustad's avatar
      ixgbe: Use correct FC setup function for x550em_a · a0254a70
      Mark Rustad authored
      
      
      Somehow the wrong fc_setup function was used for x550em_a, so
      correct that. Also set setup_link to NULL as its value is
      determined later, just like it is with X550EM_x.
      
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      a0254a70
    • Emil Tantilov's avatar
      ixgbevf: add support for per-queue ethtool stats · a02a5a53
      Emil Tantilov authored
      
      
      Implement per-queue statistics for packets, bytes and busy poll
      specific counters.
      
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      a02a5a53
    • Emil Tantilov's avatar
      ixgbevf: refactor ethtool stats handling · d72d6c19
      Emil Tantilov authored
      
      
      This brings the logic closer to how we handle the stats in ixgbe and it
      sets us up for introducing per-queue stats.
      
      Use IXGBEVF_STAT and IXGBEVF_NETDEV_STAT for accessing the driver and
      netdev stats respectively. This way we don't have to calculate the
      stats based on register values which could lead to the counters not
      being initialized properly when the interface is down.
      
      IXGBEVF_QUEUE_STATS_LEN is set to include the number of queues.
      
      Also some defines were renamed to use the IXGBEVF prefix.
      
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      d72d6c19
    • Mark Rustad's avatar
      ixgbe: Add register wait for slow links · 2f2219be
      Mark Rustad authored
      
      
      Use a new register to wait for previous register writes to complete
      before issuing a register read. This is needed when slower links
      are in use.
      
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      2f2219be
    • Sridhar Samudrala's avatar
      ixgbe: make 'action' field in struct ixgbe_fdir_filter a u64 value · 2a9ed5d1
      Sridhar Samudrala authored
      
      
      This field is used to record the RX queue index for a redirect action
      passed via ring_cookie field in struct ethtool_rx_flow_spec which is
      a u64 value.
      
      For ex: after adding a filter rule to redirect to a VF using ethtool
        # echo 4 > /sys/class/net/p4p1/device/sriov_numvfs
        # ethtool -N p4p1 flow-type ip4 src-ip 192.168.0.1 action 0x100000000
      
      querying for the rule shows the Action as 'Direct to queue 0'
      
        # ethtool -n p4p1
        4 RX rings available
        Total 1 rules
      
        Filter: 2045
       	Rule Type: Raw IPv4
      	Src IP addr: 192.168.0.1 mask: 0.0.0.0
      	Dest IP addr: 0.0.0.0 mask: 255.255.255.255
      	TOS: 0x0 mask: 0xff
      	Protocol: 0 mask: 0xff
      	L4 bytes: 0x0 mask: 0xffffffff
      	VLAN EtherType: 0x0 mask: 0xffff
      	VLAN: 0x0 mask: 0xffff
      	User-defined: 0x0 mask: 0xffffffffffffffff
      	Action: Direct to queue 0
      
      With this fix, ethtool will report the right queue index even for VFs.
      	Action: Direct to queue 4294967296
      
      Here 4294967296 corresponds to 0x100000000.
      We need to update 'ethtool' to report the queue index as a Hex value so
      that it is more  user friendly and matches with the 'action' value that
      is passed when adding the rule.
      
      Signed-off-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      2a9ed5d1
    • Emil Tantilov's avatar
      ixgbe: fix default mac->ops.setup_link for X550EM · 4695886c
      Emil Tantilov authored
      
      
      X550EM_a/x did not have a default value for mac->ops.setup_link which
      was causing link issues for backplane devices.
      
      This patch sets mac->ops.setup_link to ixgbe_setup_mac_link_X540 for
      X550EM_a/x which is also default for X550. This will result in
      mac->ops.setup_link calling the link setup function for the respective
      PHY type in case we do not need a special function to deal with it.
      
      Reported-by: default avatarKen Cox <jkc@redhat.com>
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4695886c
    • Emil Tantilov's avatar
      ixgbe: set VLAN spoof checking unconditionally · d3dec7c7
      Emil Tantilov authored
      
      
      Previously the PF driver would only set VLAN spoof checking if
      the VF had created VLANs. This was done by setting and checking
      a counter (vlan_count) whenever a VLAN was created by the VF.
      However it is possible for the vlan_count to be !=0 while there are
      no VLANs assigned to the VF due to the count incrementing every
      time a VLAN 0 is added on ifdown/up, which resulted in VLAN spoofing
      always being set for those VFs.
      
      This patch cleans up the logic by unconditionally setting VLAN based on
      how the VF is configured (via ip link set ethX vf Y spoofchk on/off).
      This change also resolves an issue where the VLAN spoofing can remain
      set even after being disabled by the user due to the driver enabling
      VLAN spoof checking every time a VLAN is added to the VF, but would
      only allow changes in the setting if vlan_count != 0.
      
      Also default_vf_vlan_id and vlans_enabled were removed from the
      vf_data_storage structure since they are not being used in the driver.
      
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      d3dec7c7
    • Emil Tantilov's avatar
      ixgbe: consolidate the configuration of spoof checking · 77f192af
      Emil Tantilov authored
      
      
      Consolidate the logic behind configuring spoof checking:
      
      Move the setting of the MAC, VLAN and Ethertype spoof checking into
      ixgbe_ndo_set_vf_spoofchk().
      
      Change ixgbe_set_mac_anti_spoofing() to set MAC spoofing per VF similar
      to the VLAN and Ethertype functions - this allows us to call the helper
      functions in ixgbe_ndo_set_vf_spoofchk() for all spoof check types and
      only disable MAC spoof checking when creating MACVLAN.
      
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      77f192af
    • Eric Dumazet's avatar
      tcp-tso: do not split TSO packets at retransmit time · 10d3be56
      Eric Dumazet authored
      
      
      Linux TCP stack painfully segments all TSO/GSO packets before retransmits.
      
      This was fine back in the days when TSO/GSO were emerging, with their
      bugs, but we believe the dark age is over.
      
      Keeping big packets in write queues, but also in stack traversal
      has a lot of benefits.
       - Less memory overhead, because write queues have less skbs
       - Less cpu overhead at ACK processing.
       - Better SACK processing, as lot of studies mentioned how
         awful linux was at this ;)
       - Less cpu overhead to send the rtx packets
         (IP stack traversal, netfilter traversal, drivers...)
       - Better latencies in presence of losses.
       - Smaller spikes in fq like packet schedulers, as retransmits
         are not constrained by TCP Small Queues.
      
      1 % packet losses are common today, and at 100Gbit speeds, this
      translates to ~80,000 losses per second.
      Losses are often correlated, and we see many retransmit events
      leading to 1-MSS train of packets, at the time hosts are already
      under stress.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10d3be56
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix stale links after re-enabling bearer · 8cee83dd
      Parthasarathy Bhuvaragan authored
      Commit 42b18f60 ("tipc: refactor function tipc_link_timeout()"),
      introduced a bug which prevents sending of probe messages during
      link synchronization phase. This leads to hanging links, if the
      bearer is disabled/enabled after links are up.
      
      In this commit, we send the probe messages correctly.
      
      Fixes: 42b18f60
      
       ("tipc: refactor function tipc_link_timeout()")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cee83dd
    • David S. Miller's avatar
      Merge branch 'tcp-tcstamp_ack-frag-coalesce' · 6a74c196
      David S. Miller authored
      
      
      Martin KaFai Lau says:
      
      ====================
      tcp: Handle txstamp_ack when fragmenting/coalescing skbs
      
      This patchset is to handle the txstamp-ack bit when
      fragmenting/coalescing skbs.
      
      The second patch depends on the recently posted series
      for the net branch:
      "tcp: Merge timestamp info when coalescing skbs"
      
      A BPF prog is used to kprobe to sock_queue_err_skb()
      and print out the value of serr->ee.ee_data.  The BPF
      prog (run-able from bcc) is attached here:
      
      BPF prog used for testing:
      ~~~~~
      
      from __future__ import print_function
      from bcc import BPF
      
      bpf_text = """
      
      int trace_err_skb(struct pt_regs *ctx)
      {
      	struct sk_buff *skb = (struct sk_buff *)ctx->si;
      	struct sock *sk = (struct sock *)ctx->di;
      	struct sock_exterr_skb *serr;
      	u32 ee_data = 0;
      
      	if (!sk || !skb)
      		return 0;
      
      	serr = SKB_EXT_ERR(skb);
      	bpf_probe_read(&ee_data, sizeof(ee_data), &serr->ee.ee_data);
      	bpf_trace_printk("ee_data:%u\\n", ee_data);
      
      	return 0;
      };
      """
      
      b = BPF(text=bpf_text)
      b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb")
      print("Attached to kprobe")
      b.trace_print()
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a74c196
    • Martin KaFai Lau's avatar
      tcp: Merge txstamp_ack in tcp_skb_collapse_tstamp · 2de8023e
      Martin KaFai Lau authored
      
      
      When collapsing skbs, txstamp_ack also needs to be merged.
      
      Retrans Collapse Test:
      ~~~~~~
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      0.200 write(4, ..., 730) = 730
      +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
      0.200 write(4, ..., 730) = 730
      +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
      0.200 write(4, ..., 11680) = 11680
      
      0.200 > P. 1:731(730) ack 1
      0.200 > P. 731:1461(730) ack 1
      0.200 > . 1461:8761(7300) ack 1
      0.200 > P. 8761:13141(4380) ack 1
      
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:2921,nop,nop>
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:4381,nop,nop>
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:5841,nop,nop>
      0.300 > P. 1:1461(1460) ack 1
      0.400 < . 1:1(0) ack 13141 win 257
      
      BPF Output Before:
      ~~~~~
      <No output due to missing SCM_TSTAMP_ACK timestamp>
      
      BPF Output After:
      ~~~~~
      <...>-2027  [007] d.s.    79.765921: : ee_data:1459
      
      Sacks Collapse Test:
      ~~~~~
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      0.200 write(4, ..., 1460) = 1460
      +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
      0.200 write(4, ..., 13140) = 13140
      +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
      
      0.200 > P. 1:1461(1460) ack 1
      0.200 > . 1461:8761(7300) ack 1
      0.200 > P. 8761:14601(5840) ack 1
      
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:14601,nop,nop>
      0.300 > P. 1:1461(1460) ack 1
      0.400 < . 1:1(0) ack 14601 win 257
      
      BPF Output Before:
      ~~~~~
      <No output due to missing SCM_TSTAMP_ACK timestamp>
      
      BPF Output After:
      ~~~~~
      <...>-2049  [007] d.s.    89.185538: : ee_data:14599
      
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Tested-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2de8023e
    • Martin KaFai Lau's avatar
      tcp: Carry txstamp_ack in tcp_fragment_tstamp · b51e13fa
      Martin KaFai Lau authored
      
      
      When a tcp skb is sliced into two smaller skbs (e.g. in
      tcp_fragment() and tso_fragment()),  it does not carry
      the txstamp_ack bit to the newly created skb if it is needed.
      The end result is a timestamping event (SCM_TSTAMP_ACK) will
      be missing from the sk->sk_error_queue.
      
      This patch carries this bit to the new skb2
      in tcp_fragment_tstamp().
      
      BPF Output Before:
      ~~~~~~
      <No output due to missing SCM_TSTAMP_ACK timestamp>
      
      BPF Output After:
      ~~~~~~
      <...>-2050  [000] d.s.   100.928763: : ee_data:14599
      
      Packetdrill Script:
      ~~~~~~
      +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
      +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
      +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0 bind(3, ..., ...) = 0
      +0 listen(3, 1) = 0
      
      0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
      0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.200 < . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
      0.200 write(4, ..., 14600) = 14600
      +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
      
      0.200 > . 1:7301(7300) ack 1
      0.200 > P. 7301:14601(7300) ack 1
      
      0.300 < . 1:1(0) ack 14601 win 257
      
      0.300 close(4) = 0
      0.300 > F. 14601:14601(0) ack 1
      0.400 < F. 1:1(0) ack 16062 win 257
      0.400 > . 14602:14602(0) ack 2
      
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Tested-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b51e13fa
  2. Apr 24, 2016
  3. Apr 22, 2016