Skip to content
  1. Oct 04, 2014
  2. Oct 03, 2014
  3. Oct 02, 2014
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 50dddff3
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Don't halt the firmware in r8152 driver, from Hayes Wang.
      
       2) Handle full sized 802.1ad frames in bnx2 and tg3 drivers properly,
          from Vlad Yasevich.
      
       3) Don't sleep while holding tx_clean_lock in netxen driver, fix from
          Manish Chopra.
      
       4) Certain kinds of ipv6 routes can end up endlessly failing the route
          validation test, causing it to be re-looked up over and over again.
          This particularly kills input route caching in TCP sockets.  Fix
          from Hannes Frederic Sowa.
      
       5) netvsc_start_xmit() has a use-after-free access to skb->len, fix
          from K Y Srinivasan.
      
       6) Fix matching of inverted containers in ematch module, from Ignacy
          Gawędzki.
      
       7) Aggregation of GRO frames via SKB ->frag_list for linear skbs isn't
          handled properly, regression fix from Eric Dumazet.
      
       8) Don't test return value of ipv4_neigh_lookup(), which returns an
          error pointer, against NULL.  From WANG Cong.
      
       9) Fix an old regression where we mistakenly allow a double add of the
          same tunnel.  Fixes from Steffen Klassert.
      
      10) macvtap device delete and open can run in parallel and corrupt lists
          etc., fix from Vlad Yasevich.
      
      11) Fix build error with IPV6=m NETFILTER_XT_TARGET_TPROXY=y, from Pablo
          Neira Ayuso.
      
      12) rhashtable_destroy() triggers lockdep splats, fix also from Pablo.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits)
        bna: Update Maintainer Email
        r8152: disable power cut for RTL8153
        r8152: remove clearing bp
        bnx2: Correctly receive full sized 802.1ad fragmes
        tg3: Allow for recieve of full-size 8021AD frames
        r8152: fix setting RTL8152_UNPLUG
        netxen: Fix bug in Tx completion path.
        netxen: Fix BUG "sleeping function called from invalid context"
        ipv6: remove rt6i_genid
        hyperv: Fix a bug in netvsc_start_xmit()
        net: stmmac: fix stmmac_pci_probe failed when CONFIG_HAVE_CLK is selected
        ematch: Fix matching of inverted containers.
        gro: fix aggregation for skb using frag_list
        neigh: check error pointer instead of NULL for ipv4_neigh_lookup()
        ip6_gre: Return an error when adding an existing tunnel.
        ip6_vti: Return an error when adding an existing tunnel.
        ip6_tunnel: Return an error when adding an existing tunnel.
        ip6gre: add a rtnl link alias for ip6gretap
        net/mlx4_core: Allow not to specify probe_vf in SRIOV IB mode
        r8152: fix the carrier off when autoresuming
        ...
      50dddff3
    • Rasesh Mody's avatar
      bna: Update Maintainer Email · 439e9575
      Rasesh Mody authored
      
      
      Update the maintainer email for BNA driver.
      
      Signed-off-by: default avatarRasesh Mody <rasesh.mody@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      439e9575
    • Petri Gynther's avatar
    • Petri Gynther's avatar
      net: bcmgenet: fix bcmgenet_put_tx_csum() · bc23333b
      Petri Gynther authored
      
      
      bcmgenet_put_tx_csum() needs to return skb pointer back to the caller
      because it reallocates a new one in case of lack of skb headroom.
      
      Signed-off-by: default avatarPetri Gynther <pgynther@google.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc23333b
    • Alexei Starovoitov's avatar
      net: pktgen: packet bursting via skb->xmit_more · 38b2cf29
      Alexei Starovoitov authored
      
      
      This patch demonstrates the effect of delaying update of HW tailptr.
      (based on earlier patch by Jesper)
      
      burst=1 is the default. It sends one packet with xmit_more=false
      burst=2 sends one packet with xmit_more=true and
              2nd copy of the same packet with xmit_more=false
      burst=3 sends two copies of the same packet with xmit_more=true and
              3rd copy with xmit_more=false
      
      Performance with ixgbe (usec 30):
      burst=1  tx:9.2 Mpps
      burst=2  tx:13.5 Mpps
      burst=3  tx:14.5 Mpps full 10G line rate
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38b2cf29
    • Florian Fainelli's avatar
      net: bridge: add a br_set_state helper function · 775dd692
      Florian Fainelli authored
      
      
      In preparation for being able to propagate port states to e.g: notifiers
      or other kernel parts, do not manipulate the port state directly, but
      instead use a helper function which will allow us to do a bit more than
      just setting the state.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      775dd692
    • WANG Cong's avatar
      net_sched: avoid calling tcf_unbind_filter() in call_rcu callback · a0efb80c
      WANG Cong authored
      
      
      This fixes the following crash:
      
      [   63.976822] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [   63.980094] CPU: 1 PID: 15 Comm: ksoftirqd/1 Not tainted 3.17.0-rc6+ #648
      [   63.980094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   63.980094] task: ffff880117dea690 ti: ffff880117dfc000 task.ti: ffff880117dfc000
      [   63.980094] RIP: 0010:[<ffffffff817e6d07>]  [<ffffffff817e6d07>] u32_destroy_key+0x27/0x6d
      [   63.980094] RSP: 0018:ffff880117dffcc0  EFLAGS: 00010202
      [   63.980094] RAX: ffff880117dea690 RBX: ffff8800d02e0820 RCX: 0000000000000000
      [   63.980094] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b
      [   63.980094] RBP: ffff880117dffcd0 R08: 0000000000000000 R09: 0000000000000000
      [   63.980094] R10: 00006c0900006ba8 R11: 00006ba100006b9d R12: 0000000000000001
      [   63.980094] R13: ffff8800d02e0898 R14: ffffffff817e6d4d R15: ffff880117387a30
      [   63.980094] FS:  0000000000000000(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000
      [   63.980094] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   63.980094] CR2: 00007f07e6732fed CR3: 000000011665b000 CR4: 00000000000006e0
      [   63.980094] Stack:
      [   63.980094]  ffff88011a9cd300 ffffffff82051ac0 ffff880117dffce0 ffffffff817e6d68
      [   63.980094]  ffff880117dffd70 ffffffff810cb4c7 ffffffff810cb3cd ffff880117dfffd8
      [   63.980094]  ffff880117dea690 ffff880117dea690 ffff880117dfffd8 000000000000000a
      [   63.980094] Call Trace:
      [   63.980094]  [<ffffffff817e6d68>] u32_delete_key_freepf_rcu+0x1b/0x1d
      [   63.980094]  [<ffffffff810cb4c7>] rcu_process_callbacks+0x3bb/0x691
      [   63.980094]  [<ffffffff810cb3cd>] ? rcu_process_callbacks+0x2c1/0x691
      [   63.980094]  [<ffffffff817e6d4d>] ? u32_destroy_key+0x6d/0x6d
      [   63.980094]  [<ffffffff810780a4>] __do_softirq+0x142/0x323
      [   63.980094]  [<ffffffff810782a8>] run_ksoftirqd+0x23/0x53
      [   63.980094]  [<ffffffff81092126>] smpboot_thread_fn+0x203/0x221
      [   63.980094]  [<ffffffff81091f23>] ? smpboot_unpark_thread+0x33/0x33
      [   63.980094]  [<ffffffff8108e44d>] kthread+0xc9/0xd1
      [   63.980094]  [<ffffffff819e00ea>] ? do_wait_for_common+0xf8/0x125
      [   63.980094]  [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61
      [   63.980094]  [<ffffffff819e43ec>] ret_from_fork+0x7c/0xb0
      [   63.980094]  [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61
      
      tp could be freed in call_rcu callback too, the order is not guaranteed.
      
      John Fastabend says:
      
      ====================
      Its worth noting why this is safe. Any running schedulers will either
      read the valid class field or it will be zeroed.
      
      All schedulers today when the class is 0 do a lookup using the
      same call used by the tcf_exts_bind(). So even if we have a running
      classifier hit the null class pointer it will do a lookup and get
      to the same result. This is particularly fragile at the moment because
      the only way to verify this is to audit the schedulers call sites.
      ====================
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0efb80c
    • WANG Cong's avatar
      net_sched: fix another crash in cls_tcindex · 6e056569
      WANG Cong authored
      
      
      This patch fixes the following crash:
      
      [  166.670795] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [  166.674230] IP: [<ffffffff814b739f>] __list_del_entry+0x5c/0x98
      [  166.674230] PGD d0ea5067 PUD ce7fc067 PMD 0
      [  166.674230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [  166.674230] CPU: 1 PID: 775 Comm: tc Not tainted 3.17.0-rc6+ #642
      [  166.674230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  166.674230] task: ffff8800d03c4d20 ti: ffff8800cae7c000 task.ti: ffff8800cae7c000
      [  166.674230] RIP: 0010:[<ffffffff814b739f>]  [<ffffffff814b739f>] __list_del_entry+0x5c/0x98
      [  166.674230] RSP: 0018:ffff8800cae7f7d0  EFLAGS: 00010207
      [  166.674230] RAX: 0000000000000000 RBX: ffff8800cba8d700 RCX: ffff8800cba8d700
      [  166.674230] RDX: 0000000000000000 RSI: dead000000200200 RDI: ffff8800cba8d700
      [  166.674230] RBP: ffff8800cae7f7d0 R08: 0000000000000001 R09: 0000000000000001
      [  166.674230] R10: 0000000000000000 R11: 000000000000859a R12: ffffffffffffffe8
      [  166.674230] R13: ffff8800cba8c5b8 R14: 0000000000000001 R15: ffff8800cba8d700
      [  166.674230] FS:  00007fdb5f04a740(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000
      [  166.674230] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  166.674230] CR2: 0000000000000000 CR3: 00000000cf929000 CR4: 00000000000006e0
      [  166.674230] Stack:
      [  166.674230]  ffff8800cae7f7e8 ffffffff814b73e8 ffff8800cba8d6e8 ffff8800cae7f828
      [  166.674230]  ffffffff817caeec 0000000000000046 ffff8800cba8c5b0 ffff8800cba8c5b8
      [  166.674230]  0000000000000000 0000000000000001 ffff8800cf8e33e8 ffff8800cae7f848
      [  166.674230] Call Trace:
      [  166.674230]  [<ffffffff814b73e8>] list_del+0xd/0x2b
      [  166.674230]  [<ffffffff817caeec>] tcf_action_destroy+0x4c/0x71
      [  166.674230]  [<ffffffff817ca0ce>] tcf_exts_destroy+0x20/0x2d
      [  166.674230]  [<ffffffff817ec2b5>] tcindex_delete+0x196/0x1b7
      
      struct list_head can not be simply copied and we should always init it.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e056569
    • David S. Miller's avatar
      Merge branch 'udp_gso' · 25e379c4
      David S. Miller authored
      
      
      Tom Herbert says:
      
      ====================
      udp: Generalize GSO for UDP tunnels
      
      This patch set generalizes the UDP tunnel segmentation functions so
      that they can work with various protocol encapsulations. The primary
      change is to set the inner_protocol field in the skbuff when creating
      the encapsulated packet, and then in skb_udp_tunnel_segment this data
      is used to determine the function for segmenting the encapsulated
      packet. The inner_protocol field is overloaded to take either an
      Ethertype or IP protocol.
      
      The inner_protocol is set on transmit using skb_set_inner_ipproto or
      skb_set_inner_protocol functions. VXLAN and IP tunnels (for fou GSO)
      were modified to call these.
      
      Notes:
        - GSO for GRE/UDP where GRE checksum is enabled does not work.
          Handling this will require some special case code.
        - Software GSO now supports many varieties of encapsulation with
          SKB_GSO_UDP_TUNNEL{_CSUM}. We still need a mechanism to query
          for device support of particular combinations (I intend to
          add ndo_gso_check for that).
        - MPLS seems to be the only previous user of inner_protocol. I don't
          believe these patches can affect that. For supporting GSO with
          MPLS over UDP, the inner_protocol should be set using the
          helper functions in this patch.
        - GSO for L2TP/UDP should also be straightforward now.
      
      v2:
        - Respin for Eric's restructuring of skbuff.
      
      Tested GRE, IPIP, and SIT over fou as well as VLXAN. This was
      done using 200 TCP_STREAMs in netperf.
      
       GRE
          IPv4, FOU, UDP checksum enabled
            TCP_STREAM TSO enabled on tun interface
              14.04% TX CPU utilization
              13.17% RX CPU utilization
              9211 Mbps
            TCP_STREAM TSO disabled on tun interface
              27.82% TX CPU utilization
              25.41% RX CPU utilization
              9336 Mbps
          IPv4, FOU, UDP checksum disabled
            TCP_STREAM TSO enabled on tun interface
              13.14% TX CPU utilization
              23.18% RX CPU utilization
              9277 Mbps
            TCP_STREAM TSO disabled on tun interface
              30.00% TX CPU utilization
              31.28% RX CPU utilization
              9327 Mbps
      
        IPIP
          FOU, UDP checksum enabled
            TCP_STREAM TSO enabled on tun interface
              15.28% TX CPU utilization
              13.92% RX CPU utilization
              9342 Mbps
            TCP_STREAM TSO disabled on tun interface
              27.82% TX CPU utilization
              25.41% RX CPU utilization
              9336 Mbps
          FOU, UDP checksum disabled
            TCP_STREAM TSO enabled on tun interface
              15.08% TX CPU utilization
              24.64% RX CPU utilization
              9226 Mbps
            TCP_STREAM TSO disabled on tun interface
              30.00% TX CPU utilization
              31.28% RX CPU utilization
              9327 Mbps
      
        SIT
          FOU, UDP checksum enabled
            TCP_STREAM TSO enabled on tun interface
              14.47% TX CPU utilization
              14.58% RX CPU utilization
              9106 Mbps
            TCP_STREAM TSO disabled on tun interface
              31.82% TX CPU utilization
              30.82% RX CPU utilization
              9204 Mbps
          FOU, UDP checksum disabled
            TCP_STREAM TSO enabled on tun interface
              15.70% TX CPU utilization
              27.93% RX CPU utilization
              9097 Mbps
            TCP_STREAM TSO disabled on tun interface
              33.48% TX CPU utilization
              37.36% RX CPU utilization
              9197 Mbps
      
         VXLAN
            TCP_STREAM TSO enabled on tun interface
              16.42% TX CPU utilization
              23.66% RX CPU utilization
              9081 Mbps
            TCP_STREAM TSO disabled on tun interface
              30.32% TX CPU utilization
              30.55% RX CPU utilization
              9185 Mbps
      
         Baseline (no encp, TSO and LRO enabled)
            TCP_STREAM
              11.85% TX CPU utilization
              15.13% RX CPU utilization
              9452 Mbps
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25e379c4
    • Tom Herbert's avatar
      vxlan: Set inner protocol before transmit · 996c9fd1
      Tom Herbert authored
      
      
      Call skb_set_inner_protocol to set inner Ethernet protocol to
      ETH_P_TEB before transmit. This is needed for GSO with UDP tunnels.
      
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      996c9fd1
    • Tom Herbert's avatar
      gre: Set inner protocol in v4 and v6 GRE transmit · 54bc9bac
      Tom Herbert authored
      
      
      Call skb_set_inner_protocol to set inner Ethernet protocol to
      protocol being encapsulation by GRE before tunnel_xmit. This is
      needed for GSO if UDP encapsulation (fou) is being done.
      
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54bc9bac
    • Tom Herbert's avatar
      ipip: Set inner IP protocol in ipip · 077c5a09
      Tom Herbert authored
      
      
      Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV4
      before tunnel_xmit. This is needed if UDP encapsulation (fou) is
      being done.
      
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      077c5a09
    • Tom Herbert's avatar
      sit: Set inner IP protocol in sit · 469471cd
      Tom Herbert authored
      
      
      Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV6
      before tunnel_xmit. This is needed if UDP encapsulation (fou) is
      being done.
      
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      469471cd
    • Tom Herbert's avatar
      udp: Generalize skb_udp_segment · 8bce6d7d
      Tom Herbert authored
      
      
      skb_udp_segment is the function called from udp4_ufo_fragment to
      segment a UDP tunnel packet. This function currently assumes
      segmentation is transparent Ethernet bridging (i.e. VXLAN
      encapsulation). This patch generalizes the function to
      operate on either Ethertype or IP protocol.
      
      The inner_protocol field must be set to the protocol of the inner
      header. This can now be either an Ethertype or an IP protocol
      (in a union). A new flag in the skbuff indicates which type is
      effective. skb_set_inner_protocol and skb_set_inner_ipproto
      helper functions were added to set the inner_protocol. These
      functions are called from the point where the tunnel encapsulation
      is occuring.
      
      When skb_udp_tunnel_segment is called, the function to segment the
      inner packet is selected based on the inner IP or Ethertype. In the
      case of an IP protocol encapsulation, the function is derived from
      inet[6]_offloads. In the case of Ethertype, skb->protocol is
      set to the inner_protocol and skb_mac_gso_segment is called. (GRE
      currently does this, but it might be possible to lookup the protocol
      in offload_base and call the appropriate segmenation function
      directly).
      
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8bce6d7d
    • David S. Miller's avatar
      Merge branch 'bpf-next' · f44d61cd
      David S. Miller authored
      
      
      Alexei Starovoitov says:
      
      ====================
      bpf: add search pruning optimization and tests
      
      patch #1 commit log explains why eBPF verifier has to examine some
      instructions multiple times and describes the search pruning optimization
      that improves verification speed for branchy programs and allows more
      complex programs to be verified successfully.
      This patch completes the core verifier logic.
      
      patch #2 adds more verifier tests related to branches and search pruning
      
      I'm still working on Andy's 'bitmask for stack slots' suggestion. It will be
      done on top of this patch.
      
      The current verifier algorithm is brute force depth first search with
      state pruning. If anyone can come up with another algorithm that demonstrates
      better results, we'll replace the algorithm without affecting user space.
      
      Note verifier doesn't guarantee that all possible valid programs are accepted.
      Overly complex programs may still be rejected.
      Verifier improvements/optimizations will guarantee that if a program
      was passing verification in the past, it will still be passing.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f44d61cd
    • Alexei Starovoitov's avatar
      bpf: add tests to verifier testsuite · fd10c2ef
      Alexei Starovoitov authored
      
      
      add 4 extra tests to cover jump verification better
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd10c2ef
    • Alexei Starovoitov's avatar
      bpf: add search pruning optimization to verifier · f1bca824
      Alexei Starovoitov authored
      
      
      consider C program represented in eBPF:
      int filter(int arg)
      {
          int a, b, c, *ptr;
      
          if (arg == 1)
              ptr = &a;
          else if (arg == 2)
              ptr = &b;
          else
              ptr = &c;
      
          *ptr = 0;
          return 0;
      }
      eBPF verifier has to follow all possible paths through the program
      to recognize that '*ptr = 0' instruction would be safe to execute
      in all situations.
      It's doing it by picking a path towards the end and observes changes
      to registers and stack at every insn until it reaches bpf_exit.
      Then it comes back to one of the previous branches and goes towards
      the end again with potentially different values in registers.
      When program has a lot of branches, the number of possible combinations
      of branches is huge, so verifer has a hard limit of walking no more
      than 32k instructions. This limit can be reached and complex (but valid)
      programs could be rejected. Therefore it's important to recognize equivalent
      verifier states to prune this depth first search.
      
      Basic idea can be illustrated by the program (where .. are some eBPF insns):
          1: ..
          2: if (rX == rY) goto 4
          3: ..
          4: ..
          5: ..
          6: bpf_exit
      In the first pass towards bpf_exit the verifier will walk insns: 1, 2, 3, 4, 5, 6
      Since insn#2 is a branch the verifier will remember its state in verifier stack
      to come back to it later.
      Since insn#4 is marked as 'branch target', the verifier will remember its state
      in explored_states[4] linked list.
      Once it reaches insn#6 successfully it will pop the state recorded at insn#2 and
      will continue.
      Without search pruning optimization verifier would have to walk 4, 5, 6 again,
      effectively simulating execution of insns 1, 2, 4, 5, 6
      With search pruning it will check whether state at #4 after jumping from #2
      is equivalent to one recorded in explored_states[4] during first pass.
      If there is an equivalent state, verifier can prune the search at #4 and declare
      this path to be safe as well.
      In other words two states at #4 are equivalent if execution of 1, 2, 3, 4 insns
      and 1, 2, 4 insns produces equivalent registers and stack.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1bca824
    • Nimrod Andy's avatar
      net: fec: implement rx_copybreak to improve rx performance · 1b7bde6d
      Nimrod Andy authored
      
      
      - Copy short frames and keep the buffers mapped, re-allocate skb instead of
        memory copy for long frames.
      - Add support for setting/getting rx_copybreak using generic ethtool tunable
      
      Changes V3:
      * As Eric Dumazet's suggestion that removing the copybreak module parameter
        and only keep the ethtool API support for rx_copybreak.
      
      Changes V2:
      * Implements rx_copybreak
      * Rx_copybreak provides module parameter to change this value
      * Add tunable_ops support for rx_copybreak
      
      Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
      Signed-off-by: default avatarFrank Li <Frank.Li@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b7bde6d
    • Eric Dumazet's avatar
      net: avoid one atomic operation in skb_clone() · ce1a4ea3
      Eric Dumazet authored
      
      
      Fast clone cloning can actually avoid an atomic_inc(), if we
      guarantee prior clone_ref value is 1.
      
      This requires a change kfree_skbmem(), to perform the
      atomic_dec_and_test() on clone_ref before setting fclone to
      SKB_FCLONE_UNAVAILABLE.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce1a4ea3
    • Fabian Frederick's avatar
      net/dccp/ccid.c: add __init to ccid_activate · e500f488
      Fabian Frederick authored
      
      
      ccid_activate is only called by __init ccid_initialize_builtins in same module.
      
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e500f488
    • Fabian Frederick's avatar
      net/dccp/proto.c: add __init to dccp_mib_init · 0c5b8a46
      Fabian Frederick authored
      
      
      dccp_mib_init is only called by __init dccp_init in same module.
      
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c5b8a46
    • David S. Miller's avatar
      Merge branch 'r8152' · 07544764
      David S. Miller authored
      
      
      Hayes Wang says:
      
      ====================
      r8152: patches about firmware
      
      The patches fix the issues when the firmware exists.
      
      For the multiple OS, the firmware may be loaded by the
      driver of the other OS. And the Linux driver has influences
      on it.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07544764
    • hayeswang's avatar
      r8152: disable power cut for RTL8153 · 49be1723
      hayeswang authored
      
      
      The firmware would be clear when the power cut is enabled for
      RTL8153.
      
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49be1723
    • hayeswang's avatar
      r8152: remove clearing bp · 204c8704
      hayeswang authored
      
      
      The xxx_clear_bp() is used to halt the firmware. It only necessary
      for updating the new firmware. Besides, depend on the version of
      the current firmware, it may have problem to halt the firmware
      directly. Finally, halt the firmware would let the firmware code
      useless, and the bugs which are fixed by the firmware would occur.
      
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      204c8704
    • Vlad Yasevich's avatar
      bnx2: Correctly receive full sized 802.1ad fragmes · 1b0ecb28
      Vlad Yasevich authored
      
      
      This driver, similar to tg3, has a check that will
      cause full sized 802.1ad frames to be dropped.  The
      frame will be larger then the standard mtu due to the
      presense of vlan header that has not been stripped.
      The driver should not drop this frame and should process
      it just like it does for 802.1q.
      
      CC: Sony Chacko <sony.chacko@qlogic.com>
      CC: Dept-HSGLinuxNICDev@qlogic.com
      Signed-off-by: default avatarVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b0ecb28
    • Vlad Yasevich's avatar
      tg3: Allow for recieve of full-size 8021AD frames · 7d3083ee
      Vlad Yasevich authored
      
      
      When receiving a vlan-tagged frame that still contains
      a vlan header, the length of the packet will be greater
      then MTU+ETH_HLEN since it will account of the extra
      vlan header.  TG3 checks this for the case for 802.1Q,
      but not for 802.1ad.  As a result, full sized 802.1ad
      frames get dropped by the card.
      
      Add a check for 802.1ad protocol when receving full
      sized frames.
      
      Suggested-by: default avatarPrashant Sreedharan <prashant@broadcom.com>
      CC: Prashant Sreedharan <prashant@broadcom.com>
      CC: Michael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d3083ee
    • Florian Westphal's avatar
      r8169: add support for Byte Queue Limits · 1e918876
      Florian Westphal authored
      
      
      tested on RTL8168d/8111d model using 'super_netperf 40' with TCP/UDP_STREAM.
      
      Output of
      while true; do
          for n in inflight limit; do
                echo -n $n\ ; cat $n;
          done;
          sleep 1;
      done
      
      during netperf run, 100mbit peer:
      
      inflight 0
      limit 3028
      inflight 6056
      limit 4542
      
      [ trimmed output for brevity, no limit/inflight changes during
        test steady-state ]
      
      limit 4542
      inflight 3028
      limit 6122
      inflight 0
      limit 6122
      [ changed cable to 1gbit peer, restart netperf ]
      inflight 37850
      limit 36336
      inflight 33308
      limit 31794
      inflight 33308
      limit 31794
      inflight 27252
      limit 25738
      [ again, no changes during test ]
      inflight 27252
      limit 25738
      inflight 0
      limit 28766
      [ change cable to 100mbit peer, restart netperf ]
      limit 28766
      inflight 27370
      limit 28766
      inflight 4542
      limit 5990
      inflight 6056
      limit 4542
      [ .. ]
      inflight 6056
      limit 4542
      inflight 0
      
      [end of test]
      
      Cc: Francois Romieu <romieu@fr.zoreil.com>
      Cc: Hayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e918876
    • Eric Dumazet's avatar
      net: cleanup and document skb fclone layout · d0bf4a9e
      Eric Dumazet authored
      
      
      Lets use a proper structure to clearly document and implement
      skb fast clones.
      
      Then, we might experiment more easily alternative layouts.
      
      This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm,
      to stop leaking of implementation details.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0bf4a9e
    • Yuchung Cheng's avatar
      tcp: abort orphan sockets stalling on zero window probes · b248230c
      Yuchung Cheng authored
      Currently we have two different policies for orphan sockets
      that repeatedly stall on zero window ACKs. If a socket gets
      a zero window ACK when it is transmitting data, the RTO is
      used to probe the window. The socket is aborted after roughly
      tcp_orphan_retries() retries (as in tcp_write_timeout()).
      
      But if the socket was idle when it received the zero window ACK,
      and later wants to send more data, we use the probe timer to
      probe the window. If the receiver always returns zero window ACKs,
      icsk_probes keeps getting reset in tcp_ack() and the orphan socket
      can stall forever until the system reaches the orphan limit (as
      commented in tcp_probe_timer()). This opens up a simple attack
      to create lots of hanging orphan sockets to burn the memory
      and the CPU, as demonstrated in the recent netdev post "TCP
      connection will hang in FIN_WAIT1 after closing if zero window is
      advertised." http://www.spinics.net/lists/netdev/msg296539.html
      
      
      
      This patch follows the design in RTO-based probe: we abort an orphan
      socket stalling on zero window when the probe timer reaches both
      the maximum backoff and the maximum RTO. For example, an 100ms RTT
      connection will timeout after roughly 153 seconds (0.3 + 0.6 +
      .... + 76.8) if the receiver keeps the window shut. If the orphan
      socket passes this check, but the system already has too many orphans
      (as in tcp_out_of_resources()), we still abort it but we'll also
      send an RST packet as the connection may still be active.
      
      In addition, we change TCP_USER_TIMEOUT to cover (life or dead)
      sockets stalled on zero-window probes. This changes the semantics
      of TCP_USER_TIMEOUT slightly because it previously only applies
      when the socket has pending transmission.
      
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reported-by: default avatarAndrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b248230c
    • Linus Torvalds's avatar
      Merge branch 'for-3.17' of git://linux-nfs.org/~bfields/linux · a44f8672
      Linus Torvalds authored
      Pull nfsd bugfix from Bruce Fields:
       "This fixes a data corruption bug introduced by the v3.16 xdr encoding
        rewrite.  I haven't managed to reproduce it myself yet, but it's
        apparently not hard to hit given the right workload"
      
      * 'for-3.17' of git://linux-nfs.org/~bfields/linux:
        nfsd4: fix corruption of NFSv4 read data
      a44f8672
    • Fabian Frederick's avatar
      cipso: add __init to cipso_v4_cache_init · cb57659a
      Fabian Frederick authored
      
      
      cipso_v4_cache_init is only called by __init cipso_v4_init
      
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb57659a
    • Fabian Frederick's avatar
      inet: frags: add __init to ip4_frags_ctl_register · 57a02c39
      Fabian Frederick authored
      
      
      ip4_frags_ctl_register is only called by __init ipfrag_init
      
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57a02c39
    • Fabian Frederick's avatar
      tcp: add __init to tcp_init_mem · 47d7a88c
      Fabian Frederick authored
      
      
      tcp_init_mem is only called by __init tcp_init.
      
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47d7a88c
    • Chun-Hao Lin's avatar
      r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled · ee7a1beb
      Chun-Hao Lin authored
      
      
      These two functions are used to inform dash firmware that driver is been
      brought up or brought down. So call these two functions only when hardware dash
      function is enabled.
      
      Signed-off-by: default avatarChun-Hao Lin <hau@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee7a1beb
    • Chun-Hao Lin's avatar
      r8169:modify the behavior of function "rtl8168_oob_notify" · 2a9b4d96
      Chun-Hao Lin authored
      
      
      In function "rtl8168_oob_notify", using function "rtl_eri_write" to access
      eri register 0xe8, instead of using MAC register "ERIDR" and "ERIAR" to
      access it.
      
      For using function "rtl_eri_write" in function "rtl8168_oob_notify", need to
      move down "rtl8168_oob_notify" related functions under the function
      "rtl_eri_write".
      
      Signed-off-by: default avatarChun-Hao Lin <hau@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a9b4d96
    • Chun-Hao Lin's avatar
      r8169:change the name of function "r8168dp_check_dash" to "r8168_check_dash" · 2f8c040c
      Chun-Hao Lin authored
      
      
      DASH function not only RTL8168DP can support, but also RTL8168EP.
      So change the name of function "r8168dp_check_dash" to "r8168_check_dash".
      
      Signed-off-by: default avatarChun-Hao Lin <hau@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f8c040c
    • Chun-Hao Lin's avatar
      r8169:change the name of function"rtl_w1w0_eri" · 706123d0
      Chun-Hao Lin authored
      
      
      Change the name of function "rtl_w1w0_eri" to "rtl_w0w1_eri".
      
      In this function, the local variable "val" is "write zeros then write ones".
      Please see below code.
      
      (val & ~m) | p
      
      In this patch, change the function name from "xx_w1w0_xx" to "xx_w0w1_xx".
      The changed function name is more suitable for it's behavior.
      
      Signed-off-by: default avatarChun-Hao Lin <hau@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      706123d0
    • Chun-Hao Lin's avatar
      r8169:for function "rtl_w1w0_phy" change its name and behavior · 76564428
      Chun-Hao Lin authored
      
      
      Change function name from "rtl_w1w0_phy" to "rtl_w0w1_phy".
      And its behavior from "write ones then write zeros" to
      "write zeros then write ones".
      
      In Realtek internal driver, bitwise operations are almost "write zeros then
      write ones". For easy to port hardware parameters from Realtek internal driver
      to Linux kernal driver "r8169", we would like to change this function's
      behavior and its name.
      
      Signed-off-by: default avatarChun-Hao Lin <hau@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76564428