Skip to content
  1. May 18, 2015
    • David S. Miller's avatar
      Merge branch 'tcp_mem_pressure' · 5d48ef3e
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tcp: better handling of memory pressure
      
      When testing commit 790ba456
      
       ("tcp: set SOCK_NOSPACE under memory
      pressure") using edge triggered epoll applications, I found various
      issues under memory pressure and thousands of active sockets.
      
      This patch series is a first round to solve these issues, in send
      and receive paths. There are probably other fixes needed, but
      with this series, my tests now all succeed.
      
      v2: fix typo in "allow one skb to be received per socket under memory pressure",
      as spotted by Jason Baron.
      ====================
      
      Acked-by: default avatarJason Baron <jbaron@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d48ef3e
    • Eric Dumazet's avatar
      tcp: halves tcp_mem[] limits · b66e91cc
      Eric Dumazet authored
      
      
      Allowing tcp to use ~19% of physical memory is way too much,
      and allowed bugs to be hidden. Add to this that some drivers use a full
      page per incoming frame, so real cost can be twice the advertized one.
      
      Reduce tcp_mem by 50 % as a first step to sanity.
      
      tcp_mem[0,1,2] defaults are now 4.68%, 6.25%, 9.37% of physical memory.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b66e91cc
    • Eric Dumazet's avatar
      tcp: allow one skb to be received per socket under memory pressure · 76dfa608
      Eric Dumazet authored
      
      
      While testing tight tcp_mem settings, I found tcp sessions could be
      stuck because we do not allow even one skb to be received on them.
      
      By allowing one skb to be received, we introduce fairness and
      eventuallu force memory hogs to release their allocation.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76dfa608
    • Eric Dumazet's avatar
      tcp: fix behavior for epoll edge trigger · 8e4d980a
      Eric Dumazet authored
      Under memory pressure, tcp_sendmsg() can fail to queue a packet
      while no packet is present in write queue. If we return -EAGAIN
      with no packet in write queue, no ACK packet will ever come
      to raise EPOLLOUT.
      
      We need to allow one skb per TCP socket, and make sure that
      tcp sockets can release their forward allocations under pressure.
      
      This is a followup to commit 790ba456
      
       ("tcp: set SOCK_NOSPACE
      under memory pressure")
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e4d980a
    • Eric Dumazet's avatar
      tcp: introduce tcp_under_memory_pressure() · b8da51eb
      Eric Dumazet authored
      
      
      Introduce an optimized version of sk_under_memory_pressure()
      for TCP. Our intent is to use it in fast paths.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8da51eb
    • Eric Dumazet's avatar
      tcp: rename sk_forced_wmem_schedule() to sk_forced_mem_schedule() · a6c5ea4c
      Eric Dumazet authored
      
      
      We plan to use sk_forced_wmem_schedule() in input path as well,
      so make it non static and rename it.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6c5ea4c
    • Eric Dumazet's avatar
      net: fix sk_mem_reclaim_partial() · 1a24e04e
      Eric Dumazet authored
      
      
      sk_mem_reclaim_partial() goal is to ensure each socket has
      one SK_MEM_QUANTUM forward allocation. This is needed both for
      performance and better handling of memory pressure situations in
      follow up patches.
      
      SK_MEM_QUANTUM is currently a page, but might be reduced to 4096 bytes
      as some arches have 64KB pages.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a24e04e
    • Willem de Bruijn's avatar
      net-packet: fix null pointer exception in rollover mode · 4633c9e0
      Willem de Bruijn authored
      Rollover can be enabled as flag or mode. Allocate state in both cases.
      This solves a NULL pointer exception in fanout_demux_rollover on
      referencing po->rollover if using mode rollover.
      
      Also make sure that in rollover mode each silo is tried (contrary
      to rollover flag, where the main socket is excluded after an initial
      try_self).
      
      Tested:
        Passes tools/testing/net/psock_fanout.c, which tests both modes and
        flag. My previous tests were limited to bench_rollover, which only
        stresses the flag. The test now completes safely. it still gives an
        error for mode rollover, because it does not expect the new headroom
        (ROOM_NORMAL) requirement. I will send a separate patch to the test.
      
      Fixes: 0648ab70
      
       ("packet: rollover prepare: per-socket state")
      
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      
      ----
      
      I should have run this test and caught this before submission, of
      course. Apologies for the oversight.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4633c9e0
    • Eric Dumazet's avatar
      net: fix two sparse errors · c91d4606
      Eric Dumazet authored
      First one in __skb_checksum_validate_complete() fixes the following
      (and other callers)
      
      make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/tcp_ipv4.o
        CHECK   net/ipv4/tcp_ipv4.c
      include/linux/skbuff.h:3052:24: warning: incorrect type in return expression (different base types)
      include/linux/skbuff.h:3052:24:    expected restricted __sum16
      include/linux/skbuff.h:3052:24:    got int
      
      Second is fixing gso_make_checksum() :
      
        CHECK   net/ipv4/gre_offload.c
      include/linux/skbuff.h:3360:14: warning: incorrect type in assignment (different base types)
      include/linux/skbuff.h:3360:14:    expected unsigned short [unsigned] [usertype] csum
      include/linux/skbuff.h:3360:14:    got restricted __sum16
      include/linux/skbuff.h:3365:16: warning: incorrect type in return expression (different base types)
      include/linux/skbuff.h:3365:16:    expected restricted __sum16
      include/linux/skbuff.h:3365:16:    got unsigned short [unsigned] [usertype] csum
      
      Fixes: 5a212329 ("net: Support for csum_bad in skbuff")
      Fixes: 7e2b10c1
      
       ("net: Support for multiple checksums with gso")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      CC: Tom Herbert <tom@herbertland.com>
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c91d4606
    • Eric Dumazet's avatar
      netfilter: synproxy: fix sparse errors · ba6d0564
      Eric Dumazet authored
      
      
      Fix verbose sparse errors :
      
      make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/netfilter/ipt_SYNPROXY.o
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba6d0564
    • Eric Dumazet's avatar
      ipip: fix one sparse error · 252a8fbe
      Eric Dumazet authored
      make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/ipip.o
        CHECK   net/ipv4/ipip.c
      net/ipv4/ipip.c:254:27: warning: incorrect type in assignment (different base types)
      net/ipv4/ipip.c:254:27:    expected restricted __be32 [addressable] [usertype] o_key
      net/ipv4/ipip.c:254:27:    got restricted __be16 [addressable] [usertype] i_flags
      
      Fixes: 3b7b514f
      
       ("ipip: fix a regression in ioctl")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      252a8fbe
    • Eric Dumazet's avatar
      net: fix sparse error in csum_replace4() · d53a2aa3
      Eric Dumazet authored
      make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/netfilter/nf_nat_l3proto_ipv4.o
        CHECK   net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
      include/net/checksum.h:125:64: warning: incorrect type in argument 2 (different base types)
      include/net/checksum.h:125:64:    expected restricted __wsum [usertype] addend
      include/net/checksum.h:125:64:    got restricted __be32 [usertype] from
      include/net/checksum.h:125:71: warning: incorrect type in argument 2 (different base types)
      include/net/checksum.h:125:71:    expected restricted __wsum [usertype] addend
      include/net/checksum.h:125:71:    got restricted __be32 [usertype] to
      include/net/checksum.h:125:64: warning: incorrect type in argument 2 (different base types)
      include/net/checksum.h:125:64:    expected restricted __wsum [usertype] addend
      include/net/checksum.h:125:64:    got restricted __be32 [usertype] from
      include/net/checksum.h:125:71: warning: incorrect type in argument 2 (different base types)
      include/net/checksum.h:125:71:    expected restricted __wsum [usertype] addend
      include/net/checksum.h:125:71:    got restricted __be32 [usertype] to
      
      Fixes: 4565af0d
      
       ("net: optimise csum_replace4()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d53a2aa3
  2. May 17, 2015
  3. May 16, 2015
  4. May 15, 2015