Skip to content
  1. Jan 04, 2017
    • Jon Paul Maloy's avatar
      tipc: reduce risk of user starvation during link congestion · 365ad353
      Jon Paul Maloy authored
      
      
      The socket code currently handles link congestion by either blocking
      and trying to send again when the congestion has abated, or just
      returning to the user with -EAGAIN and let him re-try later.
      
      This mechanism is prone to starvation, because the wakeup algorithm is
      non-atomic. During the time the link issues a wakeup signal, until the
      socket wakes up and re-attempts sending, other senders may have come
      in between and occupied the free buffer space in the link. This in turn
      may lead to a socket having to make many send attempts before it is
      successful. In extremely loaded systems we have observed latency times
      of several seconds before a low-priority socket is able to send out a
      message.
      
      In this commit, we simplify this mechanism and reduce the risk of the
      described scenario happening. When a message is attempted sent via a
      congested link, we now let it be added to the link's backlog queue
      anyway, thus permitting an oversubscription of one message per source
      socket. We still create a wakeup item and return an error code, hence
      instructing the sender to block or stop sending. Only when enough space
      has been freed up in the link's backlog queue do we issue a wakeup event
      that allows the sender to continue with the next message, if any.
      
      The fact that a socket now can consider a message sent even when the
      link returns a congestion code means that the sending socket code can
      be simplified. Also, since this is a good opportunity to get rid of the
      obsolete 'mtu change' condition in the three socket send functions, we
      now choose to refactor those functions completely.
      
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      365ad353
    • Jon Paul Maloy's avatar
      tipc: modify struct tipc_plist to be more versatile · 4d8642d8
      Jon Paul Maloy authored
      
      
      During multicast reception we currently use a simple linked list with
      push/pop semantics to store port numbers.
      
      We now see a need for a more generic list for storing values of type
      u32. We therefore make some modifications to this list, while replacing
      the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new
      functions which will come to use in the next commits.
      
      Acked-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d8642d8
    • Jon Paul Maloy's avatar
      tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions · 8c44e1af
      Jon Paul Maloy authored
      
      
      The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very
      similar. The latter function is also called from two locations, and
      there will be more in the coming commits, which will all need to test on
      different conditions.
      
      Instead of making yet another duplicates of the function, we now
      introduce a new macro tipc_wait_for_cond() where the wakeup condition
      can be stated as an argument to the call. This macro replaces all
      current and future uses of the two functions, which can now be
      eliminated.
      
      Acked-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c44e1af
    • David S. Miller's avatar
      Merge branch 'TPACKET_V3-TX_RING-support' · aa276dd7
      David S. Miller authored
      
      
      Sowmini Varadhan says:
      
      ====================
      TPACKET_V3 TX_RING support
      
      This patch series allows an application to use a single PF_PACKET
      descriptor and leverage the best implementations of TX_RING
      and RX_RING that exist today.
      
      Patch 1 adds the kernel/Documentation changes for TX_RING
      support and patch2 adds the associated test case in selftests.
      
      Changes since v2: additional sanity checks for setsockopt
      input for TX_RING/TPACKET_V3. Refactored psock_tpacket.c
      test code to avoid code duplication from V2.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa276dd7
    • Sowmini Varadhan's avatar
      tools: test case for TPACKET_V3/TX_RING support · fe878cad
      Sowmini Varadhan authored
      
      
      Add a test case and sample code for (TPACKET_V3, PACKET_TX_RING)
      
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe878cad
    • Sowmini Varadhan's avatar
      af_packet: TX_RING support for TPACKET_V3 · 7f953ab2
      Sowmini Varadhan authored
      
      
      Although TPACKET_V3 Rx has some benefits over TPACKET_V2 Rx, *_v3
      does not currently have TX_RING support. As a result an application
      that wants the best perf for Tx and Rx (e.g. to handle request/response
      transacations) ends up needing 2 sockets, one with *_v2 for Tx and
      another with *_v3 for Rx.
      
      This patch enables TPACKET_V2 compatible Tx features in TPACKET_V3
      so that an application can use a single descriptor to get the benefits
      of _v3 RX_RING and _v2 TX_RING. An application may do a block-send by
      first filling up multiple frames in the Tx ring and then triggering a
      transmit. This patch only support fixed size Tx frames for TPACKET_V3,
      and requires that tp_next_offset must be zero.
      
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f953ab2
  2. Jan 03, 2017