Skip to content
  1. Apr 16, 2016
  2. Apr 15, 2016
    • Paolo Abeni's avatar
      tun: use per cpu variables for stats accounting · 608b9977
      Paolo Abeni authored
      
      
      Currently the tun device accounting uses dev->stats without applying any
      kind of protection, regardless that accounting happens in preemptible
      process context.
      This patch move the tun stats to a per cpu data structure, and protect
      the updates with  u64_stats_update_begin()/u64_stats_update_end() or
      this_cpu_inc according to the stat type. The per cpu stats are
      aggregated by the newly added ndo_get_stats64 ops.
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      608b9977
    • David S. Miller's avatar
      Merge branch 'bpf-ARG_PTR_TO_RAW_STACK' · 548aacdd
      David S. Miller authored
      
      
      Merge branch 'bpf-ARG_PTR_TO_RAW_STACK'
      
      Daniel Borkmann says:
      
      ====================
      BPF updates
      
      This series adds a new verifier argument type called
      ARG_PTR_TO_RAW_STACK and converts related helpers to make
      use of it. Basic idea is that we can save init of stack
      memory when the helper function is guaranteed to fully
      fill out the passed buffer in every path. Series also adds
      test cases and converts samples. For more details, please
      see individual patches.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      548aacdd
    • Daniel Borkmann's avatar
      bpf, samples: add test cases for raw stack · 3f2050e2
      Daniel Borkmann authored
      
      
      This adds test cases mostly around ARG_PTR_TO_RAW_STACK to check the
      verifier behaviour.
      
        [...]
        #84 raw_stack: no skb_load_bytes OK
        #85 raw_stack: skb_load_bytes, no init OK
        #86 raw_stack: skb_load_bytes, init OK
        #87 raw_stack: skb_load_bytes, spilled regs around bounds OK
        #88 raw_stack: skb_load_bytes, spilled regs corruption OK
        #89 raw_stack: skb_load_bytes, spilled regs corruption 2 OK
        #90 raw_stack: skb_load_bytes, spilled regs + data OK
        #91 raw_stack: skb_load_bytes, invalid access 1 OK
        #92 raw_stack: skb_load_bytes, invalid access 2 OK
        #93 raw_stack: skb_load_bytes, invalid access 3 OK
        #94 raw_stack: skb_load_bytes, invalid access 4 OK
        #95 raw_stack: skb_load_bytes, invalid access 5 OK
        #96 raw_stack: skb_load_bytes, invalid access 6 OK
        #97 raw_stack: skb_load_bytes, large access OK
        Summary: 98 PASSED, 0 FAILED
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f2050e2
    • Daniel Borkmann's avatar
      bpf, samples: don't zero data when not needed · 02413cab
      Daniel Borkmann authored
      
      
      Remove the zero initialization in the sample programs where appropriate.
      Note that this is an optimization which is now possible, old programs
      still doing the zero initialization are just fine as well. Also, make
      sure we don't have padding issues when we don't memset() the entire
      struct anymore.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02413cab
    • Daniel Borkmann's avatar
      bpf: convert relevant helper args to ARG_PTR_TO_RAW_STACK · 074f528e
      Daniel Borkmann authored
      
      
      This patch converts all helpers that can use ARG_PTR_TO_RAW_STACK as argument
      type. For tc programs this is bpf_skb_load_bytes(), bpf_skb_get_tunnel_key(),
      bpf_skb_get_tunnel_opt(). For tracing, this optimizes bpf_get_current_comm()
      and bpf_probe_read(). The check in bpf_skb_load_bytes() for MAX_BPF_STACK can
      also be removed since the verifier already makes sure we stay within bounds
      on stack buffers.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      074f528e
    • Daniel Borkmann's avatar
      bpf, verifier: add ARG_PTR_TO_RAW_STACK type · 435faee1
      Daniel Borkmann authored
      
      
      When passing buffers from eBPF stack space into a helper function, we have
      ARG_PTR_TO_STACK argument type for helpers available. The verifier makes sure
      that such buffers are initialized, within boundaries, etc.
      
      However, the downside with this is that we have a couple of helper functions
      such as bpf_skb_load_bytes() that fill out the passed buffer in the expected
      success case anyway, so zero initializing them prior to the helper call is
      unneeded/wasted instructions in the eBPF program that can be avoided.
      
      Therefore, add a new helper function argument type called ARG_PTR_TO_RAW_STACK.
      The idea is to skip the STACK_MISC check in check_stack_boundary() and color
      the related stack slots as STACK_MISC after we checked all call arguments.
      
      Helper functions using ARG_PTR_TO_RAW_STACK must make sure that every path of
      the helper function will fill the provided buffer area, so that we cannot leak
      any uninitialized stack memory. This f.e. means that error paths need to
      memset() the buffers, but the expected fast-path doesn't have to do this
      anymore.
      
      Since there's no such helper needing more than at most one ARG_PTR_TO_RAW_STACK
      argument, we can keep it simple and don't need to check for multiple areas.
      Should in future such a use-case really appear, we have check_raw_mode() that
      will make sure we implement support for it first.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      435faee1
    • Daniel Borkmann's avatar
      bpf, verifier: add bpf_call_arg_meta for passing meta data · 33ff9823
      Daniel Borkmann authored
      
      
      Currently, when the verifier checks calls in check_call() function, we
      call check_func_arg() for all 5 arguments e.g. to make sure expected types
      are correct. In some cases, we collect meta data (here: map pointer) to
      perform additional checks such as checking stack boundary on key/value
      sizes for subsequent arguments. As we're going to extend the meta data,
      add a generic struct bpf_call_arg_meta that we can use for passing into
      check_func_arg().
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33ff9823
    • Marcelo Ricardo Leitner's avatar
      sctp: add support for RPS and RFS · 486bdee0
      Marcelo Ricardo Leitner authored
      
      
      This patch adds what's missing to properly support RPS and RFS on SCTP,
      as some of it is already implemented in common calls.
      
      Having support for RPS and RFS allows better scaling specially because
      not all NICs support hashing SCTP headers.
      
      Save the hash right when we dequeue a skb from inqueue so we do it only
      once per skb instead of per chunk. New sockets will then inherit the
      hash through sctp_copy_sock().
      
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      486bdee0
    • Eric Dumazet's avatar
      net: validate_xmit_skb() changes · d21fd63e
      Eric Dumazet authored
      
      
      skbs given to validate_xmit_skb() should not have a next
      pointer anymore.
      
      Also if a packet is dropped, increment dev->tx_dropped
      __dev_queue_xmit() no longer has to change tx_dropped in this case.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d21fd63e
    • Weongyo Jeong's avatar
      packet: uses kfree_skb() for errors. · da37845f
      Weongyo Jeong authored
      
      
      consume_skb() isn't for error cases that kfree_skb() is more proper
      one.  At this patch, it fixed tpacket_rcv() and packet_rcv() to be
      consistent for error or non-error cases letting perf trace its event
      properly.
      
      Signed-off-by: default avatarWeongyo Jeong <weongyo.linux@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da37845f
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix a race condition leading to subscriber refcnt bug · 333f7962
      Parthasarathy Bhuvaragan authored
      
      
      Until now, the requests sent to topology server are queued
      to a workqueue by the generic server framework.
      These messages are processed by worker threads and trigger the
      registered callbacks.
      To reduce latency on uniprocessor systems, explicit rescheduling
      is performed using cond_resched() after MAX_RECV_MSG_COUNT(25)
      messages.
      
      This implementation on SMP systems leads to an subscriber refcnt
      error as described below:
      When a worker thread yields by calling cond_resched() in a SMP
      system, a new worker is created on another CPU to process the
      pending workitem. Sometimes the sleeping thread wakes up before
      the new thread finishes execution.
      This breaks the assumption on ordering and being single threaded.
      The fault is more frequent when MAX_RECV_MSG_COUNT is lowered.
      
      If the first thread was processing subscription create and the
      second thread processing close(), the close request will free
      the subscriber and the create request oops as follows:
      
      [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380         [tipc]
      [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97
      [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc]
      [...]
      [31.228377] Call Trace:
      [31.228377]  [<ffffffff812fbb6b>] dump_stack+0x4d/0x72
      [31.228377]  [<ffffffff8105a311>] __warn+0xd1/0xf0
      [31.228377]  [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20
      [31.228377]  [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
      [31.228377]  [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc]
      [31.228377]  [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc]
      [31.228377]  [<ffffffff81071925>] process_one_work+0x145/0x3d0
      [31.246554] ---[ end trace c3882c9baa05a4fd ]---
      [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266
      [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428
      [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0
      [31.249323] PGD 0
      [31.249323] Oops: 0000 [#1] SMP
      
      In this commit, we
      - rename tipc_conn_shutdown() to tipc_conn_release().
      - move connection release callback execution from tipc_close_conn()
        to a new function tipc_sock_release(), which is executed before
        we free the connection.
      Thus we release the subscriber during connection release procedure
      rather than connection shutdown procedure.
      
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      333f7962