Skip to content
  1. Oct 14, 2013
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: convert built-in tables/chains to chain types · 9370761c
      Pablo Neira Ayuso authored
      
      
      This patch converts built-in tables/chains to chain types that
      allows you to deploy customized table and chain configurations from
      userspace.
      
      After this patch, you have to specify the chain type when
      creating a new chain:
      
       add chain ip filter output { type filter hook input priority 0; }
                                    ^^^^ ------
      
      The existing chain types after this patch are: filter, route and
      nat. Note that tables are just containers of chains with no specific
      semantics, which is a significant change with regards to iptables.
      
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      9370761c
    • Patrick McHardy's avatar
      netfilter: nft_payload: add optimized payload implementation for small loads · c29b72e0
      Patrick McHardy authored
      
      
      Add an optimized payload expression implementation for small (up to 4 bytes)
      aligned data loads from the linear packet area.
      
      This patch also includes original Patrick McHardy's entitled (nf_tables:
      inline nft_payload_fast_eval() into main evaluation loop).
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c29b72e0
    • Patrick McHardy's avatar
      netfilter: nf_tables: add optimized data comparison for small values · cb7dbfd0
      Patrick McHardy authored
      
      
      Add an optimized version of nft_data_cmp() that only handles values of to
      4 bytes length.
      
      This patch includes original Patrick McHardy's patch entitled (nf_tables:
      inline nft_cmp_fast_eval() into main evaluation loop).
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cb7dbfd0
    • Patrick McHardy's avatar
      netfilter: nf_tables: expression ops overloading · ef1f7df9
      Patrick McHardy authored
      
      
      Split the expression ops into two parts and support overloading of
      the runtime expression ops based on the requested function through
      a ->select_ops() callback.
      
      This can be used to provide optimized implementations, for instance
      for loading small aligned amounts of data from the packet or inlining
      frequently used operations into the main evaluation loop.
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ef1f7df9
    • Patrick McHardy's avatar
      netfilter: nf_tables: add netlink set API · 20a69341
      Patrick McHardy authored
      
      
      This patch adds the new netlink API for maintaining nf_tables sets
      independently of the ruleset. The API supports the following operations:
      
      - creation of sets
      - deletion of sets
      - querying of specific sets
      - dumping of all sets
      
      - addition of set elements
      - removal of set elements
      - dumping of all set elements
      
      Sets are identified by name, each table defines an individual namespace.
      The name of a set may be allocated automatically, this is mostly useful
      in combination with the NFT_SET_ANONYMOUS flag, which destroys a set
      automatically once the last reference has been released.
      
      Sets can be marked constant, meaning they're not allowed to change while
      linked to a rule. This allows to perform lockless operation for set
      types that would otherwise require locking.
      
      Additionally, if the implementation supports it, sets can (as before) be
      used as maps, associating a data value with each key (or range), by
      specifying the NFT_SET_MAP flag and can be used for interval queries by
      specifying the NFT_SET_INTERVAL flag.
      
      Set elements are added and removed incrementally. All element operations
      support batching, reducing netlink message and set lookup overhead.
      
      The old "set" and "hash" expressions are replaced by a generic "lookup"
      expression, which binds to the specified set. Userspace is not aware
      of the actual set implementation used by the kernel anymore, all
      configuration options are generic.
      
      Currently the implementation selection logic is largely missing and the
      kernel will simply use the first registered implementation supporting the
      requested operation. Eventually, the plan is to have userspace supply a
      description of the data characteristics and select the implementation
      based on expected performance and memory use.
      
      This patch includes the new 'lookup' expression to look up for element
      matching in the set.
      
      This patch includes kernel-doc descriptions for this set API and it
      also includes the following fixes.
      
      From Patrick McHardy:
      * netfilter: nf_tables: fix set element data type in dumps
      * netfilter: nf_tables: fix indentation of struct nft_set_elem comments
      * netfilter: nf_tables: fix oops in nft_validate_data_load()
      * netfilter: nf_tables: fix oops while listing sets of built-in tables
      * netfilter: nf_tables: destroy anonymous sets immediately if binding fails
      * netfilter: nf_tables: propagate context to set iter callback
      * netfilter: nf_tables: add loop detection
      
      From Pablo Neira Ayuso:
      * netfilter: nf_tables: allow to dump all existing sets
      * netfilter: nf_tables: fix wrong type for flags variable in newelem
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      20a69341
    • Patrick McHardy's avatar
      netfilter: add nftables · 96518518
      Patrick McHardy authored
      
      
      This patch adds nftables which is the intended successor of iptables.
      This packet filtering framework reuses the existing netfilter hooks,
      the connection tracking system, the NAT subsystem, the transparent
      proxying engine, the logging infrastructure and the userspace packet
      queueing facilities.
      
      In a nutshell, nftables provides a pseudo-state machine with 4 general
      purpose registers of 128 bits and 1 specific purpose register to store
      verdicts. This pseudo-machine comes with an extensible instruction set,
      a.k.a. "expressions" in the nftables jargon. The expressions included
      in this patch provide the basic functionality, they are:
      
      * bitwise: to perform bitwise operations.
      * byteorder: to change from host/network endianess.
      * cmp: to compare data with the content of the registers.
      * counter: to enable counters on rules.
      * ct: to store conntrack keys into register.
      * exthdr: to match IPv6 extension headers.
      * immediate: to load data into registers.
      * limit: to limit matching based on packet rate.
      * log: to log packets.
      * meta: to match metainformation that usually comes with the skbuff.
      * nat: to perform Network Address Translation.
      * payload: to fetch data from the packet payload and store it into
        registers.
      * reject (IPv4 only): to explicitly close connection, eg. TCP RST.
      
      Using this instruction-set, the userspace utility 'nft' can transform
      the rules expressed in human-readable text representation (using a
      new syntax, inspired by tcpdump) to nftables bytecode.
      
      nftables also inherits the table, chain and rule objects from
      iptables, but in a more configurable way, and it also includes the
      original datatype-agnostic set infrastructure with mapping support.
      This set infrastructure is enhanced in the follow up patch (netfilter:
      nf_tables: add netlink set API).
      
      This patch includes the following components:
      
      * the netlink API: net/netfilter/nf_tables_api.c and
        include/uapi/netfilter/nf_tables.h
      * the packet filter core: net/netfilter/nf_tables_core.c
      * the expressions (described above): net/netfilter/nft_*.c
      * the filter tables: arp, IPv4, IPv6 and bridge:
        net/ipv4/netfilter/nf_tables_ipv4.c
        net/ipv6/netfilter/nf_tables_ipv6.c
        net/ipv4/netfilter/nf_tables_arp.c
        net/bridge/netfilter/nf_tables_bridge.c
      * the NAT table (IPv4 only):
        net/ipv4/netfilter/nf_table_nat_ipv4.c
      * the route table (similar to mangle):
        net/ipv4/netfilter/nf_table_route_ipv4.c
        net/ipv6/netfilter/nf_table_route_ipv6.c
      * internal definitions under:
        include/net/netfilter/nf_tables.h
        include/net/netfilter/nf_tables_core.h
      * It also includes an skeleton expression:
        net/netfilter/nft_expr_template.c
        and the preliminary implementation of the meta target
        net/netfilter/nft_meta_target.c
      
      It also includes a change in struct nf_hook_ops to add a new
      pointer to store private data to the hook, that is used to store
      the rule list per chain.
      
      This patch is based on the patch from Patrick McHardy, plus merged
      accumulated cleanups, fixes and small enhancements to the nftables
      code that has been done since 2009, which are:
      
      From Patrick McHardy:
      * nf_tables: adjust netlink handler function signatures
      * nf_tables: only retry table lookup after successful table module load
      * nf_tables: fix event notification echo and avoid unnecessary messages
      * nft_ct: add l3proto support
      * nf_tables: pass expression context to nft_validate_data_load()
      * nf_tables: remove redundant definition
      * nft_ct: fix maxattr initialization
      * nf_tables: fix invalid event type in nf_tables_getrule()
      * nf_tables: simplify nft_data_init() usage
      * nf_tables: build in more core modules
      * nf_tables: fix double lookup expression unregistation
      * nf_tables: move expression initialization to nf_tables_core.c
      * nf_tables: build in payload module
      * nf_tables: use NFPROTO constants
      * nf_tables: rename pid variables to portid
      * nf_tables: save 48 bits per rule
      * nf_tables: introduce chain rename
      * nf_tables: check for duplicate names on chain rename
      * nf_tables: remove ability to specify handles for new rules
      * nf_tables: return error for rule change request
      * nf_tables: return error for NLM_F_REPLACE without rule handle
      * nf_tables: include NLM_F_APPEND/NLM_F_REPLACE flags in rule notification
      * nf_tables: fix NLM_F_MULTI usage in netlink notifications
      * nf_tables: include NLM_F_APPEND in rule dumps
      
      From Pablo Neira Ayuso:
      * nf_tables: fix stack overflow in nf_tables_newrule
      * nf_tables: nft_ct: fix compilation warning
      * nf_tables: nft_ct: fix crash with invalid packets
      * nft_log: group and qthreshold are 2^16
      * nf_tables: nft_meta: fix socket uid,gid handling
      * nft_counter: allow to restore counters
      * nf_tables: fix module autoload
      * nf_tables: allow to remove all rules placed in one chain
      * nf_tables: use 64-bits rule handle instead of 16-bits
      * nf_tables: fix chain after rule deletion
      * nf_tables: improve deletion performance
      * nf_tables: add missing code in route chain type
      * nf_tables: rise maximum number of expressions from 12 to 128
      * nf_tables: don't delete table if in use
      * nf_tables: fix basechain release
      
      From Tomasz Bursztyka:
      * nf_tables: Add support for changing users chain's name
      * nf_tables: Change chain's name to be fixed sized
      * nf_tables: Add support for replacing a rule by another one
      * nf_tables: Update uapi nftables netlink header documentation
      
      From Florian Westphal:
      * nft_log: group is u16, snaplen u32
      
      From Phil Oester:
      * nf_tables: operational limit match
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      96518518
    • Pablo Neira Ayuso's avatar
      netfilter: nf_nat: move alloc_null_binding to nf_nat_core.c · f59cb045
      Pablo Neira Ayuso authored
      
      
      Similar to nat_decode_session, alloc_null_binding is needed for both
      ip_tables and nf_tables, so move it to nf_nat_core.c. This change
      is required by nf_tables.
      
      This is an adapted version of the original patch from Patrick McHardy.
      
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f59cb045
    • Patrick McHardy's avatar
      netfilter: pass hook ops to hookfn · 795aa6ef
      Patrick McHardy authored
      
      
      Pass the hook ops to the hookfn to allow for generic hook
      functions. This change is required by nf_tables.
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      795aa6ef
  2. Oct 12, 2013
    • Eric Dumazet's avatar
      tcp: tcp_transmit_skb() optimizations · ccdbb6e9
      Eric Dumazet authored
      
      
      1) We need to take a timestamp only for skb that should be cloned.
      
      Other skbs are not in write queue and no rtt estimation is done on them.
      
      2) the unlikely() hint is wrong for receivers (they send pure ACK)
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: MF Nowlan <fitz@cs.yale.edu>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-By: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccdbb6e9
    • David S. Miller's avatar
      Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · 29b67c39
      David S. Miller authored
      Included changes:
      - update emails for A. Quartulli and M. Lindner in MAINTAINERS
      - switch to the next on-the-wire protocol version
      - introduce the T(ype) V(ersion) L(ength) V(alue) framework
      - adjust the existing components to make them use the new TVLV code
      - make the TT component use CRC32 instead of CRC16
      - totally remove the VIS functionality (has been moved to userspace)
      - reorder packet types and flags
      - add static checks on packet format
      - remove __packed from batadv_ogm_packet
      29b67c39
  3. Oct 11, 2013
  4. Oct 10, 2013