Skip to content
  1. Oct 24, 2017
  2. Oct 23, 2017
    • Sven Eckelmann's avatar
      batman-adv: use inline kernel-doc for uapi constants · 40b16b9b
      Sven Eckelmann authored
      
      
      The enums of constants for netlink tends to become rather large over time.
      Documenting them is easier when the kernel-doc is actually next to constant
      and not in a different block above the enum.
      
      Also inline kernel-doc allows multi-paragraph description. This could be
      required to better document the netlink command types and the expected
      return values.
      
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      40b16b9b
    • Gustavo A. R. Silva's avatar
      net: core: rtnetlink: use BUG_ON instead of if condition followed by BUG · 058c8d59
      Gustavo A. R. Silva authored
      
      
      Use BUG_ON instead of if condition followed by BUG in do_setlink.
      
      This issue was detected with the help of Coccinelle.
      
      Signed-off-by: default avatarGustavo A. R. Silva <garsilva@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      058c8d59
    • Florian Fainelli's avatar
      net: systemport: Guard against unmapped TX ring · e83b1715
      Florian Fainelli authored
      
      
      Because SYSTEMPORT is a (semi) normal network device, the stack may attempt to
      queue packets on it oustide of the DSA slave transmit path.  When that happens,
      the DSA layer has not had a chance to tag packets with the appropriate per-port
      and per-queue information, and if that happens and we don't have a port 0 queue
      0 available (e.g: on boards where this does not exist), we will hit a NULL
      pointer de-reference in bcm_sysport_select_queue().
      
      Guard against such cases by testing for the TX ring validity.
      
      Fixes: 84ff33eeb23d ("net: systemport: Establish DSA network device queue mapping")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e83b1715
    • David S. Miller's avatar
      Merge branch 'mlxsw-Add-support-for-non-equal-cost-multi-path' · fbd15f48
      David S. Miller authored
      
      
      Jiri Pirko says:
      
      ====================
      mlxsw: Add support for non-equal-cost multi-path
      
      Ido says:
      
      In the device, nexthops are stored as adjacency entries in an array
      called the KVD linear (KVDL). When a multi-path route is hit the
      packet's headers are hashed and then converted to an index into KVDL
      based on the adjacency group's size and base index.
      
      Up until now the driver ignored the `weight` parameter for multi-path
      routes and allocated only one adjacency entry for each nexthop with a
      limit of 32 nexthops in a group. This set makes the driver take the
      `weight` parameter into account when allocating adjacency entries.
      
      First patch teaches dpipe to show the size of the adjacency group, so
      that users will be able to determine the actual weight of each nexthop.
      The second patch refactors the KVDL allocator, making it more receptive
      towards the addition of another partition later in the set.
      
      Patches 3-5 introduce small changes towards the actual change in the
      sixth patch that populates the adjacency entries according to their
      relative weight.
      
      Last two patches finally add another partition to the KVDL, which allows
      us to allocate more than 32 entries per-group and thus support more
      nexthops and also provide higher accuracy with regards to the requested
      weights.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbd15f48
    • Ido Schimmel's avatar
      mlxsw: spectrum: Add another partition to KVD linear · 330e2cc6
      Ido Schimmel authored
      
      
      The KVD linear is currently partitioned into two partitions. One for
      single entries and another for groups of 32 entries.
      
      Add another partition consisting of groups of 512 entries which will
      allow us to more accurately represent the nexthop weights in non-equal
      cost multi-path routing.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      330e2cc6
    • Ido Schimmel's avatar
      mlxsw: spectrum: Increase number of linear entries · f11fbaf8
      Ido Schimmel authored
      
      
      The memory region where adjacency entries (nexthops) are stored is
      called the KVD linear and is configured during initialization with a
      size of 64K.
      
      Extend this area with 32K more entries, that will be partitioned into 64
      groups of 0.5K entries, thereby allowing us to support weighted nexthops
      with high accuracy.
      
      Change the ratio between both types of hash entries, so as to prevent
      reduction in the number of double hash entries, which are used for IPv6
      neighbours and routes with a prefix length greater than 64.
      
      Note that the user will be able to control all these sizes once the
      devlink resource manager is introduced.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f11fbaf8
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Populate adjacency entries according to weights · eb789980
      Ido Schimmel authored
      
      
      Up until now the driver assumed all the nexthops have an equal weight
      and wrote each to a single adjacency entry.
      
      This patch takes the `weight` parameter into account and populates the
      adjacency group according to the relative weight of each nexthop.
      
      Specifically, the weights of all the nexthops that should be offloaded
      are first normalized and then used to calculate the upper adjacency
      index of each nexthop. This is done according to the hash-threshold
      algorithm used by the kernel for IPv4 multi-path routing.
      
      Adjacency groups are currently limited to 32 entries which limits the
      weights that can be used, but follow-up patches will introduce groups of
      512 entries.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb789980
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Prepare for large adjacency groups · 425a08c6
      Ido Schimmel authored
      
      
      The device has certain restrictions regarding the size of an adjacency
      group.
      
      Have the router determine the size of the adjacency group according to
      available KVDL allocation sizes and these restrictions.
      
      This was not needed until now since only allocations of up 32 entries
      were supported and these are all valid sizes for an adjacency group.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      425a08c6
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Store weight in nexthop struct · 408bd946
      Ido Schimmel authored
      
      
      As the first step towards non-equal-cost multi-path support, store each
      nexthop's weight.
      
      For IPv6 nexthops always set the weight to 1, as it only supports ECMP.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      408bd946
    • Ido Schimmel's avatar
      mlxsw: spectrum: Add ability to query KVDL allocation size · d672aec4
      Ido Schimmel authored
      
      
      The current KVDL allocation API allows the user to specify the requested
      number of entries, but the user has no way of knowing how many entries
      were actually allocated.
      
      This works because existing users (e.g., router) request the exact
      number they end up using. With the introduction of large adjacency
      groups, this will change, as the router will have the ability to choose
      from several allocation sizes, where larger allocations provide higher
      accuracy with respect to requested weights and better resilience against
      nexthop failures.
      
      One option is to have the router try several allocations of descending
      size until one succeeds, but a better way is to simply allow it to query
      the actual allocation size and then size its request accordingly.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d672aec4
    • Ido Schimmel's avatar
      mlxsw: spectrum: Better represent KVDL partitions · a875a2ee
      Ido Schimmel authored
      
      
      The KVD linear (KVDL) allocator currently consists of a very large
      bitmap that reflects the KVDL's usage. The boundaries of each partition
      as well as their allocation size are represented using defines.
      
      This representation requires us to patch all the functions that act on a
      partition whenever the partitioning scheme is changed. In addition, it
      does not enable the dynamic configuration of the KVDL using the
      up-coming resource manager.
      
      Add objects to represent these partitions as well as the accompanying
      code that acts on them to perform allocations and de-allocations.
      
      In the following patches, this will allow us to easily add another
      partition as well as new operations to act on these partitions.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a875a2ee
    • Ido Schimmel's avatar
      mlxsw: spectrum_dpipe: Add adjacency group size · e69cd9d7
      Ido Schimmel authored
      
      
      The adjacency group size is part of the match on the adjacency group and
      should therefore be exposed using dpipe.
      
      When non-equal-cost multi-path support will be introduced, the group's
      size will help users understand the exact number of adjacency entries
      each nexthop occupies, as a nexthop will no longer correspond to a
      single entry.
      
      The output for a multi-path route with two nexthops, one with weight 255
      and the second 1 will be:
      
      Example:
      
      $ devlink dpipe table dump pci/0000:01:00.0 name mlxsw_adj
      pci/0000:01:00.0:
        index 0
        match_value:
          type field_exact header mlxsw_meta field adj_index value 65536
          type field_exact header mlxsw_meta field adj_size value 512
          type field_exact header mlxsw_meta field adj_hash_index value 0
        action_value:
          type field_modify header ethernet field destination mac value e4:1d:2d:a5:f3:64
          type field_modify header mlxsw_meta field erif_port mapping ifindex mapping_value 3 value 1
      
        index 1
        match_value:
          type field_exact header mlxsw_meta field adj_index value 65536
          type field_exact header mlxsw_meta field adj_size value 512
          type field_exact header mlxsw_meta field adj_hash_index value 510
        action_value:
          type field_modify header ethernet field destination mac value e4:1d:2d:a5:f3:65
          type field_modify header mlxsw_meta field erif_port mapping ifindex mapping_value 4 value 2
      
      Thus, the first nexthop occupies 510 adjacency entries and the second 2,
      which leads to a ratio of 255 to 1.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e69cd9d7
    • David S. Miller's avatar
      Merge branch 'bcm_sf2-Add-support-for-IPv6-CFP-rules' · bc9db417
      David S. Miller authored
      
      
      Florian Fainelli says:
      
      ====================
      net: dsa: bcm_sf2: Add support for IPv6 CFP rules
      
      This patch series adds support for matching IPv6 addresses to the existing CFP
      support code. Because IPv6 addresses are four times bigger than IPv4, we can
      fit them anymore in a single slice, so we need to chain two in order to have
      a complete match. This makes us require a second bitmap tracking unique rules
      so we don't over populate the TCAM.
      
      Finally, because the code had to be re-organized, it became a lot easier to
      support arbitrary prefix/mask lengths, so the last two patches do just that.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc9db417
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Allow matching arbitrary IPv6 masks/lengths · dd8eff68
      Florian Fainelli authored
      
      
      There is no reason why we should limit ourselves to matching only
      full IPv4 addresses (/32), the same logic applies between the DATA and
      MASK ports, so just make it more configurable to accept both.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd8eff68
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Allow matching arbitrary IPv4 mask lengths · bc3fc44c
      Florian Fainelli authored
      
      
      There is no reason why we should limit ourselves to matching only full
      IPv4 addresses (/32), the same logic applies between the DATA and MASK
      ports, so just make it more configurable to accept both.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc3fc44c
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Add support for IPv6 CFP rules · ba0696c2
      Florian Fainelli authored
      
      
      Inserting IPv6 CFP rules complicates the code a little bit in that we
      need to insert two rules side by side and chain them to match a full
      IPv6 tuple (src, dst IPv6 + port + protocol).
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba0696c2
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Simplify bcm_sf2_cfp_rule_get_all() · 4daa70cf
      Florian Fainelli authored
      
      
      There is no need to do a HW search of the TCAMs which is something slow
      and expensive. Since we already maintain a bitmask of active CFP rules,
      just iterate over those, starting from bit 1 (after the reserved entry)
      to get a count and index position to store the rule later on.
      
      As a result we can remove the code in bcm_sf2_cfp_rule_get() which acted
      on the "search" argument, and remove that argument.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4daa70cf
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Make UDF slices more configurable · 5d80bcbb
      Florian Fainelli authored
      
      
      In preparation for introducing IPv6 rules support, make the
      cfp_udf_layout more flexible and match more accurately how the HW is
      designed: we have 3 + 1 slices per protocol, but we may not be using all
      of them and we are relative to a particular base offset (slice A for
      IPv4 for instance). Also populate the slice number that should be used
      (slice 1 for IPv4) based on the lookup function.
      
      Finally, we introduce two helper functions: udf_upper_bits() and
      udf_lower_bits() to help setting the UDF_n_* valid bits based on the
      number of UDFs valid within a slice. Update the IPv4 rule setting to
      make use of it to be more robust wrt. change in number of User Defined
      Fields being programmed.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d80bcbb
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Move IPv4 CFP processing to specific functions · 33061458
      Florian Fainelli authored
      
      
      Move the processing of IPv4 rules into specific functions, allowing us
      to clearly identify which parts are generic and which ones are not. Also
      create a specific function to insert a rule into the action and policer
      RAMs as those tend to be fairly generic.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33061458
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Use existing shift/masks · 39cdd349
      Florian Fainelli authored
      
      
      Instead of open coding the shift for the IP protocol, IP fragment bit
      etc. define and/or use existing constants to that end.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39cdd349
    • Kees Cook's avatar
      isdn/gigaset: Provide cardstate context for bas timer callbacks · 33ad61d0
      Kees Cook authored
      
      
      While the work callback uses the urb to find cardstate from bas_cardstate,
      this may not be valid for timer callbacks. Instead, introduce a direct
      pointer back to the cardstate from bas_cardstate for use in timer
      callbacks.
      
      Reported-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Fixes: 4cfea08e
      
       ("isdn/gigaset: Convert timers to use timer_setup()")
      Cc: Paul Bolle <pebolle@tiscali.nl>
      Cc: Karsten Keil <isdn@linux-pingi.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Johan Hovold <johan@kernel.org>
      Cc: gigaset307x-common@lists.sourceforge.net
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33ad61d0
    • Alexei Starovoitov's avatar
      selftests/bpf: fix broken build of test_maps · e27afb84
      Alexei Starovoitov authored
      fix multiple build errors and warnings
      
      1.
      test_maps.c: In function ‘test_map_rdonly’:
      test_maps.c:1051:30: error: ‘BPF_F_RDONLY’ undeclared (first use in this function)
              MAP_SIZE, map_flags | BPF_F_RDONLY);
      
      2.
      test_maps.c:1048:6: warning: unused variable ‘i’ [-Wunused-variable]
        int i, fd, key = 0, value = 0;
      
      3.
      test_maps.c:1087:2: error: called object is not a function or function pointer
        assert(bpf_map_lookup_elem(fd, &key, &value) == -1 && errno == EPERM);
      
      4.
      ./bpf_helpers.h:72:11: error: use of undeclared identifier 'BPF_FUNC_getsockopt'
              (void *) BPF_FUNC_getsockopt;
      
      Fixes: e043325b ("bpf: Add tests for eBPF file mode")
      Fixes: 6e71b04a ("bpf: Add file mode configuration into bpf maps")
      Fixes: cd86d1fd
      
       ("bpf: Adding helper function bpf_getsockops")
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e27afb84
  3. Oct 22, 2017
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f8ddadc4
      David S. Miller authored
      
      
      There were quite a few overlapping sets of changes here.
      
      Daniel's bug fix for off-by-ones in the new BPF branch instructions,
      along with the added allowances for "data_end > ptr + x" forms
      collided with the metadata additions.
      
      Along with those three changes came veritifer test cases, which in
      their final form I tried to group together properly.  If I had just
      trimmed GIT's conflict tags as-is, this would have split up the
      meta tests unnecessarily.
      
      In the socketmap code, a set of preemption disabling changes
      overlapped with the rename of bpf_compute_data_end() to
      bpf_compute_data_pointers().
      
      Changes were made to the mv88e6060.c driver set addr method
      which got removed in net-next.
      
      The hyperv transport socket layer had a locking change in 'net'
      which overlapped with a change of socket state macro usage
      in 'net-next'.
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8ddadc4
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b5ac3beb
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "A little more than usual this time around. Been travelling, so that is
        part of it.
      
        Anyways, here are the highlights:
      
         1) Deal with memcontrol races wrt. listener dismantle, from Eric
            Dumazet.
      
         2) Handle page allocation failures properly in nfp driver, from Jaku
            Kicinski.
      
         3) Fix memory leaks in macsec, from Sabrina Dubroca.
      
         4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault.
      
         5) Several fixes in bnxt_en driver, including preventing potential
            NVRAM parameter corruption from Michael Chan.
      
         6) Fix for KRACK attacks in wireless, from Johannes Berg.
      
         7) rtnetlink event generation fixes from Xin Long.
      
         8) Deadlock in mlxsw driver, from Ido Schimmel.
      
         9) Disallow arithmetic operations on context pointers in bpf, from
            Jakub Kicinski.
      
        10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from
            Xin Long.
      
        11) Only TCP is supported for sockmap, make that explicit with a
            check, from John Fastabend.
      
        12) Fix IP options state races in DCCP and TCP, from Eric Dumazet.
      
        13) Fix panic in packet_getsockopt(), also from Eric Dumazet.
      
        14) Add missing locked in hv_sock layer, from Dexuan Cui.
      
        15) Various aquantia bug fixes, including several statistics handling
            cures. From Igor Russkikh et al.
      
        16) Fix arithmetic overflow in devmap code, from John Fastabend.
      
        17) Fix busted socket memory accounting when we get a fault in the tcp
            zero copy paths. From Willem de Bruijn.
      
        18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
        stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
        ipv6: flowlabel: do not leave opt->tot_len with garbage
        of_mdio: Fix broken PHY IRQ in case of probe deferral
        textsearch: fix typos in library helpers
        rxrpc: Don't release call mutex on error pointer
        net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
        net: stmmac: Fix stmmac_get_rx_hwtstamp()
        net: stmmac: Add missing call to dev_kfree_skb()
        mlxsw: spectrum_router: Configure TIGCR on init
        mlxsw: reg: Add Tunneling IPinIP General Configuration Register
        net: ethtool: remove error check for legacy setting transceiver type
        soreuseport: fix initialization race
        net: bridge: fix returning of vlan range op errors
        sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
        bpf: add test cases to bpf selftests to cover all access tests
        bpf: fix pattern matches for direct packet access
        bpf: fix off by one for range markings with L{T, E} patterns
        bpf: devmap fix arithmetic overflow in bitmap_size calculation
        net: aquantia: Bad udp rate on default interrupt coalescing
        net: aquantia: Enable coalescing management via ethtool interface
        ...
      b5ac3beb
    • Bernd Edlinger's avatar
      stmmac: Don't access tx_q->dirty_tx before netif_tx_lock · 8d5f4b07
      Bernd Edlinger authored
      This is the possible reason for different hard to reproduce
      problems on my ARMv7-SMP test system.
      
      The symptoms are in recent kernels imprecise external aborts,
      and in older kernels various kinds of network stalls and
      unexpected page allocation failures.
      
      My testing indicates that the trouble started between v4.5 and v4.6
      and prevails up to v4.14.
      
      Using the dirty_tx before acquiring the spin lock is clearly
      wrong and was first introduced with v4.6.
      
      Fixes: e3ad57c9
      
       ("stmmac: review RX/TX ring management")
      
      Signed-off-by: default avatarBernd Edlinger <bernd.edlinger@hotmail.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d5f4b07