Skip to content
  1. Feb 11, 2017
    • David S. Miller's avatar
      Merge branch 'mlxsw-identical-routes-handling' · b4c4ebcf
      David S. Miller authored
      
      
      Jiri Pirko says:
      
      ====================
      mlxsw: Identical routes handling
      
      Ido says:
      
      The kernel can store several FIB aliases that share the same prefix and
      length. These aliases can differ in other parameters such as TOS and
      metric, which are taken into account during lookup.
      
      Offloading devices might not have the same flexibility, allowing only a
      single route with the same prefix and length to be reflected. mlxsw is
      one such device.
      
      This patchset aims to correctly handle this situation in the mlxsw
      driver. The first four patches introduce small changes in the IPv4 FIB
      code, so that listeners of the FIB notification chain will be able to
      correctly handle identical routes.
      
      The last three patches build on top of previous work and introduce the
      necessary changes in the mlxsw driver. The biggest change is the
      introduction of a FIB node, where identical routes are chained, instead
      of a primitive reference counting. This is explained in detail in the
      fifth patch.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4c4ebcf
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Add support for route replace · 599cf8f9
      Ido Schimmel authored
      
      
      Upon the reception of an ENTRY_REPLACE notification, resolve the FIB
      node corresponding to the prefix and length and insert the new route
      before the first matching entry.
      
      Since the notification also signals the deletion of the replaced route,
      delete it from the driver's cache.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      599cf8f9
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Add support for route append · 4283bce5
      Ido Schimmel authored
      
      
      When a new route is appended, it's placed after existing routes sharing
      the same parameters (prefix, length, table ID, TOS and priority).
      
      While the device supports only one route with the same prefix and length
      in a single table, it's important to correctly place the appended route
      in the driver's cache, as when a route is deleted the next one is
      programmed into the device.
      
      Following the reception of an ENTRY_APPEND notification, resolve the
      FIB node corresponding to the prefix and length and correctly place the
      new entry in its entry list.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4283bce5
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Correctly handle identical routes · 9aecce1c
      Ido Schimmel authored
      
      
      In the device, routes are indexed in a routing table based on the prefix
      and its length. This is in contrast to the kernel's FIB where several
      FIB aliases can exist with these parameters being identical. In such
      cases, the routes will be sorted by table ID (LOCAL first, then MAIN),
      TOS and finally priority (metric).
      
      During lookup, these routes will be evaluated in order. In case the
      packet's TOS field is non-zero and a FIB alias with a matching TOS is
      found, then it's selected. Otherwise, the lookup defaults to the route
      with TOS 0 (if it exists). However, if the requested scope is narrower
      than the one found, then the lookup continues.
      
      To best reflect the kernel's datapath we should take the above into
      account. Given a prefix and its length, the reflected route will always
      be the first one in the FIB alias list. However, if the route has a
      non-zero TOS then its action will be converted to trap instead of
      forward, since we currently don't support TOS-based routing. If this
      turns out to be a real issue, we can add support for that using
      policy-based switching.
      
      The route's scope can be effectively ignored as any packet being routed
      by the device would've been looked-up using the widest scope (UNIVERSE).
      
      To achieve that we need to do two changes. Firstly, we need to create
      another struct (FIB node) that will hold the list of FIB entries sharing
      the same prefix and length. This struct will be hashed using these two
      parameters.
      
      Secondly, we need to change the route reflection to match the above
      logic, so that the first FIB entry in the list will be programmed into
      the device while the rest will remain in the driver's cache in case of
      subsequent changes.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9aecce1c
    • Ido Schimmel's avatar
      ipv4: fib: Add events for FIB replace and append · 2f3a5272
      Ido Schimmel authored
      
      
      The FIB notification chain currently uses the NLM_F_{REPLACE,APPEND}
      flags to signal routes being replaced or appended.
      
      Instead of using netlink flags for in-kernel notifications we can simply
      introduce two new events in the FIB notification chain. This has the
      added advantage of making the API cleaner, thereby making it clear that
      these events should be supported by listeners of the notification chain.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f3a5272
    • Ido Schimmel's avatar
      ipv4: fib: Send notification before deleting FIB alias · 5b7d616d
      Ido Schimmel authored
      
      
      When a FIB alias is replaced following NLM_F_REPLACE, the ENTRY_ADD
      notification is sent after the reference on the previous FIB info was
      dropped. This is problematic as potential listeners might need to access
      it in their notification blocks.
      
      Solve this by sending the notification prior to the deletion of the
      replaced FIB alias. This is consistent with ENTRY_DEL notifications.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b7d616d
    • Ido Schimmel's avatar
      ipv4: fib: Send deletion notification with actual FIB alias type · 42d5aa76
      Ido Schimmel authored
      
      
      When a FIB alias is removed, a notification is sent using the type
      passed from user space - can be RTN_UNSPEC - instead of the actual type
      of the removed alias. This is problematic for listeners of the FIB
      notification chain, as several FIB aliases can exist with matching
      parameters, but the type.
      
      Solve this by passing the actual type of the removed FIB alias.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      CC: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42d5aa76
    • Ido Schimmel's avatar
      ipv4: fib: Only flush FIB aliases belonging to currently flushed table · 58e3bdd5
      Ido Schimmel authored
      
      
      In case the MAIN table is flushed and its trie is shared with the LOCAL
      table, then we might be flushing FIB aliases belonging to the latter.
      This can lead to FIB_ENTRY_DEL notifications sent with the wrong table
      ID.
      
      The above doesn't affect current listeners, as the table ID is ignored
      during entry deletion, but this will change later in the patchset.
      
      When flushing a particular table, skip any aliases belonging to a
      different one.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      CC: Alexander Duyck <alexander.h.duyck@intel.com>
      CC: Patrick McHardy <kaber@trash.net>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58e3bdd5
  2. Feb 10, 2017