Skip to content
  1. Jul 26, 2016
  2. Jul 25, 2016
    • David S. Miller's avatar
      Merge branch 'mlxsw-port-mirroring' · bc0c419e
      David S. Miller authored
      
      
      Jiri Pirko says:
      
      ====================
      mlxsw: implement port mirroring offload
      
      This patchset introduces tc matchall classifier and its offload
      to Spectrum hardware. In combination with mirred action, defined port mirroring
      setup is offloaded by mlxsw/spectrum driver.
      
      The commands used for creating mirror ports:
      
      tc qdisc  add dev eth25 handle ffff: ingress
      tc filter add dev eth25 parent ffff:            \
              matchall skip_sw                        \
              action mirred egress mirror             \
              dev eth27
      
      tc qdisc add dev eth25 handle 1: root prio
      tc filter add dev eth25 parent 1:               \
              matchall skip_sw                        \
              action mirred egress mirror             \
              dev eth27
      
      These patches contain:
       - Resource query implementation
       - Hardware port mirorring support for spectrum.
       - Definition of the matchall traffic classifier.
       - General support for hw-offloading for that classifier.
       - Specific spectrum implementaion for matchall offloading.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc0c419e
    • Yotam Gigi's avatar
      mlxsw: spectrum: Add support in matchall mirror TC offloading · 763b4b70
      Yotam Gigi authored
      
      
      This patch offloads port mirroring directives to hw using the matchall TC
      with action mirror. It includes both the implementation of the
      ndo_setup_tc function for the spectrum driver and the spectrum hardware
      offload configuration code.
      
      The hardware offload code is basically two new functions which are capable
      of adding and removing a new mirror ports pair. It is done using the MPAT,
      MPAR and SBIB registers:
       - A new Switch-Port Analyzer (SPAN) entry is added using MPAT to the 'to'
         port.
       - The 'to' port is bound to the SPAN entry using MPAR register.
       - In case of egress SPAN, the 'to' port gets a new internal shared
         buffer using SBIB register.
      
      In addition, a new database was added to the mlxsw_sp struct to store all
      the SPAN entries and their bound ports list. The number of supported SPAN
      entries is determined by resource query.
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      763b4b70
    • Yotam Gigi's avatar
      net/sched: act_mirred: Add helper inlines to access tcf_mirred info. · 56a20680
      Yotam Gigi authored
      
      
      The helper function is_tcf_mirred_mirror helps finding whether an action
      struct is of type mirred and is configured to be of type mirror.
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56a20680
    • Yotam Gigi's avatar
      mlxsw: reg: Add the Monitoring Port Analyzer register · 23019054
      Yotam Gigi authored
      
      
      The MPAR register is used to bind ports to a SPAN entry (which was
      created using MPAT register) and thus mirror their traffic (ingress /
      egress) to a different port.
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23019054
    • Yotam Gigi's avatar
      mlxsw: reg: Add Monitoring Port Analyzer Table register · 43a46856
      Yotam Gigi authored
      
      
      The MPAT register is used to query and configure the Switch Port Analyzer
      (SPAN) table. This register is used to configure a port as a mirror output
      port, while after that a mirrored input port can be bound using MPAR
      register.
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43a46856
    • Yotam Gigi's avatar
      mlxsw: reg: Add Shared Buffer Internal Buffer register · 51ae8cc6
      Yotam Gigi authored
      
      
      The SBIB register configures per port buffer for internal use. This
      register is used to configure an egress mirror buffer on the egress port
      which does the mirroring.
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51ae8cc6
    • Yotam Gigi's avatar
      net/sched: Add match-all classifier hw offloading. · b87f7936
      Yotam Gigi authored
      
      
      Following the work that have been done on offloading classifiers like u32
      and flower, now the match-all classifier hw offloading is possible. if
      the interface supports tc offloading.
      
      To control the offloading, two tc flags have been introduced: skip_sw and
      skip_hw. Typical usage:
      
      tc filter add dev eth25 parent ffff: 	\
      	matchall skip_sw		\
      	action mirred egress mirror	\
      	dev eth27
      
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b87f7936
    • Jiri Pirko's avatar
      net/sched: introduce Match-all classifier · bf3994d2
      Jiri Pirko authored
      
      
      The matchall classifier matches every packet and allows the user to apply
      actions on it. This filter is very useful in usecases where every packet
      should be matched, for example, packet mirroring (SPAN) can be setup very
      easily using that filter.
      
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarYotam Gigi <yotamg@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf3994d2
    • Nogah Frankel's avatar
      mlxsw: pci: Add max span resources to resources query · ded821c8
      Nogah Frankel authored
      
      
      Add max span resources to resources query.
      
      Signed-off-by: default avatarNogah Frankel <nogahf@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ded821c8
    • Nogah Frankel's avatar
      mlxsw: pci: Add resources query implementation. · 57d316ba
      Nogah Frankel authored
      
      
      Add resources query implementation. If exists, query the HW for its
      builtin resources instead of having them as consts in the code.
      
      Signed-off-by: default avatarNogah Frankel <nogahf@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57d316ba
    • Kristian Evensen's avatar
      cdc_ether: Improve ZTE MF823/831/910 handling · bfe9b9d2
      Kristian Evensen authored
      
      
      The firmware in several ZTE devices (at least the MF823/831/910
      modems/mifis) use OS fingerprinting to determine which type of device to
      export. In addition, these devices export a REST API which can be used to
      control the type of device. So far, on Linux, the devices have been seen as
      RNDIS or CDC Ether.
      
      When CDC Ether is used, devices of the same type are, as with RNDIS,
      exported with the same, bogus random MAC address. In addition, the devices
      (at least on all firmware revisions I have found) use the bogus MAC when
      sending traffic routed from external networks. And as a final feature, the
      devices sometimes export the link state incorrectly. There are also
      references online to several other ZTE devices displaying this behavior,
      with several different PIDs and MAC addresses.
      
      This patch tries to improve the handling of ZTE devices by doing the
      following:
      
      * Create a new driver_info-struct that is used by ZTE devices that do not
      have an explicit entry in the product table. This struct is the same as the
      default cdc_ether driver info, but a new bind- and an rx_fixup-function
      have been added.
      
      * In the new bind function, we check if we have read a random MAC from the
      device. If we have, then we generate a new random MAC address. This will
      ensure that all devices get a unique MAC.
      
      * The rx_fixup-function replaces the destination MAC address in the skb
      with that of the device. I have not seen a revision of these devices that
      behaves correctly (i.e., sets the right destination MAC), so I chose not to
      do any comparison with for example the known, bogus addresses.
      
      * The MF823/MF832/MF910 sometimes export cdc carrier on twice on connect
      (the correct behavior is off then on). Work around this by manually setting
      carrier to off if an on-notification is received and the NOCARRIER-bit is
      not set.
      
      This change will affect all devices, but it should take care of similar
      mistakes made by other manufacturers. I tried to think of/look/test for
      problems/regressions that could be introduced by this behavior, but could
      not find any. However, my familiarity with this code path is not that
      great, so there could be something I have overlooked.
      
      I have tested this patch with multiple revisions of all three devices, and
      they behave as expected. In other words, they all got a valid, random MAC,
      the correct operational state and I can receive/sent traffic without
      problems. I also tested with some other cdc_ether devices I have and did
      not find any problems/regressions caused by the two general changes.
      
      v3->v4:
      * Forgot to remove unused variables, sorry about that (thanks David
      Miller).
      
      v2->v3:
      * I had forgot to remove the random MAC generation from usbnet_cdc_bind()
      (thanks Oliver).
      * Rework logic in the ZTE bind-function a bit.
      
      v1->v2:
      * Only generate random MAC for ZTE devices (thanks Oliver Neukum).
      * Set random MAC and do RX fixup for all ZTE devices that do not have a
      product-entry, as the bogus MAC have been seen on devices with several
      different PIDs/MAC addresses. In other words, it seems to be the default
      behavior of ZTE CDC Ether devices (thanks Lars Melin).
      
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfe9b9d2
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · c42d7121
      David S. Miller authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for net-next,
      they are:
      
      1) Count pre-established connections as active in "least connection"
         schedulers such that pre-established connections to avoid overloading
         backend servers on peak demands, from Michal Kubecek via Simon Horman.
      
      2) Address a race condition when resizing the conntrack table by caching
         the bucket size when fulling iterating over the hashtable in these
         three possible scenarios: 1) dump via /proc/net/nf_conntrack,
         2) unlinking userspace helper and 3) unlinking custom conntrack timeout.
         From Liping Zhang.
      
      3) Revisit early_drop() path to perform lockless traversal on conntrack
         eviction under stress, use del_timer() as synchronization point to
         avoid two CPUs evicting the same entry, from Florian Westphal.
      
      4) Move NAT hlist_head to nf_conn object, this simplifies the existing
         NAT extension and it doesn't increase size since recent patches to
         align nf_conn, from Florian.
      
      5) Use rhashtable for the by-source NAT hashtable, also from Florian.
      
      6) Don't allow --physdev-is-out from OUTPUT chain, just like
         --physdev-out is not either, from Hangbin Liu.
      
      7) Automagically set on nf_conntrack counters if the user tries to
         match ct bytes/packets from nftables, from Liping Zhang.
      
      8) Remove possible_net_t fields in nf_tables set objects since we just
         simply pass the net pointer to the backend set type implementations.
      
      9) Fix possible off-by-one in h323, from Toby DiPasquale.
      
      10) early_drop() may be called from ctnetlink patch, so we must hold
          rcu read size lock from them too, this amends Florian's patch #3
          coming in this batch, from Liping Zhang.
      
      11) Use binary search to validate jump offset in x_tables, this
          addresses the O(n!) validation that was introduced recently
          resolve security issues with unpriviledge namespaces, from Florian.
      
      12) Fix reference leak to connlabel in error path of nft_ct, from Zhang.
      
      13) Three updates for nft_log: Fix log prefix leak in error path. Bail
          out on loglevel larger than debug in nft_log and set on the new
          NF_LOG_F_COPY_LEN flag when snaplen is specified. Again from Zhang.
      
      14) Allow to filter rule dumps in nf_tables based on table and chain
          names.
      
      15) Simplify connlabel to always use 128 bits to store labels and
          get rid of unused function in xt_connlabel, from Florian.
      
      16) Replace set_expect_timeout() by mod_timer() from the h323 conntrack
          helper, by Gao Feng.
      
      17) Put back x_tables module reference in nft_compat on error, from
          Liping Zhang.
      
      18) Add a reference count to the x_tables extensions cache in
          nft_compat, so we can remove them when unused and avoid a crash
          if the extensions are rmmod, again from Zhang.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c42d7121
  3. Jul 24, 2016
  4. Jul 23, 2016
    • Liping Zhang's avatar
      netfilter: nft_compat: fix crash when related match/target module is removed · 4b512e1c
      Liping Zhang authored
      
      
      We "cache" the loaded match/target modules and reuse them, but when the
      modules are removed, we still point to them. Then we may end up with
      invalid memory references when using iptables-compat to add rules later.
      
      Input the following commands will reproduce the kernel crash:
        # iptables-compat -A INPUT -j LOG
        # iptables-compat -D INPUT -j LOG
        # rmmod xt_LOG
        # iptables-compat -A INPUT -j LOG
        BUG: unable to handle kernel paging request at ffffffffa05a9010
        IP: [<ffffffff813f783e>] strcmp+0xe/0x30
        Call Trace:
        [<ffffffffa05acc43>] nft_target_select_ops+0x83/0x1f0 [nft_compat]
        [<ffffffffa058a177>] nf_tables_expr_parse+0x147/0x1f0 [nf_tables]
        [<ffffffffa058e541>] nf_tables_newrule+0x301/0x810 [nf_tables]
        [<ffffffff8141ca00>] ? nla_parse+0x20/0x100
        [<ffffffffa057fa8f>] nfnetlink_rcv+0x33f/0x53d [nfnetlink]
        [<ffffffffa057f94b>] ? nfnetlink_rcv+0x1fb/0x53d [nfnetlink]
        [<ffffffff817116b8>] netlink_unicast+0x178/0x220
        [<ffffffff81711a5b>] netlink_sendmsg+0x2fb/0x3a0
        [<ffffffff816b7fc8>] sock_sendmsg+0x38/0x50
        [<ffffffff816b8a7e>] ___sys_sendmsg+0x28e/0x2a0
        [<ffffffff816bcb7e>] ? release_sock+0x1e/0xb0
        [<ffffffff81804ac5>] ? _raw_spin_unlock_bh+0x35/0x40
        [<ffffffff816bcbe2>] ? release_sock+0x82/0xb0
        [<ffffffff816b93d4>] __sys_sendmsg+0x54/0x90
        [<ffffffff816b9422>] SyS_sendmsg+0x12/0x20
        [<ffffffff81805172>] entry_SYSCALL_64_fastpath+0x1a/0xa9
      
      So when nobody use the related match/target module, there's no need to
      "cache" it. And nft_[match|target]_release are useless anymore, remove
      them.
      
      Signed-off-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4b512e1c
    • Liping Zhang's avatar
      netfilter: nft_compat: put back match/target module if init fail · 2bf4fade
      Liping Zhang authored
      
      
      If the user specify the invalid NFTA_MATCH_INFO/NFTA_TARGET_INFO attr
      or memory alloc fail, we should call module_put to the related match
      or target. Otherwise, we cannot remove the module even nobody use it.
      
      Signed-off-by: default avatarLiping Zhang <liping.zhang@spreadtrum.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2bf4fade
    • Gao Feng's avatar
      netfilter: h323: Use mod_timer instead of set_expect_timeout · 96d1327a
      Gao Feng authored
      
      
      Simplify the code without any side effect. The set_expect_timeout is
      used to modify the timer expired time.  It tries to delete timer, and
      add it again.  So we could use mod_timer directly.
      
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      96d1327a
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 107df032
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix memory leak in nftables, from Liping Zhang.
      
       2) Need to check result of vlan_insert_tag() in batman-adv otherwise we
          risk NULL skb derefs, from Sven Eckelmann.
      
       3) Check for dev_alloc_skb() failures in cfg80211, from Gregory
          Greenman.
      
       4) Handle properly when we have ppp_unregister_channel() happening in
          parallel with ppp_connect_channel(), from WANG Cong.
      
       5) Fix DCCP deadlock, from Eric Dumazet.
      
       6) Bail out properly in UDP if sk_filter() truncates the packet to be
          smaller than even the space that the protocol headers need.  From
          Michal Kubecek.
      
       7) Similarly for rose, dccp, and sctp, from Willem de Bruijn.
      
       8) Make TCP challenge ACKs less predictable, from Eric Dumazet.
      
       9) Fix infinite loop in bgmac_dma_tx_add() from Florian Fainelli.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
        packet: propagate sock_cmsg_send() error
        net/mlx5e: Fix del vxlan port command buffer memset
        packet: fix second argument of sock_tx_timestamp()
        net: switchdev: change ageing_time type to clock_t
        Update maintainer for EHEA driver.
        net/mlx4_en: Add resilience in low memory systems
        net/mlx4_en: Move filters cleanup to a proper location
        sctp: load transport header after sk_filter
        net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int
        net: cavium: liquidio: Avoid dma_unmap_single on uninitialized ndata
        net: nb8800: Fix SKB leak in nb8800_receive()
        et131x: Fix logical vs bitwise check in et131x_tx_timeout()
        vlan: use a valid default mtu value for vlan over macsec
        net: bgmac: Fix infinite loop in bgmac_dma_tx_add()
        mlxsw: spectrum: Prevent invalid ingress buffer mapping
        mlxsw: spectrum: Prevent overwrite of DCB capability fields
        mlxsw: spectrum: Don't emit errors when PFC is disabled
        mlxsw: spectrum: Indicate support for autonegotiation
        mlxsw: spectrum: Force link training according to admin state
        r8152: add MODULE_VERSION
        ...
      107df032
    • Linus Torvalds's avatar
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 88083e98
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "This contains a fix for a potential crash/corruption issue and another
        where the suid/sgid bits weren't cleared on write"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: verify upper dentry in ovl_remove_and_whiteout()
        ovl: Copy up underlying inode's ->i_mode to overlay inode
        ovl: handle ATTR_KILL*
      88083e98
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · b1386ced
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "Five fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        pps: do not crash when failed to register
        tools/vm/slabinfo: fix an unintentional printf
        testing/radix-tree: fix a macro expansion bug
        radix-tree: fix radix_tree_iter_retry() for tagged iterators.
        mm: memcontrol: fix cgroup creation failure after many small jobs
      b1386ced
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied/linux · d15ae814
      Linus Torvalds authored
      Pull intel kabylake drm fixes from Dave Airlie:
       "As mentioned Intel has gathered all the Kabylake fixes from -next,
        which we've enabled in 4.7 for the first time, these are pretty much
        limited in scope to only affects kabylake, which is hw that isn't
        shipping yet.  So I'm mostly okay with it going in now.
      
        If we don't land this, it might be a good idea to disable kabylake
        support in 4.7 before we ship"
      
      * tag 'drm-fixes-for-v4.7-rc8-intel-kbl' of git://people.freedesktop.org/~airlied/linux: (28 commits)
        drm/i915/kbl: Introduce the first official DMC for Kabylake.
        drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
        drm/i915/gen9: implement WaConextSwitchWithConcurrentTLBInvalidate
        drm/i915/gen9: Add WaFbcHighMemBwCorruptionAvoidance
        drm/i195/fbc: Add WaFbcNukeOnHostModify
        drm/i915/gen9: Add WaFbcWakeMemOn
        drm/i915/gen9: Add WaFbcTurnOffFbcWatermark
        drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch
        drm/i915/gen9: Add WaEnableChickenDCPR
        drm/i915/kbl: Add WaDisableSbeCacheDispatchPortSharing
        drm/i915/kbl: Add WaDisableGafsUnitClkGating
        drm/i915/kbl: Add WaForGAMHang
        drm/i915: Add WaInsertDummyPushConstP for bxt and kbl
        drm/i915/kbl: Add WaDisableDynamicCreditSharing
        drm/i915/kbl: Add WaDisableGamClockGating
        drm/i915/gen9: Enable must set chicken bits in config0 reg
        drm/i915/kbl: Add WaDisableLSQCROPERFforOCL
        drm/i915/kbl: Add WaDisableSDEUnitClockGating
        drm/i915/kbl: Add WaDisableFenceDestinationToSLM for A0
        drm/i915/kbl: Add WaEnableGapsTsvCreditFix
        ...
      d15ae814