Skip to content
  1. Dec 09, 2016
  2. Dec 08, 2016
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 5fccd64a
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains a large Netfilter update for net-next,
      to summarise:
      
      1) Add support for stateful objects. This series provides a nf_tables
         native alternative to the extended accounting infrastructure for
         nf_tables. Two initial stateful objects are supported: counters and
         quotas. Objects are identified by a user-defined name, you can fetch
         and reset them anytime. You can also use a maps to allow fast lookups
         using any arbitrary key combination. More info at:
      
         http://marc.info/?l=netfilter-devel&m=148029128323837&w=2
      
      
      
      2) On-demand registration of nf_conntrack and defrag hooks per netns.
         Register nf_conntrack hooks if we have a stateful ruleset, ie.
         state-based filtering or NAT. The new nf_conntrack_default_on sysctl
         enables this from newly created netnamespaces. Default behaviour is not
         modified. Patches from Florian Westphal.
      
      3) Allocate 4k chunks and then use these for x_tables counter allocation
         requests, this improves ruleset load time and also datapath ruleset
         evaluation, patches from Florian Westphal.
      
      4) Add support for ebpf to the existing x_tables bpf extension.
         From Willem de Bruijn.
      
      5) Update layer 4 checksum if any of the pseudoheader fields is updated.
         This provides a limited form of 1:1 stateless NAT that make sense in
         specific scenario, eg. load balancing.
      
      6) Add support to flush sets in nf_tables. This series comes with a new
         set->ops->deactivate_one() indirection given that we have to walk
         over the list of set elements, then deactivate them one by one.
         The existing set->ops->deactivate() performs an element lookup that
         we don't need.
      
      7) Two patches to avoid cloning packets, thus speed up packet forwarding
         via nft_fwd from ingress. From Florian Westphal.
      
      8) Two IPVS patches via Simon Horman: Decrement ttl in all modes to
         prevent infinite loops, patch from Dwip Banerjee. And one minor
         refactoring from Gao feng.
      
      9) Revisit recent log support for nf_tables netdev families: One patch
         to ensure that we correctly handle non-ethernet packets. Another
         patch to add missing logger definition for netdev. Patches from
         Liping Zhang.
      
      10) Three patches for nft_fib, one to address insufficient register
          initialization and another to solve incorrect (although harmless)
          byteswap operation. Moreover update xt_rpfilter and nft_fib to match
          lbcast packets with zeronet as source, eg. DHCP Discover packets
          (0.0.0.0 -> 255.255.255.255). Also from Liping Zhang.
      
      11) Built-in DCCP, SCTP and UDPlite conntrack and NAT support, from
          Davide Caratti. While DCCP is rather hopeless lately, and UDPlite has
          been broken in many-cast mode for some little time, let's give them a
          chance by placing them at the same level as other existing protocols.
          Thus, users don't explicitly have to modprobe support for this and
          NAT rules work for them. Some people point to the lack of support in
          SOHO Linux-based routers that make deployment of new protocols harder.
          I guess other middleboxes outthere on the Internet are also to blame.
          Anyway, let's see if this has any impact in the midrun.
      
      12) Skip software SCTP software checksum calculation if the NIC comes
          with SCTP checksum offload support. From Davide Caratti.
      
      13) Initial core factoring to prepare conversion to hook array. Three
          patches from Aaron Conole.
      
      14) Gao Feng made a wrong conversion to switch in the xt_multiport
          extension in a patch coming in the previous batch. Fix it in this
          batch.
      
      15) Get vmalloc call in sync with kmalloc flags to avoid a warning
          and likely OOM killer intervention from x_tables. From Marcelo
          Ricardo Leitner.
      
      16) Update Arturo Borrero's email address in all source code headers.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fccd64a
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 63c36c40
      David S. Miller authored
      
      
      Jeff Kirsher says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2016-12-07
      
      This series contains updates to i40e and i40evf only.
      
      Filip modifies the i40e to log link speed change and when the link is
      brought up and down.
      
      Mitch replaces i40e_txd_use_count() with a new function which is slightly
      faster and better documented so the dim witted can better follow the
      code.  Fixes the locking of the service task so that it is actually
      done in the service task and not in the scheduling function which calls
      the service task.
      
      Jacob, being the busy little beaver he is, provides most of the changes
      starting restores a workaround that is still needed in some configurations,
      specifically the Ethernet Controller XL710 for 40GbE QSFP+.  Removes
      duplicate code and simplifies the i40e_vsi_add_vlan() and
      i40e_vsi_kill_vlan() functions.  Removes detection of PTP frames over L4
      (UDP) on the XL710 MAC, since there was a product decision to defeature
      it.  Fixed a previous refactor of active filters which caused issues in
      the accounting of active_filters.  Remaining work was done in the VLAN
      filters to improve readability and simplify code as much as possible
      to reduce inconsistencies.
      
      Alex fixes foul budget accounting in core code by returning actual
      work done, capped to budget-1.
      
      Henry fixes the "ethtool -p" function for 1G BaseT PHYs.
      
      Carolyn adds support for 25G devices for i40e and i40evf.
      
      Michal adds functions to apply the correct access method for external PHYs
      which could use Clause22 or Clause45 depending on the PHY.
      
      v2: dropped last patch from previous series, since changes are needed based
          on feedback from Sergei Shtylyov
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63c36c40
    • Zhang Shengju's avatar
      dummy: expend mtu range for dummy device · 25e3e84b
      Zhang Shengju authored
      After commit 61e84623
      
       ("net: centralize net_device min/max MTU checking"),
      the mtu range for dummy device becomes [68, 1500].
      
      This patch extends it to [0, 65535].
      
      Signed-off-by: default avatarZhang Shengju <zhangshengju@cmss.chinamobile.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25e3e84b
    • Zhang Shengju's avatar
      nlmon: use core MTU range checking in nlmon driver · e82621e3
      Zhang Shengju authored
      Since commit 61e84623
      
       ("net: centralize net_device min/max MTU checking"),
      mtu range is checked at dev_set_mtu().
      
      This patch adds min_mtu for nlmon device and remove unnecessary
      ndo_change_mtu() function.
      
      Signed-off-by: default avatarZhang Shengju <zhangshengju@cmss.chinamobile.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e82621e3
    • Gao Feng's avatar
      driver: macvlan: Remove the rcu member of macvlan_port · a1f5315c
      Gao Feng authored
      
      
      When free macvlan_port in macvlan_port_destroy, it is safe to free
      directly because netdev_rx_handler_unregister could enforce one
      grace period.
      So it is unnecessary to use kfree_rcu for macvlan_port.
      
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a1f5315c
    • Gao Feng's avatar
      driver: ipvlan: Free ipvl_port directly with kfree instead of kfree_rcu · 48140a21
      Gao Feng authored
      
      
      There are two functions which would free the ipvl_port now. The first
      is ipvlan_port_create. It frees the ipvl_port in the error handler,
      so it could kfree it directly. The second is ipvlan_port_destroy. It
      invokes netdev_rx_handler_unregister which enforces one grace period
      by synchronize_net firstly, so it also could kfree the ipvl_port
      directly and safely.
      
      So it is unnecessary to use kfree_rcu to free ipvl_port.
      
      Signed-off-by: default avatarGao Feng <fgao@ikuai8.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48140a21
    • Daniel Borkmann's avatar
      bpf: fix loading of BPF_MAXINSNS sized programs · ef0915ca
      Daniel Borkmann authored
      General assumption is that single program can hold up to BPF_MAXINSNS,
      that is, 4096 number of instructions. It is the case with cBPF and
      that limit was carried over to eBPF. When recently testing digest, I
      noticed that it's actually not possible to feed 4096 instructions
      via bpf(2).
      
      The check for > BPF_MAXINSNS was added back then to bpf_check() in
      cbd35700 ("bpf: verifier (add ability to receive verification log)").
      However, 09756af4 ("bpf: expand BPF syscall with program load/unload")
      added yet another check that comes before that into bpf_prog_load(),
      but this time bails out already in case of >= BPF_MAXINSNS.
      
      Fix it up and perform the check early in bpf_prog_load(), so we can drop
      the second one in bpf_check(). It makes sense, because also a 0 insn
      program is useless and we don't want to waste any resources doing work
      up to bpf_check() point. The existing bpf(2) man page documents E2BIG
      as the official error for such cases, so just stick with it as well.
      
      Fixes: 09756af4
      
       ("bpf: expand BPF syscall with program load/unload")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef0915ca
    • Niklas Cassel's avatar
      net: stmmac: do not call phy_ethtool_ksettings_set from atomic context · 90364fea
      Niklas Cassel authored
      
      
      >From what I can tell, spin_lock(&priv->lock) is not needed, since the
      phy_ethtool_ksettings_set call is not given the priv struct.
      
      phy_start_aneg takes the phydev->lock. Calls to phy_adjust_link
      from phy_state_machine also takes the phydev->lock.
      
      [   13.718319] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:97
      [   13.726717] in_atomic(): 1, irqs_disabled(): 0, pid: 1307, name: ethtool
      [   13.742115] Hardware name: Axis ARTPEC-6 Platform
      [   13.746829] [<80110568>] (unwind_backtrace) from [<8010c2bc>] (show_stack+0x18/0x1c)
      [   13.754575] [<8010c2bc>] (show_stack) from [<80433484>] (dump_stack+0x80/0xa0)
      [   13.761801] [<80433484>] (dump_stack) from [<80145428>] (___might_sleep+0x108/0x170)
      [   13.769554] [<80145428>] (___might_sleep) from [<806c9b50>] (mutex_lock+0x24/0x44)
      [   13.777128] [<806c9b50>] (mutex_lock) from [<8050cbc0>] (phy_start_aneg+0x1c/0x13c)
      [   13.784783] [<8050cbc0>] (phy_start_aneg) from [<8050d338>] (phy_ethtool_ksettings_set+0x98/0xd0)
      [   13.793656] [<8050d338>] (phy_ethtool_ksettings_set) from [<80517adc>] (stmmac_ethtool_set_link_ksettings+0xa0/0xb4)
      [   13.804184] [<80517adc>] (stmmac_ethtool_set_link_ksettings) from [<805c5138>] (ethtool_set_settings+0xd4/0x13c)
      [   13.814358] [<805c5138>] (ethtool_set_settings) from [<805c9718>] (dev_ethtool+0x13c4/0x211c)
      [   13.822882] [<805c9718>] (dev_ethtool) from [<805dc7c0>] (dev_ioctl+0x480/0x8e0)
      [   13.830291] [<805dc7c0>] (dev_ioctl) from [<80260e34>] (do_vfs_ioctl+0x94/0xa00)
      [   13.837699] [<80260e34>] (do_vfs_ioctl) from [<802617dc>] (SyS_ioctl+0x3c/0x60)
      [   13.845011] [<802617dc>] (SyS_ioctl) from [<801088bc>] (__sys_trace_return+0x0/0x10)
      
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90364fea
    • David S. Miller's avatar
      Merge branch 'ti-cpts-update-and-fixes' · 64e8de58
      David S. Miller authored
      Grygorii Strashko says:
      
      ====================
      net: ethernet: ti: cpts: update and fixes
      
      It is preparation series intended to clean up and optimize TI CPTS driver to
      facilitate further integration with other TI's SoCs like Keystone 2.
      
      Changes in v5:
      - fixed copy paste error in cpts_release
      - reworked cc.mult/shift and cc_mult initialization
      
      Changes in v4:
      - fixed build error in patch
        "net: ethernet: ti: cpts: clean up event list if event pool is empty"
      - rebased on top of net-next
      
      Changes in v3:
      - patches reordered: fixes and small updates moved first
      - added comments in code about cpts->cc_mult
      - conversation range (maxsec) limited to 10sec
      
      Changes in v2:
      - patch "net: ethernet: ti: cpts: rework initialization/deinitialization"
        was split on 4 patches
      - applied comments from Richard Cochran
      - dropped patch
        "net: ethernet: ti: cpts: add return value to tx and rx timestamp funcitons"
      - new patches added:
        "net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs"
        and "clocksource: export the clocks_calc_mult_shift to use by timestamp code"
      
      Links on prev versions:
      v4: https://lkml.org/lkml/2016/12/6/496
      v3: https://www.spinics.net/lists/devicetree/msg153474.html
      v2: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1282034.html
      v1: http://www.spinics.net/lists/linux-omap/msg131925.html
      
      
      ====================
      
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64e8de58
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: fix overflow check period · 20138cf9
      Grygorii Strashko authored
      
      
      The CPTS drivers uses 8sec period for overflow checking with
      assumption that CPTS retclk will not exceed 500MHz. But that's not
      true on some TI platforms (Kesytone 2). As result, it is possible that
      CPTS counter will overflow more than once between two readings.
      
      Hence, fix it by selecting overflow check period dynamically as
      max_sec_before_overflow/2, where
       max_sec_before_overflow = max_counter_val / rftclk_freq.
      
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20138cf9
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: calc mult and shift from refclk freq · 88f0f0b0
      Grygorii Strashko authored
      The cyclecounter mult and shift values can be calculated based on the
      CPTS rfclk frequency and timekeepnig framework provides required algos
      and API's.
      
      Hence, calc mult and shift basing on CPTS rfclk frequency if both
      cpts_clock_shift and cpts_clock_mult properties are not provided in DT (the
      basis of calculation algorithm is borrowed from
      __clocksource_update_freq_scale() commit 7d2f944a
      
       ("clocksource:
      Provide a generic mult/shift factor calculation")). After this change
      cpts_clock_shift and cpts_clock_mult DT properties will become optional.
      
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88f0f0b0
    • Murali Karicheri's avatar
      clocksource: export the clocks_calc_mult_shift to use by timestamp code · 5304121a
      Murali Karicheri authored
      
      
      The CPSW CPTS driver is capable of doing timestamping on tx/rx packets and
      requires to know mult and shift factors for timestamp conversion from raw
      value to nanoseconds (ptp clock). Now these mult and shift factors are
      calculated manually and provided through DT, which makes very hard to
      support of a lot number of platforms, especially if CPTS refclk is not the
      same for some kind of boards and depends on efuse settings (Keystone 2
      platforms). Hence, export clocks_calc_mult_shift() to allow drivers like
      CPSW CPTS (and other ptp drivesr) to benefit from automaitc calculation of
      mult and shift factors.
      
      Cc: John Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarMurali Karicheri <m-karicheri2@ti.com>
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5304121a
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: move dt props parsing to cpts driver · 4a88fb95
      Grygorii Strashko authored
      
      
      Move DT properties parsing into CPTS driver to simplify CPSW
      code and CPTS driver porting on other SoC in the future
      (like Keystone 2) - with this change it will not be required
      to add the same DT parsing code in Keystone 2 NETCP driver.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a88fb95
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: rework initialization/deinitialization · 8a2c9a5a
      Grygorii Strashko authored
      
      
      The current implementation CPTS initialization and deinitialization
      (represented by cpts_register/unregister()) does too many static
      initialization from .ndo_open(), which is reasonable to do once at probe
      time instead, and also require caller to allocate memory for struct cpts,
      which is internal for CPTS driver in general.
      
      This patch splits CPTS initialization and deinitialization on two parts:
      
      - static initializtion cpts_create()/cpts_release() which expected to be
      executed when parent driver is probed/removed;
      
      - dynamic part cpts_register/unregister() which expected to be executed
      when network device is opened/closed.
      
      As result, current code of CPTS parent driver - CPSW - will be simplified
      (and it also will allow simplify adding support for Keystone 2 devices in
      the future), plus more initialization errors will be catched earlier. In
      addition, this change allows to clean up cpts.h for the case when CPTS is
      disabled.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a2c9a5a
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: drop excessive writes to CTRL and INT_EN regs · 2a79df3e
      Grygorii Strashko authored
      
      
      CPTS module and IRQs are always enabled when CPTS is registered,
      before starting overflow check work, and disabled during
      deregistration, when overflow check work has been canceled already.
      So, It doesn't require to (re)enable CPTS module and IRQs in
      cpts_overflow_check().
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a79df3e
    • WingMan Kwok's avatar
      net: ethernet: ti: cpts: clean up event list if event pool is empty · e4439fa8
      WingMan Kwok authored
      
      
      When a CPTS user does not exit gracefully by disabling cpts
      timestamping and leaving a joined multicast group, the system
      continues to receive and timestamps the ptp packets which eventually
      occupy all the event list entries.  When this happns, the added code
      tries to remove some list entries which are expired.
      
      Signed-off-by: default avatarWingMan Kwok <w-kwok2@ti.com>
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4439fa8
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: disable cpts when unregistered · 8fcd6891
      Grygorii Strashko authored
      
      
      The cpts now is left enabled after unregistration.
      Hence, disable it in cpts_unregister().
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8fcd6891
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: fix registration order · 6c691405
      Grygorii Strashko authored
      
      
      The ptp clock registered before spinlock, which is protecting it, and
      before timecounter and cyclecounter initialization in cpts_register().
      
      So, ensure that ptp clock is registered the last, after everything
      else is done.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c691405
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: fix unbalanced clk api usage in cpts_register/unregister · fd123a94
      Grygorii Strashko authored
      
      
      There are two issues with TI CPTS code which are reproducible when TI
      CPSW ethX device passes few up/down iterations:
      - cpts refclk prepare counter continuously incremented after each
      up/down iteration;
      - devm_clk_get(dev, "cpts") is called many times.
      
      Hence, fix these issues by using clk_disable_unprepare() in
      cpts_clk_release() and skipping devm_clk_get() if cpts refclk has been
      acquired already.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd123a94
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpsw: minimize direct access to struct cpts · b63ba58e
      Grygorii Strashko authored
      
      
      This will provide more flexibility in changing CPTS internals and also
      required for further changes.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b63ba58e
    • Grygorii Strashko's avatar
      net: ethernet: ti: allow cpts to be built separately · c8395d4e
      Grygorii Strashko authored
      
      
      TI CPTS IP is used as part of TI OMAP CPSW driver, but it's also
      present as part of NETCP on TI Keystone 2 SoCs. So, It's required
      to enable build of CPTS for both this drivers and this can be
      achieved by allowing CPTS to be built separately.
      
      Hence, allow cpts to be built separately and convert it to be
      a module as both CPSW and NETCP drives can be built as modules.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8395d4e
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: switch to readl/writel_relaxed() · 391fd6ca
      Grygorii Strashko authored
      
      
      Switch to readl/writel_relaxed() APIs, because this is recommended
      API and the CPTS IP is reused on Keystone 2 SoCs
      where LE/BE modes are supported.
      
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      391fd6ca
  3. Dec 07, 2016