Commits · 913e89a44e99d11f73ee6fe98452ed719b557f50 · Mirrors / github.com / raspberrypi_linux

May 04, 2019

mlxsw: Bump firmware version to 13.2000.1122 · 913e89a4

Ido Schimmel authored May 02, 2019



The new version supports two features that are required by upcoming
changes in the driver:

* Querying of new resources allowing port split into two ports on
Spectrum-2 systems

* Querying of number of gearboxes on supported systems such as SN3800

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

913e89a4

tipc: fix missing Name entries due to half-failover · c0b14a08

Tuong Lien authored May 02, 2019

TIPC link can temporarily fall into "half-establish" that only one of
the link endpoints is ESTABLISHED and starts to send traffic, PROTOCOL
messages, whereas the other link endpoint is not up (e.g. immediately
when the endpoint receives ACTIVATE_MSG, the network interface goes
down...).

This is a normal situation and will be settled because the link
endpoint will be eventually brought down after the link tolerance time.

However, the situation will become worse when the second link is
established before the first link endpoint goes down,
For example:

   1. Both links <1A-2A>, <1B-2B> down
   2. Link endpoint 2A up, but 1A still down (e.g. due to network
      disturbance, wrong session, etc.)
   3. Link <1B-2B> up
   4. Link endpoint 2A down (e.g. due to link tolerance timeout)
   5. Node B starts failover onto link <1B-2B>

   ==> Node A does never start link failover.

When the "half-failover" situation happens, two consequences have been
observed:

a) Peer link/node gets stuck in FAILINGOVER state;
b) Traffic or user messages that peer node is trying to failover onto
the second link can be partially or completely dropped by this node.

The consequence a) was actually solved by commit c140eb16

 ("tipc:
fix failover problem"), but that commit didn't cover the b). It's due
to the fact that the tunnel link endpoint has never been prepared for a
failover, so the 'l->drop_point' (and the other data...) is not set
correctly. When a TUNNEL_MSG from peer node arrives on the link,
depending on the inner message's seqno and the current 'l->drop_point'
value, the message can be dropped (- treated as a duplicate message) or
processed.
At this early stage, the traffic messages from peer are likely to be
NAME_DISTRIBUTORs, this means some name table entries will be missed on
the node forever!

The commit resolves the issue by starting the FAILOVER process on this
node as well. Another benefit from this solution is that we ensure the
link will not be re-established until the failover ends.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

c0b14a08

net: phy: improve resuming from hibernation · f24098f8

Heiner Kallweit authored May 01, 2019



I got an interesting report [0] that after resuming from hibernation
the link has 100Mbps instead of 1Gbps. Reason is that another OS has
been used whilst Linux was hibernated. And this OS speeds down the link
due to WoL. Therefore, when resuming, we shouldn't expect that what
the PHY advertises is what it did when hibernating.
Easiest way to do this is removing state PHY_RESUMING. Instead always
go via PHY_UP that configures PHY advertisement.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=202851

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

f24098f8

net: phy: improve pause handling · 22c0ef6b

Heiner Kallweit authored May 01, 2019



When probing the phy device we set sym and asym pause in the "supported"
bitmap (unless the PHY tells us otherwise). However we don't know yet
whether the MAC supports pause. Simply copying phy->supported to
phy->advertising will trigger advertising pause, and that's not
what we want. Therefore add phy_advertise_supported() that copies all
modes but doesn't touch the pause bits.

In phy_support_(a)sym_pause we shouldn't set any bits in the supported
bitmap because we may set a bit the PHY intentionally disabled.
Effective pause support should be the AND-combined PHY and MAC pause
capabilities. If the MAC supports everything, then it's only relevant
what the PHY supports. If MAC supports sym pause only, then we have to
clear the asym bit in phydev->supported.
Copy the pause flags only and don't touch the modes, because a driver
may have intentionally removed a mode from phydev->advertising.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

22c0ef6b

net: sched: cls_u32: use struct_size() helper · e512fcf0

Gustavo A. R. Silva authored May 01, 2019



Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace code of the following form:

sizeof(*s) + s->nkeys*sizeof(struct tc_u32_key)

with:

struct_size(s, keys, s->nkeys)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e512fcf0

net: add a generic tracepoint for TX queue timeout · 141b6b2a

Cong Wang authored May 01, 2019



Although devlink health report does a nice job on reporting TX
timeout and other NIC errors, unfortunately it requires drivers
to support it but currently only mlx5 has implemented it.
Before other drivers could catch up, it is useful to have a
generic tracepoint to monitor this kind of TX timeout. We have
been suffering TX timeout with different drivers, we plan to
start to monitor it with rasdaemon which just needs a new tracepoint.

Sample output:

  ksoftirqd/1-16    [001] ..s2   144.043173: net_dev_xmit_timeout: dev=ens3 driver=e1000 queue=0

Cc: Eran Ben Elisha <eranbe@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

141b6b2a

Merge tag 'mlx5-updates-2019-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f3f050a4

David S. Miller authored May 04, 2019



Saeed Mahameed says:

====================
mlx5-updates-2019-04-30

mlx5 misc updates:

1) Bodong Wang and Parav Pandit (6):
   - Remove unused mlx5_query_nic_vport_vlans
   - vport macros refactoring
   - Fix vport access in E-Switch
   - Use atomic rep state to serialize state change

2) Eli Britstein (2):
   - prio tag mode support, added ACLs and replace TC vlan pop with
     vlan 0 rewrite when prio tag mode is enabled.

3) Erez Alfasi (2):
   - ethtool: Add SFF-8436 and SFF-8636 max EEPROM length definitions
   - mlx5e: ethtool, Add support for EEPROM high pages query

4) Masahiro Yamada (1):
   - remove meaningless CFLAGS_tracepoint.o

5) Maxim Mikityanskiy (1):
   - Put the common XDP code into a function

6) Tariq Toukan (2):
   - Turn on HW tunnel offload in all TIRs

7) Vlad Buslov (1):
   - Return error when trying to insert existing flower filter
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

f3f050a4

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 18af9626

David S. Miller authored May 04, 2019



Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-05-02

This series contains updates to the ice driver only.

Anirudh introduces the framework to store queue specific information in
the VSI queue contexts.  This will allow future changes to update the
structure to hold queue specific information.

Akeem adds additional check so that if there is no queue to disable when
attempting to disable a queue, return a configuration error without
acquiring the lock.  Fixed an issue with non-trusted VFs being able to
add more than the permitted number of VLANs.

Bruce removes unreachable code and updated the function to return void
since it would never return anything but success.

Brett provides most of the changes in the series, starting with reducing
the scope of the error variable used and improved the debug message if
we fail to configure the receive queue.  Updates the driver to use a
macro instead of using the same 'for' loop throughout the driver which
helps with readability.  Fixed an issue where users were led to believe
they could set rx-usecs-high value, yet the changes to this value would
not stick because it was not yet implemented to allow changes to this
value, so implement the missing code to change the value.  Found we had
unnecessary wait when disabling queues, so remove it.  I,proved a
wasteful addition operation in our hot path by adding a member to the
ice_q_vector structure and the necessary changes to use the member which
stores the calculated vector hardware index.  Refactored the link event
flow to make it cleaner and more clear.

Maciej updates the array index when stopping transmit rings, so that
process every ring the VSI, not just the rings in a given transmit
class.

Paul adds support for setting 52 byte RSS hash keys.

Md Fahad cleaned up a runtime change to the PFINT_OICR_ENA register,
since the interrupt handlers will handle resetting the bit, if
necessary.

Tony adds a missing PHY type, which was causing warning message about an
unrecognized PHY.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

18af9626

wimax/i2400m: use struct_size() helper · 70bb13a5

Gustavo A. R. Silva authored Apr 30, 2019



Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.

So, replace code of the following form:

sizeof(*tx_msg) + le16_to_cpu(tx_msg->num_pls) * sizeof(tx_msg->pld[0]);

with:

struct_size(tx_msg, pld, le16_to_cpu(tx_msg->num_pls));

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70bb13a5

Merge branch 'net-hns3-enhance-capabilities-for-fibre-port' · 504159c3

David S. Miller authored May 04, 2019



Jian Shen says:

====================
net: hns3: enhance capabilities for fibre port

This patchset enhances more capabilities for fibre port,
include multipe media type identification, autoneg,
change port speed and FEC encoding.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

504159c3

net: hns3: add support for FEC encoding control · 7e6ec914

Jian Shen authored May 03, 2019



This patch adds support for FEC encoding control, user can change
FEC mode by command ethtool --set-fec, and get FEC mode by command
ethtool --show-fec. The fec capability is changed follow the port
speed. If autoneg on, the user configure fec mode will be overwritten
by autoneg result.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e6ec914

net: hns3: add autoneg and change speed support for fibre port · 22f48e24

Jian Shen authored May 03, 2019



Previously, our driver only supports phydev to autoneg or change
port speed. This patch adds support for fibre port, driver gets
media speed capability and autoneg capability from firmware. If
the media supports multiple speeds, user can change port speed
with command "ethtool -s <devname> speed xxxx autoneg off duplex
full". If autoneg on, the user configuration may be overwritten
by the autoneg result.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

22f48e24

net: hns3: add support for multiple media type · 88d10bd6

Jian Shen authored May 03, 2019



Previously, we can only identify copper and fiber type, the
supported link modes of port information are always showing
SR type. This patch adds support for multiple media types,
include SR, LR CR, KR. Driver needs to query the media type
from firmware periodicly, and updates the port information.

The new port information looks like this:
Settings for eth0:
        Supported ports: [ FIBRE ]
        Supported link modes:   25000baseCR/Full
                                25000baseSR/Full
                                1000baseX/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: None BaseR
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Current message level: 0x00000036 (54)
                               probe link ifdown ifup
        Link detected: yes

In order to be compatible with old firmware which only support
sfp speed, we remained using the same query command, and kept
the former logic.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

88d10bd6

usbnet: ipheth: Remove unnecessary NULL pointer check · e28441e2

Guenter Roeck authored Apr 30, 2019



ipheth_carrier_set() is called from two locations. In
ipheth_carrier_check_work(), its parameter 'dev' is set with
container_of(work, ...) and can not be NULL. In ipheth_open(),
dev is extracted from netdev_priv(net) and dereferenced before
the call to ipheth_carrier_set(). The NULL pointer check of dev
in ipheth_carrier_set() is therefore unnecessary and can be removed.

Cc: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

e28441e2

net: dsa: mv88e6xxx: Pass interrupt number in platform data · a27415de

Andrew Lunn authored May 01, 2019



Allow an interrupt number to be passed in the platform data. The
driver will then use it if not zero, otherwise it will poll for
interrupts.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

a27415de

Merge branch 'mv88e6xxx-Disable-ports-to-save-power' · 3b3600ff

David S. Miller authored May 03, 2019



Andrew Lunn says:

====================
mv88e6xxx: Disable ports to save power

Save some power by disabling ports. The first patch fully disables a
port when it is runtime disabled. The second disables any ports which
are not used at all.

Depending on configuration strapping, this can lower the temperature
of an idle switch a few degrees.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

3b3600ff

net: dsa :mv88e6xxx: Disable unused ports · 100a9b9d

Andrew Lunn authored May 01, 2019



If the NO_CPU strap is set, the switch starts in 'dumb hub' mode, with
all ports enable. Ports which are then actively used are reconfigured
as required when the driver starts. However unused ports are left
alone. Change this to disable them, and turn off any SERDES
interface. This could save some power and so reduce the temperature a
bit.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

100a9b9d

net: dsa: mv88e6xxx: Set STP disable state in port_disable · 4a0eb731

Andrew Lunn authored May 01, 2019



When requested to disable a port, set the port STP state to disabled.
This fully disables the port and should save some power.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a0eb731

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 2ce1aef9

David S. Miller authored May 03, 2019



Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2019-05-03

This series contains updates to the i40e driver only.

Carolyn changes the driver behavior to now disable the VF after one MDD
event instead of allowing a couple of MDD events before doing the reset.

Aleksandr changes the driver to only report an error when a VF tries to
remove VLAN when a port VLAN is configured, unless it is VLAN 0.  Also
extends the LLDP support to be able to keep the current LLDP state
persistent across a power cycle.

Maciej fixes the checksum calculation due to firmware changes, which
requires the driver to perform a double shadow RAM dump in some cases.

Adam adds advertising support for 40GBase_LR4, 40GBase_CR4 and fibre in
the driver.

Jake cleans up a check that is not needed and was producing a warning in
GCC 8.

Harshitha fixes a misleading message by ensuring that a success message
is only printed on the host side when the promiscuous mode change has
been successful.

Stefan Assmann adds the vendor id and device id to the dmesg log entry
during probe to help with bug reports when lspci output may not be
available.

Alice and Piotr add recovery mode support in the i40e driver, which is
needed for migrating from a structured to a flat firmware image.

v2: Removed patch 1 "i40e: replace switch-statement to speed-up
    retpoline-enabled builds" from the series since it is no longer
    needed.  Also updated the last patch in the series that introduces
    recovery mode support, to include a more detailed patch description
    and removed code not intended for the upstream kernel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

2ce1aef9

i40e: Introduce recovery mode support · 4ff0ee1a

Alice Michael authored May 02, 2019



This patch introduces "recovery mode" to the i40e driver. It is
part of a new Any2Any idea of upgrading the firmware. In this
approach, it is required for the driver to have support for
"transition firmware", that is used for migrating from structured
to flat firmware image. In this new, very basic mode, i40e driver
must be able to handle particular IOCTL calls from the NVM Update
Tool and run a small set of AQ commands.

These additional AQ commands are part of the interface used by
the NVMUpdate tool.  The NVMUpdate tool contains all of the
necessary logic to reference these new AQ commands.  The end user
experience remains the same, they are using the NVMUpdate tool to
update the NVM contents.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Piotr Marczak <piotr.marczak@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4ff0ee1a

i40e: print PCI vendor and device ID during probe · a121644c

Stefan Assmann authored Mar 12, 2019



Printing each devices PCI vendor and device ID has the advantage of
easily revealing what hardware we're dealing with exactly. It's no
longer necessary to match the PCI bus information to the lspci output.

Helps with bug reports where no lspci output is available.

Output before
i40e 0000:08:00.0: fw 6.1.49420 api 1.7 nvm 6.80 0x80003c64 1.2007.0
and after
i40e 0000:08:00.0: fw 6.1.49420 api 1.7 nvm 6.80 0x80003c64 1.2007.0 [8086:1572] [8086:0004]

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a121644c

i40e: fix misleading message about promisc setting on un-trusted VF · 1e846827

Harshitha Ramamurthy authored Feb 28, 2019



A refactor of the i40e_vc_config_promiscuous_mode_msg function moved
the check for un-trusted VF into another function. We have to lie to
an un-trusted VF that its request to set promiscuous mode is
successful even when it is not because we don't want the VF to find
out its trust status this way. With the refactor, we were running into
a case where even though we were not setting promiscuous mode for an
un-trusted VF, we still printed a misleading message that it was
successful.

This patch fixes that by ensuring that a success message is printed
on the host side only when the promiscuous mode change has been
successful.

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1e846827

i40e: update version number · d1fc90a9

Alice Michael authored Feb 28, 2019



Just bumping the version number appropriately.

Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

d1fc90a9

i40e: remove out-of-range comparisons in i40e_validate_cloud_filter · a01e5f22

Jacob Keller authored Feb 28, 2019



The function i40e_validate_cloud_filter checks that the destination and
source port numbers are valid by attempting to ensure that the number is
non-zero and no larger than 0xFFFF. However, the types for the dst_port
and src_port variable are __be16 which by definition cannot be larger
than 0xFFFF

Since these values cannot be larger than 2 bytes, the check to see if
they exceed 0xFFFF is meaningless.

One might consider these checks as some sort of defensive coding, in
case the type was later changed. However, these checks also byte-swap
the value before comparison using be16_to_cpu, which will truncate the
values to 16bits anyways. Additionally, changing the type would require
updating the opcodes to support new data layout of these virtchnl
commands.

Remove the check to silence the -Wtype-limits warning that was added to
GCC 8.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a01e5f22

i40e: Further implementation of LLDP · c65e78f8

Aleksandr Loktionov authored Feb 28, 2019



This code implements driver code changes necessary for LLDP
Agent support. Modified i40e_aq_start_lldp() and
i40e_aq_stop_lldp() adding false parameter whether LLDP state
should be persistent across power cycles.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

c65e78f8

i40e: Report advertised link modes on 40GBase_LR4, CR4 and fibre · b3212f35

Adam Ludkiewicz authored Feb 28, 2019



Add assignments for advertising 40GBase_LR4, 40GBase_CR4 and fibre

Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b3212f35

i40e: ShadowRAM checksum calculation change · 226436dc

Maciej Paczkowski authored Feb 28, 2019



Due to changes in FW the SW is required to perform double SR dump in
some cases.

Implementation adds two new steps to update nvm checksum function:
* recalculate checksum and check if checksum in NVM is correct
* if checksum in NVM is not correct then update it again

Signed-off-by: Maciej Paczkowski <maciej.paczkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

226436dc

i40e: remove error msg when vf with port vlan tries to remove vlan 0 · 5a189f15

Aleksandr Loktionov authored Feb 28, 2019



VF's attempt to delete vlan 0 when a port vlan is configured is harmless
in this case pf driver just does nothing.  If vf will try to remove
other vlans when a port vlan is configured it will still produce error
as before.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

5a189f15

i40e: change behavior on PF in response to MDD event · a1df906c

Carolyn Wyborny authored Feb 28, 2019



TX MDD events reported on the PF are the result of the
PF misconfiguring a descriptor and not because of "bad actions"
by anything else.  No need to reset now because if it
results in a Tx hang, the Tx hang check will take care of it.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a1df906c

i40e: Fix for allowing too many MDD events on VF · a7da7f16

Carolyn Wyborny authored Feb 28, 2019



This patch changes the driver behavior when detecting a VF MDD event.
It now disables the VF after one event, which indicates a hw detected
problem in the VF.  Before this change, the PF would allow a couple of
events before doing the reset.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a7da7f16

May 03, 2019

Merge branch 'NXP-SJA1105-DSA-driver' · 8ef988b9

David S. Miller authored May 03, 2019



Vladimir Oltean says:

====================
NXP SJA1105 DSA driver

This patchset adds a DSA driver for the SPI-controlled NXP SJA1105
switch.  Due to the hardware's unfriendliness, most of its state needs
to be shadowed in kernel memory by the driver. To support this and keep
a decent amount of cleanliness in the code, a new generic API for
converting between CPU-accessible ("unpacked") structures and
hardware-accessible ("packed") structures is proposed and used.

The driver is GPL-2.0 licensed. The source code files which are licensed
as BSD-3-Clause are hardware support files and derivative of the
userspace NXP sja1105-tool program, which is BSD-3-Clause licensed.

TODO items:
* Add support for traffic.
* Add full support for the P/Q/R/S series. The patches were mostly
  tested on a first-generation T device.
* Add timestamping support and PTP clock manipulation.
* Figure out how the tc-taprio hardware offload that was just proposed
  by Vinicius can be used to configure the switch's time-aware scheduler.
* Rework link state callbacks to use phylink once the SGMII port
  is supported.

Changes in v5:
1. Removed trailing empty lines at the end of files.
2. Moved the lib/packing.c file under a CONFIG_PACKING option instead of
   having it always built-in. The module is GPL licensed, which applies
   to its distribution in binary form, but the code is dual-licensed
   which means it can be used in projects with other licenses as well.
3. Made SJA1105 driver select CONFIG_PACKING and CONFIG_CRC32.

v4 patchset can be found at:
https://lwn.net/Articles/787077/

Changes in v4:
1. Previous patchset was broken apart, and for the moment the driver is
   configuring the switch as unmanaged. Support for regular and management
   traffic, as well as for PTP timestamping, will be submitted once the
   basic driver is accepted. Some core DSA patches were also broken out
   of the series, and are a dependency for this series:
   https://patchwork.ozlabs.org/project/netdev/list/?series=105069
2. Addressed Jiri Pirko's feedback about too generic function and macro
   naming.
3. Re-introduced ETH_P_DSA_8021Q.

v3 patchset can be found at:
https://lkml.org/lkml/2019/4/12/978

Changes in v3:
1. Removed the patch for a dedicated Ethertype to use with 802.1Q DSA
   tagging
2. Changed the SJA1105 switch tagging protocol sysfs label from
   "sja1105" to "8021q" to denote to users such as tcpdump that the
   structure is more generic.
3. Respun previous patch "net: dsa: Allow drivers to modulate between
   presence and absence of tagging". Current equivalent patch is called
   "net: dsa: Allow drivers to filter packets they can decode source
   port from" and at least allows reception of management traffic during
   the time when switch tagging is not enabled.
4. Added DSA-level fixes for the bridge core not unsetting
   vlan_filtering when ports leave. The global VLAN filtering is treated
   as a special case. Made the mt7530 driver use this. This patch
   benefits the SJA1105 because otherwise traffic in standalone mode
   would no longer work after removing the ports from a vlan_filtering
   bridge, since the driver and the hardware would be in an inconsistent
   state.
5. Restructured the documentation as rst. This depends upon the recently
   submitted "[PATCH net-next] Documentation: net: dsa: transition to
   the rst format": https://patchwork.ozlabs.org/patch/1084658/.

v2 patchset can be found at:
https://www.spinics.net/lists/netdev/msg563454.html

Changes in v2:
1. Device ID is no longer auto-detected but enforced based on explicit DT
   compatible string. This helps with stricter checking of DT bindings.
2. Group all device-specific operations into a sja1105_info structure and
   avoid using the IS_ET() and IS_PQRS() macros at runtime as much as possible.
3. Added more verbiage to commit messages and documentation.
4. Treat the case where RGMII internal delays are requested through DT bindings
   and return error.
5. Miscellaneous cosmetic cleanup in sja1105_clocking.c
6. Not advertising link features that are not supported, such as pause frames
   and the half duplex modes.
7. Fixed a mistake in previous patchset where the switch tagging was not
   actually enabled (lost during a rebase). This brought up another uncaught
   issue where switching at runtime between tagging and no-tagging was not
   supported by DSA. Fixed up the mistake in "net: dsa: sja1105: Add support
   for traffic through standalone ports", and added the new patch "net: dsa:
   Allow drivers to modulate between presence and absence of tagging" to
   address the other issue.
8. Added a workaround for switch resets cutting a frame in the middle of
   transmission, which would throw off some link partners.
9. Changed the TPID from ETH_P_EDSA (0xDADA) to a newly introduced one:
   ETH_P_DSA_8021Q (0xDADB). Uncovered another mistake in the previous patchset
   with a missing ntohs(), which was not caught because 0xDADA is
   endian-agnostic.
10. Made NET_DSA_TAG_8021Q select VLAN_8021Q
11. Renamed __dsa_port_vlan_add to dsa_port_vid_add and not to
    dsa_port_vlan_add_trans, as suggested, because the corresponding _del function
    does not have a transactional phase and the naming is more uniform this way.

v1 patchset can be found at:
https://www.spinics.net/lists/netdev/msg561589.html

Changes from RFC:
1. Removed the packing code for the static configuration tables that were
   not currently used
2. Removed the code for unpacking a static configuration structure from
   a memory buffer (not used)
3. Completely removed the SGMII stubs, since the configuration is not
   complete anyway.
4. Moved some code from the SJA1105 introduction commit into the patch
   that used it.
5. Made the code for checking global VLAN filtering generic and made b53
   driver use it.
6. Made mt7530 driver use the new generic dp->vlan_filtering
7. Fixed check for stringset in .get_sset_count
8. Minor cleanup in sja1105_clocking.c
9. Fixed a confusing typo in DSA

RFC can be found at:
https://www.mail-archive.com/netdev@vger.kernel.org/msg291717.html
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

8ef988b9

dt-bindings: net: dsa: Add documentation for NXP SJA1105 driver · 013fe01d

Vladimir Oltean authored May 02, 2019



Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

013fe01d

Documentation: net: dsa: Add details about NXP SJA1105 driver · 47592097
Vladimir Oltean authored May 02, 2019
```
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
47592097

net: dsa: sja1105: Reject unsupported link modes for AN · ad9f299a

Vladimir Oltean authored May 02, 2019



Ethernet flow control:

The switch MAC does not consume, nor does it emit pause frames. It
simply forwards them as any other Ethernet frame (and since the DMAC is,
per IEEE spec, 01-80-C2-00-00-01, it means they are filtered as
link-local traffic and forwarded to the CPU, which can't do anything
useful with them).

Duplex:

There is no duplex setting in the SJA1105 MAC. It is known to forward
traffic at line rate on the same port in both directions. Therefore it
must be that it only supports full duplex.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad9f299a

net: dsa: sja1105: Prevent PHY jabbering during switch reset · 1a4c6940

Vladimir Oltean authored May 02, 2019



Resetting the switch at runtime is currently done while changing the
vlan_filtering setting (due to the required TPID change).

But reset is asynchronous with packet egress, and the switch core will
not wait for egress to finish before carrying on with the reset
operation.

As a result, a connected PHY such as the BCM5464 would see an
unterminated Ethernet frame and start to jabber (repeat the last seen
Ethernet symbols - jabber is by definition an oversized Ethernet frame
with bad FCS). This behavior is strange in itself, but it also causes
the MACs of some link partners (such as the FRDM-LS1012A) to completely
lock up.

So as a remedy for this situation, when switch reset is required, simply
inhibit Tx on all ports, and wait for the necessary time for the
eventual one frame left in the egress queue (not even the Tx inhibit
command is instantaneous) to be flushed.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a4c6940

net: dsa: sja1105: Add support for configuring address ageing time · 8456721d

Vladimir Oltean authored May 02, 2019



If STP is active, this setting is applied on bridged ports each time an
Ethernet link is established (topology changes).

Since the setting is global to the switch and a reset is required to
change it, resets are prevented if the new callback does not change the
value that the hardware already is programmed for.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

8456721d

net: dsa: sja1105: Add support for ethtool port counters · 52c34e6e

Vladimir Oltean authored May 02, 2019



Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

52c34e6e

net: dsa: sja1105: Add support for VLAN operations · 6666cebc

Vladimir Oltean authored May 02, 2019



VLAN filtering cannot be properly disabled in SJA1105. So in order to
emulate the "no VLAN awareness" behavior (not dropping traffic that is
tagged with a VID that isn't configured on the port), we need to hack
another switch feature: programmable TPID (which is 0x8100 for 802.1Q).
We are reprogramming the TPID to a bogus value which leaves the switch
thinking that all traffic is untagged, and therefore accepts it.

Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is
installed again, and the switch starts identifying 802.1Q-tagged
traffic.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6666cebc

ether: Add dedicated Ethertype for pseudo-802.1Q DSA tagging · bf5bc3ce

Vladimir Oltean authored May 02, 2019



There are two possible utilizations so far:

- Switch devices that don't support a native insertion/extraction header
  on the CPU port may still enjoy the benefits of port isolation with a
  custom VLAN tag.

  For this, they need to have a customizable TPID in hardware and a new
  Ethertype to distinguish between real 802.1Q traffic and the private
  tags used for port separation.

- Switches that don't support the deactivation of VLAN awareness, but
  still want to have a mode in which they accept all traffic, including
  frames that are tagged with a VLAN not configured on their ports, may
  use this as a fake to trick the hardware into thinking that the TPID
  for VLAN is something other than 0x8100.

What follows after the ETH_P_DSA_8021Q EtherType is a regular VLAN
header (TCI), however there is no other EtherType that can be used for
this purpose and doesn't already have a well-defined meaning.
ETH_P_8021AD, ETH_P_QINQ1, ETH_P_QINQ2 and ETH_P_QINQ3 expect that
another follow-up VLAN tag is present, which is not the case here.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

bf5bc3ce

net: dsa: sja1105: Error out if RGMII delays are requested in DT · f5b8631c

Vladimir Oltean authored May 02, 2019

Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:

* "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
should not add an RX delay in this case)

The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
or less clear (needs to be added by the PHY), what should the MAC
driver do about the Tx delays? Is it an implicit wild card for the
MAC to apply delays in the Tx direction if it can? What if those were
already added as serpentine PCB traces, how could that be made more
obvious through DT bindings so that the MAC doesn't attempt to add
them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
fixed (a purely software entity that obviously cannot add clock
skew), what is the meaning of the above property?

So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.

The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.

For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.

Meanwhile do not allow the user to select an invalid configuration.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

f5b8631c