Commit f3c6e128 authored by David S. Miller's avatar David S. Miller
Browse files

Merge branch 'ethtool-mac-merge'

Vladimir Oltean says:

====================
ethtool support for IEEE 802.3 MAC Merge layer

Change log
----------

v3->v4:
- add missing opening bracket in ocelot_port_mm_irq()
- moved cfg.verify_time range checking so that it actually takes place
  for the updated rather than old value
v3 at:
https://patchwork.kernel.org/project/netdevbpf/cover/20230117085947.2176464-1-vladimir.oltean@nxp.com/

v2->v3:
- made get_mm return int instead of void
- deleted ETHTOOL_A_MM_SUPPORTED
- renamed ETHTOOL_A_MM_ADD_FRAG_SIZE to ETHTOOL_A_MM_TX_MIN_FRAG_SIZE
- introduced ETHTOOL_A_MM_RX_MIN_FRAG_SIZE
- cleaned up documentation
- rebased on top of PLCA changes
- renamed ETHTOOL_STATS_SRC_* to ETHTOOL_MAC_STATS_SRC_*
v2 at:
https://patchwork.kernel.org/project/netdevbpf/cover/20230111161706.1465242-1-vladimir.oltean@nxp.com/

v1->v2:
I've decided to focus just on the MAC Merge layer for now, which is why
I am able to submit this patch set as non-RFC.
v1 (RFC) at:
https://patchwork.kernel.org/project/netdevbpf/cover/20220816222920.1952936-1-vladimir.oltean@nxp.com/

What is being introduced
------------------------

TL;DR: a MAC Merge layer as defined by IEEE 802.3-2018, clause 99
(interspersing of express traffic). This is controlled through ethtool
netlink (ETHTOOL_MSG_MM_GET, ETHTOOL_MSG_MM_SET). The raw ethtool
commands are posted here:
https://patchwork.kernel.org/project/netdevbpf/cover/20230111153638.1454687-1-vladimir.oltean@nxp.com/

The MAC Merge layer has its own statistics counters
(ethtool --include-statistics --show-mm swp0) as well as two member
MACs, the statistics of which can be queried individually, through a new
ethtool netlink attribute, corresponding to:

$ ethtool -I --show-pause eno2 --src aggregate
$ ethtool -S eno2 --groups eth-mac eth-phy eth-ctrl rmon -- --src pmac

The core properties of the MAC Merge layer are described in great detail
in patches 02/12 and 03/12. They can be viewed in "make htmldocs" format.

Devices for which the API is supported
--------------------------------------

I decided to start with the Ethernet switch on NXP LS1028A (Felix)
because of the smaller patch set. I also have support for the ENETC
controller pending.

I would like to get confirmation that the UAPI being proposed here will
not restrict any use cases known by other hardware vendors.

Why is support for preemptible traffic classes not here?
--------------------------------------------------------

There is legitimate concern whether the 802.1Q portion of the standard
(which traffic classes go to the eMAC and which to the pMAC) should be
modeled in Linux using tc or using another UAPI. I think that is
stalling the entire series, but should be discussed separately instead.
Removing FP adminStatus support makes me confident enough to submit this
patch set without an RFC tag (meaning: I wouldn't mind if it was merged
as is).

What is submitted here is sufficient for an LLDP daemon to do its job.
I've patched openlldp to advertise and configure frame preemption:
https://github.com/vladimiroltean/openlldp/tree/frame-preemption-v3



In case someone wants to try it out, here are some commands I've used.

 # Configure the interfaces to receive and transmit LLDP Data Units
 lldptool -L -i eno0 adminStatus=rxtx
 lldptool -L -i swp0 adminStatus=rxtx
 # Enable the transmission of certain TLVs on switch's interface
 lldptool -T -i eno0 -V addEthCap enableTx=yes
 lldptool -T -i swp0 -V addEthCap enableTx=yes
 # Query LLDP statistics on switch's interface
 lldptool -S -i swp0
 # Query the received neighbor TLVs
 lldptool -i swp0 -t -n -V addEthCap
 Additional Ethernet Capabilities TLV
         Preemption capability supported
         Preemption capability enabled
         Preemption capability active
         Additional fragment size: 60 octets

So using this patch set, lldpad will be able to advertise and configure
frame preemption, but still, no data packet will be sent as preemptible
over the link, because there is no UAPI to control which traffic classes
are sent as preemptible and which as express.

Preemptable or preemptible?
---------------------------

IEEE 802.3 uses "preemptable" throughout. IEEE 802.1Q uses "preemptible"
throughout. Because the definition of "preemptible" falls under 802.1Q's
jurisdiction and 802.3 just references it, I went with the 802.1Q naming
even where supporting an 802.3 feature. Also, checkpatch agrees with this.
====================

Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 40e0b090 6505b680
Loading
Loading
Loading
Loading
+107 −0
Original line number Diff line number Diff line
@@ -223,6 +223,8 @@ Userspace to kernel:
  ``ETHTOOL_MSG_PSE_SET``               set PSE parameters
  ``ETHTOOL_MSG_PSE_GET``               get PSE parameters
  ``ETHTOOL_MSG_RSS_GET``               get RSS settings
  ``ETHTOOL_MSG_MM_GET``                get MAC merge layer state
  ``ETHTOOL_MSG_MM_SET``                set MAC merge layer parameters
  ===================================== =================================

Kernel to userspace:
@@ -265,6 +267,7 @@ Kernel to userspace:
  ``ETHTOOL_MSG_MODULE_GET_REPLY``         transceiver module parameters
  ``ETHTOOL_MSG_PSE_GET_REPLY``            PSE parameters
  ``ETHTOOL_MSG_RSS_GET_REPLY``            RSS settings
  ``ETHTOOL_MSG_MM_GET_REPLY``             MAC merge layer status
  ======================================== =================================

``GET`` requests are sent by userspace applications to retrieve device
@@ -1089,8 +1092,18 @@ Request contents:

  =====================================  ======  ==========================
  ``ETHTOOL_A_PAUSE_HEADER``             nested  request header
  ``ETHTOOL_A_PAUSE_STATS_SRC``          u32     source of statistics
  =====================================  ======  ==========================

``ETHTOOL_A_PAUSE_STATS_SRC`` is optional. It takes values from:

.. kernel-doc:: include/uapi/linux/ethtool.h
    :identifiers: ethtool_mac_stats_src

If absent from the request, stats will be provided with
an ``ETHTOOL_A_PAUSE_STATS_SRC`` attribute in the response equal to
``ETHTOOL_MAC_STATS_SRC_AGGREGATE``.

Kernel response contents:

  =====================================  ======  ==========================
@@ -1505,6 +1518,7 @@ Request contents:

  =======================================  ======  ==========================
  ``ETHTOOL_A_STATS_HEADER``               nested  request header
  ``ETHTOOL_A_STATS_SRC``                  u32     source of statistics
  ``ETHTOOL_A_STATS_GROUPS``               bitset  requested groups of stats
  =======================================  ======  ==========================

@@ -1513,6 +1527,8 @@ Kernel response contents:
 +-----------------------------------+--------+--------------------------------+
 | ``ETHTOOL_A_STATS_HEADER``        | nested | reply header                   |
 +-----------------------------------+--------+--------------------------------+
 | ``ETHTOOL_A_STATS_SRC``           | u32    | source of statistics           |
 +-----------------------------------+--------+--------------------------------+
 | ``ETHTOOL_A_STATS_GRP``           | nested | one or more group of stats     |
 +-+---------------------------------+--------+--------------------------------+
 | | ``ETHTOOL_A_STATS_GRP_ID``      | u32    | group ID - ``ETHTOOL_STATS_*`` |
@@ -1574,6 +1590,11 @@ Low and high bounds are inclusive, for example:
 etherStatsPkts512to1023Octets 512  1023
 ============================= ==== ====

``ETHTOOL_A_STATS_SRC`` is optional. Similar to ``PAUSE_GET``, it takes values
from ``enum ethtool_mac_stats_src``. If absent from the request, stats will be
provided with an ``ETHTOOL_A_STATS_SRC`` attribute in the response equal to
``ETHTOOL_MAC_STATS_SRC_AGGREGATE``.

PHC_VCLOCKS_GET
===============

@@ -1868,6 +1889,90 @@ When set, the ``ETHTOOL_A_PLCA_STATUS`` attribute indicates whether the node is
detecting the presence of the BEACON on the network. This flag is
corresponding to ``IEEE 802.3cg-2019`` 30.16.1.1.2 aPLCAStatus.

MM_GET
======

Retrieve 802.3 MAC Merge parameters.

Request contents:

  ====================================  ======  ==========================
  ``ETHTOOL_A_MM_HEADER``               nested  request header
  ====================================  ======  ==========================

Kernel response contents:

  =================================  ======  ===================================
  ``ETHTOOL_A_MM_HEADER``            nested  request header
  ``ETHTOOL_A_MM_PMAC_ENABLED``      bool    set if RX of preemptible and SMD-V
                                             frames is enabled
  ``ETHTOOL_A_MM_TX_ENABLED``        bool    set if TX of preemptible frames is
                                             administratively enabled (might be
                                             inactive if verification failed)
  ``ETHTOOL_A_MM_TX_ACTIVE``         bool    set if TX of preemptible frames is
                                             operationally enabled
  ``ETHTOOL_A_MM_TX_MIN_FRAG_SIZE``  u32     minimum size of transmitted
                                             non-final fragments, in octets
  ``ETHTOOL_A_MM_RX_MIN_FRAG_SIZE``  u32     minimum size of received non-final
                                             fragments, in octets
  ``ETHTOOL_A_MM_VERIFY_ENABLED``    bool    set if TX of SMD-V frames is
                                             administratively enabled
  ``ETHTOOL_A_MM_VERIFY_STATUS``     u8      state of the verification function
  ``ETHTOOL_A_MM_VERIFY_TIME``       u32     delay between verification attempts
  ``ETHTOOL_A_MM_MAX_VERIFY_TIME```  u32     maximum verification interval
                                             supported by device
  ``ETHTOOL_A_MM_STATS``             nested  IEEE 802.3-2018 subclause 30.14.1
                                             oMACMergeEntity statistics counters
  =================================  ======  ===================================

The attributes are populated by the device driver through the following
structure:

.. kernel-doc:: include/linux/ethtool.h
    :identifiers: ethtool_mm_state

The ``ETHTOOL_A_MM_VERIFY_STATUS`` will report one of the values from

.. kernel-doc:: include/uapi/linux/ethtool.h
    :identifiers: ethtool_mm_verify_status

If ``ETHTOOL_A_MM_VERIFY_ENABLED`` was passed as false in the ``MM_SET``
command, ``ETHTOOL_A_MM_VERIFY_STATUS`` will report either
``ETHTOOL_MM_VERIFY_STATUS_INITIAL`` or ``ETHTOOL_MM_VERIFY_STATUS_DISABLED``,
otherwise it should report one of the other states.

It is recommended that drivers start with the pMAC disabled, and enable it upon
user space request. It is also recommended that user space does not depend upon
the default values from ``ETHTOOL_MSG_MM_GET`` requests.

``ETHTOOL_A_MM_STATS`` are reported if ``ETHTOOL_FLAG_STATS`` was set in
``ETHTOOL_A_HEADER_FLAGS``. The attribute will be empty if driver did not
report any statistics. Drivers fill in the statistics in the following
structure:

.. kernel-doc:: include/linux/ethtool.h
    :identifiers: ethtool_mm_stats

MM_SET
======

Modifies the configuration of the 802.3 MAC Merge layer.

Request contents:

  =================================  ======  ==========================
  ``ETHTOOL_A_MM_VERIFY_TIME``       u32     see MM_GET description
  ``ETHTOOL_A_MM_VERIFY_ENABLED``    bool    see MM_GET description
  ``ETHTOOL_A_MM_TX_ENABLED``        bool    see MM_GET description
  ``ETHTOOL_A_MM_PMAC_ENABLED``      bool    see MM_GET description
  ``ETHTOOL_A_MM_TX_MIN_FRAG_SIZE``  u32     see MM_GET description
  =================================  ======  ==========================

The attributes are propagated to the driver through the following structure:

.. kernel-doc:: include/linux/ethtool.h
    :identifiers: ethtool_mm_cfg

Request translation
===================

@@ -1972,4 +2077,6 @@ are netlink only.
  n/a                                 ``ETHTOOL_MSG_PLCA_GET_CFG``
  n/a                                 ``ETHTOOL_MSG_PLCA_SET_CFG``
  n/a                                 ``ETHTOOL_MSG_PLCA_GET_STATUS``
  n/a                                 ``ETHTOOL_MSG_MM_GET``
  n/a                                 ``ETHTOOL_MSG_MM_SET``
  =================================== =====================================
+1 −0
Original line number Diff line number Diff line
@@ -171,6 +171,7 @@ statistics are supported in the following commands:

  - `ETHTOOL_MSG_PAUSE_GET`
  - `ETHTOOL_MSG_FEC_GET`
  - `ETHTOOL_MSG_MM_GET`

debugfs
-------
+28 −0
Original line number Diff line number Diff line
@@ -2024,6 +2024,31 @@ static int felix_port_del_dscp_prio(struct dsa_switch *ds, int port, u8 dscp,
	return ocelot_port_del_dscp_prio(ocelot, port, dscp, prio);
}

static int felix_get_mm(struct dsa_switch *ds, int port,
			struct ethtool_mm_state *state)
{
	struct ocelot *ocelot = ds->priv;

	return ocelot_port_get_mm(ocelot, port, state);
}

static int felix_set_mm(struct dsa_switch *ds, int port,
			struct ethtool_mm_cfg *cfg,
			struct netlink_ext_ack *extack)
{
	struct ocelot *ocelot = ds->priv;

	return ocelot_port_set_mm(ocelot, port, cfg, extack);
}

static void felix_get_mm_stats(struct dsa_switch *ds, int port,
			       struct ethtool_mm_stats *stats)
{
	struct ocelot *ocelot = ds->priv;

	ocelot_port_get_mm_stats(ocelot, port, stats);
}

const struct dsa_switch_ops felix_switch_ops = {
	.get_tag_protocol		= felix_get_tag_protocol,
	.change_tag_protocol		= felix_change_tag_protocol,
@@ -2031,6 +2056,9 @@ const struct dsa_switch_ops felix_switch_ops = {
	.setup				= felix_setup,
	.teardown			= felix_teardown,
	.set_ageing_time		= felix_set_ageing_time,
	.get_mm				= felix_get_mm,
	.set_mm				= felix_set_mm,
	.get_mm_stats			= felix_get_mm_stats,
	.get_stats64			= felix_get_stats64,
	.get_pause_stats		= felix_get_pause_stats,
	.get_rmon_stats			= felix_get_rmon_stats,
+49 −8
Original line number Diff line number Diff line
@@ -6,6 +6,7 @@
#include <soc/mscc/ocelot_qsys.h>
#include <soc/mscc/ocelot_vcap.h>
#include <soc/mscc/ocelot_ana.h>
#include <soc/mscc/ocelot_dev.h>
#include <soc/mscc/ocelot_ptp.h>
#include <soc/mscc/ocelot_sys.h>
#include <net/tc_act/tc_gate.h>
@@ -318,6 +319,29 @@ static const u32 vsc9959_sys_regmap[] = {
	REG(SYS_COUNT_RX_GREEN_PRIO_5,		0x0000a4),
	REG(SYS_COUNT_RX_GREEN_PRIO_6,		0x0000a8),
	REG(SYS_COUNT_RX_GREEN_PRIO_7,		0x0000ac),
	REG(SYS_COUNT_RX_ASSEMBLY_ERRS,		0x0000b0),
	REG(SYS_COUNT_RX_SMD_ERRS,		0x0000b4),
	REG(SYS_COUNT_RX_ASSEMBLY_OK,		0x0000b8),
	REG(SYS_COUNT_RX_MERGE_FRAGMENTS,	0x0000bc),
	REG(SYS_COUNT_RX_PMAC_OCTETS,		0x0000c0),
	REG(SYS_COUNT_RX_PMAC_UNICAST,		0x0000c4),
	REG(SYS_COUNT_RX_PMAC_MULTICAST,	0x0000c8),
	REG(SYS_COUNT_RX_PMAC_BROADCAST,	0x0000cc),
	REG(SYS_COUNT_RX_PMAC_SHORTS,		0x0000d0),
	REG(SYS_COUNT_RX_PMAC_FRAGMENTS,	0x0000d4),
	REG(SYS_COUNT_RX_PMAC_JABBERS,		0x0000d8),
	REG(SYS_COUNT_RX_PMAC_CRC_ALIGN_ERRS,	0x0000dc),
	REG(SYS_COUNT_RX_PMAC_SYM_ERRS,		0x0000e0),
	REG(SYS_COUNT_RX_PMAC_64,		0x0000e4),
	REG(SYS_COUNT_RX_PMAC_65_127,		0x0000e8),
	REG(SYS_COUNT_RX_PMAC_128_255,		0x0000ec),
	REG(SYS_COUNT_RX_PMAC_256_511,		0x0000f0),
	REG(SYS_COUNT_RX_PMAC_512_1023,		0x0000f4),
	REG(SYS_COUNT_RX_PMAC_1024_1526,	0x0000f8),
	REG(SYS_COUNT_RX_PMAC_1527_MAX,		0x0000fc),
	REG(SYS_COUNT_RX_PMAC_PAUSE,		0x000100),
	REG(SYS_COUNT_RX_PMAC_CONTROL,		0x000104),
	REG(SYS_COUNT_RX_PMAC_LONGS,		0x000108),
	REG(SYS_COUNT_TX_OCTETS,		0x000200),
	REG(SYS_COUNT_TX_UNICAST,		0x000204),
	REG(SYS_COUNT_TX_MULTICAST,		0x000208),
@@ -349,6 +373,20 @@ static const u32 vsc9959_sys_regmap[] = {
	REG(SYS_COUNT_TX_GREEN_PRIO_6,		0x000270),
	REG(SYS_COUNT_TX_GREEN_PRIO_7,		0x000274),
	REG(SYS_COUNT_TX_AGED,			0x000278),
	REG(SYS_COUNT_TX_MM_HOLD,		0x00027c),
	REG(SYS_COUNT_TX_MERGE_FRAGMENTS,	0x000280),
	REG(SYS_COUNT_TX_PMAC_OCTETS,		0x000284),
	REG(SYS_COUNT_TX_PMAC_UNICAST,		0x000288),
	REG(SYS_COUNT_TX_PMAC_MULTICAST,	0x00028c),
	REG(SYS_COUNT_TX_PMAC_BROADCAST,	0x000290),
	REG(SYS_COUNT_TX_PMAC_PAUSE,		0x000294),
	REG(SYS_COUNT_TX_PMAC_64,		0x000298),
	REG(SYS_COUNT_TX_PMAC_65_127,		0x00029c),
	REG(SYS_COUNT_TX_PMAC_128_255,		0x0002a0),
	REG(SYS_COUNT_TX_PMAC_256_511,		0x0002a4),
	REG(SYS_COUNT_TX_PMAC_512_1023,		0x0002a8),
	REG(SYS_COUNT_TX_PMAC_1024_1526,	0x0002ac),
	REG(SYS_COUNT_TX_PMAC_1527_MAX,		0x0002b0),
	REG(SYS_COUNT_DROP_LOCAL,		0x000400),
	REG(SYS_COUNT_DROP_TAIL,		0x000404),
	REG(SYS_COUNT_DROP_YELLOW_PRIO_0,	0x000408),
@@ -439,6 +477,9 @@ static const u32 vsc9959_dev_gmii_regmap[] = {
	REG(DEV_MAC_FC_MAC_LOW_CFG,		0x3c),
	REG(DEV_MAC_FC_MAC_HIGH_CFG,		0x40),
	REG(DEV_MAC_STICKY,			0x44),
	REG(DEV_MM_ENABLE_CONFIG,		0x48),
	REG(DEV_MM_VERIF_CONFIG,		0x4C),
	REG(DEV_MM_STATUS,			0x50),
	REG_RESERVED(PCS1G_CFG),
	REG_RESERVED(PCS1G_MODE_CFG),
	REG_RESERVED(PCS1G_SD_CFG),
@@ -2562,20 +2603,19 @@ static const struct felix_info felix_info_vsc9959 = {
	.tas_guard_bands_update	= vsc9959_tas_guard_bands_update,
};

/* The INTB interrupt is shared between for PTP TX timestamp availability
 * notification and MAC Merge status change on each port.
 */
static irqreturn_t felix_irq_handler(int irq, void *data)
{
	struct ocelot *ocelot = (struct ocelot *)data;

	/* The INTB interrupt is used for both PTP TX timestamp interrupt
	 * and preemption status change interrupt on each port.
	 *
	 * - Get txtstamp if have
	 * - TODO: handle preemption. Without handling it, driver may get
	 *   interrupt storm.
	 */
	int port;

	ocelot_get_txtstamp(ocelot);

	for (port = 0; port < ocelot->num_phys_ports; port++)
		ocelot_port_mm_irq(ocelot, port);

	return IRQ_HANDLED;
}

@@ -2623,6 +2663,7 @@ static int felix_pci_probe(struct pci_dev *pdev,
	}

	ocelot->ptp = 1;
	ocelot->mm_supported = true;

	ds = kzalloc(sizeof(struct dsa_switch), GFP_KERNEL);
	if (!ds) {
+1 −0
Original line number Diff line number Diff line
@@ -5,6 +5,7 @@ mscc_ocelot_switch_lib-y := \
	ocelot_devlink.o \
	ocelot_flower.o \
	ocelot_io.o \
	ocelot_mm.o \
	ocelot_police.o \
	ocelot_ptp.o \
	ocelot_stats.o \
Loading