Merge branch 'dsa-changes-for-multiple-cpu-ports-part-4' (e8b9f0da) · Commits · EulixOS / Software / Kernel

Documentation/networking/dsa/configuration.rst

+96 −0

Original line number	Diff line number	Diff line
		@@ -49,6 +49,9 @@ In this documentation the following Ethernet interfaces are used:
		eth0
		the master interface

		eth1
		another master interface

		lan1
		a slave interface

		@@ -360,3 +363,96 @@ the ``self`` flag) has been removed. This results in the following changes:

		Script writers are therefore encouraged to use the ``master static`` set of
		flags when working with bridge FDB entries on DSA switch interfaces.

		Affinity of user ports to CPU ports
		-----------------------------------

		Typically, DSA switches are attached to the host via a single Ethernet
		interface, but in cases where the switch chip is discrete, the hardware design
		may permit the use of 2 or more ports connected to the host, for an increase in
		termination throughput.

		DSA can make use of multiple CPU ports in two ways. First, it is possible to
		statically assign the termination traffic associated with a certain user port
		to be processed by a certain CPU port. This way, user space can implement
		custom policies of static load balancing between user ports, by spreading the
		affinities according to the available CPU ports.

		Secondly, it is possible to perform load balancing between CPU ports on a per
		packet basis, rather than statically assigning user ports to CPU ports.
		This can be achieved by placing the DSA masters under a LAG interface (bonding
		or team). DSA monitors this operation and creates a mirror of this software LAG
		on the CPU ports facing the physical DSA masters that constitute the LAG slave
		devices.

		To make use of multiple CPU ports, the firmware (device tree) description of
		the switch must mark all the links between CPU ports and their DSA masters
		using the ``ethernet`` reference/phandle. At startup, only a single CPU port
		and DSA master will be used - the numerically first port from the firmware
		description which has an ``ethernet`` property. It is up to the user to
		configure the system for the switch to use other masters.

		DSA uses the ``rtnl_link_ops`` mechanism (with a "dsa" ``kind``) to allow
		changing the DSA master of a user port. The ``IFLA_DSA_MASTER`` u32 netlink
		attribute contains the ifindex of the master device that handles each slave
		device. The DSA master must be a valid candidate based on firmware node
		information, or a LAG interface which contains only slaves which are valid
		candidates.

		Using iproute2, the following manipulations are possible:

		.. code-block:: sh

		# See the DSA master in current use
		ip -d link show dev swp0
		(...)
		dsa master eth0

		# Static CPU port distribution
		ip link set swp0 type dsa master eth1
		ip link set swp1 type dsa master eth0
		ip link set swp2 type dsa master eth1
		ip link set swp3 type dsa master eth0

		# CPU ports in LAG, using explicit assignment of the DSA master
		ip link add bond0 type bond mode balance-xor && ip link set bond0 up
		ip link set eth1 down && ip link set eth1 master bond0
		ip link set swp0 type dsa master bond0
		ip link set swp1 type dsa master bond0
		ip link set swp2 type dsa master bond0
		ip link set swp3 type dsa master bond0
		ip link set eth0 down && ip link set eth0 master bond0
		ip -d link show dev swp0
		(...)
		dsa master bond0

		# CPU ports in LAG, relying on implicit migration of the DSA master
		ip link add bond0 type bond mode balance-xor && ip link set bond0 up
		ip link set eth0 down && ip link set eth0 master bond0
		ip link set eth1 down && ip link set eth1 master bond0
		ip -d link show dev swp0
		(...)
		dsa master bond0

		Notice that in the case of CPU ports under a LAG, the use of the
		``IFLA_DSA_MASTER`` netlink attribute is not strictly needed, but rather, DSA
		reacts to the ``IFLA_MASTER`` attribute change of its present master (``eth0``)
		and migrates all user ports to the new upper of ``eth0``, ``bond0``. Similarly,
		when ``bond0`` is destroyed using ``RTM_DELLINK``, DSA migrates the user ports
		that were assigned to this interface to the first physical DSA master which is
		eligible, based on the firmware description (it effectively reverts to the
		startup configuration).

		In a setup with more than 2 physical CPU ports, it is therefore possible to mix
		static user to CPU port assignment with LAG between DSA masters. It is not
		possible to statically assign a user port towards a DSA master that has any
		upper interfaces (this includes LAG devices - the master must always be the LAG
		in this case).

		Live changing of the DSA master (and thus CPU port) affinity of a user port is
		permitted, in order to allow dynamic redistribution in response to traffic.

		Physical DSA masters are allowed to join and leave at any time a LAG interface
		used as a DSA master; however, DSA will reject a LAG interface as a valid
		candidate for being a DSA master unless it has at least one physical DSA master
		as a slave device.

Documentation/networking/dsa/dsa.rst

+32 −6

Original line number	Diff line number	Diff line
		@@ -303,6 +303,20 @@ These frames are then queued for transmission using the master network device
		Ethernet switch will be able to process these incoming frames from the
		management interface and deliver them to the physical switch port.

		When using multiple CPU ports, it is possible to stack a LAG (bonding/team)
		device between the DSA slave devices and the physical DSA masters. The LAG
		device is thus also a DSA master, but the LAG slave devices continue to be DSA
		masters as well (just with no user port assigned to them; this is needed for
		recovery in case the LAG DSA master disappears). Thus, the data path of the LAG
		DSA master is used asymmetrically. On RX, the ``ETH_P_XDSA`` handler, which
		calls ``dsa_switch_rcv()``, is invoked early (on the physical DSA master;
		LAG slave). Therefore, the RX data path of the LAG DSA master is not used.
		On the other hand, TX takes place linearly: ``dsa_slave_xmit`` calls
		``dsa_enqueue_skb``, which calls ``dev_queue_xmit`` towards the LAG DSA master.
		The latter calls ``dev_queue_xmit`` towards one physical DSA master or the
		other, and in both cases, the packet exits the system through a hardware path
		towards the switch.

		Graphical representation
		------------------------

		@@ -629,6 +643,24 @@ Switch configuration
		PHY cannot be found. In this case, probing of the DSA switch continues
		without that particular port.

		- ``port_change_master``: method through which the affinity (association used
		for traffic termination purposes) between a user port and a CPU port can be
		changed. By default all user ports from a tree are assigned to the first
		available CPU port that makes sense for them (most of the times this means
		the user ports of a tree are all assigned to the same CPU port, except for H
		topologies as described in commit 2c0b03258b8b). The ``port`` argument
		represents the index of the user port, and the ``master`` argument represents
		the new DSA master ``net_device``. The CPU port associated with the new
		master can be retrieved by looking at ``struct dsa_port *cpu_dp =
		master->dsa_ptr``. Additionally, the master can also be a LAG device where
		all the slave devices are physical DSA masters. LAG DSA masters also have a
		valid ``master->dsa_ptr`` pointer, however this is not unique, but rather a
		duplicate of the first physical DSA master's (LAG slave) ``dsa_ptr``. In case
		of a LAG DSA master, a further call to ``port_lag_join`` will be emitted
		separately for the physical CPU ports associated with the physical DSA
		masters, requesting them to create a hardware LAG associated with the LAG
		interface.

		PHY devices and link management
		-------------------------------

		@@ -1095,9 +1127,3 @@ capable hardware, but does not enforce a strict switch device driver model. On
		the other DSA enforces a fairly strict device driver model, and deals with most
		of the switch specific. At some point we should envision a merger between these
		two subsystems and get the best of both worlds.

		Other hanging fruits
		--------------------

		- allowing more than one CPU/management interface:
		http://comments.gmane.org/gmane.linux.network/365657

drivers/net/dsa/bcm_sf2.c

+2 −2

Original line number	Diff line number	Diff line
		@@ -983,7 +983,7 @@ static int bcm_sf2_sw_resume(struct dsa_switch *ds)
		static void bcm_sf2_sw_get_wol(struct dsa_switch *ds, int port,
		struct ethtool_wolinfo *wol)
		{
		struct net_device *p = dsa_to_port(ds, port)->cpu_dp->master;
		struct net_device *p = dsa_port_to_master(dsa_to_port(ds, port));
		struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
		struct ethtool_wolinfo pwol = { };

		@@ -1007,7 +1007,7 @@ static void bcm_sf2_sw_get_wol(struct dsa_switch *ds, int port,
		static int bcm_sf2_sw_set_wol(struct dsa_switch *ds, int port,
		struct ethtool_wolinfo *wol)
		{
		struct net_device *p = dsa_to_port(ds, port)->cpu_dp->master;
		struct net_device *p = dsa_port_to_master(dsa_to_port(ds, port));
		struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
		s8 cpu_port = dsa_to_port(ds, port)->cpu_dp->index;
		struct ethtool_wolinfo pwol = { };

drivers/net/dsa/bcm_sf2_cfp.c

+2 −2

Original line number	Diff line number	Diff line
		@@ -1102,7 +1102,7 @@ static int bcm_sf2_cfp_rule_get_all(struct bcm_sf2_priv *priv,
		int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port,
		struct ethtool_rxnfc nfc, u32 rule_locs)
		{
		struct net_device *p = dsa_to_port(ds, port)->cpu_dp->master;
		struct net_device *p = dsa_port_to_master(dsa_to_port(ds, port));
		struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
		int ret = 0;

		@@ -1145,7 +1145,7 @@ int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port,
		int bcm_sf2_set_rxnfc(struct dsa_switch *ds, int port,
		struct ethtool_rxnfc *nfc)
		{
		struct net_device *p = dsa_to_port(ds, port)->cpu_dp->master;
		struct net_device *p = dsa_port_to_master(dsa_to_port(ds, port));
		struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
		int ret = 0;

drivers/net/dsa/lan9303-core.c

+2 −2

Original line number	Diff line number	Diff line
		@@ -1092,7 +1092,7 @@ static int lan9303_port_enable(struct dsa_switch *ds, int port,
		if (!dsa_port_is_user(dp))
		return 0;

		vlan_vid_add(dp->cpu_dp->master, htons(ETH_P_8021Q), port);
		vlan_vid_add(dsa_port_to_master(dp), htons(ETH_P_8021Q), port);

		return lan9303_enable_processing_port(chip, port);
		}
		@@ -1105,7 +1105,7 @@ static void lan9303_port_disable(struct dsa_switch *ds, int port)
		if (!dsa_port_is_user(dp))
		return;

		vlan_vid_del(dp->cpu_dp->master, htons(ETH_P_8021Q), port);
		vlan_vid_del(dsa_port_to_master(dp), htons(ETH_P_8021Q), port);

		lan9303_disable_processing_port(chip, port);
		lan9303_phy_write(ds, chip->phy_addr_base + port, MII_BMCR, BMCR_PDOWN);