summaryrefslogtreecommitdiff
path: root/drivers/net/dsa
AgeCommit message (Collapse)Author
2022-02-27net: dsa: sja1105: enforce FDB isolationVladimir Oltean
For sja1105, to enforce FDB isolation simply means to turn on Independent VLAN Learning unconditionally, and to remap VLAN-unaware FDB and MDB entries towards the private VLAN allocated by tag_8021q for each bridge. Standalone ports each have their own standalone tag_8021q VLAN. No learning happens in that VLAN due to: - learning being disabled on standalone user ports - learning being disabled on the CPU port (we use assisted_learning_on_cpu_port which only installs bridge FDBs) VLAN-aware ports learn FDB entries with the bridge VLANs. VLAN-unaware bridge ports learn with the tag_8021q VLAN for bridging. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27net: dsa: pass extack to .port_bridge_join driver methodsVladimir Oltean
As FDB isolation cannot be enforced between VLAN-aware bridges in lack of hardware assistance like extra FID bits, it seems plausible that many DSA switches cannot do it. Therefore, they need to reject configurations with multiple VLAN-aware bridges from the two code paths that can transition towards that state: - joining a VLAN-aware bridge - toggling VLAN awareness on an existing bridge The .port_vlan_filtering method already propagates the netlink extack to the driver, let's propagate it from .port_bridge_join too, to make sure that the driver can use the same function for both. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27net: dsa: request drivers to perform FDB isolationVladimir Oltean
For DSA, to encourage drivers to perform FDB isolation simply means to track which bridge does each FDB and MDB entry belong to. It then becomes the driver responsibility to use something that makes the FDB entry from one bridge not match the FDB lookup of ports from other bridges. The top-level functions where the bridge is determined are: - dsa_port_fdb_{add,del} - dsa_port_host_fdb_{add,del} - dsa_port_mdb_{add,del} - dsa_port_host_mdb_{add,del} aka the pre-crosschip-notifier functions. Changing the API to pass a reference to a bridge is not superfluous, and looking at the passed bridge argument is not the same as having the driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add() method. DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well, and those do not have any dp->bridge information to retrieve, because they are not in any bridge - they are merely the pipes that serve the user ports that are in one or multiple bridges. The struct dsa_bridge associated with each FDB/MDB entry is encapsulated in a larger "struct dsa_db" database. Although only databases associated to bridges are notified for now, this API will be the starting point for implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB entries on the CPU port which belong to the corresponding user port's port database. These are supposed to match only when the port is standalone. It is better to introduce the API in its expected final form than to introduce it for bridges first, then to have to change drivers which may have made one or more assumptions. Drivers can use the provided bridge.num, but they can also use a different numbering scheme that is more convenient. DSA must perform refcounting on the CPU and DSA ports by also taking into account the bridge number. So if two bridges request the same local address, DSA must notify the driver twice, once for each bridge. In fact, if the driver supports FDB isolation, DSA must perform refcounting per bridge, but if the driver doesn't, DSA must refcount host addresses across all bridges, otherwise it would be telling the driver to delete an FDB entry for a bridge and the driver would delete it for all bridges. So introduce a bool fdb_isolation in drivers which would make all bridge databases passed to the cross-chip notifier have the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal() say that all bridge databases are the same database - which is essentially the legacy behavior. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27net: dsa: tag_8021q: rename dsa_8021q_bridge_tx_fwd_offload_vidVladimir Oltean
The dsa_8021q_bridge_tx_fwd_offload_vid is no longer used just for bridge TX forwarding offload, it is the private VLAN reserved for VLAN-unaware bridging in a way that is compatible with FDB isolation. So just rename it dsa_tag_8021q_bridge_vid. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27net: dsa: tag_8021q: merge RX and TX VLANsVladimir Oltean
In the old Shared VLAN Learning mode of operation that tag_8021q previously used for forwarding, we needed to have distinct concepts for an RX and a TX VLAN. An RX VLAN could be installed on all ports that were members of a given bridge, so that autonomous forwarding could still work, while a TX VLAN was dedicated for precise packet steering, so it just contained the CPU port and one egress port. Now that tag_8021q uses Independent VLAN Learning and imprecise RX/TX all over, those lines have been blurred and we no longer have the need to do precise TX towards a port that is in a bridge. As for standalone ports, it is fine to use the same VLAN ID for both RX and TX. This patch changes the tag_8021q format by shifting the VLAN range it reserves, and halving it. Previously, our DIR bits were encoding the VLAN direction (RX/TX) and were set to either 1 or 2. This meant that tag_8021q reserved 2K VLANs, or 50% of the available range. Change the DIR bits to a hardcoded value of 3 now, which makes tag_8021q reserve only 1K VLANs, and a different range now (the last 1K). This is done so that we leave the old format in place in case we need to return to it. In terms of code, the vid_is_dsa_8021q_rxvlan and vid_is_dsa_8021q_txvlan functions go away. Any vid_is_dsa_8021q is both a TX and an RX VLAN, and they are no longer distinct. For example, felix which did different things for different VLAN types, now needs to handle the RX and the TX logic for the same VLAN. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27net: dsa: felix: delete workarounds present due to SVL tag_8021q bridgingVladimir Oltean
The felix driver, which also has a tagging protocol implementation based on tag_8021q, does not care about adding the RX VLAN that is pvid on one port on the other ports that are in the same bridge with it. It simply doesn't need that, because in its implementation, the RX VLAN that is pvid of a port is only used to install a TCAM rule that pushes that VLAN ID towards the CPU port. Now that tag_8021q no longer performs Shared VLAN Learning based forwarding, the RX VLANs are actually segregated into two types: standalone VLANs and VLAN-unaware bridging VLANs. Since you actually have to call dsa_tag_8021q_bridge_join() to get a bridging VLAN from tag_8021q, and felix does not do that because it doesn't need it, it means that it only gets standalone port VLANs from tag_8021q. Which is perfect because this means it can drop its workarounds that avoid the VLANs it does not need. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridgingVladimir Oltean
For VLAN-unaware bridging, tag_8021q uses something perhaps a bit too tied with the sja1105 switch: each port uses the same pvid which is also used for standalone operation (a unique one from which the source port and device ID can be retrieved when packets from that port are forwarded to the CPU). Since each port has a unique pvid when performing autonomous forwarding, the switch must be configured for Shared VLAN Learning (SVL) such that the VLAN ID itself is ignored when performing FDB lookups. Without SVL, packets would always be flooded, since FDB lookup in the source port's VLAN would never find any entry. First of all, to make tag_8021q more palatable to switches which might not support Shared VLAN Learning, let's just use a common VLAN for all ports that are under the same bridge. Secondly, using Shared VLAN Learning means that FDB isolation can never be enforced. But if all ports under the same VLAN-unaware bridge share the same VLAN ID, it can. The disadvantage is that the CPU port can no longer perform precise source port identification for these packets. But at least we have a mechanism which has proven to be adequate for that situation: imprecise RX (dsa_find_designated_bridge_port_by_vid), which is what we use for termination on VLAN-aware bridges. The VLAN ID that VLAN-unaware bridges will use with tag_8021q is the same one as we were previously using for imprecise TX (bridge TX forwarding offload). It is already allocated, it is just a matter of using it. Note that because now all ports under the same bridge share the same VLAN, the complexity of performing a tag_8021q bridge join decreases dramatically. We no longer have to install the RX VLAN of a newly joining port into the port membership of the existing bridge ports. The newly joining port just becomes a member of the VLAN corresponding to that bridge, and the other ports are already members of it from when they joined the bridge themselves. So forwarding works properly. This means that we can unhook dsa_tag_8021q_bridge_{join,leave} from the cross-chip notifier level dsa_switch_bridge_{join,leave}. We can put these calls directly into the sja1105 driver. With this new mode of operation, a port controlled by tag_8021q can have two pvids whereas before it could only have one. The pvid for standalone operation is different from the pvid used for VLAN-unaware bridging. This is done, again, so that FDB isolation can be enforced. Let tag_8021q manage this by deleting the standalone pvid when a port joins a bridge, and restoring it when it leaves it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-26net: dsa: ocelot: mark as non-legacyRussell King (Oracle)
The ocelot DSA driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-26net: dsa: ocelot: convert to mac_select_pcs()Russell King (Oracle)
Convert the PCS selection to use mac_select_pcs, which allows the PCS to perform any validation it needs, and removes the need to set the PCS in the mac_config() callback, delving into the higher DSA levels to do so. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-26net: dsa: ocelot: remove interface checksRussell King (Oracle)
When the supported interfaces bitmap is populated, phylink will itself check that the interface mode is present in this bitmap. Drivers no longer need to perform this check themselves. Remove these checks. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-26net: dsa: ocelot: populate supported_interfacesRussell King (Oracle)
Populate the supported interfaces bitmap for the Ocelot DSA switches. Since all sub-drivers only support a single interface mode, defined by ocelot_port->phy_mode, we can handle this in the main driver code without reference to the sub-driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: dsa: qca8k: return with -EINVAL on invalid portColin Ian King
Currently an invalid port throws a WARN_ON warning however invalid uninitialized values in reg and cpu_port_index are being used later on. Fix this by returning -EINVAL for an invalid port value. Addresses clang-scan warnings: drivers/net/dsa/qca8k.c:1981:3: warning: 2nd function call argument is an uninitialized value [core.CallAndMessage] drivers/net/dsa/qca8k.c:1999:9: warning: 2nd function call argument is an uninitialized value [core.CallAndMessage] Fixes: 7544b3ff745b ("net: dsa: qca8k: move pcs configuration") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220224220557.147075-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-25net: dsa: sja1105: support switching between SGMII and 2500BASE-XRussell King (Oracle)
Vladimir Oltean suggests that sja1105 can support switching between SGMII and 2500BASE-X modes. Augment sja1105_phylink_get_caps() to fill in both interface modes if they can be supported. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: dsa: sja1105: convert to phylink_generic_validate()Russell King (Oracle)
Populate the MAC capabilities for the SJA1105 DSA switch using the same decision making which sja1105_phylink_validate() uses. Remove the now obsolete sja1105_phylink_validate() implementation to allow DSA to use phylink_generic_validate() for this switch driver. As noted by Vladimir, this fixes an inconsequential bug which allowed gigabit and lower interface modes to be indicated when operating in 2500base-X mode. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: dsa: sja1105: mark as non-legacyRussell King (Oracle)
The sja1105 DSA driver does not have a phylink_mac_config() method implementation, it is safe to mark this as a non-legacy driver. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: dsa: sja1105: use .mac_select_pcs() interfaceRussell King (Oracle)
Convert the PCS selection to use mac_select_pcs, which allows the PCS to perform any validation it needs, and removes the need to set the PCS in the mac_config() callback, delving into the higher DSA levels to do so. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: dsa: sja1105: remove interface checksRussell King (Oracle)
When the supported interfaces bitmap is populated, phylink will itself check that the interface mode is present in this bitmap. Drivers no longer need to perform this check themselves. Remove these checks. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25net: dsa: sja1105: populate supported_interfacesRussell King (Oracle)
Populate the supported interfaces bitmap for the SJA1105 DSA switch. This switch only supports a static model of configuration, so we restrict the interface modes to the configured setting. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir. │ Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-24net: dsa: felix: support FDB entries on offloaded LAG interfacesVladimir Oltean
This adds the logic in the Felix DSA driver and Ocelot switch library. For Ocelot switches, the DEST_IDX that is the output of the MAC table lookup is a logical port (equal to physical port, if no LAG is used, or a dynamically allocated number otherwise). The allocation we have in place for LAG IDs is different from DSA's, so we can't use that: - DSA allocates a continuous range of LAG IDs starting from 1 - Ocelot appears to require that physical ports and LAG IDs are in the same space of [0, num_phys_ports), and additionally, ports that aren't in a LAG must have physical port id == logical port id The implication is that an FDB entry towards a LAG might need to be deleted and reinstalled when the LAG ID changes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24net: dsa: create a dsa_lag structureVladimir Oltean
The main purpose of this change is to create a data structure for a LAG as seen by DSA. This is similar to what we have for bridging - we pass a copy of this structure by value to ->port_lag_join and ->port_lag_leave. For now we keep the lag_dev, id and a reference count in it. Future patches will add a list of FDB entries for the LAG (these also need to be refcounted to work properly). The LAG structure is created using dsa_port_lag_create() and destroyed using dsa_port_lag_destroy(), just like we have for bridging. Because now, the dsa_lag itself is refcounted, we can simplify dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in the dst->lags array only as long as at least one port uses it. The refcounting logic inside those functions can be removed now - they are called only when we should perform the operation. dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag structure instead of the lag_dev net_device. dsa_lag_foreach_port() now takes the dsa_lag structure as argument. dst->lags holds an array of dsa_lag structures. dsa_lag_map() now also saves the dsa_lag->id value, so that linear walking of dst->lags in drivers using dsa_lag_id() is no longer necessary. They can just look at lag.id. dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(), which can be used by drivers to get the LAG ID assigned by DSA to a given port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24net: dsa: mv88e6xxx: use dsa_switch_for_each_port in mv88e6xxx_lag_sync_masksVladimir Oltean
Make the intent of the code more clear by using the dedicated helper for iterating over the ports of a switch. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24net: dsa: make LAG IDs one-basedVladimir Oltean
The DSA LAG API will be changed to become more similar with the bridge data structures, where struct dsa_bridge holds an unsigned int num, which is generated by DSA and is one-based. We have a similar thing going with the DSA LAG, except that isn't stored anywhere, it is calculated dynamically by dsa_lag_id() by iterating through dst->lags. The idea of encoding an invalid (or not requested) LAG ID as zero for the purpose of simplifying checks in drivers means that the LAG IDs passed by DSA to drivers need to be one-based too. So back-and-forth conversion is needed when indexing the dst->lags array, as well as in drivers which assume a zero-based index. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24net: dsa: qca8k: rename references to "lag" as "lag_dev"Vladimir Oltean
In preparation of converting struct net_device *dp->lag_dev into a struct dsa_lag *dp->lag, we need to rename, for consistency purposes, all occurrences of the "lag" variable in qca8k to "lag_dev". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24net: dsa: mv88e6xxx: rename references to "lag" as "lag_dev"Vladimir Oltean
In preparation of converting struct net_device *dp->lag_dev into a struct dsa_lag *dp->lag, we need to rename, for consistency purposes, all occurrences of the "lag" variable in mv88e6xxx to "lag_dev". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
tools/testing/selftests/net/mptcp/mptcp_join.sh 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers") 857898eb4b28 ("selftests: mptcp: add missing join check") 6ef84b1517e0 ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions") c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") 09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") 84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr") efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") 3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23net: dsa: mv88e6xxx: Add support for bridge port locked modeHans Schultz
Supporting bridge ports in locked mode using the drop on lock feature in Marvell mv88e6xxx switchcores is described in the '88E6096/88E6097/88E6097F Datasheet', sections 4.4.6, 4.4.7 and 5.1.2.1 (Drop on Lock). This feature is implemented here facilitated by the locked port flag. Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23net: dsa: realtek: rtl8365mb: serialize indirect PHY register accessAlvin Šipraga
Realtek switches in the rtl8365mb family can access the PHY registers of the internal PHYs via the switch registers. This method is called indirect access. At a high level, the indirect PHY register access method involves reading and writing some special switch registers in a particular sequence. This works for both SMI and MDIO connected switches. Currently the rtl8365mb driver does not take any care to serialize the aforementioned access to the switch registers. In particular, it is permitted for other driver code to access other switch registers while the indirect PHY register access is ongoing. Locking is only done at the regmap level. This, however, is a bug: concurrent register access, even to unrelated switch registers, risks corrupting the PHY register value read back via the indirect access method described above. Arınç reported that the switch sometimes returns nonsense data when reading the PHY registers. In particular, a value of 0 causes the kernel's PHY subsystem to think that the link is down, but since most reads return correct data, the link then flip-flops between up and down over a period of time. The aforementioned bug can be readily observed by: 1. Enabling ftrace events for regmap and mdio 2. Polling BSMR PHY register for a connected port; it should always read the same (e.g. 0x79ed) 3. Wait for step 2 to give a different value Example command for step 2: while true; do phytool read swp2/2/0x01; done On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY register within a matter of seconds. The interleaved register access it then evident in the trace log: kworker/3:4-70 [003] ....... 1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd phytool-16816 [002] ....... 1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0 kworker/3:4-70 [003] ....... 1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0 phytool-16816 [002] ....... 1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69 kworker/3:4-70 [003] ....... 1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0 phytool-16816 [002] ....... 1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041 kworker/3:4-70 [003] ....... 1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0 phytool-16816 [002] ....... 1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1 kworker/3:4-70 [003] ....... 1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be phytool-16816 [002] ....... 1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0 kworker/3:4-70 [003] ....... 1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0 phytool-16816 [002] ....... 1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0 phytool-16816 [002] ....... 1927.142641: mdio_access: SMI-0 read phy:0x02 reg:0x01 val:0x0000 <- ?! kworker/3:4-70 [003] ....... 1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0 kworker/3:4-70 [003] ....... 1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89 kworker/3:4-70 [003] ....... 1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be kworker/3:4-70 [003] ....... 1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0 kworker/3:4-70 [003] ....... 1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0 kworker/3:4-70 [003] ....... 1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6 The kworker here is polling MIB counters for stats, as evidenced by the register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This polling is performed every 3 seconds, but is just one example of such unsynchronized access. In Arınç's case, the driver was not using the switch IRQ, so the PHY subsystem was itself doing polling analogous to phytool in the above example. A test module was created [see second Link] to simulate such spurious switch register accesses while performing indirect PHY register reads and writes. Realtek was also consulted to confirm whether this is a known issue or not. The conclusion of these lines of inquiry is as follows: 1. Reading of PHY registers via indirect access will be aborted if, after executing the read operation (via a write to the INDIRECT_ACCESS_CTRL_REG), any register is accessed, other than INDIRECT_ACCESS_STATUS_REG. 2. The PHY register indirect read is only complete when INDIRECT_ACCESS_STATUS_REG reads zero. 3. The INDIRECT_ACCESS_DATA_REG, which is read to get the result of the PHY read, will contain the result of the last successful read operation. If there was spurious register access and the indirect read was aborted, then this register is not guaranteed to hold anything meaningful and the PHY read will silently fail. 4. PHY writes do not appear to be affected by this mechanism. 5. Other similar access routines, such as for MIB counters, although similar to the PHY indirect access method, are actually table access. Table access is not affected by spurious reads or writes of other registers. However, concurrent table access is not allowed. Currently this is protected via mib_lock, so there is nothing to fix. The above statements are corroborated both via the test module and through consultation with Realtek. In particular, Realtek states that this is simply a property of the hardware design and is not a hardware bug. To fix this problem, one must guard against regmap access while the PHY indirect register read is executing. Fix this by using the newly introduced "nolock" regmap in all PHY-related functions, and by aquiring the regmap mutex at the top level of the PHY register access callbacks. Although no issue has been observed with PHY register _writes_, this change also serializes the indirect access method there. This is done purely as a matter of convenience and for reasons of symmetry. Fixes: 4af2950c50c8 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC") Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7LXKofQ@mail.gmail.com/ Link: https://lore.kernel.org/netdev/871qzwjmtv.fsf@bang-olufsen.dk/ Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reported-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23net: dsa: realtek: allow subdrivers to externally lock regmapAlvin Šipraga
Currently there is no way for Realtek DSA subdrivers to serialize consecutive regmap accesses. In preparation for a bugfix relating to indirect PHY register access - which involves a series of regmap reads and writes - add a facility for subdrivers to serialize their regmap access. Specifically, a mutex is added to the driver private data structure and the standard regmap is initialized with custom lock/unlock ops which use this mutex. Then, a "nolock" variant of the regmap is added, which is functionally equivalent to the existing regmap except that regmap locking is disabled. Functions that wish to serialize a sequence of regmap accesses may then lock the newly introduced driver-owned mutex before using the nolock regmap. Doing things this way means that subdriver code that doesn't care about serialized register access - i.e. the vast majority of code - needn't worry about synchronizing register access with an external lock: it can just continue to use the original regmap. Another advantage of this design is that, while regmaps with locking disabled do not expose a debugfs interface for obvious reasons, there still exists the original regmap which does expose this interface. This interface remains safe to use even combined with driver codepaths that use the nolock regmap, because said codepaths will use the same mutex to synchronize access. With respect to disadvantages, it can be argued that having near-duplicate regmaps is confusing. However, the naming is rather explicit, and examples will abound. Finally, while we are at it, rename realtek_smi_mdio_regmap_config to realtek_smi_regmap_config. This makes it consistent with the naming realtek_mdio_regmap_config in realtek-mdio.c. Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: microchip: ksz9477: reduce polling interval for statisticsOleksij Rempel
30 seconds is too long interval especially if it used with ip -s l. Reduce polling interval to 5 sec. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20220221084129.3660124-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22net: dsa: b53: mark as non-legacyRussell King (Oracle)
The B53 driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: switch to using phylink_generic_validate()Russell King (Oracle)
Switch the Broadcom b53 driver to using the phylink_generic_validate() implementation by removing its own .phylink_validate method and associated code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: drop use of phylink_helper_basex_speed()Russell King (Oracle)
Now that we have a better method to select SFP interface modes, we no longer need to use phylink_helper_basex_speed() in a driver's validation function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: populate supported_interfaces and mac_capabilitiesRussell King (Oracle)
Populate the supported interfaces and MAC capabilities for the Broadcom B53 DSA switches in preparation to using these for the generic validation functionality. The interface modes are derived from: - b53_serdes_phylink_validate() - SRAB mux configuration Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: clean up if() condition to be more readableRussell King (Oracle)
I've stared at this if() statement for a while trying to work out if it really does correspond with the comment above, and it does seem to. However, let's make it more readable and phrase it in the same way as the comment. Also add a FIXME into the comment - we appear to deny Gigabit modes for 802.3z interface modes, but 802.3z interface modes only operate at gigabit and above. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19net: dsa: microchip: add ksz8563 to ksz9477 I2C driverAhmad Fatoum
The KSZ9477 SPI driver already has support for the KSZ8563. The same switch chip can also be managed via i2c and we have an KSZ9477 I2C driver, but that one lacks the relevant compatible entry. Add it. DT bindings already describe this compatible. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19net: dsa: microchip: ksz9477: export HW stats over stats64 interfaceOleksij Rempel
Provide access to HW offloaded packets over stats64 interface. The rx/tx_bytes values needed some fixing since HW is accounting size of the Ethernet frame together with FCS. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19net: dsa: microchip: fix bridging with more than two member portsSvenning Sørensen
Commit b3612ccdf284 ("net: dsa: microchip: implement multi-bridge support") plugged a packet leak between ports that were members of different bridges. Unfortunately, this broke another use case, namely that of more than two ports that are members of the same bridge. After that commit, when a port is added to a bridge, hardware bridging between other member ports of that bridge will be cleared, preventing packet exchange between them. Fix by ensuring that the Port VLAN Membership bitmap includes any existing ports in the bridge, not just the port being added. Fixes: b3612ccdf284 ("net: dsa: microchip: implement multi-bridge support") Signed-off-by: Svenning Sørensen <sss@secomea.com> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: mark as non-legacyRussell King (Oracle)
The qca8k driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: move pcs configurationRussell King (Oracle)
Move the PCS configuration to qca8k_pcs_config(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: convert to use phylink_pcsRussell King (Oracle)
Convert the qca8k driver to use the phylink_pcs support to talk to the SGMII PCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: move qca8k_phylink_mac_link_state()Russell King (Oracle)
Move qca8k_phylink_mac_link_state() to separate the code movement from code changes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18net: dsa: qca8k: move qca8k_setup()Russell King (Oracle)
Move qca8k_setup() to be later in the file to avoid needing prototypes for called functions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: dsa: lan9303: add VLAN IDs to master deviceMans Rullgard
If the master device does VLAN filtering, the IDs used by the switch must be added for any frames to be received. Do this in the port_enable() function, and remove them in port_disable(). Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20220216204818.28746-1-mans@mansr.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: dsa: felix: update destinations of existing traps with ocelot-8021qVladimir Oltean
Historically, the felix DSA driver has installed special traps such that PTP over L2 works with the ocelot-8021q tagging protocol; commit 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") has the details. Then the ocelot switch library also gained more comprehensive support for PTP traps through commit 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets"). Right now, PTP over L2 works using ocelot-8021q via the traps it has set for itself, but nothing else does. Consolidating the two code blocks would make ocelot-8021q gain support for PTP over L4 and tc-flower traps, and at the same time avoid some code and TCAM duplication. The traps are similar in intent, but different in execution, so some explanation is required. The traps set up by felix_setup_mmio_filtering() are VCAP IS1 filters, which have a PAG that chains them to a VCAP IS2 filter, and the IS2 is where the 'trap' action resides. The traps set up by ocelot_trap_add(), on the other hand, have a single filter, in VCAP IS2. The reason for chaining VCAP IS1 and IS2 in Felix was to ensure that the hardcoded traps take precedence and cannot be overridden by the Ocelot switch library. So in principle, the PTP traps needed for ocelot-8021q in the Felix driver can rely on ocelot_trap_add(), but the filters need to be patched to account for a quirk that LS1028A has: the quirk_no_xtr_irq described in commit 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping"). Live-patching is done by iterating through the trap list every time we know it has been updated, and transforming a trap into a redirect + CPU copy if ocelot-8021q is in use. Making the DSA ocelot-8021q tagger work with the Ocelot traps means we can eliminate the dedicated OCELOT_VCAP_IS1_TAG_8021Q_PTP_MMIO and OCELOT_VCAP_IS2_TAG_8021Q_PTP_MMIO cookies. To minimize the patch delta, OCELOT_VCAP_IS2_MRP_TRAP takes the place of OCELOT_VCAP_IS2_TAG_8021Q_PTP_MMIO (the alternative would have been to left-shift all cookie numbers by 1). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17net: dsa: felix: remove dead code in felix_setup_mmio_filtering()Vladimir Oltean
There has been some controversy related to the sanity check that a CPU port exists, and commit e8b1d7698038 ("net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering") even "corrected" an apparent memory leak as static analysis tools see it. However, the check is completely dead code, since the earliest point at which felix_setup_mmio_filtering() can be called is: felix_pci_probe -> dsa_register_switch -> dsa_switch_probe -> dsa_tree_setup -> dsa_tree_setup_cpu_ports -> dsa_tree_setup_default_cpu -> contains the "DSA: tree %d has no CPU port\n" check -> dsa_tree_setup_master -> dsa_master_setup -> sysfs_create_group(&dev->dev.kobj, &dsa_group); -> makes tagging_store() callable -> dsa_tree_change_tag_proto -> dsa_tree_notify -> dsa_switch_event -> dsa_switch_change_tag_proto -> ds->ops->change_tag_protocol -> felix_change_tag_protocol -> felix_set_tag_protocol -> felix_setup_tag_8021q -> felix_setup_mmio_filtering -> breaks at first CPU port So probing would have failed earlier if there wasn't any CPU port defined. To avoid all confusion, delete the dead code and replace it with a comment. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17net: dsa: felix: use DSA port iteration helpersVladimir Oltean
Use the helpers that avoid the quadratic complexity associated with calling dsa_to_port() indirectly: dsa_is_unused_port(), dsa_is_cpu_port(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17net: mscc: ocelot: consolidate cookie allocation for private VCAP rulesVladimir Oltean
Every use case that needed VCAP filters (in order: DSA tag_8021q, MRP, PTP traps) has hardcoded filter identifiers that worked well enough for that use case alone. But when two or more of those use cases would be used together, some of those identifiers would overlap, leading to breakage. Add definitions for each cookie and centralize them in ocelot_vcap.h, such that the overlaps are more obvious. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16net: dsa: lantiq_gswip: fix use after free in gswip_remove()Alexey Khoroshilov
of_node_put(priv->ds->slave_mii_bus->dev.of_node) should be done before mdiobus_free(priv->ds->slave_mii_bus). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Fixes: 0d120dfb5d67 ("net: dsa: lantiq_gswip: don't use devres for mdiobus") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1644921768-26477-1-git-send-email-khoroshilov@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-14net: dsa: mv88e6xxx: Fix validation of built-in PHYs on 6095/6097Tobias Waldekranz
These chips have 8 built-in FE PHYs and 3 SERDES interfaces that can run at 1G. With the blamed commit, the built-in PHYs could no longer be connected to, using an MII PHY interface mode. Create a separate .phylink_get_caps callback for these chips, which takes the FE/GE split into consideration. Fixes: 2ee84cfefb1e ("net: dsa: mv88e6xxx: convert to phylink_generic_validate()") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20220213185154.3262207-1-tobias@waldekranz.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>