diff options
510 files changed, 28310 insertions, 6745 deletions
diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst index fcedc5349ace..640934b6f7b4 100644 --- a/Documentation/core-api/xarray.rst +++ b/Documentation/core-api/xarray.rst @@ -25,10 +25,6 @@ good performance with large indices. If your index can be larger than ``ULONG_MAX`` then the XArray is not the data type for you. The most important user of the XArray is the page cache. -Each non-``NULL`` entry in the array has three bits associated with -it called marks. Each mark may be set or cleared independently of -the others. You can iterate over entries which are marked. - Normal pointers may be stored in the XArray directly. They must be 4-byte aligned, which is true for any pointer returned from kmalloc() and alloc_page(). It isn't true for arbitrary user-space pointers, @@ -41,12 +37,11 @@ When you retrieve an entry from the XArray, you can check whether it is a value entry by calling xa_is_value(), and convert it back to an integer by calling xa_to_value(). -Some users want to store tagged pointers instead of using the marks -described above. They can call xa_tag_pointer() to create an -entry with a tag, xa_untag_pointer() to turn a tagged entry -back into an untagged pointer and xa_pointer_tag() to retrieve -the tag of an entry. Tagged pointers use the same bits that are used -to distinguish value entries from normal pointers, so each user must +Some users want to tag the pointers they store in the XArray. You can +call xa_tag_pointer() to create an entry with a tag, xa_untag_pointer() +to turn a tagged entry back into an untagged pointer and xa_pointer_tag() +to retrieve the tag of an entry. Tagged pointers use the same bits that +are used to distinguish value entries from normal pointers, so you must decide whether they want to store value entries or tagged pointers in any particular XArray. @@ -56,10 +51,9 @@ conflict with value entries or internal entries. An unusual feature of the XArray is the ability to create entries which occupy a range of indices. Once stored to, looking up any index in the range will return the same entry as looking up any other index in -the range. Setting a mark on one index will set it on all of them. -Storing to any index will store to all of them. Multi-index entries can -be explicitly split into smaller entries, or storing ``NULL`` into any -entry will cause the XArray to forget about the range. +the range. Storing to any index will store to all of them. Multi-index +entries can be explicitly split into smaller entries, or storing ``NULL`` +into any entry will cause the XArray to forget about the range. Normal API ========== @@ -87,17 +81,11 @@ If you want to only store a new entry to an index if the current entry at that index is ``NULL``, you can use xa_insert() which returns ``-EBUSY`` if the entry is not empty. -You can enquire whether a mark is set on an entry by using -xa_get_mark(). If the entry is not ``NULL``, you can set a mark -on it by using xa_set_mark() and remove the mark from an entry by -calling xa_clear_mark(). You can ask whether any entry in the -XArray has a particular mark set by calling xa_marked(). - You can copy entries out of the XArray into a plain array by calling -xa_extract(). Or you can iterate over the present entries in -the XArray by calling xa_for_each(). You may prefer to use -xa_find() or xa_find_after() to move to the next present -entry in the XArray. +xa_extract(). Or you can iterate over the present entries in the XArray +by calling xa_for_each(), xa_for_each_start() or xa_for_each_range(). +You may prefer to use xa_find() or xa_find_after() to move to the next +present entry in the XArray. Calling xa_store_range() stores the same entry in a range of indices. If you do this, some of the other operations will behave @@ -124,6 +112,31 @@ xa_destroy(). If the XArray entries are pointers, you may wish to free the entries first. You can do this by iterating over all present entries in the XArray using the xa_for_each() iterator. +Search Marks +------------ + +Each entry in the array has three bits associated with it called marks. +Each mark may be set or cleared independently of the others. You can +iterate over marked entries by using the xa_for_each_marked() iterator. + +You can enquire whether a mark is set on an entry by using +xa_get_mark(). If the entry is not ``NULL``, you can set a mark on it +by using xa_set_mark() and remove the mark from an entry by calling +xa_clear_mark(). You can ask whether any entry in the XArray has a +particular mark set by calling xa_marked(). Erasing an entry from the +XArray causes all marks associated with that entry to be cleared. + +Setting or clearing a mark on any index of a multi-index entry will +affect all indices covered by that entry. Querying the mark on any +index will return the same result. + +There is no way to iterate over entries which are not marked; the data +structure does not allow this to be implemented efficiently. There are +not currently iterators to search for logical combinations of bits (eg +iterate over all entries which have both ``XA_MARK_1`` and ``XA_MARK_2`` +set, or iterate over all entries which have ``XA_MARK_0`` or ``XA_MARK_2`` +set). It would be possible to add these if a user arises. + Allocating XArrays ------------------ @@ -180,6 +193,8 @@ No lock needed: Takes RCU read lock: * xa_load() * xa_for_each() + * xa_for_each_start() + * xa_for_each_range() * xa_find() * xa_find_after() * xa_extract() @@ -419,10 +434,9 @@ you last processed. If you have interrupts disabled while iterating, then it is good manners to pause the iteration and reenable interrupts every ``XA_CHECK_SCHED`` entries. -The xas_get_mark(), xas_set_mark() and -xas_clear_mark() functions require the xa_state cursor to have -been moved to the appropriate location in the xarray; they will do -nothing if you have called xas_pause() or xas_set() +The xas_get_mark(), xas_set_mark() and xas_clear_mark() functions require +the xa_state cursor to have been moved to the appropriate location in the +XArray; they will do nothing if you have called xas_pause() or xas_set() immediately before. You can call xas_set_update() to have a callback function diff --git a/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt b/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt index b5eadee4a9a7..dd258674633c 100644 --- a/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt +++ b/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt @@ -11,6 +11,7 @@ Required properties: - compatible: should contain one of the following: * "brcm,bcm20702a1" + * "brcm,bcm4329-bt" * "brcm,bcm4330-bt" * "brcm,bcm43438-bt" * "brcm,bcm4345c5" @@ -22,7 +23,9 @@ Optional properties: - max-speed: see Documentation/devicetree/bindings/serial/slave-device.txt - shutdown-gpios: GPIO specifier, used to enable the BT module - device-wakeup-gpios: GPIO specifier, used to wakeup the controller - - host-wakeup-gpios: GPIO specifier, used to wakeup the host processor + - host-wakeup-gpios: GPIO specifier, used to wakeup the host processor. + deprecated, replaced by interrupts and + "host-wakeup" interrupt-names - clocks: 1 or 2 clocks as defined in clock-names below, in that order - clock-names: names for clock inputs, matching the clocks given - "extclk": deprecated, replaced by "txco" @@ -36,7 +39,8 @@ Optional properties: - pcm-frame-type: short, long - pcm-sync-mode: slave, master - pcm-clock-mode: slave, master - + - interrupts: must be one, used to wakeup the host processor + - interrupt-names: must be "host-wakeup" Example: diff --git a/Documentation/devicetree/bindings/net/fsl-fman.txt b/Documentation/devicetree/bindings/net/fsl-fman.txt index 299c0dcd67db..250f8d8cdce4 100644 --- a/Documentation/devicetree/bindings/net/fsl-fman.txt +++ b/Documentation/devicetree/bindings/net/fsl-fman.txt @@ -403,6 +403,19 @@ PROPERTIES The settings and programming routines for internal/external MDIO are different. Must be included for internal MDIO. +- fsl,erratum-a011043 + Usage: optional + Value type: <boolean> + Definition: Indicates the presence of the A011043 erratum + describing that the MDIO_CFG[MDIO_RD_ER] bit may be falsely + set when reading internal PCS registers. MDIO reads to + internal PCS registers may result in having the + MDIO_CFG[MDIO_RD_ER] bit set, even when there is no error and + read data (MDIO_DATA[MDIO_DATA]) is correct. + Software may get false read error when reading internal + PCS registers through MDIO. As a workaround, all internal + MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit. + For internal PHY device on internal mdio bus, a PHY node should be created. See the definition of the PHY node in booting-without-of.txt for an example of how to define a PHY (Internal PHY has no interrupt line). diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt index 017128394a3e..616c87746d6f 100644 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt @@ -50,7 +50,7 @@ Optional properties: entry in clock-names. - clock-names: Should contain the clock names "wifi_wcss_cmd", "wifi_wcss_ref", "wifi_wcss_rtc" for "qcom,ipq4019-wifi" compatible target and - "cxo_ref_clk_pin" for "qcom,wcn3990-wifi" + "cxo_ref_clk_pin" and optionally "qdss" for "qcom,wcn3990-wifi" compatible target. - qcom,msi_addr: MSI interrupt address. - qcom,msi_base: Base value to add before writing MSI data into @@ -88,6 +88,9 @@ Optional properties: of the host capability QMI request - qcom,xo-cal-data: xo cal offset to be configured in xo trim register. +- qcom,msa-fixed-perm: Boolean context flag to disable SCM call for statically + mapped msa region. + Example (to supply PCI based wifi block details): In this example, the node is defined as child node of the PCI controller. @@ -185,4 +188,5 @@ wifi@18000000 { vdd-3.3-ch0-supply = <&vreg_l25a_3p3>; memory-region = <&wifi_msa_mem>; iommus = <&apps_smmu 0x0040 0x1>; + qcom,msa-fixed-perm; }; diff --git a/Documentation/networking/device_drivers/microsoft/netvsc.txt b/Documentation/networking/device_drivers/microsoft/netvsc.txt index 3bfa635bbbd5..cd63556b27a0 100644 --- a/Documentation/networking/device_drivers/microsoft/netvsc.txt +++ b/Documentation/networking/device_drivers/microsoft/netvsc.txt @@ -82,3 +82,24 @@ Features contain one or more packets. The send buffer is an optimization, the driver will use slower method to handle very large packets or if the send buffer area is exhausted. + + XDP support + ----------- + XDP (eXpress Data Path) is a feature that runs eBPF bytecode at the early + stage when packets arrive at a NIC card. The goal is to increase performance + for packet processing, reducing the overhead of SKB allocation and other + upper network layers. + + hv_netvsc supports XDP in native mode, and transparently sets the XDP + program on the associated VF NIC as well. + + Setting / unsetting XDP program on synthetic NIC (netvsc) propagates to + VF NIC automatically. Setting / unsetting XDP program on VF NIC directly + is not recommended, also not propagated to synthetic NIC, and may be + overwritten by setting of synthetic NIC. + + XDP program cannot run with LRO (RSC) enabled, so you need to disable LRO + before running XDP: + ethtool -K eth0 lro off + + XDP_REDIRECT action is not yet supported. diff --git a/Documentation/networking/devlink/bnxt.rst b/Documentation/networking/devlink/bnxt.rst index 79e746d22a53..82ef9ec46707 100644 --- a/Documentation/networking/devlink/bnxt.rst +++ b/Documentation/networking/devlink/bnxt.rst @@ -39,3 +39,36 @@ parameters. - Generic Routing Encapsulation (GRE) version check will be enabled in the device. If disabled, the device will skip the version check for incoming packets. + +Info versions +============= + +The ``bnxt_en`` driver reports the following versions + +.. list-table:: devlink info versions implemented + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``asic.id`` + - fixed + - ASIC design identifier + * - ``asic.rev`` + - fixed + - ASIC design revision + * - ``fw.psid`` + - stored, running + - Firmware parameter set version of the board + * - ``fw`` + - stored, running + - Overall board firmware version + * - ``fw.app`` + - stored, running + - Data path firmware version + * - ``fw.mgmt`` + - stored, running + - Management firmware version + * - ``fw.roce`` + - stored, running + - RoCE management firmware version diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst index 0385f15028b1..70981dd1b981 100644 --- a/Documentation/networking/devlink/devlink-info.rst +++ b/Documentation/networking/devlink/devlink-info.rst @@ -92,3 +92,9 @@ fw.psid ------- Unique identifier of the firmware parameter set. + +fw.roce +------- + +RoCE firmware version which is responsible for handling roce +management. diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index c60afba69e3c..f1f868479ceb 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -185,18 +185,26 @@ Userspace to kernel: ``ETHTOOL_MSG_LINKMODES_GET`` get link modes info ``ETHTOOL_MSG_LINKMODES_SET`` set link modes info ``ETHTOOL_MSG_LINKSTATE_GET`` get link state + ``ETHTOOL_MSG_DEBUG_GET`` get debugging settings + ``ETHTOOL_MSG_DEBUG_SET`` set debugging settings + ``ETHTOOL_MSG_WOL_GET`` get wake-on-lan settings + ``ETHTOOL_MSG_WOL_SET`` set wake-on-lan settings ===================================== ================================ Kernel to userspace: - ===================================== ================================ + ===================================== ================================= ``ETHTOOL_MSG_STRSET_GET_REPLY`` string set contents ``ETHTOOL_MSG_LINKINFO_GET_REPLY`` link settings ``ETHTOOL_MSG_LINKINFO_NTF`` link settings notification ``ETHTOOL_MSG_LINKMODES_GET_REPLY`` link modes info ``ETHTOOL_MSG_LINKMODES_NTF`` link modes notification ``ETHTOOL_MSG_LINKSTATE_GET_REPLY`` link state info - ===================================== ================================ + ``ETHTOOL_MSG_DEBUG_GET_REPLY`` debugging settings + ``ETHTOOL_MSG_DEBUG_NTF`` debugging settings notification + ``ETHTOOL_MSG_WOL_GET_REPLY`` wake-on-lan settings + ``ETHTOOL_MSG_WOL_NTF`` wake-on-lan settings notification + ===================================== ================================= ``GET`` requests are sent by userspace applications to retrieve device information. They usually do not contain any message specific attributes. @@ -423,6 +431,96 @@ define their own handler. devices supporting the request). +DEBUG_GET +========= + +Requests debugging settings of a device. At the moment, only message mask is +provided. + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_DEBUG_HEADER`` nested request header + ==================================== ====== ========================== + +Kernel response contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_DEBUG_HEADER`` nested reply header + ``ETHTOOL_A_DEBUG_MSGMASK`` bitset message mask + ==================================== ====== ========================== + +The message mask (``ETHTOOL_A_DEBUG_MSGMASK``) is equal to message level as +provided by ``ETHTOOL_GMSGLVL`` and set by ``ETHTOOL_SMSGLVL`` in ioctl +interface. While it is called message level there for historical reasons, most +drivers and almost all newer drivers use it as a mask of enabled message +classes (represented by ``NETIF_MSG_*`` constants); therefore netlink +interface follows its actual use in practice. + +``DEBUG_GET`` allows dump requests (kernel returns reply messages for all +devices supporting the request). + + +DEBUG_SET +========= + +Set or update debugging settings of a device. At the moment, only message mask +is supported. + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_DEBUG_HEADER`` nested request header + ``ETHTOOL_A_DEBUG_MSGMASK`` bitset message mask + ==================================== ====== ========================== + +``ETHTOOL_A_DEBUG_MSGMASK`` bit set allows setting or modifying mask of +enabled debugging message types for the device. + + +WOL_GET +======= + +Query device wake-on-lan settings. Unlike most "GET" type requests, +``ETHTOOL_MSG_WOL_GET`` requires (netns) ``CAP_NET_ADMIN`` privileges as it +(potentially) provides SecureOn(tm) password which is confidential. + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_WOL_HEADER`` nested request header + ==================================== ====== ========================== + +Kernel response contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_WOL_HEADER`` nested reply header + ``ETHTOOL_A_WOL_MODES`` bitset mask of enabled WoL modes + ``ETHTOOL_A_WOL_SOPASS`` binary SecureOn(tm) password + ==================================== ====== ========================== + +In reply, ``ETHTOOL_A_WOL_MODES`` mask consists of modes supported by the +device, value of modes which are enabled. ``ETHTOOL_A_WOL_SOPASS`` is only +included in reply if ``WAKE_MAGICSECURE`` mode is supported. + + +WOL_SET +======= + +Set or update wake-on-lan settings. + +Request contents: + + ==================================== ====== ========================== + ``ETHTOOL_A_WOL_HEADER`` nested request header + ``ETHTOOL_A_WOL_MODES`` bitset enabled WoL modes + ``ETHTOOL_A_WOL_SOPASS`` binary SecureOn(tm) password + ==================================== ====== ========================== + +``ETHTOOL_A_WOL_SOPASS`` is only allowed for devices supporting +``WAKE_MAGICSECURE`` mode. + + Request translation =================== @@ -439,10 +537,10 @@ have their netlink replacement yet. ``ETHTOOL_MSG_LINKMODES_SET`` ``ETHTOOL_GDRVINFO`` n/a ``ETHTOOL_GREGS`` n/a - ``ETHTOOL_GWOL`` n/a - ``ETHTOOL_SWOL`` n/a - ``ETHTOOL_GMSGLVL`` n/a - ``ETHTOOL_SMSGLVL`` n/a + ``ETHTOOL_GWOL`` ``ETHTOOL_MSG_WOL_GET`` + ``ETHTOOL_SWOL`` ``ETHTOOL_MSG_WOL_SET`` + ``ETHTOOL_GMSGLVL`` ``ETHTOOL_MSG_DEBUG_GET`` + ``ETHTOOL_SMSGLVL`` ``ETHTOOL_MSG_DEBUG_SET`` ``ETHTOOL_NWAY_RST`` n/a ``ETHTOOL_GLINK`` ``ETHTOOL_MSG_LINKSTATE_GET`` ``ETHTOOL_GEEPROM`` n/a diff --git a/MAINTAINERS b/MAINTAINERS index 702382b89c37..31084bdb1639 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6198,6 +6198,7 @@ ETHERNET PHY LIBRARY M: Andrew Lunn <andrew@lunn.ch> M: Florian Fainelli <f.fainelli@gmail.com> M: Heiner Kallweit <hkallweit1@gmail.com> +R: Russell King <linux@armlinux.org.uk> L: netdev@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-class-net-phydev @@ -8570,7 +8571,7 @@ S: Maintained F: drivers/platform/x86/intel-vbtn.c INTEL WIRELESS 3945ABG/BG, 4965AGN (iwlegacy) -M: Stanislaw Gruszka <sgruszka@redhat.com> +M: Stanislaw Gruszka <stf_xl@wp.pl> L: linux-wireless@vger.kernel.org S: Supported F: drivers/net/wireless/intel/iwlegacy/ @@ -9960,7 +9961,6 @@ F: drivers/net/ethernet/marvell/mvneta.* MARVELL MWIFIEX WIRELESS DRIVER M: Amitkumar Karwar <amitkarwar@gmail.com> -M: Nishant Sarmukadam <nishants@marvell.com> M: Ganapathi Bhat <ganapathi.bhat@nxp.com> M: Xinming Hu <huxinming820@gmail.com> L: linux-wireless@vger.kernel.org @@ -11500,6 +11500,7 @@ F: drivers/net/dsa/ NETWORKING [GENERAL] M: "David S. Miller" <davem@davemloft.net> +M: Jakub Kicinski <kuba@kernel.org> L: netdev@vger.kernel.org W: http://www.linuxfoundation.org/en/Net Q: http://patchwork.ozlabs.org/project/netdev/list/ @@ -11583,6 +11584,8 @@ W: https://github.com/multipath-tcp/mptcp_net-next/wiki B: https://github.com/multipath-tcp/mptcp_net-next/issues S: Maintained F: include/net/mptcp.h +F: net/mptcp/ +F: tools/testing/selftests/net/mptcp/ NETWORKING [TCP] M: Eric Dumazet <edumazet@google.com> @@ -13838,7 +13841,7 @@ S: Maintained F: arch/mips/ralink RALINK RT2X00 WIRELESS LAN DRIVER -M: Stanislaw Gruszka <sgruszka@redhat.com> +M: Stanislaw Gruszka <stf_xl@wp.pl> M: Helmut Schaa <helmut.schaa@googlemail.com> L: linux-wireless@vger.kernel.org S: Maintained @@ -16620,7 +16623,7 @@ F: kernel/time/ntp.c F: tools/testing/selftests/timers/ TIPC NETWORK LAYER -M: Jon Maloy <jon.maloy@ericsson.com> +M: Jon Maloy <jmaloy@redhat.com> M: Ying Xue <ying.xue@windriver.com> L: netdev@vger.kernel.org (core kernel code) L: tipc-discussion@lists.sourceforge.net (user apps, general discussion) @@ -2,7 +2,7 @@ VERSION = 5 PATCHLEVEL = 5 SUBLEVEL = 0 -EXTRAVERSION = -rc6 +EXTRAVERSION = -rc7 NAME = Kleptomaniac Octopus # *DOCUMENTATION* diff --git a/arch/arm/boot/dts/am335x-boneblack-common.dtsi b/arch/arm/boot/dts/am335x-boneblack-common.dtsi index 7ad079861efd..91f93bc89716 100644 --- a/arch/arm/boot/dts/am335x-boneblack-common.dtsi +++ b/arch/arm/boot/dts/am335x-boneblack-common.dtsi @@ -131,6 +131,11 @@ }; / { + memory@80000000 { + device_type = "memory"; + reg = <0x80000000 0x20000000>; /* 512 MB */ + }; + clk_mcasp0_fixed: clk_mcasp0_fixed { #clock-cells = <0>; compatible = "fixed-clock"; diff --git a/arch/arm/boot/dts/am43x-epos-evm.dts b/arch/arm/boot/dts/am43x-epos-evm.dts index 078cb473fa7d..a6fbc088daa8 100644 --- a/arch/arm/boot/dts/am43x-epos-evm.dts +++ b/arch/arm/boot/dts/am43x-epos-evm.dts @@ -848,6 +848,7 @@ pinctrl-names = "default", "sleep"; pinctrl-0 = <&spi0_pins_default>; pinctrl-1 = <&spi0_pins_sleep>; + ti,pindir-d0-out-d1-in = <1>; }; &spi1 { @@ -855,6 +856,7 @@ pinctrl-names = "default", "sleep"; pinctrl-0 = <&spi1_pins_default>; pinctrl-1 = <&spi1_pins_sleep>; + ti,pindir-d0-out-d1-in = <1>; }; &usb2_phy1 { diff --git a/arch/arm/kernel/hyp-stub.S b/arch/arm/kernel/hyp-stub.S index ae5020302de4..6607fa817bba 100644 --- a/arch/arm/kernel/hyp-stub.S +++ b/arch/arm/kernel/hyp-stub.S @@ -146,10 +146,9 @@ ARM_BE8(orr r7, r7, #(1 << 25)) @ HSCTLR.EE #if !defined(ZIMAGE) && defined(CONFIG_ARM_ARCH_TIMER) @ make CNTP_* and CNTPCT accessible from PL1 mrc p15, 0, r7, c0, c1, 1 @ ID_PFR1 - lsr r7, #16 - and r7, #0xf - cmp r7, #1 - bne 1f + ubfx r7, r7, #16, #4 + teq r7, #0 + beq 1f mrc p15, 4, r7, c14, c1, 0 @ CNTHCTL orr r7, r7, #3 @ PL1PCEN | PL1PCTEN mcr p15, 4, r7, c14, c1, 0 @ CNTHCTL diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1ec34e16ed65..e2a412113359 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -455,11 +455,7 @@ config PPC_TRANSACTIONAL_MEM config PPC_UV bool "Ultravisor support" depends on KVM_BOOK3S_HV_POSSIBLE - select ZONE_DEVICE - select DEV_PAGEMAP_OPS - select DEVICE_PRIVATE - select MEMORY_HOTPLUG - select MEMORY_HOTREMOVE + depends on DEVICE_PRIVATE default n help This option paravirtualizes the kernel to run in POWER platforms that diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0-best-effort.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0-best-effort.dtsi index e1a961f05dcd..baa0c503e741 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0-best-effort.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0-best-effort.dtsi @@ -63,6 +63,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy0: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0.dtsi index c288f3c6c637..93095600e808 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-0.dtsi @@ -60,6 +60,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy6: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1-best-effort.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1-best-effort.dtsi index 94f3e7175012..ff4bd38f0645 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1-best-effort.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1-best-effort.dtsi @@ -63,6 +63,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy1: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1.dtsi index 94a76982d214..1fa38ed6f59e 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-10g-1.dtsi @@ -60,6 +60,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy7: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-0.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-0.dtsi index b5ff5f71c6b8..a8cc9780c0c4 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-0.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-0.dtsi @@ -59,6 +59,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy0: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-1.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-1.dtsi index ee44182c6348..8b8bd70c9382 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-1.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-1.dtsi @@ -59,6 +59,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy1: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-2.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-2.dtsi index f05f0d775039..619c880b54d8 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-2.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-2.dtsi @@ -59,6 +59,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe5000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy2: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-3.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-3.dtsi index a9114ec51075..d7ebb73a400d 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-3.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-3.dtsi @@ -59,6 +59,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe7000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy3: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-4.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-4.dtsi index 44dd00ac7367..b151d696a069 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-4.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-4.dtsi @@ -59,6 +59,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe9000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy4: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-5.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-5.dtsi index 5b1b84b58602..adc0ae0013a3 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-5.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-0-1g-5.dtsi @@ -59,6 +59,7 @@ fman@400000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xeb000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy5: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-0.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-0.dtsi index 0e1daaef9e74..435047e0e250 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-0.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-0.dtsi @@ -60,6 +60,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy14: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-1.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-1.dtsi index 68c5ef779266..c098657cca0a 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-1.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-10g-1.dtsi @@ -60,6 +60,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xf3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy15: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-0.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-0.dtsi index 605363cc1117..9d06824815f3 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-0.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-0.dtsi @@ -59,6 +59,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe1000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy8: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-1.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-1.dtsi index 1955dfa13634..70e947730c4b 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-1.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-1.dtsi @@ -59,6 +59,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe3000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy9: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-2.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-2.dtsi index 2c1476454ee0..ad96e6529595 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-2.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-2.dtsi @@ -59,6 +59,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe5000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy10: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-3.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-3.dtsi index b8b541ff5fb0..034bc4b71f7a 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-3.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-3.dtsi @@ -59,6 +59,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe7000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy11: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-4.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-4.dtsi index 4b2cfddd1b15..93ca23d82b39 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-4.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-4.dtsi @@ -59,6 +59,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xe9000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy12: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-5.dtsi b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-5.dtsi index 0a52ddf7cc17..23b3117a2fd2 100644 --- a/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-5.dtsi +++ b/arch/powerpc/boot/dts/fsl/qoriq-fman3-1-1g-5.dtsi @@ -59,6 +59,7 @@ fman@500000 { #size-cells = <0>; compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio"; reg = <0xeb000 0x1000>; + fsl,erratum-a011043; /* must ignore read errors */ pcsphy13: ethernet-phy@0 { reg = <0x0>; diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index 15b75005bc34..3fa1b962dc27 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -600,8 +600,11 @@ extern void slb_set_size(u16 size); * */ #define MAX_USER_CONTEXT ((ASM_CONST(1) << CONTEXT_BITS) - 2) + +// The + 2 accounts for INVALID_REGION and 1 more to avoid overlap with kernel #define MIN_USER_CONTEXT (MAX_KERNEL_CTX_CNT + MAX_VMALLOC_CTX_CNT + \ - MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT) + MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT + 2) + /* * For platforms that support on 65bit VA we limit the context bits */ diff --git a/arch/powerpc/include/asm/xive-regs.h b/arch/powerpc/include/asm/xive-regs.h index f2dfcd50a2d3..33aee7490cbb 100644 --- a/arch/powerpc/include/asm/xive-regs.h +++ b/arch/powerpc/include/asm/xive-regs.h @@ -39,6 +39,7 @@ #define XIVE_ESB_VAL_P 0x2 #define XIVE_ESB_VAL_Q 0x1 +#define XIVE_ESB_INVALID 0xFF /* * Thread Management (aka "TM") registers diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c index f5fadbd2533a..9651ca061828 100644 --- a/arch/powerpc/sysdev/xive/common.c +++ b/arch/powerpc/sysdev/xive/common.c @@ -972,12 +972,21 @@ static int xive_get_irqchip_state(struct irq_data *data, enum irqchip_irq_state which, bool *state) { struct xive_irq_data *xd = irq_data_get_irq_handler_data(data); + u8 pq; switch (which) { case IRQCHIP_STATE_ACTIVE: - *state = !xd->stale_p && - (xd->saved_p || - !!(xive_esb_read(xd, XIVE_ESB_GET) & XIVE_ESB_VAL_P)); + pq = xive_esb_read(xd, XIVE_ESB_GET); + + /* + * The esb value being all 1's means we couldn't get + * the PQ state of the interrupt through mmio. It may + * happen, for example when querying a PHB interrupt + * while the PHB is in an error state. We consider the + * interrupt to be inactive in that case. + */ + *state = (pq != XIVE_ESB_INVALID) && !xd->stale_p && + (xd->saved_p || !!(pq & XIVE_ESB_VAL_P)); return 0; default: return -EINVAL; diff --git a/drivers/atm/firestream.c b/drivers/atm/firestream.c index aad00d2b28f5..cc87004d5e2d 100644 --- a/drivers/atm/firestream.c +++ b/drivers/atm/firestream.c @@ -912,6 +912,7 @@ static int fs_open(struct atm_vcc *atm_vcc) } if (!to) { printk ("No more free channels for FS50..\n"); + kfree(vcc); return -EBUSY; } vcc->channo = dev->channo; @@ -922,6 +923,7 @@ static int fs_open(struct atm_vcc *atm_vcc) if (((DO_DIRECTION(rxtp) && dev->atm_vccs[vcc->channo])) || ( DO_DIRECTION(txtp) && test_bit (vcc->channo, dev->tx_inuse))) { printk ("Channel is in use for FS155.\n"); + kfree(vcc); return -EBUSY; } } @@ -935,6 +937,7 @@ static int fs_open(struct atm_vcc *atm_vcc) tc, sizeof (struct fs_transmit_config)); if (!tc) { fs_dprintk (FS_DEBUG_OPEN, "fs: can't alloc transmit_config.\n"); + kfree(vcc); return -ENOMEM; } diff --git a/drivers/bluetooth/btbcm.c b/drivers/bluetooth/btbcm.c index 0795a49edfae..1f498f358f60 100644 --- a/drivers/bluetooth/btbcm.c +++ b/drivers/bluetooth/btbcm.c @@ -36,6 +36,7 @@ int btbcm_check_bdaddr(struct hci_dev *hdev) HCI_INIT_TIMEOUT); if (IS_ERR(skb)) { int err = PTR_ERR(skb); + bt_dev_err(hdev, "BCM: Reading device address failed (%d)", err); return err; } @@ -223,6 +224,7 @@ static int btbcm_reset(struct hci_dev *hdev) skb = __hci_cmd_sync(hdev, HCI_OP_RESET, 0, NULL, HCI_INIT_TIMEOUT); if (IS_ERR(skb)) { int err = PTR_ERR(skb); + bt_dev_err(hdev, "BCM: Reset failed (%d)", err); return err; } diff --git a/drivers/bluetooth/btbcm.h b/drivers/bluetooth/btbcm.h index 3c7dd0765837..014ef847a486 100644 --- a/drivers/bluetooth/btbcm.h +++ b/drivers/bluetooth/btbcm.h @@ -78,13 +78,13 @@ static inline int btbcm_set_bdaddr(struct hci_dev *hdev, const bdaddr_t *bdaddr) return -EOPNOTSUPP; } -int btbcm_read_pcm_int_params(struct hci_dev *hdev, +static inline int btbcm_read_pcm_int_params(struct hci_dev *hdev, struct bcm_set_pcm_int_params *params) { return -EOPNOTSUPP; } -int btbcm_write_pcm_int_params(struct hci_dev *hdev, +static inline int btbcm_write_pcm_int_params(struct hci_dev *hdev, const struct bcm_set_pcm_int_params *params) { return -EOPNOTSUPP; diff --git a/drivers/bluetooth/btrtl.c b/drivers/bluetooth/btrtl.c index f838537f9f89..577cfa3329db 100644 --- a/drivers/bluetooth/btrtl.c +++ b/drivers/bluetooth/btrtl.c @@ -370,11 +370,11 @@ static int rtlbt_parse_firmware(struct hci_dev *hdev, * the end. */ len = patch_length; - buf = kmemdup(btrtl_dev->fw_data + patch_offset, patch_length, - GFP_KERNEL); + buf = kvmalloc(patch_length, GFP_KERNEL); if (!buf) return -ENOMEM; + memcpy(buf, btrtl_dev->fw_data + patch_offset, patch_length - 4); memcpy(buf + patch_length - 4, &epatch_info->fw_version, 4); *_buf = buf; @@ -460,8 +460,10 @@ static int rtl_load_file(struct hci_dev *hdev, const char *name, u8 **buff) if (ret < 0) return ret; ret = fw->size; - *buff = kmemdup(fw->data, ret, GFP_KERNEL); - if (!*buff) + *buff = kvmalloc(fw->size, GFP_KERNEL); + if (*buff) + memcpy(*buff, fw->data, ret); + else ret = -ENOMEM; release_firmware(fw); @@ -499,14 +501,14 @@ static int btrtl_setup_rtl8723b(struct hci_dev *hdev, goto out; if (btrtl_dev->cfg_len > 0) { - tbuff = kzalloc(ret + btrtl_dev->cfg_len, GFP_KERNEL); + tbuff = kvzalloc(ret + btrtl_dev->cfg_len, GFP_KERNEL); if (!tbuff) { ret = -ENOMEM; goto out; } memcpy(tbuff, fw_data, ret); - kfree(fw_data); + kvfree(fw_data); memcpy(tbuff + ret, btrtl_dev->cfg_data, btrtl_dev->cfg_len); ret += btrtl_dev->cfg_len; @@ -519,14 +521,14 @@ static int btrtl_setup_rtl8723b(struct hci_dev *hdev, ret = rtl_download_firmware(hdev, fw_data, ret); out: - kfree(fw_data); + kvfree(fw_data); return ret; } void btrtl_free(struct btrtl_device_info *btrtl_dev) { - kfree(btrtl_dev->fw_data); - kfree(btrtl_dev->cfg_data); + kvfree(btrtl_dev->fw_data); + kvfree(btrtl_dev->cfg_data); kfree(btrtl_dev); } EXPORT_SYMBOL_GPL(btrtl_free); diff --git a/drivers/bluetooth/btsdio.c b/drivers/bluetooth/btsdio.c index fd9571d5fdac..199e8f7d426d 100644 --- a/drivers/bluetooth/btsdio.c +++ b/drivers/bluetooth/btsdio.c @@ -145,11 +145,20 @@ static int btsdio_rx_packet(struct btsdio_data *data) data->hdev->stat.byte_rx += len; - hci_skb_pkt_type(skb) = hdr[3]; - - err = hci_recv_frame(data->hdev, skb); - if (err < 0) - return err; + switch (hdr[3]) { + case HCI_EVENT_PKT: + case HCI_ACLDATA_PKT: + case HCI_SCODATA_PKT: + case HCI_ISODATA_PKT: + hci_skb_pkt_type(skb) = hdr[3]; + err = hci_recv_frame(data->hdev, skb); + if (err < 0) + return err; + break; + default: + kfree_skb(skb); + return -EINVAL; + } sdio_writeb(data->func, 0x00, REG_PC_RRT, NULL); diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 0eaeca0a64fb..f5924f3e8b8d 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -266,6 +266,7 @@ static const struct usb_device_id blacklist_table[] = { { USB_DEVICE(0x04ca, 0x3015), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x04ca, 0x3016), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x04ca, 0x301a), .driver_info = BTUSB_QCA_ROME }, + { USB_DEVICE(0x04ca, 0x3021), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x13d3, 0x3491), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x13d3, 0x3496), .driver_info = BTUSB_QCA_ROME }, { USB_DEVICE(0x13d3, 0x3501), .driver_info = BTUSB_QCA_ROME }, diff --git a/drivers/bluetooth/hci_bcm.c b/drivers/bluetooth/hci_bcm.c index f8f5c593a05c..b236cb11c0dc 100644 --- a/drivers/bluetooth/hci_bcm.c +++ b/drivers/bluetooth/hci_bcm.c @@ -13,6 +13,7 @@ #include <linux/module.h> #include <linux/acpi.h> #include <linux/of.h> +#include <linux/of_irq.h> #include <linux/property.h> #include <linux/platform_data/x86/apple.h> #include <linux/platform_device.h> @@ -53,6 +54,7 @@ */ struct bcm_device_data { bool no_early_set_baudrate; + bool drive_rts_on_open; }; /** @@ -122,6 +124,7 @@ struct bcm_device { bool is_suspended; #endif bool no_early_set_baudrate; + bool drive_rts_on_open; u8 pcm_int_params[5]; }; @@ -456,7 +459,9 @@ static int bcm_open(struct hci_uart *hu) out: if (bcm->dev) { - hci_uart_set_flow_control(hu, true); + if (bcm->dev->drive_rts_on_open) + hci_uart_set_flow_control(hu, true); + hu->init_speed = bcm->dev->init_speed; /* If oper_speed is set, ldisc/serdev will set the baudrate @@ -466,7 +471,10 @@ out: hu->oper_speed = bcm->dev->oper_speed; err = bcm_gpio_set_power(bcm->dev, true); - hci_uart_set_flow_control(hu, false); + + if (bcm->dev->drive_rts_on_open) + hci_uart_set_flow_control(hu, false); + if (err) goto err_unset_hu; } @@ -1144,6 +1152,8 @@ static int bcm_of_probe(struct bcm_device *bdev) device_property_read_u32(bdev->dev, "max-speed", &bdev->oper_speed); device_property_read_u8_array(bdev->dev, "brcm,bt-pcm-int-params", bdev->pcm_int_params, 5); + bdev->irq = of_irq_get_byname(bdev->dev->of_node, "host-wakeup"); + return 0; } @@ -1447,8 +1457,10 @@ static int bcm_serdev_probe(struct serdev_device *serdev) dev_err(&serdev->dev, "Failed to power down\n"); data = device_get_match_data(bcmdev->dev); - if (data) + if (data) { bcmdev->no_early_set_baudrate = data->no_early_set_baudrate; + bcmdev->drive_rts_on_open = data->drive_rts_on_open; + } return hci_uart_register_device(&bcmdev->serdev_hu, &bcm_proto); } @@ -1465,11 +1477,16 @@ static struct bcm_device_data bcm4354_device_data = { .no_early_set_baudrate = true, }; +static struct bcm_device_data bcm43438_device_data = { + .drive_rts_on_open = true, +}; + static const struct of_device_id bcm_bluetooth_of_match[] = { { .compatible = "brcm,bcm20702a1" }, + { .compatible = "brcm,bcm4329-bt" }, { .compatible = "brcm,bcm4345c5" }, { .compatible = "brcm,bcm4330-bt" }, - { .compatible = "brcm,bcm43438-bt" }, + { .compatible = "brcm,bcm43438-bt", .data = &bcm43438_device_data }, { .compatible = "brcm,bcm43540-bt", .data = &bcm4354_device_data }, { .compatible = "brcm,bcm4335a0" }, { }, diff --git a/drivers/bluetooth/hci_h4.c b/drivers/bluetooth/hci_h4.c index 19ba52005009..6dc1fbeb564b 100644 --- a/drivers/bluetooth/hci_h4.c +++ b/drivers/bluetooth/hci_h4.c @@ -103,6 +103,7 @@ static const struct h4_recv_pkt h4_recv_pkts[] = { { H4_RECV_ACL, .recv = hci_recv_frame }, { H4_RECV_SCO, .recv = hci_recv_frame }, { H4_RECV_EVENT, .recv = hci_recv_frame }, + { H4_RECV_ISO, .recv = hci_recv_frame }, }; /* Recv data */ diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c index dacf297baf59..0b14547482a7 100644 --- a/drivers/bluetooth/hci_h5.c +++ b/drivers/bluetooth/hci_h5.c @@ -385,6 +385,7 @@ static void h5_complete_rx_pkt(struct hci_uart *hu) case HCI_EVENT_PKT: case HCI_ACLDATA_PKT: case HCI_SCODATA_PKT: + case HCI_ISODATA_PKT: hci_skb_pkt_type(h5->rx_skb) = H5_HDR_PKT_TYPE(hdr); /* Remove Three-wire header */ @@ -594,6 +595,7 @@ static int h5_enqueue(struct hci_uart *hu, struct sk_buff *skb) break; case HCI_SCODATA_PKT: + case HCI_ISODATA_PKT: skb_queue_tail(&h5->unrel, skb); break; @@ -636,6 +638,7 @@ static bool valid_packet_type(u8 type) case HCI_ACLDATA_PKT: case HCI_COMMAND_PKT: case HCI_SCODATA_PKT: + case HCI_ISODATA_PKT: case HCI_3WIRE_LINK_PKT: case HCI_3WIRE_ACK_PKT: return true; diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c index f10bdf8e1fc5..d6e0c99ee5eb 100644 --- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -20,6 +20,7 @@ #include <linux/completion.h> #include <linux/debugfs.h> #include <linux/delay.h> +#include <linux/devcoredump.h> #include <linux/device.h> #include <linux/gpio/consumer.h> #include <linux/mod_devicetable.h> @@ -46,6 +47,7 @@ #define IBS_BTSOC_TX_IDLE_TIMEOUT_MS 40 #define IBS_HOST_TX_IDLE_TIMEOUT_MS 2000 #define CMD_TRANS_TIMEOUT_MS 100 +#define MEMDUMP_TIMEOUT_MS 8000 /* susclk rate */ #define SUSCLK_RATE_32KHZ 32768 @@ -53,12 +55,24 @@ /* Controller debug log header */ #define QCA_DEBUG_HANDLE 0x2EDC +/* max retry count when init fails */ +#define MAX_INIT_RETRIES 3 + +/* Controller dump header */ +#define QCA_SSR_DUMP_HANDLE 0x0108 +#define QCA_DUMP_PACKET_SIZE 255 +#define QCA_LAST_SEQUENCE_NUM 0xFFFF +#define QCA_CRASHBYTE_PACKET_LEN 1096 +#define QCA_MEMDUMP_BYTE 0xFB + enum qca_flags { QCA_IBS_ENABLED, QCA_DROP_VENDOR_EVENT, QCA_SUSPENDING, + QCA_MEMDUMP_COLLECTION }; + /* HCI_IBS transmit side sleep protocol states */ enum tx_ibs_states { HCI_IBS_TX_ASLEEP, @@ -81,11 +95,40 @@ enum hci_ibs_clock_state_vote { HCI_IBS_RX_VOTE_CLOCK_OFF, }; +/* Controller memory dump states */ +enum qca_memdump_states { + QCA_MEMDUMP_IDLE, + QCA_MEMDUMP_COLLECTING, + QCA_MEMDUMP_COLLECTED, + QCA_MEMDUMP_TIMEOUT, +}; + +struct qca_memdump_data { + char *memdump_buf_head; + char *memdump_buf_tail; + u32 current_seq_no; + u32 received_dump; +}; + +struct qca_memdump_event_hdr { + __u8 evt; + __u8 plen; + __u16 opcode; + __u16 seq_no; + __u8 reserved; +} __packed; + + +struct qca_dump_size { + u32 dump_size; +} __packed; + struct qca_data { struct hci_uart *hu; struct sk_buff *rx_skb; struct sk_buff_head txq; struct sk_buff_head tx_wait_q; /* HCI_IBS wait queue */ + struct sk_buff_head rx_memdump_q; /* Memdump wait queue */ spinlock_t hci_ibs_lock; /* HCI_IBS state lock */ u8 tx_ibs_state; /* HCI_IBS transmit side power state*/ u8 rx_ibs_state; /* HCI_IBS receive side power state */ @@ -95,14 +138,18 @@ struct qca_data { u32 tx_idle_delay; struct timer_list wake_retrans_timer; u32 wake_retrans; + struct timer_list memdump_timer; struct workqueue_struct *workqueue; struct work_struct ws_awake_rx; struct work_struct ws_awake_device; struct work_struct ws_rx_vote_off; struct work_struct ws_tx_vote_off; + struct work_struct ctrl_memdump_evt; + struct qca_memdump_data *qca_memdump; unsigned long flags; struct completion drop_ev_comp; wait_queue_head_t suspend_wait_q; + enum qca_memdump_states memdump_state; /* For debugging purpose */ u64 ibs_sent_wacks; @@ -167,6 +214,7 @@ static int qca_regulator_enable(struct qca_serdev *qcadev); static void qca_regulator_disable(struct qca_serdev *qcadev); static void qca_power_shutdown(struct hci_uart *hu); static int qca_power_off(struct hci_dev *hdev); +static void qca_controller_memdump(struct work_struct *work); static enum qca_btsoc_type qca_soc_type(struct hci_uart *hu) { @@ -474,12 +522,28 @@ static void hci_ibs_wake_retrans_timeout(struct timer_list *t) hci_uart_tx_wakeup(hu); } +static void hci_memdump_timeout(struct timer_list *t) +{ + struct qca_data *qca = from_timer(qca, t, tx_idle_timer); + struct hci_uart *hu = qca->hu; + struct qca_memdump_data *qca_memdump = qca->qca_memdump; + char *memdump_buf = qca_memdump->memdump_buf_tail; + + bt_dev_err(hu->hdev, "clearing allocated memory due to memdump timeout"); + /* Inject hw error event to reset the device and driver. */ + hci_reset_dev(hu->hdev); + vfree(memdump_buf); + kfree(qca_memdump); + qca->memdump_state = QCA_MEMDUMP_TIMEOUT; + del_timer(&qca->memdump_timer); + cancel_work_sync(&qca->ctrl_memdump_evt); +} + /* Initialize protocol */ static int qca_open(struct hci_uart *hu) { struct qca_serdev *qcadev; struct qca_data *qca; - int ret; BT_DBG("hu %p qca_open", hu); @@ -492,6 +556,7 @@ static int qca_open(struct hci_uart *hu) skb_queue_head_init(&qca->txq); skb_queue_head_init(&qca->tx_wait_q); + skb_queue_head_init(&qca->rx_memdump_q); spin_lock_init(&qca->hci_ibs_lock); qca->workqueue = alloc_ordered_workqueue("qca_wq", 0); if (!qca->workqueue) { @@ -504,7 +569,7 @@ static int qca_open(struct hci_uart *hu) INIT_WORK(&qca->ws_awake_device, qca_wq_awake_device); INIT_WORK(&qca->ws_rx_vote_off, qca_wq_serial_rx_clock_vote_off); INIT_WORK(&qca->ws_tx_vote_off, qca_wq_serial_tx_clock_vote_off); - + INIT_WORK(&qca->ctrl_memdump_evt, qca_controller_memdump); init_waitqueue_head(&qca->suspend_wait_q); qca->hu = hu; @@ -519,23 +584,10 @@ static int qca_open(struct hci_uart *hu) hu->priv = qca; if (hu->serdev) { - qcadev = serdev_device_get_drvdata(hu->serdev); - if (!qca_is_wcn399x(qcadev->btsoc_type)) { - gpiod_set_value_cansleep(qcadev->bt_en, 1); - /* Controller needs time to bootup. */ - msleep(150); - } else { + if (qca_is_wcn399x(qcadev->btsoc_type)) { hu->init_speed = qcadev->init_speed; hu->oper_speed = qcadev->oper_speed; - ret = qca_regulator_enable(qcadev); - if (ret) { - destroy_workqueue(qca->workqueue); - kfree_skb(qca->rx_skb); - hu->priv = NULL; - kfree(qca); - return ret; - } } } @@ -544,6 +596,7 @@ static int qca_open(struct hci_uart *hu) timer_setup(&qca->tx_idle_timer, hci_ibs_tx_idle_timeout, 0); qca->tx_idle_delay = IBS_HOST_TX_IDLE_TIMEOUT_MS; + timer_setup(&qca->memdump_timer, hci_memdump_timeout, 0); BT_DBG("HCI_UART_QCA open, tx_idle_delay=%u, wake_retrans=%u", qca->tx_idle_delay, qca->wake_retrans); @@ -613,7 +666,6 @@ static int qca_flush(struct hci_uart *hu) /* Close protocol */ static int qca_close(struct hci_uart *hu) { - struct qca_serdev *qcadev; struct qca_data *qca = hu->priv; BT_DBG("hu %p qca close", hu); @@ -622,19 +674,14 @@ static int qca_close(struct hci_uart *hu) skb_queue_purge(&qca->tx_wait_q); skb_queue_purge(&qca->txq); + skb_queue_purge(&qca->rx_memdump_q); del_timer(&qca->tx_idle_timer); del_timer(&qca->wake_retrans_timer); + del_timer(&qca->memdump_timer); destroy_workqueue(qca->workqueue); qca->hu = NULL; - if (hu->serdev) { - qcadev = serdev_device_get_drvdata(hu->serdev); - if (qca_is_wcn399x(qcadev->btsoc_type)) - qca_power_shutdown(hu); - else - gpiod_set_value_cansleep(qcadev->bt_en, 0); - - } + qca_power_shutdown(hu); kfree_skb(qca->rx_skb); @@ -900,6 +947,125 @@ static int qca_recv_acl_data(struct hci_dev *hdev, struct sk_buff *skb) return hci_recv_frame(hdev, skb); } +static void qca_controller_memdump(struct work_struct *work) +{ + struct qca_data *qca = container_of(work, struct qca_data, + ctrl_memdump_evt); + struct hci_uart *hu = qca->hu; + struct sk_buff *skb; + struct qca_memdump_event_hdr *cmd_hdr; + struct qca_memdump_data *qca_memdump = qca->qca_memdump; + struct qca_dump_size *dump; + char *memdump_buf; + char nullBuff[QCA_DUMP_PACKET_SIZE] = { 0 }; + u16 seq_no; + u32 dump_size; + + while ((skb = skb_dequeue(&qca->rx_memdump_q))) { + + if (!qca_memdump) { + qca_memdump = kzalloc(sizeof(struct qca_memdump_data), + GFP_ATOMIC); + if (!qca_memdump) + return; + + qca->qca_memdump = qca_memdump; + } + + qca->memdump_state = QCA_MEMDUMP_COLLECTING; + cmd_hdr = (void *) skb->data; + seq_no = __le16_to_cpu(cmd_hdr->seq_no); + skb_pull(skb, sizeof(struct qca_memdump_event_hdr)); + + if (!seq_no) { + + /* This is the first frame of memdump packet from + * the controller, Disable IBS to recevie dump + * with out any interruption, ideally time required for + * the controller to send the dump is 8 seconds. let us + * start timer to handle this asynchronous activity. + */ + clear_bit(QCA_IBS_ENABLED, &qca->flags); + set_bit(QCA_MEMDUMP_COLLECTION, &qca->flags); + dump = (void *) skb->data; + dump_size = __le32_to_cpu(dump->dump_size); + if (!(dump_size)) { + bt_dev_err(hu->hdev, "Rx invalid memdump size"); + kfree_skb(skb); + return; + } + + bt_dev_info(hu->hdev, "QCA collecting dump of size:%u", + dump_size); + mod_timer(&qca->memdump_timer, (jiffies + + msecs_to_jiffies(MEMDUMP_TIMEOUT_MS))); + + skb_pull(skb, sizeof(dump_size)); + memdump_buf = vmalloc(dump_size); + qca_memdump->memdump_buf_head = memdump_buf; + qca_memdump->memdump_buf_tail = memdump_buf; + } + + memdump_buf = qca_memdump->memdump_buf_tail; + + /* If sequence no 0 is missed then there is no point in + * accepting the other sequences. + */ + if (!memdump_buf) { + bt_dev_err(hu->hdev, "QCA: Discarding other packets"); + kfree(qca_memdump); + kfree_skb(skb); + qca->qca_memdump = NULL; + return; + } + + /* There could be chance of missing some packets from + * the controller. In such cases let us store the dummy + * packets in the buffer. + */ + while ((seq_no > qca_memdump->current_seq_no + 1) && + seq_no != QCA_LAST_SEQUENCE_NUM) { + bt_dev_err(hu->hdev, "QCA controller missed packet:%d", + qca_memdump->current_seq_no); + memcpy(memdump_buf, nullBuff, QCA_DUMP_PACKET_SIZE); + memdump_buf = memdump_buf + QCA_DUMP_PACKET_SIZE; + qca_memdump->received_dump += QCA_DUMP_PACKET_SIZE; + qca_memdump->current_seq_no++; + } + + memcpy(memdump_buf, (unsigned char *) skb->data, skb->len); + memdump_buf = memdump_buf + skb->len; + qca_memdump->memdump_buf_tail = memdump_buf; + qca_memdump->current_seq_no = seq_no + 1; + qca_memdump->received_dump += skb->len; + qca->qca_memdump = qca_memdump; + kfree_skb(skb); + if (seq_no == QCA_LAST_SEQUENCE_NUM) { + bt_dev_info(hu->hdev, "QCA writing crash dump of size %d bytes", + qca_memdump->received_dump); + memdump_buf = qca_memdump->memdump_buf_head; + dev_coredumpv(&hu->serdev->dev, memdump_buf, + qca_memdump->received_dump, GFP_KERNEL); + del_timer(&qca->memdump_timer); + kfree(qca->qca_memdump); + qca->qca_memdump = NULL; + qca->memdump_state = QCA_MEMDUMP_COLLECTED; + } + } + +} + +int qca_controller_memdump_event(struct hci_dev *hdev, struct sk_buff *skb) +{ + struct hci_uart *hu = hci_get_drvdata(hdev); + struct qca_data *qca = hu->priv; + + skb_queue_tail(&qca->rx_memdump_q, skb); + queue_work(qca->workqueue, &qca->ctrl_memdump_evt); + + return 0; +} + static int qca_recv_event(struct hci_dev *hdev, struct sk_buff *skb) { struct hci_uart *hu = hci_get_drvdata(hdev); @@ -925,6 +1091,14 @@ static int qca_recv_event(struct hci_dev *hdev, struct sk_buff *skb) return 0; } + /* We receive chip memory dump as an event packet, With a dedicated + * handler followed by a hardware error event. When this event is + * received we store dump into a file before closing hci. This + * dump will help in triaging the issues. + */ + if ((skb->data[0] == HCI_VENDOR_PKT) && + (get_unaligned_be16(skb->data + 2) == QCA_SSR_DUMP_HANDLE)) + return qca_controller_memdump_event(hdev, skb); return hci_recv_frame(hdev, skb); } @@ -1203,6 +1377,91 @@ error: return ret; } +static int qca_send_crashbuffer(struct hci_uart *hu) +{ + struct qca_data *qca = hu->priv; + struct sk_buff *skb; + + skb = bt_skb_alloc(QCA_CRASHBYTE_PACKET_LEN, GFP_KERNEL); + if (!skb) { + bt_dev_err(hu->hdev, "Failed to allocate memory for skb packet"); + return -ENOMEM; + } + + /* We forcefully crash the controller, by sending 0xfb byte for + * 1024 times. We also might have chance of losing data, To be + * on safer side we send 1096 bytes to the SoC. + */ + memset(skb_put(skb, QCA_CRASHBYTE_PACKET_LEN), QCA_MEMDUMP_BYTE, + QCA_CRASHBYTE_PACKET_LEN); + hci_skb_pkt_type(skb) = HCI_COMMAND_PKT; + bt_dev_info(hu->hdev, "crash the soc to collect controller dump"); + skb_queue_tail(&qca->txq, skb); + hci_uart_tx_wakeup(hu); + + return 0; +} + +static void qca_wait_for_dump_collection(struct hci_dev *hdev) +{ + struct hci_uart *hu = hci_get_drvdata(hdev); + struct qca_data *qca = hu->priv; + struct qca_memdump_data *qca_memdump = qca->qca_memdump; + char *memdump_buf = NULL; + + wait_on_bit_timeout(&qca->flags, QCA_MEMDUMP_COLLECTION, + TASK_UNINTERRUPTIBLE, MEMDUMP_TIMEOUT_MS); + + clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags); + if (qca->memdump_state == QCA_MEMDUMP_IDLE) { + bt_dev_err(hu->hdev, "Clearing the buffers due to timeout"); + if (qca_memdump) + memdump_buf = qca_memdump->memdump_buf_tail; + vfree(memdump_buf); + kfree(qca_memdump); + qca->memdump_state = QCA_MEMDUMP_TIMEOUT; + del_timer(&qca->memdump_timer); + cancel_work_sync(&qca->ctrl_memdump_evt); + } +} + +static void qca_hw_error(struct hci_dev *hdev, u8 code) +{ + struct hci_uart *hu = hci_get_drvdata(hdev); + struct qca_data *qca = hu->priv; + + bt_dev_info(hdev, "mem_dump_status: %d", qca->memdump_state); + + if (qca->memdump_state == QCA_MEMDUMP_IDLE) { + /* If hardware error event received for other than QCA + * soc memory dump event, then we need to crash the SOC + * and wait here for 8 seconds to get the dump packets. + * This will block main thread to be on hold until we + * collect dump. + */ + set_bit(QCA_MEMDUMP_COLLECTION, &qca->flags); + qca_send_crashbuffer(hu); + qca_wait_for_dump_collection(hdev); + } else if (qca->memdump_state == QCA_MEMDUMP_COLLECTING) { + /* Let us wait here until memory dump collected or + * memory dump timer expired. + */ + bt_dev_info(hdev, "waiting for dump to complete"); + qca_wait_for_dump_collection(hdev); + } +} + +static void qca_cmd_timeout(struct hci_dev *hdev) +{ + struct hci_uart *hu = hci_get_drvdata(hdev); + struct qca_data *qca = hu->priv; + + if (qca->memdump_state == QCA_MEMDUMP_IDLE) + qca_send_crashbuffer(hu); + else + bt_dev_info(hdev, "Dump collection is in process"); +} + static int qca_wcn3990_init(struct hci_uart *hu) { struct qca_serdev *qcadev; @@ -1253,11 +1512,37 @@ static int qca_wcn3990_init(struct hci_uart *hu) return 0; } +static int qca_power_on(struct hci_dev *hdev) +{ + struct hci_uart *hu = hci_get_drvdata(hdev); + enum qca_btsoc_type soc_type = qca_soc_type(hu); + struct qca_serdev *qcadev; + int ret = 0; + + /* Non-serdev device usually is powered by external power + * and don't need additional action in driver for power on + */ + if (!hu->serdev) + return 0; + + if (qca_is_wcn399x(soc_type)) { + ret = qca_wcn3990_init(hu); + } else { + qcadev = serdev_device_get_drvdata(hu->serdev); + gpiod_set_value_cansleep(qcadev->bt_en, 1); + /* Controller needs time to bootup. */ + msleep(150); + } + + return ret; +} + static int qca_setup(struct hci_uart *hu) { struct hci_dev *hdev = hu->hdev; struct qca_data *qca = hu->priv; unsigned int speed, qca_baudrate = QCA_BAUDRATE_115200; + unsigned int retries = 0; enum qca_btsoc_type soc_type = qca_soc_type(hu); const char *firmware_name = qca_get_firmware_name(hu); int ret; @@ -1275,24 +1560,21 @@ static int qca_setup(struct hci_uart *hu) */ set_bit(HCI_QUIRK_SIMULTANEOUS_DISCOVERY, &hdev->quirks); - if (qca_is_wcn399x(soc_type)) { - bt_dev_info(hdev, "setting up wcn3990"); + bt_dev_info(hdev, "setting up %s", + qca_is_wcn399x(soc_type) ? "wcn399x" : "ROME"); - /* Enable NON_PERSISTENT_SETUP QUIRK to ensure to execute - * setup for every hci up. - */ - set_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks); +retry: + ret = qca_power_on(hdev); + if (ret) + return ret; + + if (qca_is_wcn399x(soc_type)) { set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks); - hu->hdev->shutdown = qca_power_off; - ret = qca_wcn3990_init(hu); - if (ret) - return ret; ret = qca_read_soc_version(hdev, &soc_ver, soc_type); if (ret) return ret; } else { - bt_dev_info(hdev, "ROME setup"); qca_set_speed(hu, QCA_INIT_SPEED); } @@ -1320,6 +1602,8 @@ static int qca_setup(struct hci_uart *hu) if (!ret) { set_bit(QCA_IBS_ENABLED, &qca->flags); qca_debugfs_init(hdev); + hu->hdev->hw_error = qca_hw_error; + hu->hdev->cmd_timeout = qca_cmd_timeout; } else if (ret == -ENOENT) { /* No patch/nvm-config found, run with original fw/config */ ret = 0; @@ -1329,6 +1613,20 @@ static int qca_setup(struct hci_uart *hu) * patch/nvm-config is found, so run with original fw/config. */ ret = 0; + } else { + if (retries < MAX_INIT_RETRIES) { + qca_power_shutdown(hu); + if (hu->serdev) { + serdev_device_close(hu->serdev); + ret = serdev_device_open(hu->serdev); + if (ret) { + bt_dev_err(hdev, "failed to open port"); + return ret; + } + } + retries++; + goto retry; + } } /* Setup bdaddr */ @@ -1393,6 +1691,7 @@ static void qca_power_shutdown(struct hci_uart *hu) struct qca_serdev *qcadev; struct qca_data *qca = hu->priv; unsigned long flags; + enum qca_btsoc_type soc_type = qca_soc_type(hu); qcadev = serdev_device_get_drvdata(hu->serdev); @@ -1405,20 +1704,36 @@ static void qca_power_shutdown(struct hci_uart *hu) qca_flush(hu); spin_unlock_irqrestore(&qca->hci_ibs_lock, flags); - host_set_baudrate(hu, 2400); - qca_send_power_pulse(hu, false); - qca_regulator_disable(qcadev); + hu->hdev->hw_error = NULL; + hu->hdev->cmd_timeout = NULL; + + /* Non-serdev device usually is powered by external power + * and don't need additional action in driver for power down + */ + if (!hu->serdev) + return; + + if (qca_is_wcn399x(soc_type)) { + host_set_baudrate(hu, 2400); + qca_send_power_pulse(hu, false); + qca_regulator_disable(qcadev); + } else { + gpiod_set_value_cansleep(qcadev->bt_en, 0); + } } static int qca_power_off(struct hci_dev *hdev) { struct hci_uart *hu = hci_get_drvdata(hdev); + struct qca_data *qca = hu->priv; - /* Perform pre shutdown command */ - qca_send_pre_shutdown_cmd(hdev); - - usleep_range(8000, 10000); + /* Stop sending shutdown command if soc crashes. */ + if (qca->memdump_state == QCA_MEMDUMP_IDLE) { + qca_send_pre_shutdown_cmd(hdev); + usleep_range(8000, 10000); + } + qca->memdump_state = QCA_MEMDUMP_IDLE; qca_power_shutdown(hu); return 0; } @@ -1493,6 +1808,7 @@ static int qca_init_regulators(struct qca_power *qca, static int qca_serdev_probe(struct serdev_device *serdev) { struct qca_serdev *qcadev; + struct hci_dev *hdev; const struct qca_vreg_data *data; int err; @@ -1501,7 +1817,7 @@ static int qca_serdev_probe(struct serdev_device *serdev) return -ENOMEM; qcadev->serdev_hu.serdev = serdev; - data = of_device_get_match_data(&serdev->dev); + data = device_get_match_data(&serdev->dev); serdev_device_set_drvdata(serdev, qcadev); device_property_read_string(&serdev->dev, "firmware-name", &qcadev->firmware_name); @@ -1518,7 +1834,7 @@ static int qca_serdev_probe(struct serdev_device *serdev) data->num_vregs); if (err) { BT_ERR("Failed to init regulators:%d", err); - goto out; + return err; } qcadev->bt_power->vregs_on = false; @@ -1531,7 +1847,7 @@ static int qca_serdev_probe(struct serdev_device *serdev) err = hci_uart_register_device(&qcadev->serdev_hu, &qca_proto); if (err) { BT_ERR("wcn3990 serdev registration failed"); - goto out; + return err; } } else { qcadev->btsoc_type = QCA_ROME; @@ -1557,12 +1873,18 @@ static int qca_serdev_probe(struct serdev_device *serdev) return err; err = hci_uart_register_device(&qcadev->serdev_hu, &qca_proto); - if (err) + if (err) { + BT_ERR("Rome serdev registration failed"); clk_disable_unprepare(qcadev->susclk); + return err; + } } -out: return err; + hdev = qcadev->serdev_hu.hdev; + set_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks); + hdev->shutdown = qca_power_off; + return 0; } static void qca_serdev_remove(struct serdev_device *serdev) diff --git a/drivers/bluetooth/hci_uart.h b/drivers/bluetooth/hci_uart.h index 6ab631101019..4e039d7a16f8 100644 --- a/drivers/bluetooth/hci_uart.h +++ b/drivers/bluetooth/hci_uart.h @@ -143,6 +143,13 @@ struct h4_recv_pkt { .lsize = 1, \ .maxlen = HCI_MAX_EVENT_SIZE +#define H4_RECV_ISO \ + .type = HCI_ISODATA_PKT, \ + .hlen = HCI_ISO_HDR_SIZE, \ + .loff = 2, \ + .lsize = 2, \ + .maxlen = HCI_MAX_FRAME_SIZE \ + struct sk_buff *h4_recv_buf(struct hci_dev *hdev, struct sk_buff *skb, const unsigned char *buffer, int count, const struct h4_recv_pkt *pkts, int pkts_count); diff --git a/drivers/bluetooth/hci_vhci.c b/drivers/bluetooth/hci_vhci.c index 65e41c1d760f..8ab26dec5f6e 100644 --- a/drivers/bluetooth/hci_vhci.c +++ b/drivers/bluetooth/hci_vhci.c @@ -178,6 +178,7 @@ static inline ssize_t vhci_get_user(struct vhci_data *data, case HCI_EVENT_PKT: case HCI_ACLDATA_PKT: case HCI_SCODATA_PKT: + case HCI_ISODATA_PKT: if (!data->hdev) { kfree_skb(skb); return -ENODEV; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 01a793a0cbf7..30a1e3ac21d6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1004,7 +1004,7 @@ static const struct pci_device_id pciidlist[] = { {0x1002, 0x734F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14}, /* Renoir */ - {0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU|AMD_EXP_HW_SUPPORT}, + {0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU}, /* Navi12 */ {0x1002, 0x7360, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI12|AMD_EXP_HW_SUPPORT}, diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c index 5a61a5596912..e6afe4faeca6 100644 --- a/drivers/gpu/drm/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/drm_dp_mst_topology.c @@ -1916,73 +1916,90 @@ static u8 drm_dp_calculate_rad(struct drm_dp_mst_port *port, return parent_lct + 1; } -static int drm_dp_port_set_pdt(struct drm_dp_mst_port *port, u8 new_pdt) +static bool drm_dp_mst_is_dp_mst_end_device(u8 pdt, bool mcs) +{ + switch (pdt) { + case DP_PEER_DEVICE_DP_LEGACY_CONV: + case DP_PEER_DEVICE_SST_SINK: + return true; + case DP_PEER_DEVICE_MST_BRANCHING: + /* For sst branch device */ + if (!mcs) + return true; + + return false; + } + return true; +} + +static int +drm_dp_port_set_pdt(struct drm_dp_mst_port *port, u8 new_pdt, + bool new_mcs) { struct drm_dp_mst_topology_mgr *mgr = port->mgr; struct drm_dp_mst_branch *mstb; u8 rad[8], lct; int ret = 0; - if (port->pdt == new_pdt) + if (port->pdt == new_pdt && port->mcs == new_mcs) return 0; /* Teardown the old pdt, if there is one */ - switch (port->pdt) { - case DP_PEER_DEVICE_DP_LEGACY_CONV: - case DP_PEER_DEVICE_SST_SINK: - /* - * If the new PDT would also have an i2c bus, don't bother - * with reregistering it - */ - if (new_pdt == DP_PEER_DEVICE_DP_LEGACY_CONV || - new_pdt == DP_PEER_DEVICE_SST_SINK) { - port->pdt = new_pdt; - return 0; - } + if (port->pdt != DP_PEER_DEVICE_NONE) { + if (drm_dp_mst_is_dp_mst_end_device(port->pdt, port->mcs)) { + /* + * If the new PDT would also have an i2c bus, + * don't bother with reregistering it + */ + if (new_pdt != DP_PEER_DEVICE_NONE && + drm_dp_mst_is_dp_mst_end_device(new_pdt, new_mcs)) { + port->pdt = new_pdt; + port->mcs = new_mcs; + return 0; + } - /* remove i2c over sideband */ - drm_dp_mst_unregister_i2c_bus(&port->aux); - break; - case DP_PEER_DEVICE_MST_BRANCHING: - mutex_lock(&mgr->lock); - drm_dp_mst_topology_put_mstb(port->mstb); - port->mstb = NULL; - mutex_unlock(&mgr->lock); - break; + /* remove i2c over sideband */ + drm_dp_mst_unregister_i2c_bus(&port->aux); + } else { + mutex_lock(&mgr->lock); + drm_dp_mst_topology_put_mstb(port->mstb); + port->mstb = NULL; + mutex_unlock(&mgr->lock); + } } port->pdt = new_pdt; - switch (port->pdt) { - case DP_PEER_DEVICE_DP_LEGACY_CONV: - case DP_PEER_DEVICE_SST_SINK: - /* add i2c over sideband */ - ret = drm_dp_mst_register_i2c_bus(&port->aux); - break; + port->mcs = new_mcs; - case DP_PEER_DEVICE_MST_BRANCHING: - lct = drm_dp_calculate_rad(port, rad); - mstb = drm_dp_add_mst_branch_device(lct, rad); - if (!mstb) { - ret = -ENOMEM; - DRM_ERROR("Failed to create MSTB for port %p", port); - goto out; - } + if (port->pdt != DP_PEER_DEVICE_NONE) { + if (drm_dp_mst_is_dp_mst_end_device(port->pdt, port->mcs)) { + /* add i2c over sideband */ + ret = drm_dp_mst_register_i2c_bus(&port->aux); + } else { + lct = drm_dp_calculate_rad(port, rad); + mstb = drm_dp_add_mst_branch_device(lct, rad); + if (!mstb) { + ret = -ENOMEM; + DRM_ERROR("Failed to create MSTB for port %p", + port); + goto out; + } - mutex_lock(&mgr->lock); - port->mstb = mstb; - mstb->mgr = port->mgr; - mstb->port_parent = port; + mutex_lock(&mgr->lock); + port->mstb = mstb; + mstb->mgr = port->mgr; + mstb->port_parent = port; - /* - * Make sure this port's memory allocation stays - * around until its child MSTB releases it - */ - drm_dp_mst_get_port_malloc(port); - mutex_unlock(&mgr->lock); + /* + * Make sure this port's memory allocation stays + * around until its child MSTB releases it + */ + drm_dp_mst_get_port_malloc(port); + mutex_unlock(&mgr->lock); - /* And make sure we send a link address for this */ - ret = 1; - break; + /* And make sure we send a link address for this */ + ret = 1; + } } out: @@ -2135,9 +2152,8 @@ drm_dp_mst_port_add_connector(struct drm_dp_mst_branch *mstb, goto error; } - if ((port->pdt == DP_PEER_DEVICE_DP_LEGACY_CONV || - port->pdt == DP_PEER_DEVICE_SST_SINK) && - port->port_num >= DP_MST_LOGICAL_PORT_0) { + if (port->pdt != DP_PEER_DEVICE_NONE && + drm_dp_mst_is_dp_mst_end_device(port->pdt, port->mcs)) { port->cached_edid = drm_get_edid(port->connector, &port->aux.ddc); drm_connector_set_tile_property(port->connector); @@ -2201,6 +2217,7 @@ drm_dp_mst_handle_link_address_port(struct drm_dp_mst_branch *mstb, struct drm_dp_mst_port *port; int old_ddps = 0, ret; u8 new_pdt = DP_PEER_DEVICE_NONE; + bool new_mcs = 0; bool created = false, send_link_addr = false, changed = false; port = drm_dp_get_port(mstb, port_msg->port_number); @@ -2245,7 +2262,7 @@ drm_dp_mst_handle_link_address_port(struct drm_dp_mst_branch *mstb, port->input = port_msg->input_port; if (!port->input) new_pdt = port_msg->peer_device_type; - port->mcs = port_msg->mcs; + new_mcs = port_msg->mcs; port->ddps = port_msg->ddps; port->ldps = port_msg->legacy_device_plug_status; port->dpcd_rev = port_msg->dpcd_revision; @@ -2272,7 +2289,7 @@ drm_dp_mst_handle_link_address_port(struct drm_dp_mst_branch *mstb, } } - ret = drm_dp_port_set_pdt(port, new_pdt); + ret = drm_dp_port_set_pdt(port, new_pdt, new_mcs); if (ret == 1) { send_link_addr = true; } else if (ret < 0) { @@ -2286,7 +2303,8 @@ drm_dp_mst_handle_link_address_port(struct drm_dp_mst_branch *mstb, * we're coming out of suspend. In this case, always resend the link * address if there's an MSTB on this port */ - if (!created && port->pdt == DP_PEER_DEVICE_MST_BRANCHING) + if (!created && port->pdt == DP_PEER_DEVICE_MST_BRANCHING && + port->mcs) send_link_addr = true; if (port->connector) @@ -2323,6 +2341,7 @@ drm_dp_mst_handle_conn_stat(struct drm_dp_mst_branch *mstb, struct drm_dp_mst_port *port; int old_ddps, old_input, ret, i; u8 new_pdt; + bool new_mcs; bool dowork = false, create_connector = false; port = drm_dp_get_port(mstb, conn_stat->port_number); @@ -2354,7 +2373,6 @@ drm_dp_mst_handle_conn_stat(struct drm_dp_mst_branch *mstb, old_ddps = port->ddps; old_input = port->input; port->input = conn_stat->input_port; - port->mcs = conn_stat->message_capability_status; port->ldps = conn_stat->legacy_device_plug_status; port->ddps = conn_stat->displayport_device_plug_status; @@ -2367,8 +2385,8 @@ drm_dp_mst_handle_conn_stat(struct drm_dp_mst_branch *mstb, } new_pdt = port->input ? DP_PEER_DEVICE_NONE : conn_stat->peer_device_type; - - ret = drm_dp_port_set_pdt(port, new_pdt); + new_mcs = conn_stat->message_capability_status; + ret = drm_dp_port_set_pdt(port, new_pdt, new_mcs); if (ret == 1) { dowork = true; } else if (ret < 0) { @@ -3929,6 +3947,8 @@ drm_dp_mst_detect_port(struct drm_connector *connector, switch (port->pdt) { case DP_PEER_DEVICE_NONE: case DP_PEER_DEVICE_MST_BRANCHING: + if (!port->mcs) + ret = connector_status_connected; break; case DP_PEER_DEVICE_SST_SINK: @@ -4541,7 +4561,7 @@ drm_dp_delayed_destroy_port(struct drm_dp_mst_port *port) if (port->connector) port->mgr->cbs->destroy_connector(port->mgr, port->connector); - drm_dp_port_set_pdt(port, DP_PEER_DEVICE_NONE); + drm_dp_port_set_pdt(port, DP_PEER_DEVICE_NONE, port->mcs); drm_dp_mst_put_port_malloc(port); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c index 3d4f5775a4ba..25235ef630c1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c @@ -9,16 +9,16 @@ #include "i915_gem_ioctls.h" #include "i915_gem_object.h" -static __always_inline u32 __busy_read_flag(u8 id) +static __always_inline u32 __busy_read_flag(u16 id) { - if (id == (u8)I915_ENGINE_CLASS_INVALID) + if (id == (u16)I915_ENGINE_CLASS_INVALID) return 0xffff0000u; GEM_BUG_ON(id >= 16); return 0x10000u << id; } -static __always_inline u32 __busy_write_id(u8 id) +static __always_inline u32 __busy_write_id(u16 id) { /* * The uABI guarantees an active writer is also amongst the read @@ -29,14 +29,14 @@ static __always_inline u32 __busy_write_id(u8 id) * last_read - hence we always set both read and write busy for * last_write. */ - if (id == (u8)I915_ENGINE_CLASS_INVALID) + if (id == (u16)I915_ENGINE_CLASS_INVALID) return 0xffffffffu; return (id + 1) | __busy_read_flag(id); } static __always_inline unsigned int -__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u8 id)) +__busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u16 id)) { const struct i915_request *rq; @@ -57,7 +57,7 @@ __busy_set_if_active(const struct dma_fence *fence, u32 (*flag)(u8 id)) return 0; /* Beware type-expansion follies! */ - BUILD_BUG_ON(!typecheck(u8, rq->engine->uabi_class)); + BUILD_BUG_ON(!typecheck(u16, rq->engine->uabi_class)); return flag(rq->engine->uabi_class); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c index 4c72d74d6576..0dbb44d30885 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c @@ -402,7 +402,7 @@ struct get_pages_work { static struct sg_table * __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj, - struct page **pvec, int num_pages) + struct page **pvec, unsigned long num_pages) { unsigned int max_segment = i915_sg_segment_size(); struct sg_table *st; @@ -448,9 +448,10 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work) { struct get_pages_work *work = container_of(_work, typeof(*work), work); struct drm_i915_gem_object *obj = work->obj; - const int npages = obj->base.size >> PAGE_SHIFT; + const unsigned long npages = obj->base.size >> PAGE_SHIFT; + unsigned long pinned; struct page **pvec; - int pinned, ret; + int ret; ret = -ENOMEM; pinned = 0; @@ -553,7 +554,7 @@ __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj) static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj) { - const int num_pages = obj->base.size >> PAGE_SHIFT; + const unsigned long num_pages = obj->base.size >> PAGE_SHIFT; struct mm_struct *mm = obj->userptr.mm->mm; struct page **pvec; struct sg_table *pages; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 17f1f1441efc..2b446474e010 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -274,8 +274,8 @@ struct intel_engine_cs { u8 class; u8 instance; - u8 uabi_class; - u8 uabi_instance; + u16 uabi_class; + u16 uabi_instance; u32 uabi_capabilities; u32 context_size; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index c083f516fd35..44727806dfd7 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1177,6 +1177,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2)); vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1))); do { + GEM_BUG_ON(iter->sg->length < I915_GTT_PAGE_SIZE); vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma; iter->dma += I915_GTT_PAGE_SIZE; @@ -1660,6 +1661,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm, vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt)); do { + GEM_BUG_ON(iter.sg->length < I915_GTT_PAGE_SIZE); vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma); iter.dma += I915_GTT_PAGE_SIZE; diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index f61364f7c471..88b431a267af 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -78,8 +78,10 @@ static int panfrost_ioctl_get_param(struct drm_device *ddev, void *data, struct static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file *file) { + struct panfrost_file_priv *priv = file->driver_priv; struct panfrost_gem_object *bo; struct drm_panfrost_create_bo *args = data; + struct panfrost_gem_mapping *mapping; if (!args->size || args->pad || (args->flags & ~(PANFROST_BO_NOEXEC | PANFROST_BO_HEAP))) @@ -95,7 +97,14 @@ static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data, if (IS_ERR(bo)) return PTR_ERR(bo); - args->offset = bo->node.start << PAGE_SHIFT; + mapping = panfrost_gem_mapping_get(bo, priv); + if (!mapping) { + drm_gem_object_put_unlocked(&bo->base.base); + return -EINVAL; + } + + args->offset = mapping->mmnode.start << PAGE_SHIFT; + panfrost_gem_mapping_put(mapping); return 0; } @@ -119,6 +128,11 @@ panfrost_lookup_bos(struct drm_device *dev, struct drm_panfrost_submit *args, struct panfrost_job *job) { + struct panfrost_file_priv *priv = file_priv->driver_priv; + struct panfrost_gem_object *bo; + unsigned int i; + int ret; + job->bo_count = args->bo_handle_count; if (!job->bo_count) @@ -130,9 +144,32 @@ panfrost_lookup_bos(struct drm_device *dev, if (!job->implicit_fences) return -ENOMEM; - return drm_gem_objects_lookup(file_priv, - (void __user *)(uintptr_t)args->bo_handles, - job->bo_count, &job->bos); + ret = drm_gem_objects_lookup(file_priv, + (void __user *)(uintptr_t)args->bo_handles, + job->bo_count, &job->bos); + if (ret) + return ret; + + job->mappings = kvmalloc_array(job->bo_count, + sizeof(struct panfrost_gem_mapping *), + GFP_KERNEL | __GFP_ZERO); + if (!job->mappings) + return -ENOMEM; + + for (i = 0; i < job->bo_count; i++) { + struct panfrost_gem_mapping *mapping; + + bo = to_panfrost_bo(job->bos[i]); + mapping = panfrost_gem_mapping_get(bo, priv); + if (!mapping) { + ret = -EINVAL; + break; + } + + job->mappings[i] = mapping; + } + + return ret; } /** @@ -320,7 +357,9 @@ out: static int panfrost_ioctl_get_bo_offset(struct drm_device *dev, void *data, struct drm_file *file_priv) { + struct panfrost_file_priv *priv = file_priv->driver_priv; struct drm_panfrost_get_bo_offset *args = data; + struct panfrost_gem_mapping *mapping; struct drm_gem_object *gem_obj; struct panfrost_gem_object *bo; @@ -331,18 +370,26 @@ static int panfrost_ioctl_get_bo_offset(struct drm_device *dev, void *data, } bo = to_panfrost_bo(gem_obj); - args->offset = bo->node.start << PAGE_SHIFT; - + mapping = panfrost_gem_mapping_get(bo, priv); drm_gem_object_put_unlocked(gem_obj); + + if (!mapping) + return -EINVAL; + + args->offset = mapping->mmnode.start << PAGE_SHIFT; + panfrost_gem_mapping_put(mapping); return 0; } static int panfrost_ioctl_madvise(struct drm_device *dev, void *data, struct drm_file *file_priv) { + struct panfrost_file_priv *priv = file_priv->driver_priv; struct drm_panfrost_madvise *args = data; struct panfrost_device *pfdev = dev->dev_private; struct drm_gem_object *gem_obj; + struct panfrost_gem_object *bo; + int ret = 0; gem_obj = drm_gem_object_lookup(file_priv, args->handle); if (!gem_obj) { @@ -350,22 +397,48 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data, return -ENOENT; } + bo = to_panfrost_bo(gem_obj); + mutex_lock(&pfdev->shrinker_lock); + mutex_lock(&bo->mappings.lock); + if (args->madv == PANFROST_MADV_DONTNEED) { + struct panfrost_gem_mapping *first; + + first = list_first_entry(&bo->mappings.list, + struct panfrost_gem_mapping, + node); + + /* + * If we want to mark the BO purgeable, there must be only one + * user: the caller FD. + * We could do something smarter and mark the BO purgeable only + * when all its users have marked it purgeable, but globally + * visible/shared BOs are likely to never be marked purgeable + * anyway, so let's not bother. + */ + if (!list_is_singular(&bo->mappings.list) || + WARN_ON_ONCE(first->mmu != &priv->mmu)) { + ret = -EINVAL; + goto out_unlock_mappings; + } + } + args->retained = drm_gem_shmem_madvise(gem_obj, args->madv); if (args->retained) { - struct panfrost_gem_object *bo = to_panfrost_bo(gem_obj); - if (args->madv == PANFROST_MADV_DONTNEED) list_add_tail(&bo->base.madv_list, &pfdev->shrinker_list); else if (args->madv == PANFROST_MADV_WILLNEED) list_del_init(&bo->base.madv_list); } + +out_unlock_mappings: + mutex_unlock(&bo->mappings.lock); mutex_unlock(&pfdev->shrinker_lock); drm_gem_object_put_unlocked(gem_obj); - return 0; + return ret; } int panfrost_unstable_ioctl_check(void) diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c b/drivers/gpu/drm/panfrost/panfrost_gem.c index fd766b1395fb..17b654e1eb94 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c @@ -29,6 +29,12 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj) list_del_init(&bo->base.madv_list); mutex_unlock(&pfdev->shrinker_lock); + /* + * If we still have mappings attached to the BO, there's a problem in + * our refcounting. + */ + WARN_ON_ONCE(!list_empty(&bo->mappings.list)); + if (bo->sgts) { int i; int n_sgt = bo->base.base.size / SZ_2M; @@ -46,6 +52,69 @@ static void panfrost_gem_free_object(struct drm_gem_object *obj) drm_gem_shmem_free_object(obj); } +struct panfrost_gem_mapping * +panfrost_gem_mapping_get(struct panfrost_gem_object *bo, + struct panfrost_file_priv *priv) +{ + struct panfrost_gem_mapping *iter, *mapping = NULL; + + mutex_lock(&bo->mappings.lock); + list_for_each_entry(iter, &bo->mappings.list, node) { + if (iter->mmu == &priv->mmu) { + kref_get(&iter->refcount); + mapping = iter; + break; + } + } + mutex_unlock(&bo->mappings.lock); + + return mapping; +} + +static void +panfrost_gem_teardown_mapping(struct panfrost_gem_mapping *mapping) +{ + struct panfrost_file_priv *priv; + + if (mapping->active) + panfrost_mmu_unmap(mapping); + + priv = container_of(mapping->mmu, struct panfrost_file_priv, mmu); + spin_lock(&priv->mm_lock); + if (drm_mm_node_allocated(&mapping->mmnode)) + drm_mm_remove_node(&mapping->mmnode); + spin_unlock(&priv->mm_lock); +} + +static void panfrost_gem_mapping_release(struct kref *kref) +{ + struct panfrost_gem_mapping *mapping; + + mapping = container_of(kref, struct panfrost_gem_mapping, refcount); + + panfrost_gem_teardown_mapping(mapping); + drm_gem_object_put_unlocked(&mapping->obj->base.base); + kfree(mapping); +} + +void panfrost_gem_mapping_put(struct panfrost_gem_mapping *mapping) +{ + if (!mapping) + return; + + kref_put(&mapping->refcount, panfrost_gem_mapping_release); +} + +void panfrost_gem_teardown_mappings(struct panfrost_gem_object *bo) +{ + struct panfrost_gem_mapping *mapping; + + mutex_lock(&bo->mappings.lock); + list_for_each_entry(mapping, &bo->mappings.list, node) + panfrost_gem_teardown_mapping(mapping); + mutex_unlock(&bo->mappings.lock); +} + int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv) { int ret; @@ -54,6 +123,16 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv) struct panfrost_gem_object *bo = to_panfrost_bo(obj); unsigned long color = bo->noexec ? PANFROST_BO_NOEXEC : 0; struct panfrost_file_priv *priv = file_priv->driver_priv; + struct panfrost_gem_mapping *mapping; + + mapping = kzalloc(sizeof(*mapping), GFP_KERNEL); + if (!mapping) + return -ENOMEM; + + INIT_LIST_HEAD(&mapping->node); + kref_init(&mapping->refcount); + drm_gem_object_get(obj); + mapping->obj = bo; /* * Executable buffers cannot cross a 16MB boundary as the program @@ -66,37 +145,48 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv) else align = size >= SZ_2M ? SZ_2M >> PAGE_SHIFT : 0; - bo->mmu = &priv->mmu; + mapping->mmu = &priv->mmu; spin_lock(&priv->mm_lock); - ret = drm_mm_insert_node_generic(&priv->mm, &bo->node, + ret = drm_mm_insert_node_generic(&priv->mm, &mapping->mmnode, size >> PAGE_SHIFT, align, color, 0); spin_unlock(&priv->mm_lock); if (ret) - return ret; + goto err; if (!bo->is_heap) { - ret = panfrost_mmu_map(bo); - if (ret) { - spin_lock(&priv->mm_lock); - drm_mm_remove_node(&bo->node); - spin_unlock(&priv->mm_lock); - } + ret = panfrost_mmu_map(mapping); + if (ret) + goto err; } + + mutex_lock(&bo->mappings.lock); + WARN_ON(bo->base.madv != PANFROST_MADV_WILLNEED); + list_add_tail(&mapping->node, &bo->mappings.list); + mutex_unlock(&bo->mappings.lock); + +err: + if (ret) + panfrost_gem_mapping_put(mapping); return ret; } void panfrost_gem_close(struct drm_gem_object *obj, struct drm_file *file_priv) { - struct panfrost_gem_object *bo = to_panfrost_bo(obj); struct panfrost_file_priv *priv = file_priv->driver_priv; + struct panfrost_gem_object *bo = to_panfrost_bo(obj); + struct panfrost_gem_mapping *mapping = NULL, *iter; - if (bo->is_mapped) - panfrost_mmu_unmap(bo); + mutex_lock(&bo->mappings.lock); + list_for_each_entry(iter, &bo->mappings.list, node) { + if (iter->mmu == &priv->mmu) { + mapping = iter; + list_del(&iter->node); + break; + } + } + mutex_unlock(&bo->mappings.lock); - spin_lock(&priv->mm_lock); - if (drm_mm_node_allocated(&bo->node)) - drm_mm_remove_node(&bo->node); - spin_unlock(&priv->mm_lock); + panfrost_gem_mapping_put(mapping); } static int panfrost_gem_pin(struct drm_gem_object *obj) @@ -136,6 +226,8 @@ struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, size_t if (!obj) return NULL; + INIT_LIST_HEAD(&obj->mappings.list); + mutex_init(&obj->mappings.lock); obj->base.base.funcs = &panfrost_gem_funcs; return &obj->base.base; diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h index 4b17e7308764..ca1bc9019600 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem.h +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h @@ -13,23 +13,46 @@ struct panfrost_gem_object { struct drm_gem_shmem_object base; struct sg_table *sgts; - struct panfrost_mmu *mmu; - struct drm_mm_node node; - bool is_mapped :1; + /* + * Use a list for now. If searching a mapping ever becomes the + * bottleneck, we should consider using an RB-tree, or even better, + * let the core store drm_gem_object_mapping entries (where we + * could place driver specific data) instead of drm_gem_object ones + * in its drm_file->object_idr table. + * + * struct drm_gem_object_mapping { + * struct drm_gem_object *obj; + * void *driver_priv; + * }; + */ + struct { + struct list_head list; + struct mutex lock; + } mappings; + bool noexec :1; bool is_heap :1; }; +struct panfrost_gem_mapping { + struct list_head node; + struct kref refcount; + struct panfrost_gem_object *obj; + struct drm_mm_node mmnode; + struct panfrost_mmu *mmu; + bool active :1; +}; + static inline struct panfrost_gem_object *to_panfrost_bo(struct drm_gem_object *obj) { return container_of(to_drm_gem_shmem_obj(obj), struct panfrost_gem_object, base); } -static inline -struct panfrost_gem_object *drm_mm_node_to_panfrost_bo(struct drm_mm_node *node) +static inline struct panfrost_gem_mapping * +drm_mm_node_to_panfrost_mapping(struct drm_mm_node *node) { - return container_of(node, struct panfrost_gem_object, node); + return container_of(node, struct panfrost_gem_mapping, mmnode); } struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, size_t size); @@ -49,6 +72,12 @@ int panfrost_gem_open(struct drm_gem_object *obj, struct drm_file *file_priv); void panfrost_gem_close(struct drm_gem_object *obj, struct drm_file *file_priv); +struct panfrost_gem_mapping * +panfrost_gem_mapping_get(struct panfrost_gem_object *bo, + struct panfrost_file_priv *priv); +void panfrost_gem_mapping_put(struct panfrost_gem_mapping *mapping); +void panfrost_gem_teardown_mappings(struct panfrost_gem_object *bo); + void panfrost_gem_shrinker_init(struct drm_device *dev); void panfrost_gem_shrinker_cleanup(struct drm_device *dev); diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c index 458f0fa68111..f5dd7b29bc95 100644 --- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c +++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c @@ -39,11 +39,12 @@ panfrost_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc static bool panfrost_gem_purge(struct drm_gem_object *obj) { struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); + struct panfrost_gem_object *bo = to_panfrost_bo(obj); if (!mutex_trylock(&shmem->pages_lock)) return false; - panfrost_mmu_unmap(to_panfrost_bo(obj)); + panfrost_gem_teardown_mappings(bo); drm_gem_shmem_purge_locked(obj); mutex_unlock(&shmem->pages_lock); diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index d411eb6c8eb9..e364ee00f3d0 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -268,9 +268,20 @@ static void panfrost_job_cleanup(struct kref *ref) dma_fence_put(job->done_fence); dma_fence_put(job->render_done_fence); - if (job->bos) { + if (job->mappings) { for (i = 0; i < job->bo_count; i++) + panfrost_gem_mapping_put(job->mappings[i]); + kvfree(job->mappings); + } + + if (job->bos) { + struct panfrost_gem_object *bo; + + for (i = 0; i < job->bo_count; i++) { + bo = to_panfrost_bo(job->bos[i]); drm_gem_object_put_unlocked(job->bos[i]); + } + kvfree(job->bos); } diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h index 62454128a792..bbd3ba97ff67 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.h +++ b/drivers/gpu/drm/panfrost/panfrost_job.h @@ -32,6 +32,7 @@ struct panfrost_job { /* Exclusive fences we have taken from the BOs to wait for */ struct dma_fence **implicit_fences; + struct panfrost_gem_mapping **mappings; struct drm_gem_object **bos; u32 bo_count; diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index a3ed64a1f15e..763cfca886a7 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -269,14 +269,15 @@ static int mmu_map_sg(struct panfrost_device *pfdev, struct panfrost_mmu *mmu, return 0; } -int panfrost_mmu_map(struct panfrost_gem_object *bo) +int panfrost_mmu_map(struct panfrost_gem_mapping *mapping) { + struct panfrost_gem_object *bo = mapping->obj; struct drm_gem_object *obj = &bo->base.base; struct panfrost_device *pfdev = to_panfrost_device(obj->dev); struct sg_table *sgt; int prot = IOMMU_READ | IOMMU_WRITE; - if (WARN_ON(bo->is_mapped)) + if (WARN_ON(mapping->active)) return 0; if (bo->noexec) @@ -286,25 +287,28 @@ int panfrost_mmu_map(struct panfrost_gem_object *bo) if (WARN_ON(IS_ERR(sgt))) return PTR_ERR(sgt); - mmu_map_sg(pfdev, bo->mmu, bo->node.start << PAGE_SHIFT, prot, sgt); - bo->is_mapped = true; + mmu_map_sg(pfdev, mapping->mmu, mapping->mmnode.start << PAGE_SHIFT, + prot, sgt); + mapping->active = true; return 0; } -void panfrost_mmu_unmap(struct panfrost_gem_object *bo) +void panfrost_mmu_unmap(struct panfrost_gem_mapping *mapping) { + struct panfrost_gem_object *bo = mapping->obj; struct drm_gem_object *obj = &bo->base.base; struct panfrost_device *pfdev = to_panfrost_device(obj->dev); - struct io_pgtable_ops *ops = bo->mmu->pgtbl_ops; - u64 iova = bo->node.start << PAGE_SHIFT; - size_t len = bo->node.size << PAGE_SHIFT; + struct io_pgtable_ops *ops = mapping->mmu->pgtbl_ops; + u64 iova = mapping->mmnode.start << PAGE_SHIFT; + size_t len = mapping->mmnode.size << PAGE_SHIFT; size_t unmapped_len = 0; - if (WARN_ON(!bo->is_mapped)) + if (WARN_ON(!mapping->active)) return; - dev_dbg(pfdev->dev, "unmap: as=%d, iova=%llx, len=%zx", bo->mmu->as, iova, len); + dev_dbg(pfdev->dev, "unmap: as=%d, iova=%llx, len=%zx", + mapping->mmu->as, iova, len); while (unmapped_len < len) { size_t unmapped_page; @@ -318,8 +322,9 @@ void panfrost_mmu_unmap(struct panfrost_gem_object *bo) unmapped_len += pgsize; } - panfrost_mmu_flush_range(pfdev, bo->mmu, bo->node.start << PAGE_SHIFT, len); - bo->is_mapped = false; + panfrost_mmu_flush_range(pfdev, mapping->mmu, + mapping->mmnode.start << PAGE_SHIFT, len); + mapping->active = false; } static void mmu_tlb_inv_context_s1(void *cookie) @@ -394,10 +399,10 @@ void panfrost_mmu_pgtable_free(struct panfrost_file_priv *priv) free_io_pgtable_ops(mmu->pgtbl_ops); } -static struct panfrost_gem_object * -addr_to_drm_mm_node(struct panfrost_device *pfdev, int as, u64 addr) +static struct panfrost_gem_mapping * +addr_to_mapping(struct panfrost_device *pfdev, int as, u64 addr) { - struct panfrost_gem_object *bo = NULL; + struct panfrost_gem_mapping *mapping = NULL; struct panfrost_file_priv *priv; struct drm_mm_node *node; u64 offset = addr >> PAGE_SHIFT; @@ -418,8 +423,9 @@ found_mmu: drm_mm_for_each_node(node, &priv->mm) { if (offset >= node->start && offset < (node->start + node->size)) { - bo = drm_mm_node_to_panfrost_bo(node); - drm_gem_object_get(&bo->base.base); + mapping = drm_mm_node_to_panfrost_mapping(node); + + kref_get(&mapping->refcount); break; } } @@ -427,7 +433,7 @@ found_mmu: spin_unlock(&priv->mm_lock); out: spin_unlock(&pfdev->as_lock); - return bo; + return mapping; } #define NUM_FAULT_PAGES (SZ_2M / PAGE_SIZE) @@ -436,28 +442,30 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, u64 addr) { int ret, i; + struct panfrost_gem_mapping *bomapping; struct panfrost_gem_object *bo; struct address_space *mapping; pgoff_t page_offset; struct sg_table *sgt; struct page **pages; - bo = addr_to_drm_mm_node(pfdev, as, addr); - if (!bo) + bomapping = addr_to_mapping(pfdev, as, addr); + if (!bomapping) return -ENOENT; + bo = bomapping->obj; if (!bo->is_heap) { dev_WARN(pfdev->dev, "matching BO is not heap type (GPU VA = %llx)", - bo->node.start << PAGE_SHIFT); + bomapping->mmnode.start << PAGE_SHIFT); ret = -EINVAL; goto err_bo; } - WARN_ON(bo->mmu->as != as); + WARN_ON(bomapping->mmu->as != as); /* Assume 2MB alignment and size multiple */ addr &= ~((u64)SZ_2M - 1); page_offset = addr >> PAGE_SHIFT; - page_offset -= bo->node.start; + page_offset -= bomapping->mmnode.start; mutex_lock(&bo->base.pages_lock); @@ -509,13 +517,14 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, goto err_map; } - mmu_map_sg(pfdev, bo->mmu, addr, IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt); + mmu_map_sg(pfdev, bomapping->mmu, addr, + IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt); - bo->is_mapped = true; + bomapping->active = true; dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr); - drm_gem_object_put_unlocked(&bo->base.base); + panfrost_gem_mapping_put(bomapping); return 0; diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.h b/drivers/gpu/drm/panfrost/panfrost_mmu.h index 7c5b6775ae23..44fc2edf63ce 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.h +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.h @@ -4,12 +4,12 @@ #ifndef __PANFROST_MMU_H__ #define __PANFROST_MMU_H__ -struct panfrost_gem_object; +struct panfrost_gem_mapping; struct panfrost_file_priv; struct panfrost_mmu; -int panfrost_mmu_map(struct panfrost_gem_object *bo); -void panfrost_mmu_unmap(struct panfrost_gem_object *bo); +int panfrost_mmu_map(struct panfrost_gem_mapping *mapping); +void panfrost_mmu_unmap(struct panfrost_gem_mapping *mapping); int panfrost_mmu_init(struct panfrost_device *pfdev); void panfrost_mmu_fini(struct panfrost_device *pfdev); diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c index 2c04e858c50a..684820448be3 100644 --- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c +++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c @@ -25,7 +25,7 @@ #define V4_SHADERS_PER_COREGROUP 4 struct panfrost_perfcnt { - struct panfrost_gem_object *bo; + struct panfrost_gem_mapping *mapping; size_t bosize; void *buf; struct panfrost_file_priv *user; @@ -49,7 +49,7 @@ static int panfrost_perfcnt_dump_locked(struct panfrost_device *pfdev) int ret; reinit_completion(&pfdev->perfcnt->dump_comp); - gpuva = pfdev->perfcnt->bo->node.start << PAGE_SHIFT; + gpuva = pfdev->perfcnt->mapping->mmnode.start << PAGE_SHIFT; gpu_write(pfdev, GPU_PERFCNT_BASE_LO, gpuva); gpu_write(pfdev, GPU_PERFCNT_BASE_HI, gpuva >> 32); gpu_write(pfdev, GPU_INT_CLEAR, @@ -89,17 +89,22 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev, if (IS_ERR(bo)) return PTR_ERR(bo); - perfcnt->bo = to_panfrost_bo(&bo->base); - /* Map the perfcnt buf in the address space attached to file_priv. */ - ret = panfrost_gem_open(&perfcnt->bo->base.base, file_priv); + ret = panfrost_gem_open(&bo->base, file_priv); if (ret) goto err_put_bo; + perfcnt->mapping = panfrost_gem_mapping_get(to_panfrost_bo(&bo->base), + user); + if (!perfcnt->mapping) { + ret = -EINVAL; + goto err_close_bo; + } + perfcnt->buf = drm_gem_shmem_vmap(&bo->base); if (IS_ERR(perfcnt->buf)) { ret = PTR_ERR(perfcnt->buf); - goto err_close_bo; + goto err_put_mapping; } /* @@ -154,12 +159,17 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev, if (panfrost_has_hw_issue(pfdev, HW_ISSUE_8186)) gpu_write(pfdev, GPU_PRFCNT_TILER_EN, 0xffffffff); + /* The BO ref is retained by the mapping. */ + drm_gem_object_put_unlocked(&bo->base); + return 0; err_vunmap: - drm_gem_shmem_vunmap(&perfcnt->bo->base.base, perfcnt->buf); + drm_gem_shmem_vunmap(&bo->base, perfcnt->buf); +err_put_mapping: + panfrost_gem_mapping_put(perfcnt->mapping); err_close_bo: - panfrost_gem_close(&perfcnt->bo->base.base, file_priv); + panfrost_gem_close(&bo->base, file_priv); err_put_bo: drm_gem_object_put_unlocked(&bo->base); return ret; @@ -182,11 +192,11 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev, GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_OFF)); perfcnt->user = NULL; - drm_gem_shmem_vunmap(&perfcnt->bo->base.base, perfcnt->buf); + drm_gem_shmem_vunmap(&perfcnt->mapping->obj->base.base, perfcnt->buf); perfcnt->buf = NULL; - panfrost_gem_close(&perfcnt->bo->base.base, file_priv); - drm_gem_object_put_unlocked(&perfcnt->bo->base.base); - perfcnt->bo = NULL; + panfrost_gem_close(&perfcnt->mapping->obj->base.base, file_priv); + panfrost_gem_mapping_put(perfcnt->mapping); + perfcnt->mapping = NULL; pm_runtime_mark_last_busy(pfdev->dev); pm_runtime_put_autosuspend(pfdev->dev); diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index 6c64d50c9aae..01c2eeb02aa9 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -294,9 +294,10 @@ static inline u16 volt2reg(int channel, long volt, u8 bypass_attn) long reg; if (bypass_attn & (1 << channel)) - reg = (volt * 1024) / 2250; + reg = DIV_ROUND_CLOSEST(volt * 1024, 2250); else - reg = (volt * r[1] * 1024) / ((r[0] + r[1]) * 2250); + reg = DIV_ROUND_CLOSEST(volt * r[1] * 1024, + (r[0] + r[1]) * 2250); return clamp_val(reg, 0, 1023) & (0xff << 2); } diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c index 1f3b30b085b9..d018b20089ec 100644 --- a/drivers/hwmon/hwmon.c +++ b/drivers/hwmon/hwmon.c @@ -51,6 +51,7 @@ struct hwmon_device_attribute { #define to_hwmon_attr(d) \ container_of(d, struct hwmon_device_attribute, dev_attr) +#define to_dev_attr(a) container_of(a, struct device_attribute, attr) /* * Thermal zone information @@ -58,7 +59,7 @@ struct hwmon_device_attribute { * also provides the sensor index. */ struct hwmon_thermal_data { - struct hwmon_device *hwdev; /* Reference to hwmon device */ + struct device *dev; /* Reference to hwmon device */ int index; /* sensor index */ }; @@ -95,9 +96,27 @@ static const struct attribute_group *hwmon_dev_attr_groups[] = { NULL }; +static void hwmon_free_attrs(struct attribute **attrs) +{ + int i; + + for (i = 0; attrs[i]; i++) { + struct device_attribute *dattr = to_dev_attr(attrs[i]); + struct hwmon_device_attribute *hattr = to_hwmon_attr(dattr); + + kfree(hattr); + } + kfree(attrs); +} + static void hwmon_dev_release(struct device *dev) { - kfree(to_hwmon_device(dev)); + struct hwmon_device *hwdev = to_hwmon_device(dev); + + if (hwdev->group.attrs) + hwmon_free_attrs(hwdev->group.attrs); + kfree(hwdev->groups); + kfree(hwdev); } static struct class hwmon_class = { @@ -119,11 +138,11 @@ static DEFINE_IDA(hwmon_ida); static int hwmon_thermal_get_temp(void *data, int *temp) { struct hwmon_thermal_data *tdata = data; - struct hwmon_device *hwdev = tdata->hwdev; + struct hwmon_device *hwdev = to_hwmon_device(tdata->dev); int ret; long t; - ret = hwdev->chip->ops->read(&hwdev->dev, hwmon_temp, hwmon_temp_input, + ret = hwdev->chip->ops->read(tdata->dev, hwmon_temp, hwmon_temp_input, tdata->index, &t); if (ret < 0) return ret; @@ -137,8 +156,7 @@ static const struct thermal_zone_of_device_ops hwmon_thermal_ops = { .get_temp = hwmon_thermal_get_temp, }; -static int hwmon_thermal_add_sensor(struct device *dev, - struct hwmon_device *hwdev, int index) +static int hwmon_thermal_add_sensor(struct device *dev, int index) { struct hwmon_thermal_data *tdata; struct thermal_zone_device *tzd; @@ -147,10 +165,10 @@ static int hwmon_thermal_add_sensor(struct device *dev, if (!tdata) return -ENOMEM; - tdata->hwdev = hwdev; + tdata->dev = dev; tdata->index = index; - tzd = devm_thermal_zone_of_sensor_register(&hwdev->dev, index, tdata, + tzd = devm_thermal_zone_of_sensor_register(dev, index, tdata, &hwmon_thermal_ops); /* * If CONFIG_THERMAL_OF is disabled, this returns -ENODEV, @@ -162,8 +180,7 @@ static int hwmon_thermal_add_sensor(struct device *dev, return 0; } #else -static int hwmon_thermal_add_sensor(struct device *dev, - struct hwmon_device *hwdev, int index) +static int hwmon_thermal_add_sensor(struct device *dev, int index) { return 0; } @@ -250,8 +267,7 @@ static bool is_string_attr(enum hwmon_sensor_types type, u32 attr) (type == hwmon_fan && attr == hwmon_fan_label); } -static struct attribute *hwmon_genattr(struct device *dev, - const void *drvdata, +static struct attribute *hwmon_genattr(const void *drvdata, enum hwmon_sensor_types type, u32 attr, int index, @@ -279,7 +295,7 @@ static struct attribute *hwmon_genattr(struct device *dev, if ((mode & 0222) && !ops->write) return ERR_PTR(-EINVAL); - hattr = devm_kzalloc(dev, sizeof(*hattr), GFP_KERNEL); + hattr = kzalloc(sizeof(*hattr), GFP_KERNEL); if (!hattr) return ERR_PTR(-ENOMEM); @@ -492,8 +508,7 @@ static int hwmon_num_channel_attrs(const struct hwmon_channel_info *info) return n; } -static int hwmon_genattrs(struct device *dev, - const void *drvdata, +static int hwmon_genattrs(const void *drvdata, struct attribute **attrs, const struct hwmon_ops *ops, const struct hwmon_channel_info *info) @@ -519,7 +534,7 @@ static int hwmon_genattrs(struct device *dev, attr_mask &= ~BIT(attr); if (attr >= template_size) return -EINVAL; - a = hwmon_genattr(dev, drvdata, info->type, attr, i, + a = hwmon_genattr(drvdata, info->type, attr, i, templates[attr], ops); if (IS_ERR(a)) { if (PTR_ERR(a) != -ENOENT) @@ -533,8 +548,7 @@ static int hwmon_genattrs(struct device *dev, } static struct attribute ** -__hwmon_create_attrs(struct device *dev, const void *drvdata, - const struct hwmon_chip_info *chip) +__hwmon_create_attrs(const void *drvdata, const struct hwmon_chip_info *chip) { int ret, i, aindex = 0, nattrs = 0; struct attribute **attrs; @@ -545,15 +559,17 @@ __hwmon_create_attrs(struct device *dev, const void *drvdata, if (nattrs == 0) return ERR_PTR(-EINVAL); - attrs = devm_kcalloc(dev, nattrs + 1, sizeof(*attrs), GFP_KERNEL); + attrs = kcalloc(nattrs + 1, sizeof(*attrs), GFP_KERNEL); if (!attrs) return ERR_PTR(-ENOMEM); for (i = 0; chip->info[i]; i++) { - ret = hwmon_genattrs(dev, drvdata, &attrs[aindex], chip->ops, + ret = hwmon_genattrs(drvdata, &attrs[aindex], chip->ops, chip->info[i]); - if (ret < 0) + if (ret < 0) { + hwmon_free_attrs(attrs); return ERR_PTR(ret); + } aindex += ret; } @@ -595,14 +611,13 @@ __hwmon_device_register(struct device *dev, const char *name, void *drvdata, for (i = 0; groups[i]; i++) ngroups++; - hwdev->groups = devm_kcalloc(dev, ngroups, sizeof(*groups), - GFP_KERNEL); + hwdev->groups = kcalloc(ngroups, sizeof(*groups), GFP_KERNEL); if (!hwdev->groups) { err = -ENOMEM; goto free_hwmon; } - attrs = __hwmon_create_attrs(dev, drvdata, chip); + attrs = __hwmon_create_attrs(drvdata, chip); if (IS_ERR(attrs)) { err = PTR_ERR(attrs); goto free_hwmon; @@ -647,8 +662,7 @@ __hwmon_device_register(struct device *dev, const char *name, void *drvdata, hwmon_temp_input, j)) continue; if (info[i]->config[j] & HWMON_T_INPUT) { - err = hwmon_thermal_add_sensor(dev, - hwdev, j); + err = hwmon_thermal_add_sensor(hdev, j); if (err) { device_unregister(hdev); /* @@ -667,7 +681,7 @@ __hwmon_device_register(struct device *dev, const char *name, void *drvdata, return hdev; free_hwmon: - kfree(hwdev); + hwmon_dev_release(hdev); ida_remove: ida_simple_remove(&hwmon_ida, id); return ERR_PTR(err); diff --git a/drivers/hwmon/nct7802.c b/drivers/hwmon/nct7802.c index f3dd2a17bd42..2e97e56c72c7 100644 --- a/drivers/hwmon/nct7802.c +++ b/drivers/hwmon/nct7802.c @@ -23,8 +23,8 @@ static const u8 REG_VOLTAGE[5] = { 0x09, 0x0a, 0x0c, 0x0d, 0x0e }; static const u8 REG_VOLTAGE_LIMIT_LSB[2][5] = { - { 0x40, 0x00, 0x42, 0x44, 0x46 }, - { 0x3f, 0x00, 0x41, 0x43, 0x45 }, + { 0x46, 0x00, 0x40, 0x42, 0x44 }, + { 0x45, 0x00, 0x3f, 0x41, 0x43 }, }; static const u8 REG_VOLTAGE_LIMIT_MSB[5] = { 0x48, 0x00, 0x47, 0x47, 0x48 }; @@ -58,6 +58,8 @@ static const u8 REG_VOLTAGE_LIMIT_MSB_SHIFT[2][5] = { struct nct7802_data { struct regmap *regmap; struct mutex access_lock; /* for multi-byte read and write operations */ + u8 in_status; + struct mutex in_alarm_lock; }; static ssize_t temp_type_show(struct device *dev, @@ -368,6 +370,66 @@ static ssize_t in_store(struct device *dev, struct device_attribute *attr, return err ? : count; } +static ssize_t in_alarm_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct nct7802_data *data = dev_get_drvdata(dev); + int volt, min, max, ret; + unsigned int val; + + mutex_lock(&data->in_alarm_lock); + + /* + * The SMI Voltage status register is the only register giving a status + * for voltages. A bit is set for each input crossing a threshold, in + * both direction, but the "inside" or "outside" limits info is not + * available. Also this register is cleared on read. + * Note: this is not explicitly spelled out in the datasheet, but + * from experiment. + * To deal with this we use a status cache with one validity bit and + * one status bit for each input. Validity is cleared at startup and + * each time the register reports a change, and the status is processed + * by software based on current input value and limits. + */ + ret = regmap_read(data->regmap, 0x1e, &val); /* SMI Voltage status */ + if (ret < 0) + goto abort; + + /* invalidate cached status for all inputs crossing a threshold */ + data->in_status &= ~((val & 0x0f) << 4); + + /* if cached status for requested input is invalid, update it */ + if (!(data->in_status & (0x10 << sattr->index))) { + ret = nct7802_read_voltage(data, sattr->nr, 0); + if (ret < 0) + goto abort; + volt = ret; + + ret = nct7802_read_voltage(data, sattr->nr, 1); + if (ret < 0) + goto abort; + min = ret; + + ret = nct7802_read_voltage(data, sattr->nr, 2); + if (ret < 0) + goto abort; + max = ret; + + if (volt < min || volt > max) + data->in_status |= (1 << sattr->index); + else + data->in_status &= ~(1 << sattr->index); + + data->in_status |= 0x10 << sattr->index; + } + + ret = sprintf(buf, "%u\n", !!(data->in_status & (1 << sattr->index))); +abort: + mutex_unlock(&data->in_alarm_lock); + return ret; +} + static ssize_t temp_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -660,7 +722,7 @@ static const struct attribute_group nct7802_temp_group = { static SENSOR_DEVICE_ATTR_2_RO(in0_input, in, 0, 0); static SENSOR_DEVICE_ATTR_2_RW(in0_min, in, 0, 1); static SENSOR_DEVICE_ATTR_2_RW(in0_max, in, 0, 2); -static SENSOR_DEVICE_ATTR_2_RO(in0_alarm, alarm, 0x1e, 3); +static SENSOR_DEVICE_ATTR_2_RO(in0_alarm, in_alarm, 0, 3); static SENSOR_DEVICE_ATTR_2_RW(in0_beep, beep, 0x5a, 3); static SENSOR_DEVICE_ATTR_2_RO(in1_input, in, 1, 0); @@ -668,19 +730,19 @@ static SENSOR_DEVICE_ATTR_2_RO(in1_input, in, 1, 0); static SENSOR_DEVICE_ATTR_2_RO(in2_input, in, 2, 0); static SENSOR_DEVICE_ATTR_2_RW(in2_min, in, 2, 1); static SENSOR_DEVICE_ATTR_2_RW(in2_max, in, 2, 2); -static SENSOR_DEVICE_ATTR_2_RO(in2_alarm, alarm, 0x1e, 0); +static SENSOR_DEVICE_ATTR_2_RO(in2_alarm, in_alarm, 2, 0); static SENSOR_DEVICE_ATTR_2_RW(in2_beep, beep, 0x5a, 0); static SENSOR_DEVICE_ATTR_2_RO(in3_input, in, 3, 0); static SENSOR_DEVICE_ATTR_2_RW(in3_min, in, 3, 1); static SENSOR_DEVICE_ATTR_2_RW(in3_max, in, 3, 2); -static SENSOR_DEVICE_ATTR_2_RO(in3_alarm, alarm, 0x1e, 1); +static SENSOR_DEVICE_ATTR_2_RO(in3_alarm, in_alarm, 3, 1); static SENSOR_DEVICE_ATTR_2_RW(in3_beep, beep, 0x5a, 1); static SENSOR_DEVICE_ATTR_2_RO(in4_input, in, 4, 0); static SENSOR_DEVICE_ATTR_2_RW(in4_min, in, 4, 1); static SENSOR_DEVICE_ATTR_2_RW(in4_max, in, 4, 2); -static SENSOR_DEVICE_ATTR_2_RO(in4_alarm, alarm, 0x1e, 2); +static SENSOR_DEVICE_ATTR_2_RO(in4_alarm, in_alarm, 4, 2); static SENSOR_DEVICE_ATTR_2_RW(in4_beep, beep, 0x5a, 2); static struct attribute *nct7802_in_attrs[] = { @@ -1011,6 +1073,7 @@ static int nct7802_probe(struct i2c_client *client, return PTR_ERR(data->regmap); mutex_init(&data->access_lock); + mutex_init(&data->in_alarm_lock); ret = nct7802_init_chip(data); if (ret < 0) diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c index f918fca9ada3..cb6e3a5f509c 100644 --- a/drivers/input/evdev.c +++ b/drivers/input/evdev.c @@ -484,10 +484,7 @@ static int evdev_open(struct inode *inode, struct file *file) struct evdev_client *client; int error; - client = kzalloc(struct_size(client, buffer, bufsize), - GFP_KERNEL | __GFP_NOWARN); - if (!client) - client = vzalloc(struct_size(client, buffer, bufsize)); + client = kvzalloc(struct_size(client, buffer, bufsize), GFP_KERNEL); if (!client) return -ENOMEM; diff --git a/drivers/input/misc/keyspan_remote.c b/drivers/input/misc/keyspan_remote.c index 83368f1e7c4e..4650f4a94989 100644 --- a/drivers/input/misc/keyspan_remote.c +++ b/drivers/input/misc/keyspan_remote.c @@ -336,7 +336,8 @@ static int keyspan_setup(struct usb_device* dev) int retval = 0; retval = usb_control_msg(dev, usb_sndctrlpipe(dev, 0), - 0x11, 0x40, 0x5601, 0x0, NULL, 0, 0); + 0x11, 0x40, 0x5601, 0x0, NULL, 0, + USB_CTRL_SET_TIMEOUT); if (retval) { dev_dbg(&dev->dev, "%s - failed to set bit rate due to error: %d\n", __func__, retval); @@ -344,7 +345,8 @@ static int keyspan_setup(struct usb_device* dev) } retval = usb_control_msg(dev, usb_sndctrlpipe(dev, 0), - 0x44, 0x40, 0x0, 0x0, NULL, 0, 0); + 0x44, 0x40, 0x0, 0x0, NULL, 0, + USB_CTRL_SET_TIMEOUT); if (retval) { dev_dbg(&dev->dev, "%s - failed to set resume sensitivity due to error: %d\n", __func__, retval); @@ -352,7 +354,8 @@ static int keyspan_setup(struct usb_device* dev) } retval = usb_control_msg(dev, usb_sndctrlpipe(dev, 0), - 0x22, 0x40, 0x0, 0x0, NULL, 0, 0); + 0x22, 0x40, 0x0, 0x0, NULL, 0, + USB_CTRL_SET_TIMEOUT); if (retval) { dev_dbg(&dev->dev, "%s - failed to turn receive on due to error: %d\n", __func__, retval); diff --git a/drivers/input/misc/max77650-onkey.c b/drivers/input/misc/max77650-onkey.c index 4d875f2ac13d..ee55f22dbca5 100644 --- a/drivers/input/misc/max77650-onkey.c +++ b/drivers/input/misc/max77650-onkey.c @@ -108,9 +108,16 @@ static int max77650_onkey_probe(struct platform_device *pdev) return input_register_device(onkey->input); } +static const struct of_device_id max77650_onkey_of_match[] = { + { .compatible = "maxim,max77650-onkey" }, + { } +}; +MODULE_DEVICE_TABLE(of, max77650_onkey_of_match); + static struct platform_driver max77650_onkey_driver = { .driver = { .name = "max77650-onkey", + .of_match_table = max77650_onkey_of_match, }, .probe = max77650_onkey_probe, }; diff --git a/drivers/input/misc/pm8xxx-vibrator.c b/drivers/input/misc/pm8xxx-vibrator.c index ecd762f93732..53ad25eaf1a2 100644 --- a/drivers/input/misc/pm8xxx-vibrator.c +++ b/drivers/input/misc/pm8xxx-vibrator.c @@ -90,7 +90,7 @@ static int pm8xxx_vib_set(struct pm8xxx_vib *vib, bool on) if (regs->enable_mask) rc = regmap_update_bits(vib->regmap, regs->enable_addr, - on ? regs->enable_mask : 0, val); + regs->enable_mask, on ? ~0 : 0); return rc; } diff --git a/drivers/input/rmi4/rmi_f54.c b/drivers/input/rmi4/rmi_f54.c index 0bc01cfc2b51..6b23e679606e 100644 --- a/drivers/input/rmi4/rmi_f54.c +++ b/drivers/input/rmi4/rmi_f54.c @@ -24,6 +24,12 @@ #define F54_NUM_TX_OFFSET 1 #define F54_NUM_RX_OFFSET 0 +/* + * The smbus protocol can read only 32 bytes max at a time. + * But this should be fine for i2c/spi as well. + */ +#define F54_REPORT_DATA_SIZE 32 + /* F54 commands */ #define F54_GET_REPORT 1 #define F54_FORCE_CAL 2 @@ -526,6 +532,7 @@ static void rmi_f54_work(struct work_struct *work) int report_size; u8 command; int error; + int i; report_size = rmi_f54_get_report_size(f54); if (report_size == 0) { @@ -558,23 +565,27 @@ static void rmi_f54_work(struct work_struct *work) rmi_dbg(RMI_DEBUG_FN, &fn->dev, "Get report command completed, reading data\n"); - fifo[0] = 0; - fifo[1] = 0; - error = rmi_write_block(fn->rmi_dev, - fn->fd.data_base_addr + F54_FIFO_OFFSET, - fifo, sizeof(fifo)); - if (error) { - dev_err(&fn->dev, "Failed to set fifo start offset\n"); - goto abort; - } + for (i = 0; i < report_size; i += F54_REPORT_DATA_SIZE) { + int size = min(F54_REPORT_DATA_SIZE, report_size - i); + + fifo[0] = i & 0xff; + fifo[1] = i >> 8; + error = rmi_write_block(fn->rmi_dev, + fn->fd.data_base_addr + F54_FIFO_OFFSET, + fifo, sizeof(fifo)); + if (error) { + dev_err(&fn->dev, "Failed to set fifo start offset\n"); + goto abort; + } - error = rmi_read_block(fn->rmi_dev, fn->fd.data_base_addr + - F54_REPORT_DATA_OFFSET, f54->report_data, - report_size); - if (error) { - dev_err(&fn->dev, "%s: read [%d bytes] returned %d\n", - __func__, report_size, error); - goto abort; + error = rmi_read_block(fn->rmi_dev, fn->fd.data_base_addr + + F54_REPORT_DATA_OFFSET, + f54->report_data + i, size); + if (error) { + dev_err(&fn->dev, "%s: read [%d bytes] returned %d\n", + __func__, size, error); + goto abort; + } } abort: diff --git a/drivers/input/rmi4/rmi_smbus.c b/drivers/input/rmi4/rmi_smbus.c index b313c579914f..2407ea43de59 100644 --- a/drivers/input/rmi4/rmi_smbus.c +++ b/drivers/input/rmi4/rmi_smbus.c @@ -163,6 +163,7 @@ static int rmi_smb_write_block(struct rmi_transport_dev *xport, u16 rmiaddr, /* prepare to write next block of bytes */ cur_len -= SMB_MAX_COUNT; databuff += SMB_MAX_COUNT; + rmiaddr += SMB_MAX_COUNT; } exit: mutex_unlock(&rmi_smb->page_mutex); @@ -214,6 +215,7 @@ static int rmi_smb_read_block(struct rmi_transport_dev *xport, u16 rmiaddr, /* prepare to read next block of bytes */ cur_len -= SMB_MAX_COUNT; databuff += SMB_MAX_COUNT; + rmiaddr += SMB_MAX_COUNT; } retval = 0; diff --git a/drivers/input/tablet/aiptek.c b/drivers/input/tablet/aiptek.c index 2ca586fb914f..e08b0ef078e8 100644 --- a/drivers/input/tablet/aiptek.c +++ b/drivers/input/tablet/aiptek.c @@ -1713,7 +1713,7 @@ aiptek_probe(struct usb_interface *intf, const struct usb_device_id *id) aiptek->inputdev = inputdev; aiptek->intf = intf; - aiptek->ifnum = intf->altsetting[0].desc.bInterfaceNumber; + aiptek->ifnum = intf->cur_altsetting->desc.bInterfaceNumber; aiptek->inDelay = 0; aiptek->endDelay = 0; aiptek->previousJitterable = 0; @@ -1802,14 +1802,14 @@ aiptek_probe(struct usb_interface *intf, const struct usb_device_id *id) input_set_abs_params(inputdev, ABS_WHEEL, AIPTEK_WHEEL_MIN, AIPTEK_WHEEL_MAX - 1, 0, 0); /* Verify that a device really has an endpoint */ - if (intf->altsetting[0].desc.bNumEndpoints < 1) { + if (intf->cur_altsetting->desc.bNumEndpoints < 1) { dev_err(&intf->dev, "interface has %d endpoints, but must have minimum 1\n", - intf->altsetting[0].desc.bNumEndpoints); + intf->cur_altsetting->desc.bNumEndpoints); err = -EINVAL; goto fail3; } - endpoint = &intf->altsetting[0].endpoint[0].desc; + endpoint = &intf->cur_altsetting->endpoint[0].desc; /* Go set up our URB, which is called when the tablet receives * input. diff --git a/drivers/input/tablet/gtco.c b/drivers/input/tablet/gtco.c index 35031228a6d0..96d65575f75a 100644 --- a/drivers/input/tablet/gtco.c +++ b/drivers/input/tablet/gtco.c @@ -875,18 +875,14 @@ static int gtco_probe(struct usb_interface *usbinterface, } /* Sanity check that a device has an endpoint */ - if (usbinterface->altsetting[0].desc.bNumEndpoints < 1) { + if (usbinterface->cur_altsetting->desc.bNumEndpoints < 1) { dev_err(&usbinterface->dev, "Invalid number of endpoints\n"); error = -EINVAL; goto err_free_urb; } - /* - * The endpoint is always altsetting 0, we know this since we know - * this device only has one interrupt endpoint - */ - endpoint = &usbinterface->altsetting[0].endpoint[0].desc; + endpoint = &usbinterface->cur_altsetting->endpoint[0].desc; /* Some debug */ dev_dbg(&usbinterface->dev, "gtco # interfaces: %d\n", usbinterface->num_altsetting); @@ -896,7 +892,8 @@ static int gtco_probe(struct usb_interface *usbinterface, if (usb_endpoint_xfer_int(endpoint)) dev_dbg(&usbinterface->dev, "endpoint: we have interrupt endpoint\n"); - dev_dbg(&usbinterface->dev, "endpoint extra len:%d\n", usbinterface->altsetting[0].extralen); + dev_dbg(&usbinterface->dev, "interface extra len:%d\n", + usbinterface->cur_altsetting->extralen); /* * Find the HID descriptor so we can find out the size of the @@ -973,8 +970,6 @@ static int gtco_probe(struct usb_interface *usbinterface, input_dev->dev.parent = &usbinterface->dev; /* Setup the URB, it will be posted later on open of input device */ - endpoint = &usbinterface->altsetting[0].endpoint[0].desc; - usb_fill_int_urb(gtco->urbinfo, udev, usb_rcvintpipe(udev, diff --git a/drivers/input/tablet/pegasus_notetaker.c b/drivers/input/tablet/pegasus_notetaker.c index a1f3a0cb197e..38f087404f7a 100644 --- a/drivers/input/tablet/pegasus_notetaker.c +++ b/drivers/input/tablet/pegasus_notetaker.c @@ -275,7 +275,7 @@ static int pegasus_probe(struct usb_interface *intf, return -ENODEV; /* Sanity check that the device has an endpoint */ - if (intf->altsetting[0].desc.bNumEndpoints < 1) { + if (intf->cur_altsetting->desc.bNumEndpoints < 1) { dev_err(&intf->dev, "Invalid number of endpoints\n"); return -EINVAL; } diff --git a/drivers/input/touchscreen/sun4i-ts.c b/drivers/input/touchscreen/sun4i-ts.c index 0af0fe8c40d7..742a7e96c1b5 100644 --- a/drivers/input/touchscreen/sun4i-ts.c +++ b/drivers/input/touchscreen/sun4i-ts.c @@ -237,6 +237,7 @@ static int sun4i_ts_probe(struct platform_device *pdev) struct device *dev = &pdev->dev; struct device_node *np = dev->of_node; struct device *hwmon; + struct thermal_zone_device *thermal; int error; u32 reg; bool ts_attached; @@ -355,7 +356,10 @@ static int sun4i_ts_probe(struct platform_device *pdev) if (IS_ERR(hwmon)) return PTR_ERR(hwmon); - devm_thermal_zone_of_sensor_register(ts->dev, 0, ts, &sun4i_ts_tz_ops); + thermal = devm_thermal_zone_of_sensor_register(ts->dev, 0, ts, + &sun4i_ts_tz_ops); + if (IS_ERR(thermal)) + return PTR_ERR(thermal); writel(TEMP_IRQ_EN(1), ts->base + TP_INT_FIFOC); diff --git a/drivers/input/touchscreen/sur40.c b/drivers/input/touchscreen/sur40.c index 1dd47dda71cd..34d31c7ec8ba 100644 --- a/drivers/input/touchscreen/sur40.c +++ b/drivers/input/touchscreen/sur40.c @@ -661,7 +661,7 @@ static int sur40_probe(struct usb_interface *interface, int error; /* Check if we really have the right interface. */ - iface_desc = &interface->altsetting[0]; + iface_desc = interface->cur_altsetting; if (iface_desc->desc.bInterfaceClass != 0xFF) return -ENODEV; diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index 568c52317757..483f7bc379fa 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -1655,27 +1655,39 @@ static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, static void init_iommu_perf_ctr(struct amd_iommu *iommu) { struct pci_dev *pdev = iommu->dev; - u64 val = 0xabcd, val2 = 0; + u64 val = 0xabcd, val2 = 0, save_reg = 0; if (!iommu_feature(iommu, FEATURE_PC)) return; amd_iommu_pc_present = true; + /* save the value to restore, if writable */ + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false)) + goto pc_false; + /* Check if the performance counters can be written to */ if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) || (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) || - (val != val2)) { - pci_err(pdev, "Unable to write to IOMMU perf counter.\n"); - amd_iommu_pc_present = false; - return; - } + (val != val2)) + goto pc_false; + + /* restore */ + if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true)) + goto pc_false; pci_info(pdev, "IOMMU performance counters supported\n"); val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET); iommu->max_banks = (u8) ((val >> 12) & 0x3f); iommu->max_counters = (u8) ((val >> 7) & 0xf); + + return; + +pc_false: + pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n"); + amd_iommu_pc_present = false; + return; } static ssize_t amd_iommu_show_cap(struct device *dev, diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 1801f0aaf013..932267f49f9a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev) spin_lock_irqsave(&device_domain_lock, flags); info = dev->archdata.iommu; - if (info) + if (info && info != DEFER_DEVICE_DOMAIN_INFO + && info != DUMMY_DEVICE_DOMAIN_INFO) __dmar_remove_one_dev_info(info); spin_unlock_irqrestore(&device_domain_lock, flags); } diff --git a/drivers/leds/leds-as3645a.c b/drivers/leds/leds-as3645a.c index b7e0ae1af8fa..e8922fa03379 100644 --- a/drivers/leds/leds-as3645a.c +++ b/drivers/leds/leds-as3645a.c @@ -493,16 +493,17 @@ static int as3645a_parse_node(struct as3645a *flash, switch (id) { case AS_LED_FLASH: flash->flash_node = child; + fwnode_handle_get(child); break; case AS_LED_INDICATOR: flash->indicator_node = child; + fwnode_handle_get(child); break; default: dev_warn(&flash->client->dev, "unknown LED %u encountered, ignoring\n", id); break; } - fwnode_handle_get(child); } if (!flash->flash_node) { diff --git a/drivers/leds/leds-gpio.c b/drivers/leds/leds-gpio.c index a5c73f3d5f79..2bf74595610f 100644 --- a/drivers/leds/leds-gpio.c +++ b/drivers/leds/leds-gpio.c @@ -151,9 +151,14 @@ static struct gpio_leds_priv *gpio_leds_create(struct platform_device *pdev) struct gpio_led led = {}; const char *state = NULL; + /* + * Acquire gpiod from DT with uninitialized label, which + * will be updated after LED class device is registered, + * Only then the final LED name is known. + */ led.gpiod = devm_fwnode_get_gpiod_from_child(dev, NULL, child, GPIOD_ASIS, - led.name); + NULL); if (IS_ERR(led.gpiod)) { fwnode_handle_put(child); return ERR_CAST(led.gpiod); @@ -186,6 +191,9 @@ static struct gpio_leds_priv *gpio_leds_create(struct platform_device *pdev) fwnode_handle_put(child); return ERR_PTR(ret); } + /* Set gpiod label to match the corresponding LED name. */ + gpiod_set_consumer_name(led_dat->gpiod, + led_dat->cdev.dev->kobj.name); priv->num_leds++; } diff --git a/drivers/leds/leds-lm3532.c b/drivers/leds/leds-lm3532.c index 0507c6575c08..491268bb34a7 100644 --- a/drivers/leds/leds-lm3532.c +++ b/drivers/leds/leds-lm3532.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 // TI LM3532 LED driver // Copyright (C) 2019 Texas Instruments Incorporated - http://www.ti.com/ +// http://www.ti.com/lit/ds/symlink/lm3532.pdf #include <linux/i2c.h> #include <linux/leds.h> @@ -623,7 +624,7 @@ static int lm3532_parse_node(struct lm3532_data *priv) led->num_leds = fwnode_property_count_u32(child, "led-sources"); if (led->num_leds > LM3532_MAX_LED_STRINGS) { - dev_err(&priv->client->dev, "To many LED string defined\n"); + dev_err(&priv->client->dev, "Too many LED string defined\n"); continue; } diff --git a/drivers/leds/leds-max77650.c b/drivers/leds/leds-max77650.c index 4c2d0b3c6dad..a0d4b725c917 100644 --- a/drivers/leds/leds-max77650.c +++ b/drivers/leds/leds-max77650.c @@ -135,9 +135,16 @@ err_node_put: return rv; } +static const struct of_device_id max77650_led_of_match[] = { + { .compatible = "maxim,max77650-led" }, + { } +}; +MODULE_DEVICE_TABLE(of, max77650_led_of_match); + static struct platform_driver max77650_led_driver = { .driver = { .name = "max77650-led", + .of_match_table = max77650_led_of_match, }, .probe = max77650_led_probe, }; diff --git a/drivers/leds/leds-rb532.c b/drivers/leds/leds-rb532.c index db5af83f0cec..b6447c1721b4 100644 --- a/drivers/leds/leds-rb532.c +++ b/drivers/leds/leds-rb532.c @@ -21,7 +21,6 @@ static void rb532_led_set(struct led_classdev *cdev, { if (brightness) set_latch_u5(LO_ULED, 0); - else set_latch_u5(0, LO_ULED); } diff --git a/drivers/leds/trigger/ledtrig-pattern.c b/drivers/leds/trigger/ledtrig-pattern.c index 718729c89440..3abcafe46278 100644 --- a/drivers/leds/trigger/ledtrig-pattern.c +++ b/drivers/leds/trigger/ledtrig-pattern.c @@ -455,7 +455,7 @@ static void __exit pattern_trig_exit(void) module_init(pattern_trig_init); module_exit(pattern_trig_exit); -MODULE_AUTHOR("Raphael Teysseyre <rteysseyre@gmail.com"); -MODULE_AUTHOR("Baolin Wang <baolin.wang@linaro.org"); +MODULE_AUTHOR("Raphael Teysseyre <rteysseyre@gmail.com>"); +MODULE_AUTHOR("Baolin Wang <baolin.wang@linaro.org>"); MODULE_DESCRIPTION("LED Pattern trigger"); MODULE_LICENSE("GPL v2"); diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c index 7bc950520fd9..403ac44a7378 100644 --- a/drivers/mmc/host/sdhci-tegra.c +++ b/drivers/mmc/host/sdhci-tegra.c @@ -386,7 +386,7 @@ static void tegra_sdhci_reset(struct sdhci_host *host, u8 mask) misc_ctrl |= SDHCI_MISC_CTRL_ENABLE_DDR50; if (soc_data->nvquirks & NVQUIRK_ENABLE_SDR104) misc_ctrl |= SDHCI_MISC_CTRL_ENABLE_SDR104; - if (soc_data->nvquirks & SDHCI_MISC_CTRL_ENABLE_SDR50) + if (soc_data->nvquirks & NVQUIRK_ENABLE_SDR50) clk_ctrl |= SDHCI_CLOCK_CTRL_SDR50_TUNING_OVERRIDE; } diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 1b1c26da3fe0..659a9459ace3 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -3913,11 +3913,13 @@ int sdhci_setup_host(struct sdhci_host *host) if (host->ops->get_min_clock) mmc->f_min = host->ops->get_min_clock(host); else if (host->version >= SDHCI_SPEC_300) { - if (host->clk_mul) { - mmc->f_min = (host->max_clk * host->clk_mul) / 1024; + if (host->clk_mul) max_clk = host->max_clk * host->clk_mul; - } else - mmc->f_min = host->max_clk / SDHCI_MAX_DIV_SPEC_300; + /* + * Divided Clock Mode minimum clock rate is always less than + * Programmable Clock Mode minimum clock rate. + */ + mmc->f_min = host->max_clk / SDHCI_MAX_DIV_SPEC_300; } else mmc->f_min = host->max_clk / SDHCI_MAX_DIV_SPEC_200; diff --git a/drivers/mmc/host/sdhci_am654.c b/drivers/mmc/host/sdhci_am654.c index b8e897e31e2e..b8fe94fd9525 100644 --- a/drivers/mmc/host/sdhci_am654.c +++ b/drivers/mmc/host/sdhci_am654.c @@ -240,6 +240,35 @@ static void sdhci_am654_write_b(struct sdhci_host *host, u8 val, int reg) writeb(val, host->ioaddr + reg); } +static int sdhci_am654_execute_tuning(struct mmc_host *mmc, u32 opcode) +{ + struct sdhci_host *host = mmc_priv(mmc); + int err = sdhci_execute_tuning(mmc, opcode); + + if (err) + return err; + /* + * Tuning data remains in the buffer after tuning. + * Do a command and data reset to get rid of it + */ + sdhci_reset(host, SDHCI_RESET_CMD | SDHCI_RESET_DATA); + + return 0; +} + +static u32 sdhci_am654_cqhci_irq(struct sdhci_host *host, u32 intmask) +{ + int cmd_error = 0; + int data_error = 0; + + if (!sdhci_cqe_irq(host, intmask, &cmd_error, &data_error)) + return intmask; + + cqhci_irq(host->mmc, intmask, cmd_error, data_error); + + return 0; +} + static struct sdhci_ops sdhci_am654_ops = { .get_max_clock = sdhci_pltfm_clk_get_max_clock, .get_timeout_clock = sdhci_pltfm_clk_get_max_clock, @@ -248,13 +277,13 @@ static struct sdhci_ops sdhci_am654_ops = { .set_power = sdhci_am654_set_power, .set_clock = sdhci_am654_set_clock, .write_b = sdhci_am654_write_b, + .irq = sdhci_am654_cqhci_irq, .reset = sdhci_reset, }; static const struct sdhci_pltfm_data sdhci_am654_pdata = { .ops = &sdhci_am654_ops, - .quirks = SDHCI_QUIRK_INVERTED_WRITE_PROTECT | - SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12, + .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12, .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN, }; @@ -263,19 +292,6 @@ static const struct sdhci_am654_driver_data sdhci_am654_drvdata = { .flags = IOMUX_PRESENT | FREQSEL_2_BIT | STRBSEL_4_BIT | DLL_PRESENT, }; -static u32 sdhci_am654_cqhci_irq(struct sdhci_host *host, u32 intmask) -{ - int cmd_error = 0; - int data_error = 0; - - if (!sdhci_cqe_irq(host, intmask, &cmd_error, &data_error)) - return intmask; - - cqhci_irq(host->mmc, intmask, cmd_error, data_error); - - return 0; -} - static struct sdhci_ops sdhci_j721e_8bit_ops = { .get_max_clock = sdhci_pltfm_clk_get_max_clock, .get_timeout_clock = sdhci_pltfm_clk_get_max_clock, @@ -290,8 +306,7 @@ static struct sdhci_ops sdhci_j721e_8bit_ops = { static const struct sdhci_pltfm_data sdhci_j721e_8bit_pdata = { .ops = &sdhci_j721e_8bit_ops, - .quirks = SDHCI_QUIRK_INVERTED_WRITE_PROTECT | - SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12, + .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12, .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN, }; @@ -314,8 +329,7 @@ static struct sdhci_ops sdhci_j721e_4bit_ops = { static const struct sdhci_pltfm_data sdhci_j721e_4bit_pdata = { .ops = &sdhci_j721e_4bit_ops, - .quirks = SDHCI_QUIRK_INVERTED_WRITE_PROTECT | - SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12, + .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12, .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN, }; @@ -549,6 +563,8 @@ static int sdhci_am654_probe(struct platform_device *pdev) goto pm_runtime_put; } + host->mmc_host_ops.execute_tuning = sdhci_am654_execute_tuning; + ret = sdhci_am654_init(host); if (ret) goto pm_runtime_put; diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c index 2e57122f02fb..2f5c287eac95 100644 --- a/drivers/net/can/slcan.c +++ b/drivers/net/can/slcan.c @@ -344,9 +344,16 @@ static void slcan_transmit(struct work_struct *work) */ static void slcan_write_wakeup(struct tty_struct *tty) { - struct slcan *sl = tty->disc_data; + struct slcan *sl; + + rcu_read_lock(); + sl = rcu_dereference(tty->disc_data); + if (!sl) + goto out; schedule_work(&sl->tx_work); +out: + rcu_read_unlock(); } /* Send a can_frame to a TTY queue. */ @@ -644,10 +651,11 @@ static void slcan_close(struct tty_struct *tty) return; spin_lock_bh(&sl->lock); - tty->disc_data = NULL; + rcu_assign_pointer(tty->disc_data, NULL); sl->tty = NULL; spin_unlock_bh(&sl->lock); + synchronize_rcu(); flush_work(&sl->tx_work); /* Flush network side */ diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c index e284b6753725..6aee2f0fc0db 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c @@ -2020,7 +2020,7 @@ static int xgene_enet_probe(struct platform_device *pdev) int ret; ndev = alloc_etherdev_mqs(sizeof(struct xgene_enet_pdata), - XGENE_NUM_RX_RING, XGENE_NUM_TX_RING); + XGENE_NUM_TX_RING, XGENE_NUM_RX_RING); if (!ndev) return -ENOMEM; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 198c69dceeef..483935b001c8 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6998,7 +6998,6 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp) pf->fw_fid = le16_to_cpu(resp->fid); pf->port_id = le16_to_cpu(resp->port_id); - bp->dev->dev_port = pf->port_id; memcpy(pf->mac_addr, resp->mac_address, ETH_ALEN); pf->first_vf_id = le16_to_cpu(resp->first_vf_id); pf->max_vfs = le16_to_cpu(resp->max_vfs); @@ -7289,6 +7288,7 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp) bp->hwrm_max_ext_req_len = HWRM_MAX_REQ_LEN; bp->chip_num = le16_to_cpu(resp->chip_num); + bp->chip_rev = resp->chip_rev; if (bp->chip_num == CHIP_NUM_58700 && !resp->chip_rev && !resp->chip_metal) bp->flags |= BNXT_FLAG_CHIP_NITRO_A0; @@ -9064,7 +9064,7 @@ static int bnxt_update_phy_setting(struct bnxt *bp) /* The last close may have shutdown the link, so need to call * PHY_CFG to bring it back up. */ - if (!netif_carrier_ok(bp->dev)) + if (!bp->link_info.link_up) update_link = true; if (!bnxt_eee_config_ok(bp)) @@ -10041,6 +10041,13 @@ static void bnxt_timer(struct timer_list *t) bnxt_queue_sp_work(bp); } +#ifdef CONFIG_RFS_ACCEL + if ((bp->flags & BNXT_FLAG_RFS) && bp->ntp_fltr_count) { + set_bit(BNXT_RX_NTP_FLTR_SP_EVENT, &bp->sp_event); + bnxt_queue_sp_work(bp); + } +#endif /*CONFIG_RFS_ACCEL*/ + if (bp->link_info.phy_retry) { if (time_after(jiffies, bp->link_info.phy_retry_expires)) { bp->link_info.phy_retry = false; @@ -10051,7 +10058,8 @@ static void bnxt_timer(struct timer_list *t) } } - if ((bp->flags & BNXT_FLAG_CHIP_P5) && netif_carrier_ok(dev)) { + if ((bp->flags & BNXT_FLAG_CHIP_P5) && !bp->chip_rev && + netif_carrier_ok(dev)) { set_bit(BNXT_RING_COAL_NOW_SP_EVENT, &bp->sp_event); bnxt_queue_sp_work(bp); } @@ -10569,7 +10577,7 @@ static void bnxt_set_dflt_rss_hash_type(struct bnxt *bp) VNIC_RSS_CFG_REQ_HASH_TYPE_TCP_IPV4 | VNIC_RSS_CFG_REQ_HASH_TYPE_IPV6 | VNIC_RSS_CFG_REQ_HASH_TYPE_TCP_IPV6; - if (BNXT_CHIP_P4(bp) && bp->hwrm_spec_code >= 0x10501) { + if (BNXT_CHIP_P4_PLUS(bp) && bp->hwrm_spec_code >= 0x10501) { bp->flags |= BNXT_FLAG_UDP_RSS_CAP; bp->rss_hash_cfg |= VNIC_RSS_CFG_REQ_HASH_TYPE_UDP_IPV4 | VNIC_RSS_CFG_REQ_HASH_TYPE_UDP_IPV6; @@ -11101,6 +11109,7 @@ static int bnxt_rx_flow_steer(struct net_device *dev, const struct sk_buff *skb, struct ethhdr *eth = (struct ethhdr *)skb_mac_header(skb); int rc = 0, idx, bit_id, l2_idx = 0; struct hlist_head *head; + u32 flags; if (!ether_addr_equal(dev->dev_addr, eth->h_dest)) { struct bnxt_vnic_info *vnic = &bp->vnic_info[0]; @@ -11140,8 +11149,9 @@ static int bnxt_rx_flow_steer(struct net_device *dev, const struct sk_buff *skb, rc = -EPROTONOSUPPORT; goto err_free; } - if ((fkeys->control.flags & FLOW_DIS_ENCAPSULATION) && - bp->hwrm_spec_code < 0x10601) { + flags = fkeys->control.flags; + if (((flags & FLOW_DIS_ENCAPSULATION) && + bp->hwrm_spec_code < 0x10601) || (flags & FLOW_DIS_IS_FRAGMENT)) { rc = -EPROTONOSUPPORT; goto err_free; } @@ -11378,8 +11388,8 @@ int bnxt_get_port_parent_id(struct net_device *dev, if (!BNXT_PF(bp) || !(bp->flags & BNXT_FLAG_DSN_VALID)) return -EOPNOTSUPP; - ppid->id_len = sizeof(bp->switch_id); - memcpy(ppid->id, bp->switch_id, ppid->id_len); + ppid->id_len = sizeof(bp->dsn); + memcpy(ppid->id, bp->dsn, ppid->id_len); return 0; } @@ -11435,9 +11445,9 @@ static void bnxt_remove_one(struct pci_dev *pdev) bnxt_sriov_disable(bp); bnxt_dl_fw_reporters_destroy(bp, true); - bnxt_dl_unregister(bp); pci_disable_pcie_error_reporting(pdev); unregister_netdev(dev); + bnxt_dl_unregister(bp); bnxt_shutdown_tc(bp); bnxt_cancel_sp_work(bp); bp->sp_event = 0; @@ -11471,6 +11481,9 @@ static int bnxt_probe_phy(struct bnxt *bp, bool fw_dflt) rc); return rc; } + if (!fw_dflt) + return 0; + rc = bnxt_update_link(bp, false); if (rc) { netdev_err(bp->dev, "Probe phy can't update link (rc: %x)\n", @@ -11484,9 +11497,6 @@ static int bnxt_probe_phy(struct bnxt *bp, bool fw_dflt) if (link_info->auto_link_speeds && !link_info->support_auto_speeds) link_info->support_auto_speeds = link_info->support_speeds; - if (!fw_dflt) - return 0; - bnxt_init_ethtool_link_settings(bp); return 0; } @@ -11860,7 +11870,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (BNXT_PF(bp)) { /* Read the adapter's DSN to use as the eswitch switch_id */ - bnxt_pcie_dsn_get(bp, bp->switch_id); + rc = bnxt_pcie_dsn_get(bp, bp->dsn); } /* MTU range: 60 - FW defined max */ @@ -11907,11 +11917,14 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) bnxt_init_tc(bp); } + bnxt_dl_register(bp); + rc = register_netdev(dev); if (rc) - goto init_err_cleanup_tc; + goto init_err_cleanup; - bnxt_dl_register(bp); + if (BNXT_PF(bp)) + devlink_port_type_eth_set(&bp->dl_port, bp->dev); bnxt_dl_fw_reporters_create(bp); netdev_info(dev, "%s found at mem %lx, node addr %pM\n", @@ -11921,7 +11934,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) return 0; -init_err_cleanup_tc: +init_err_cleanup: + bnxt_dl_unregister(bp); bnxt_shutdown_tc(bp); bnxt_clear_int_mode(bp); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index f14335433a64..cabef0b4f5fb 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1457,6 +1457,8 @@ struct bnxt { #define CHIP_NUM_58804 0xd804 #define CHIP_NUM_58808 0xd808 + u8 chip_rev; + #define BNXT_CHIP_NUM_5730X(chip_num) \ ((chip_num) >= CHIP_NUM_57301 && \ (chip_num) <= CHIP_NUM_57304) @@ -1846,7 +1848,7 @@ struct bnxt { enum devlink_eswitch_mode eswitch_mode; struct bnxt_vf_rep **vf_reps; /* array of vf-rep ptrs */ u16 *cfa_code_map; /* cfa_code -> vf_idx map */ - u8 switch_id[8]; + u8 dsn[8]; struct bnxt_tc_info *tc_info; struct list_head tc_indr_block_list; struct notifier_block tc_netdev_nb; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c index 0c3d224637b9..eec0168330b7 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c @@ -21,6 +21,7 @@ bnxt_dl_flash_update(struct devlink *dl, const char *filename, const char *region, struct netlink_ext_ack *extack) { struct bnxt *bp = bnxt_get_bp_from_dl(dl); + int rc; if (region) return -EOPNOTSUPP; @@ -31,7 +32,18 @@ bnxt_dl_flash_update(struct devlink *dl, const char *filename, return -EPERM; } - return bnxt_flash_package_from_file(bp->dev, filename, 0); + devlink_flash_update_begin_notify(dl); + devlink_flash_update_status_notify(dl, "Preparing to flash", region, 0, + 0); + rc = bnxt_flash_package_from_file(bp->dev, filename, 0); + if (!rc) + devlink_flash_update_status_notify(dl, "Flashing done", region, + 0, 0); + else + devlink_flash_update_status_notify(dl, "Flashing failed", + region, 0, 0); + devlink_flash_update_end_notify(dl); + return rc; } static int bnxt_fw_reporter_diagnose(struct devlink_health_reporter *reporter, @@ -272,11 +284,15 @@ void bnxt_dl_health_recovery_done(struct bnxt *bp) devlink_health_reporter_recovery_done(hlth->fw_reset_reporter); } +static int bnxt_dl_info_get(struct devlink *dl, struct devlink_info_req *req, + struct netlink_ext_ack *extack); + static const struct devlink_ops bnxt_dl_ops = { #ifdef CONFIG_BNXT_SRIOV .eswitch_mode_set = bnxt_dl_eswitch_mode_set, .eswitch_mode_get = bnxt_dl_eswitch_mode_get, #endif /* CONFIG_BNXT_SRIOV */ + .info_get = bnxt_dl_info_get, .flash_update = bnxt_dl_flash_update, }; @@ -343,6 +359,136 @@ static void bnxt_copy_from_nvm_data(union devlink_param_value *dst, dst->vu8 = (u8)val32; } +static int bnxt_hwrm_get_nvm_cfg_ver(struct bnxt *bp, + union devlink_param_value *nvm_cfg_ver) +{ + struct hwrm_nvm_get_variable_input req = {0}; + union bnxt_nvm_data *data; + dma_addr_t data_dma_addr; + int rc; + + bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_NVM_GET_VARIABLE, -1, -1); + data = dma_alloc_coherent(&bp->pdev->dev, sizeof(*data), + &data_dma_addr, GFP_KERNEL); + if (!data) + return -ENOMEM; + + req.dest_data_addr = cpu_to_le64(data_dma_addr); + req.data_len = cpu_to_le16(BNXT_NVM_CFG_VER_BITS); + req.option_num = cpu_to_le16(NVM_OFF_NVM_CFG_VER); + + rc = hwrm_send_message_silent(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) + bnxt_copy_from_nvm_data(nvm_cfg_ver, data, + BNXT_NVM_CFG_VER_BITS, + BNXT_NVM_CFG_VER_BYTES); + + dma_free_coherent(&bp->pdev->dev, sizeof(*data), data, data_dma_addr); + return rc; +} + +static int bnxt_dl_info_get(struct devlink *dl, struct devlink_info_req *req, + struct netlink_ext_ack *extack) +{ + struct bnxt *bp = bnxt_get_bp_from_dl(dl); + union devlink_param_value nvm_cfg_ver; + struct hwrm_ver_get_output *ver_resp; + char mgmt_ver[FW_VER_STR_LEN]; + char roce_ver[FW_VER_STR_LEN]; + char fw_ver[FW_VER_STR_LEN]; + char buf[32]; + int rc; + + rc = devlink_info_driver_name_put(req, DRV_MODULE_NAME); + if (rc) + return rc; + + sprintf(buf, "%X", bp->chip_num); + rc = devlink_info_version_fixed_put(req, + DEVLINK_INFO_VERSION_GENERIC_ASIC_ID, buf); + if (rc) + return rc; + + ver_resp = &bp->ver_resp; + sprintf(buf, "%X", ver_resp->chip_rev); + rc = devlink_info_version_fixed_put(req, + DEVLINK_INFO_VERSION_GENERIC_ASIC_REV, buf); + if (rc) + return rc; + + if (BNXT_PF(bp)) { + sprintf(buf, "%02X-%02X-%02X-%02X-%02X-%02X-%02X-%02X", + bp->dsn[7], bp->dsn[6], bp->dsn[5], bp->dsn[4], + bp->dsn[3], bp->dsn[2], bp->dsn[1], bp->dsn[0]); + rc = devlink_info_serial_number_put(req, buf); + if (rc) + return rc; + } + + if (strlen(ver_resp->active_pkg_name)) { + rc = + devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW, + ver_resp->active_pkg_name); + if (rc) + return rc; + } + + if (BNXT_PF(bp) && !bnxt_hwrm_get_nvm_cfg_ver(bp, &nvm_cfg_ver)) { + u32 ver = nvm_cfg_ver.vu32; + + sprintf(buf, "%X.%X.%X", (ver >> 16) & 0xF, (ver >> 8) & 0xF, + ver & 0xF); + rc = devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW_PSID, buf); + if (rc) + return rc; + } + + if (ver_resp->flags & VER_GET_RESP_FLAGS_EXT_VER_AVAIL) { + snprintf(fw_ver, FW_VER_STR_LEN, "%d.%d.%d.%d", + ver_resp->hwrm_fw_major, ver_resp->hwrm_fw_minor, + ver_resp->hwrm_fw_build, ver_resp->hwrm_fw_patch); + + snprintf(mgmt_ver, FW_VER_STR_LEN, "%d.%d.%d.%d", + ver_resp->mgmt_fw_major, ver_resp->mgmt_fw_minor, + ver_resp->mgmt_fw_build, ver_resp->mgmt_fw_patch); + + snprintf(roce_ver, FW_VER_STR_LEN, "%d.%d.%d.%d", + ver_resp->roce_fw_major, ver_resp->roce_fw_minor, + ver_resp->roce_fw_build, ver_resp->roce_fw_patch); + } else { + snprintf(fw_ver, FW_VER_STR_LEN, "%d.%d.%d.%d", + ver_resp->hwrm_fw_maj_8b, ver_resp->hwrm_fw_min_8b, + ver_resp->hwrm_fw_bld_8b, ver_resp->hwrm_fw_rsvd_8b); + + snprintf(mgmt_ver, FW_VER_STR_LEN, "%d.%d.%d.%d", + ver_resp->mgmt_fw_maj_8b, ver_resp->mgmt_fw_min_8b, + ver_resp->mgmt_fw_bld_8b, ver_resp->mgmt_fw_rsvd_8b); + + snprintf(roce_ver, FW_VER_STR_LEN, "%d.%d.%d.%d", + ver_resp->roce_fw_maj_8b, ver_resp->roce_fw_min_8b, + ver_resp->roce_fw_bld_8b, ver_resp->roce_fw_rsvd_8b); + } + rc = devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW_APP, fw_ver); + if (rc) + return rc; + + if (!(bp->flags & BNXT_FLAG_CHIP_P5)) { + rc = devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW_MGMT, mgmt_ver); + if (rc) + return rc; + + rc = devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW_ROCE, roce_ver); + if (rc) + return rc; + } + return 0; +} + static int bnxt_hwrm_nvm_req(struct bnxt *bp, u32 param_id, void *msg, int msg_len, union devlink_param_value *val) { @@ -485,15 +631,48 @@ static const struct devlink_param bnxt_dl_params[] = { static const struct devlink_param bnxt_dl_port_params[] = { }; -int bnxt_dl_register(struct bnxt *bp) +static int bnxt_dl_params_register(struct bnxt *bp) { - struct devlink *dl; int rc; - if (bp->hwrm_spec_code < 0x10600) { - netdev_warn(bp->dev, "Firmware does not support NVM params"); - return -ENOTSUPP; + if (bp->hwrm_spec_code < 0x10600) + return 0; + + rc = devlink_params_register(bp->dl, bnxt_dl_params, + ARRAY_SIZE(bnxt_dl_params)); + if (rc) { + netdev_warn(bp->dev, "devlink_params_register failed. rc=%d", + rc); + return rc; + } + rc = devlink_port_params_register(&bp->dl_port, bnxt_dl_port_params, + ARRAY_SIZE(bnxt_dl_port_params)); + if (rc) { + netdev_err(bp->dev, "devlink_port_params_register failed"); + devlink_params_unregister(bp->dl, bnxt_dl_params, + ARRAY_SIZE(bnxt_dl_params)); + return rc; } + devlink_params_publish(bp->dl); + + return 0; +} + +static void bnxt_dl_params_unregister(struct bnxt *bp) +{ + if (bp->hwrm_spec_code < 0x10600) + return; + + devlink_params_unregister(bp->dl, bnxt_dl_params, + ARRAY_SIZE(bnxt_dl_params)); + devlink_port_params_unregister(&bp->dl_port, bnxt_dl_port_params, + ARRAY_SIZE(bnxt_dl_port_params)); +} + +int bnxt_dl_register(struct bnxt *bp) +{ + struct devlink *dl; + int rc; if (BNXT_PF(bp)) dl = devlink_alloc(&bnxt_dl_ops, sizeof(struct bnxt_dl)); @@ -520,40 +699,23 @@ int bnxt_dl_register(struct bnxt *bp) if (!BNXT_PF(bp)) return 0; - rc = devlink_params_register(dl, bnxt_dl_params, - ARRAY_SIZE(bnxt_dl_params)); - if (rc) { - netdev_warn(bp->dev, "devlink_params_register failed. rc=%d", - rc); - goto err_dl_unreg; - } - devlink_port_attrs_set(&bp->dl_port, DEVLINK_PORT_FLAVOUR_PHYSICAL, - bp->pf.port_id, false, 0, - bp->switch_id, sizeof(bp->switch_id)); + bp->pf.port_id, false, 0, bp->dsn, + sizeof(bp->dsn)); rc = devlink_port_register(dl, &bp->dl_port, bp->pf.port_id); if (rc) { netdev_err(bp->dev, "devlink_port_register failed"); - goto err_dl_param_unreg; + goto err_dl_unreg; } - devlink_port_type_eth_set(&bp->dl_port, bp->dev); - rc = devlink_port_params_register(&bp->dl_port, bnxt_dl_port_params, - ARRAY_SIZE(bnxt_dl_port_params)); - if (rc) { - netdev_err(bp->dev, "devlink_port_params_register failed"); + rc = bnxt_dl_params_register(bp); + if (rc) goto err_dl_port_unreg; - } - - devlink_params_publish(dl); return 0; err_dl_port_unreg: devlink_port_unregister(&bp->dl_port); -err_dl_param_unreg: - devlink_params_unregister(dl, bnxt_dl_params, - ARRAY_SIZE(bnxt_dl_params)); err_dl_unreg: devlink_unregister(dl); err_dl_free: @@ -570,12 +732,8 @@ void bnxt_dl_unregister(struct bnxt *bp) return; if (BNXT_PF(bp)) { - devlink_port_params_unregister(&bp->dl_port, - bnxt_dl_port_params, - ARRAY_SIZE(bnxt_dl_port_params)); + bnxt_dl_params_unregister(bp); devlink_port_unregister(&bp->dl_port); - devlink_params_unregister(dl, bnxt_dl_params, - ARRAY_SIZE(bnxt_dl_params)); } devlink_unregister(dl); devlink_free(dl); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h index 08aaa4441c78..95f893f2a74d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.h @@ -38,6 +38,10 @@ static inline void bnxt_link_bp_to_dl(struct bnxt *bp, struct devlink *dl) #define NVM_OFF_IGNORE_ARI 164 #define NVM_OFF_DIS_GRE_VER_CHECK 171 #define NVM_OFF_ENABLE_SRIOV 401 +#define NVM_OFF_NVM_CFG_VER 602 + +#define BNXT_NVM_CFG_VER_BITS 24 +#define BNXT_NVM_CFG_VER_BYTES 4 #define BNXT_MSIX_VEC_MAX 1280 #define BNXT_MSIX_VEC_MIN_MAX 128 diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 08d56ec7b68a..6171fa8b3677 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -1462,15 +1462,15 @@ static int bnxt_get_link_ksettings(struct net_device *dev, ethtool_link_ksettings_add_link_mode(lk_ksettings, advertising, Autoneg); base->autoneg = AUTONEG_ENABLE; - if (link_info->phy_link_status == BNXT_LINK_LINK) + base->duplex = DUPLEX_UNKNOWN; + if (link_info->phy_link_status == BNXT_LINK_LINK) { bnxt_fw_to_ethtool_lp_adv(link_info, lk_ksettings); + if (link_info->duplex & BNXT_LINK_DUPLEX_FULL) + base->duplex = DUPLEX_FULL; + else + base->duplex = DUPLEX_HALF; + } ethtool_speed = bnxt_fw_to_ethtool_speed(link_info->link_speed); - if (!netif_carrier_ok(dev)) - base->duplex = DUPLEX_UNKNOWN; - else if (link_info->duplex & BNXT_LINK_DUPLEX_FULL) - base->duplex = DUPLEX_FULL; - else - base->duplex = DUPLEX_HALF; } else { base->autoneg = AUTONEG_DISABLE; ethtool_speed = @@ -2707,7 +2707,7 @@ static int bnxt_disable_an_for_lpbk(struct bnxt *bp, return rc; fw_speed = PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_1GB; - if (netif_carrier_ok(bp->dev)) + if (bp->link_info.link_up) fw_speed = bp->link_info.link_speed; else if (fw_advertising & BNXT_LINK_SPEED_MSK_10GB) fw_speed = PORT_PHY_CFG_REQ_FORCE_LINK_SPEED_10GB; diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index 013959285a61..e50a15397e11 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -2157,8 +2157,8 @@ static void bcmgenet_init_tx_ring(struct bcmgenet_priv *priv, DMA_END_ADDR); /* Initialize Tx NAPI */ - netif_napi_add(priv->dev, &ring->napi, bcmgenet_tx_poll, - NAPI_POLL_WEIGHT); + netif_tx_napi_add(priv->dev, &ring->napi, bcmgenet_tx_poll, + NAPI_POLL_WEIGHT); } /* Initialize a RDMA ring */ diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c index 58f89f6a040f..97ff8608f0ab 100644 --- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c +++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c @@ -2448,6 +2448,8 @@ static int cxgb_extension_ioctl(struct net_device *dev, void __user *useraddr) if (!is_offload(adapter)) return -EOPNOTSUPP; + if (!capable(CAP_NET_ADMIN)) + return -EPERM; if (!(adapter->flags & FULL_INIT_DONE)) return -EIO; /* need the memory controllers */ if (copy_from_user(&t, useraddr, sizeof(t))) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c index ee3aab563b1d..9d1f2f88b945 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c @@ -70,8 +70,7 @@ static void *seq_tab_start(struct seq_file *seq, loff_t *pos) static void *seq_tab_next(struct seq_file *seq, void *v, loff_t *pos) { v = seq_tab_get_idx(seq->private, *pos + 1); - if (v) - ++*pos; + ++(*pos); return v; } diff --git a/drivers/net/ethernet/chelsio/cxgb4/l2t.c b/drivers/net/ethernet/chelsio/cxgb4/l2t.c index e9e45006632d..1a16449e9deb 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/l2t.c +++ b/drivers/net/ethernet/chelsio/cxgb4/l2t.c @@ -678,8 +678,7 @@ static void *l2t_seq_start(struct seq_file *seq, loff_t *pos) static void *l2t_seq_next(struct seq_file *seq, void *v, loff_t *pos) { v = l2t_get_idx(seq, *pos); - if (v) - ++*pos; + ++(*pos); return v; } diff --git a/drivers/net/ethernet/freescale/fman/fman_memac.c b/drivers/net/ethernet/freescale/fman/fman_memac.c index 41c6fa200e74..e1901874c19f 100644 --- a/drivers/net/ethernet/freescale/fman/fman_memac.c +++ b/drivers/net/ethernet/freescale/fman/fman_memac.c @@ -110,7 +110,7 @@ do { \ /* Interface Mode Register (IF_MODE) */ #define IF_MODE_MASK 0x00000003 /* 30-31 Mask on i/f mode bits */ -#define IF_MODE_XGMII 0x00000000 /* 30-31 XGMII (10G) interface */ +#define IF_MODE_10G 0x00000000 /* 30-31 10G interface */ #define IF_MODE_GMII 0x00000002 /* 30-31 GMII (1G) interface */ #define IF_MODE_RGMII 0x00000004 #define IF_MODE_RGMII_AUTO 0x00008000 @@ -440,7 +440,7 @@ static int init(struct memac_regs __iomem *regs, struct memac_cfg *cfg, tmp = 0; switch (phy_if) { case PHY_INTERFACE_MODE_XGMII: - tmp |= IF_MODE_XGMII; + tmp |= IF_MODE_10G; break; default: tmp |= IF_MODE_GMII; diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c index e03b30c60dcf..c82c85ef5fb3 100644 --- a/drivers/net/ethernet/freescale/xgmac_mdio.c +++ b/drivers/net/ethernet/freescale/xgmac_mdio.c @@ -49,6 +49,7 @@ struct tgec_mdio_controller { struct mdio_fsl_priv { struct tgec_mdio_controller __iomem *mdio_base; bool is_little_endian; + bool has_a011043; }; static u32 xgmac_read32(void __iomem *regs, @@ -226,7 +227,8 @@ static int xgmac_mdio_read(struct mii_bus *bus, int phy_id, int regnum) return ret; /* Return all Fs if nothing was there */ - if (xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) { + if ((xgmac_read32(®s->mdio_stat, endian) & MDIO_STAT_RD_ER) && + !priv->has_a011043) { dev_err(&bus->dev, "Error while reading PHY%d reg at %d.%hhu\n", phy_id, dev_addr, regnum); @@ -274,6 +276,9 @@ static int xgmac_mdio_probe(struct platform_device *pdev) priv->is_little_endian = of_property_read_bool(pdev->dev.of_node, "little-endian"); + priv->has_a011043 = of_property_read_bool(pdev->dev.of_node, + "fsl,erratum-a011043"); + ret = of_mdiobus_register(bus, np); if (ret) { dev_err(&pdev->dev, "cannot register MDIO bus\n"); diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index d4055037af89..45b90eb11adb 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -1113,7 +1113,7 @@ i40e_status i40e_read_pba_string(struct i40e_hw *hw, u8 *pba_num, */ pba_size--; if (pba_num_size < (((u32)pba_size * 2) + 1)) { - hw_dbg(hw, "Buffer to small for PBA data.\n"); + hw_dbg(hw, "Buffer too small for PBA data.\n"); return I40E_ERR_PARAM; } diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile index 7cb829132d28..59544b0fc086 100644 --- a/drivers/net/ethernet/intel/ice/Makefile +++ b/drivers/net/ethernet/intel/ice/Makefile @@ -17,7 +17,8 @@ ice-y := ice_main.o \ ice_lib.o \ ice_txrx_lib.o \ ice_txrx.o \ - ice_flex_pipe.o \ + ice_flex_pipe.o \ + ice_flow.o \ ice_ethtool.o ice-$(CONFIG_PCI_IOV) += ice_virtchnl_pf.o ice_sriov.o ice-$(CONFIG_DCB) += ice_dcb.o ice_dcb_nl.o ice_dcb_lib.o diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 5421fc413f94..4459bc564b11 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -232,6 +232,13 @@ struct ice_aqc_get_sw_cfg_resp { */ #define ICE_AQC_RES_TYPE_VSI_LIST_REP 0x03 #define ICE_AQC_RES_TYPE_VSI_LIST_PRUNE 0x04 +#define ICE_AQC_RES_TYPE_HASH_PROF_BLDR_PROFID 0x60 +#define ICE_AQC_RES_TYPE_HASH_PROF_BLDR_TCAM 0x61 + +#define ICE_AQC_RES_TYPE_FLAG_SCAN_BOTTOM BIT(12) +#define ICE_AQC_RES_TYPE_FLAG_IGNORE_INDEX BIT(13) + +#define ICE_AQC_RES_TYPE_FLAG_DEDICATED 0x00 /* Allocate Resources command (indirect 0x0208) * Free Resources command (indirect 0x0209) @@ -1849,6 +1856,7 @@ enum ice_adminq_opc { /* package commands */ ice_aqc_opc_download_pkg = 0x0C40, + ice_aqc_opc_update_pkg = 0x0C42, ice_aqc_opc_get_pkg_info_list = 0x0C43, /* debug commands */ diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index a03b4fdc01e6..0207e28c2682 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -4,6 +4,7 @@ #include "ice_common.h" #include "ice_sched.h" #include "ice_adminq_cmd.h" +#include "ice_flow.h" #define ICE_PF_RESET_WAIT_COUNT 200 @@ -1497,6 +1498,114 @@ void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) } /** + * ice_aq_alloc_free_res - command to allocate/free resources + * @hw: pointer to the HW struct + * @num_entries: number of resource entries in buffer + * @buf: Indirect buffer to hold data parameters and response + * @buf_size: size of buffer for indirect commands + * @opc: pass in the command opcode + * @cd: pointer to command details structure or NULL + * + * Helper function to allocate/free resources using the admin queue commands + */ +enum ice_status +ice_aq_alloc_free_res(struct ice_hw *hw, u16 num_entries, + struct ice_aqc_alloc_free_res_elem *buf, u16 buf_size, + enum ice_adminq_opc opc, struct ice_sq_cd *cd) +{ + struct ice_aqc_alloc_free_res_cmd *cmd; + struct ice_aq_desc desc; + + cmd = &desc.params.sw_res_ctrl; + + if (!buf) + return ICE_ERR_PARAM; + + if (buf_size < (num_entries * sizeof(buf->elem[0]))) + return ICE_ERR_PARAM; + + ice_fill_dflt_direct_cmd_desc(&desc, opc); + + desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD); + + cmd->num_entries = cpu_to_le16(num_entries); + + return ice_aq_send_cmd(hw, &desc, buf, buf_size, cd); +} + +/** + * ice_alloc_hw_res - allocate resource + * @hw: pointer to the HW struct + * @type: type of resource + * @num: number of resources to allocate + * @btm: allocate from bottom + * @res: pointer to array that will receive the resources + */ +enum ice_status +ice_alloc_hw_res(struct ice_hw *hw, u16 type, u16 num, bool btm, u16 *res) +{ + struct ice_aqc_alloc_free_res_elem *buf; + enum ice_status status; + u16 buf_len; + + buf_len = struct_size(buf, elem, num - 1); + buf = kzalloc(buf_len, GFP_KERNEL); + if (!buf) + return ICE_ERR_NO_MEMORY; + + /* Prepare buffer to allocate resource. */ + buf->num_elems = cpu_to_le16(num); + buf->res_type = cpu_to_le16(type | ICE_AQC_RES_TYPE_FLAG_DEDICATED | + ICE_AQC_RES_TYPE_FLAG_IGNORE_INDEX); + if (btm) + buf->res_type |= cpu_to_le16(ICE_AQC_RES_TYPE_FLAG_SCAN_BOTTOM); + + status = ice_aq_alloc_free_res(hw, 1, buf, buf_len, + ice_aqc_opc_alloc_res, NULL); + if (status) + goto ice_alloc_res_exit; + + memcpy(res, buf->elem, sizeof(buf->elem) * num); + +ice_alloc_res_exit: + kfree(buf); + return status; +} + +/** + * ice_free_hw_res - free allocated HW resource + * @hw: pointer to the HW struct + * @type: type of resource to free + * @num: number of resources + * @res: pointer to array that contains the resources to free + */ +enum ice_status +ice_free_hw_res(struct ice_hw *hw, u16 type, u16 num, u16 *res) +{ + struct ice_aqc_alloc_free_res_elem *buf; + enum ice_status status; + u16 buf_len; + + buf_len = struct_size(buf, elem, num - 1); + buf = kzalloc(buf_len, GFP_KERNEL); + if (!buf) + return ICE_ERR_NO_MEMORY; + + /* Prepare buffer to free resource. */ + buf->num_elems = cpu_to_le16(num); + buf->res_type = cpu_to_le16(type); + memcpy(buf->elem, res, sizeof(buf->elem) * num); + + status = ice_aq_alloc_free_res(hw, num, buf, buf_len, + ice_aqc_opc_free_res, NULL); + if (status) + ice_debug(hw, ICE_DBG_SW, "CQ CMD Buffer:\n"); + + kfree(buf); + return status; +} + +/** * ice_get_num_per_func - determine number of resources per PF * @hw: pointer to the HW structure * @max: value to be evenly split between each PF @@ -3406,7 +3515,10 @@ enum ice_status ice_replay_vsi(struct ice_hw *hw, u16 vsi_handle) if (status) return status; } - + /* Replay per VSI all RSS configurations */ + status = ice_replay_rss_cfg(hw, vsi_handle); + if (status) + return status; /* Replay per VSI all filters */ status = ice_replay_vsi_all_fltr(hw, vsi_handle); return status; diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index b22aa561e253..b5c013fdaaf9 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -34,10 +34,18 @@ enum ice_status ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res, enum ice_aq_res_access_type access, u32 timeout); void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res); +enum ice_status +ice_alloc_hw_res(struct ice_hw *hw, u16 type, u16 num, bool btm, u16 *res); +enum ice_status +ice_free_hw_res(struct ice_hw *hw, u16 type, u16 num, u16 *res); enum ice_status ice_init_nvm(struct ice_hw *hw); enum ice_status ice_read_sr_buf(struct ice_hw *hw, u16 offset, u16 *words, u16 *data); enum ice_status +ice_aq_alloc_free_res(struct ice_hw *hw, u16 num_entries, + struct ice_aqc_alloc_free_res_elem *buf, u16 buf_size, + enum ice_adminq_opc opc, struct ice_sq_cd *cd); +enum ice_status ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, struct ice_aq_desc *desc, void *buf, u16 buf_size, struct ice_sq_cd *cd); diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index f395457b728f..90c6a3ca20c9 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -4,6 +4,7 @@ /* ethtool support for ice */ #include "ice.h" +#include "ice_flow.h" #include "ice_lib.h" #include "ice_dcb_lib.h" @@ -2534,6 +2535,243 @@ done: } /** + * ice_parse_hdrs - parses headers from RSS hash input + * @nfc: ethtool rxnfc command + * + * This function parses the rxnfc command and returns intended + * header types for RSS configuration + */ +static u32 ice_parse_hdrs(struct ethtool_rxnfc *nfc) +{ + u32 hdrs = ICE_FLOW_SEG_HDR_NONE; + + switch (nfc->flow_type) { + case TCP_V4_FLOW: + hdrs |= ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_IPV4; + break; + case UDP_V4_FLOW: + hdrs |= ICE_FLOW_SEG_HDR_UDP | ICE_FLOW_SEG_HDR_IPV4; + break; + case SCTP_V4_FLOW: + hdrs |= ICE_FLOW_SEG_HDR_SCTP | ICE_FLOW_SEG_HDR_IPV4; + break; + case TCP_V6_FLOW: + hdrs |= ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_IPV6; + break; + case UDP_V6_FLOW: + hdrs |= ICE_FLOW_SEG_HDR_UDP | ICE_FLOW_SEG_HDR_IPV6; + break; + case SCTP_V6_FLOW: + hdrs |= ICE_FLOW_SEG_HDR_SCTP | ICE_FLOW_SEG_HDR_IPV6; + break; + default: + break; + } + return hdrs; +} + +#define ICE_FLOW_HASH_FLD_IPV4_SA BIT_ULL(ICE_FLOW_FIELD_IDX_IPV4_SA) +#define ICE_FLOW_HASH_FLD_IPV6_SA BIT_ULL(ICE_FLOW_FIELD_IDX_IPV6_SA) +#define ICE_FLOW_HASH_FLD_IPV4_DA BIT_ULL(ICE_FLOW_FIELD_IDX_IPV4_DA) +#define ICE_FLOW_HASH_FLD_IPV6_DA BIT_ULL(ICE_FLOW_FIELD_IDX_IPV6_DA) +#define ICE_FLOW_HASH_FLD_TCP_SRC_PORT BIT_ULL(ICE_FLOW_FIELD_IDX_TCP_SRC_PORT) +#define ICE_FLOW_HASH_FLD_TCP_DST_PORT BIT_ULL(ICE_FLOW_FIELD_IDX_TCP_DST_PORT) +#define ICE_FLOW_HASH_FLD_UDP_SRC_PORT BIT_ULL(ICE_FLOW_FIELD_IDX_UDP_SRC_PORT) +#define ICE_FLOW_HASH_FLD_UDP_DST_PORT BIT_ULL(ICE_FLOW_FIELD_IDX_UDP_DST_PORT) +#define ICE_FLOW_HASH_FLD_SCTP_SRC_PORT \ + BIT_ULL(ICE_FLOW_FIELD_IDX_SCTP_SRC_PORT) +#define ICE_FLOW_HASH_FLD_SCTP_DST_PORT \ + BIT_ULL(ICE_FLOW_FIELD_IDX_SCTP_DST_PORT) + +/** + * ice_parse_hash_flds - parses hash fields from RSS hash input + * @nfc: ethtool rxnfc command + * + * This function parses the rxnfc command and returns intended + * hash fields for RSS configuration + */ +static u64 ice_parse_hash_flds(struct ethtool_rxnfc *nfc) +{ + u64 hfld = ICE_HASH_INVALID; + + if (nfc->data & RXH_IP_SRC || nfc->data & RXH_IP_DST) { + switch (nfc->flow_type) { + case TCP_V4_FLOW: + case UDP_V4_FLOW: + case SCTP_V4_FLOW: + if (nfc->data & RXH_IP_SRC) + hfld |= ICE_FLOW_HASH_FLD_IPV4_SA; + if (nfc->data & RXH_IP_DST) + hfld |= ICE_FLOW_HASH_FLD_IPV4_DA; + break; + case TCP_V6_FLOW: + case UDP_V6_FLOW: + case SCTP_V6_FLOW: + if (nfc->data & RXH_IP_SRC) + hfld |= ICE_FLOW_HASH_FLD_IPV6_SA; + if (nfc->data & RXH_IP_DST) + hfld |= ICE_FLOW_HASH_FLD_IPV6_DA; + break; + default: + break; + } + } + + if (nfc->data & RXH_L4_B_0_1 || nfc->data & RXH_L4_B_2_3) { + switch (nfc->flow_type) { + case TCP_V4_FLOW: + case TCP_V6_FLOW: + if (nfc->data & RXH_L4_B_0_1) + hfld |= ICE_FLOW_HASH_FLD_TCP_SRC_PORT; + if (nfc->data & RXH_L4_B_2_3) + hfld |= ICE_FLOW_HASH_FLD_TCP_DST_PORT; + break; + case UDP_V4_FLOW: + case UDP_V6_FLOW: + if (nfc->data & RXH_L4_B_0_1) + hfld |= ICE_FLOW_HASH_FLD_UDP_SRC_PORT; + if (nfc->data & RXH_L4_B_2_3) + hfld |= ICE_FLOW_HASH_FLD_UDP_DST_PORT; + break; + case SCTP_V4_FLOW: + case SCTP_V6_FLOW: + if (nfc->data & RXH_L4_B_0_1) + hfld |= ICE_FLOW_HASH_FLD_SCTP_SRC_PORT; + if (nfc->data & RXH_L4_B_2_3) + hfld |= ICE_FLOW_HASH_FLD_SCTP_DST_PORT; + break; + default: + break; + } + } + + return hfld; +} + +/** + * ice_set_rss_hash_opt - Enable/Disable flow types for RSS hash + * @vsi: the VSI being configured + * @nfc: ethtool rxnfc command + * + * Returns Success if the flow input set is supported. + */ +static int +ice_set_rss_hash_opt(struct ice_vsi *vsi, struct ethtool_rxnfc *nfc) +{ + struct ice_pf *pf = vsi->back; + enum ice_status status; + struct device *dev; + u64 hashed_flds; + u32 hdrs; + + dev = ice_pf_to_dev(pf); + if (ice_is_safe_mode(pf)) { + dev_dbg(dev, "Advanced RSS disabled. Package download failed, vsi num = %d\n", + vsi->vsi_num); + return -EINVAL; + } + + hashed_flds = ice_parse_hash_flds(nfc); + if (hashed_flds == ICE_HASH_INVALID) { + dev_dbg(dev, "Invalid hash fields, vsi num = %d\n", + vsi->vsi_num); + return -EINVAL; + } + + hdrs = ice_parse_hdrs(nfc); + if (hdrs == ICE_FLOW_SEG_HDR_NONE) { + dev_dbg(dev, "Header type is not valid, vsi num = %d\n", + vsi->vsi_num); + return -EINVAL; + } + + status = ice_add_rss_cfg(&pf->hw, vsi->idx, hashed_flds, hdrs); + if (status) { + dev_dbg(dev, "ice_add_rss_cfg failed, vsi num = %d, error = %d\n", + vsi->vsi_num, status); + return -EINVAL; + } + + return 0; +} + +/** + * ice_get_rss_hash_opt - Retrieve hash fields for a given flow-type + * @vsi: the VSI being configured + * @nfc: ethtool rxnfc command + */ +static void +ice_get_rss_hash_opt(struct ice_vsi *vsi, struct ethtool_rxnfc *nfc) +{ + struct ice_pf *pf = vsi->back; + struct device *dev; + u64 hash_flds; + u32 hdrs; + + dev = ice_pf_to_dev(pf); + + nfc->data = 0; + if (ice_is_safe_mode(pf)) { + dev_dbg(dev, "Advanced RSS disabled. Package download failed, vsi num = %d\n", + vsi->vsi_num); + return; + } + + hdrs = ice_parse_hdrs(nfc); + if (hdrs == ICE_FLOW_SEG_HDR_NONE) { + dev_dbg(dev, "Header type is not valid, vsi num = %d\n", + vsi->vsi_num); + return; + } + + hash_flds = ice_get_rss_cfg(&pf->hw, vsi->idx, hdrs); + if (hash_flds == ICE_HASH_INVALID) { + dev_dbg(dev, "No hash fields found for the given header type, vsi num = %d\n", + vsi->vsi_num); + return; + } + + if (hash_flds & ICE_FLOW_HASH_FLD_IPV4_SA || + hash_flds & ICE_FLOW_HASH_FLD_IPV6_SA) + nfc->data |= (u64)RXH_IP_SRC; + + if (hash_flds & ICE_FLOW_HASH_FLD_IPV4_DA || + hash_flds & ICE_FLOW_HASH_FLD_IPV6_DA) + nfc->data |= (u64)RXH_IP_DST; + + if (hash_flds & ICE_FLOW_HASH_FLD_TCP_SRC_PORT || + hash_flds & ICE_FLOW_HASH_FLD_UDP_SRC_PORT || + hash_flds & ICE_FLOW_HASH_FLD_SCTP_SRC_PORT) + nfc->data |= (u64)RXH_L4_B_0_1; + + if (hash_flds & ICE_FLOW_HASH_FLD_TCP_DST_PORT || + hash_flds & ICE_FLOW_HASH_FLD_UDP_DST_PORT || + hash_flds & ICE_FLOW_HASH_FLD_SCTP_DST_PORT) + nfc->data |= (u64)RXH_L4_B_2_3; +} + +/** + * ice_set_rxnfc - command to set Rx flow rules. + * @netdev: network interface device structure + * @cmd: ethtool rxnfc command + * + * Returns 0 for success and negative values for errors + */ +static int ice_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + + switch (cmd->cmd) { + case ETHTOOL_SRXFH: + return ice_set_rss_hash_opt(vsi, cmd); + default: + break; + } + return -EOPNOTSUPP; +} + +/** * ice_get_rxnfc - command to get Rx flow classification rules * @netdev: network interface device structure * @cmd: ethtool rxnfc command @@ -2554,6 +2792,10 @@ ice_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd, cmd->data = vsi->rss_size; ret = 0; break; + case ETHTOOL_GRXFH: + ice_get_rss_hash_opt(vsi, cmd); + ret = 0; + break; default: break; } @@ -3857,6 +4099,7 @@ static const struct ethtool_ops ice_ethtool_ops = { .set_priv_flags = ice_set_priv_flags, .get_sset_count = ice_get_sset_count, .get_rxnfc = ice_get_rxnfc, + .set_rxnfc = ice_set_rxnfc, .get_ringparam = ice_get_ringparam, .set_ringparam = ice_set_ringparam, .nway_reset = ice_nway_reset, diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c index cbd53b586c36..99208946224c 100644 --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.c +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.c @@ -3,6 +3,87 @@ #include "ice_common.h" #include "ice_flex_pipe.h" +#include "ice_flow.h" + +static const u32 ice_sect_lkup[ICE_BLK_COUNT][ICE_SECT_COUNT] = { + /* SWITCH */ + { + ICE_SID_XLT0_SW, + ICE_SID_XLT_KEY_BUILDER_SW, + ICE_SID_XLT1_SW, + ICE_SID_XLT2_SW, + ICE_SID_PROFID_TCAM_SW, + ICE_SID_PROFID_REDIR_SW, + ICE_SID_FLD_VEC_SW, + ICE_SID_CDID_KEY_BUILDER_SW, + ICE_SID_CDID_REDIR_SW + }, + + /* ACL */ + { + ICE_SID_XLT0_ACL, + ICE_SID_XLT_KEY_BUILDER_ACL, + ICE_SID_XLT1_ACL, + ICE_SID_XLT2_ACL, + ICE_SID_PROFID_TCAM_ACL, + ICE_SID_PROFID_REDIR_ACL, + ICE_SID_FLD_VEC_ACL, + ICE_SID_CDID_KEY_BUILDER_ACL, + ICE_SID_CDID_REDIR_ACL + }, + + /* FD */ + { + ICE_SID_XLT0_FD, + ICE_SID_XLT_KEY_BUILDER_FD, + ICE_SID_XLT1_FD, + ICE_SID_XLT2_FD, + ICE_SID_PROFID_TCAM_FD, + ICE_SID_PROFID_REDIR_FD, + ICE_SID_FLD_VEC_FD, + ICE_SID_CDID_KEY_BUILDER_FD, + ICE_SID_CDID_REDIR_FD + }, + + /* RSS */ + { + ICE_SID_XLT0_RSS, + ICE_SID_XLT_KEY_BUILDER_RSS, + ICE_SID_XLT1_RSS, + ICE_SID_XLT2_RSS, + ICE_SID_PROFID_TCAM_RSS, + ICE_SID_PROFID_REDIR_RSS, + ICE_SID_FLD_VEC_RSS, + ICE_SID_CDID_KEY_BUILDER_RSS, + ICE_SID_CDID_REDIR_RSS + }, + + /* PE */ + { + ICE_SID_XLT0_PE, + ICE_SID_XLT_KEY_BUILDER_PE, + ICE_SID_XLT1_PE, + ICE_SID_XLT2_PE, + ICE_SID_PROFID_TCAM_PE, + ICE_SID_PROFID_REDIR_PE, + ICE_SID_FLD_VEC_PE, + ICE_SID_CDID_KEY_BUILDER_PE, + ICE_SID_CDID_REDIR_PE + } +}; + +/** + * ice_sect_id - returns section ID + * @blk: block type + * @sect: section type + * + * This helper function returns the proper section ID given a block type and a + * section type. + */ +static u32 ice_sect_id(enum ice_block blk, enum ice_sect sect) +{ + return ice_sect_lkup[blk][sect]; +} /** * ice_pkg_val_buf @@ -158,6 +239,176 @@ ice_pkg_enum_section(struct ice_seg *ice_seg, struct ice_pkg_enum *state, return state->sect; } +/* Key creation */ + +#define ICE_DC_KEY 0x1 /* don't care */ +#define ICE_DC_KEYINV 0x1 +#define ICE_NM_KEY 0x0 /* never match */ +#define ICE_NM_KEYINV 0x0 +#define ICE_0_KEY 0x1 /* match 0 */ +#define ICE_0_KEYINV 0x0 +#define ICE_1_KEY 0x0 /* match 1 */ +#define ICE_1_KEYINV 0x1 + +/** + * ice_gen_key_word - generate 16-bits of a key/mask word + * @val: the value + * @valid: valid bits mask (change only the valid bits) + * @dont_care: don't care mask + * @nvr_mtch: never match mask + * @key: pointer to an array of where the resulting key portion + * @key_inv: pointer to an array of where the resulting key invert portion + * + * This function generates 16-bits from a 8-bit value, an 8-bit don't care mask + * and an 8-bit never match mask. The 16-bits of output are divided into 8 bits + * of key and 8 bits of key invert. + * + * '0' = b01, always match a 0 bit + * '1' = b10, always match a 1 bit + * '?' = b11, don't care bit (always matches) + * '~' = b00, never match bit + * + * Input: + * val: b0 1 0 1 0 1 + * dont_care: b0 0 1 1 0 0 + * never_mtch: b0 0 0 0 1 1 + * ------------------------------ + * Result: key: b01 10 11 11 00 00 + */ +static enum ice_status +ice_gen_key_word(u8 val, u8 valid, u8 dont_care, u8 nvr_mtch, u8 *key, + u8 *key_inv) +{ + u8 in_key = *key, in_key_inv = *key_inv; + u8 i; + + /* 'dont_care' and 'nvr_mtch' masks cannot overlap */ + if ((dont_care ^ nvr_mtch) != (dont_care | nvr_mtch)) + return ICE_ERR_CFG; + + *key = 0; + *key_inv = 0; + + /* encode the 8 bits into 8-bit key and 8-bit key invert */ + for (i = 0; i < 8; i++) { + *key >>= 1; + *key_inv >>= 1; + + if (!(valid & 0x1)) { /* change only valid bits */ + *key |= (in_key & 0x1) << 7; + *key_inv |= (in_key_inv & 0x1) << 7; + } else if (dont_care & 0x1) { /* don't care bit */ + *key |= ICE_DC_KEY << 7; + *key_inv |= ICE_DC_KEYINV << 7; + } else if (nvr_mtch & 0x1) { /* never match bit */ + *key |= ICE_NM_KEY << 7; + *key_inv |= ICE_NM_KEYINV << 7; + } else if (val & 0x01) { /* exact 1 match */ + *key |= ICE_1_KEY << 7; + *key_inv |= ICE_1_KEYINV << 7; + } else { /* exact 0 match */ + *key |= ICE_0_KEY << 7; + *key_inv |= ICE_0_KEYINV << 7; + } + + dont_care >>= 1; + nvr_mtch >>= 1; + valid >>= 1; + val >>= 1; + in_key >>= 1; + in_key_inv >>= 1; + } + + return 0; +} + +/** + * ice_bits_max_set - determine if the number of bits set is within a maximum + * @mask: pointer to the byte array which is the mask + * @size: the number of bytes in the mask + * @max: the max number of set bits + * + * This function determines if there are at most 'max' number of bits set in an + * array. Returns true if the number for bits set is <= max or will return false + * otherwise. + */ +static bool ice_bits_max_set(const u8 *mask, u16 size, u16 max) +{ + u16 count = 0; + u16 i; + + /* check each byte */ + for (i = 0; i < size; i++) { + /* if 0, go to next byte */ + if (!mask[i]) + continue; + + /* We know there is at least one set bit in this byte because of + * the above check; if we already have found 'max' number of + * bits set, then we can return failure now. + */ + if (count == max) + return false; + + /* count the bits in this byte, checking threshold */ + count += hweight8(mask[i]); + if (count > max) + return false; + } + + return true; +} + +/** + * ice_set_key - generate a variable sized key with multiples of 16-bits + * @key: pointer to where the key will be stored + * @size: the size of the complete key in bytes (must be even) + * @val: array of 8-bit values that makes up the value portion of the key + * @upd: array of 8-bit masks that determine what key portion to update + * @dc: array of 8-bit masks that make up the don't care mask + * @nm: array of 8-bit masks that make up the never match mask + * @off: the offset of the first byte in the key to update + * @len: the number of bytes in the key update + * + * This function generates a key from a value, a don't care mask and a never + * match mask. + * upd, dc, and nm are optional parameters, and can be NULL: + * upd == NULL --> udp mask is all 1's (update all bits) + * dc == NULL --> dc mask is all 0's (no don't care bits) + * nm == NULL --> nm mask is all 0's (no never match bits) + */ +static enum ice_status +ice_set_key(u8 *key, u16 size, u8 *val, u8 *upd, u8 *dc, u8 *nm, u16 off, + u16 len) +{ + u16 half_size; + u16 i; + + /* size must be a multiple of 2 bytes. */ + if (size % 2) + return ICE_ERR_CFG; + + half_size = size / 2; + if (off + len > half_size) + return ICE_ERR_CFG; + + /* Make sure at most one bit is set in the never match mask. Having more + * than one never match mask bit set will cause HW to consume excessive + * power otherwise; this is a power management efficiency check. + */ +#define ICE_NVR_MTCH_BITS_MAX 1 + if (nm && !ice_bits_max_set(nm, len, ICE_NVR_MTCH_BITS_MAX)) + return ICE_ERR_CFG; + + for (i = 0; i < len; i++) + if (ice_gen_key_word(val[i], upd ? upd[i] : 0xff, + dc ? dc[i] : 0, nm ? nm[i] : 0, + key + off + i, key + half_size + off + i)) + return ICE_ERR_CFG; + + return 0; +} + /** * ice_acquire_global_cfg_lock * @hw: pointer to the HW structure @@ -205,6 +456,31 @@ static void ice_release_global_cfg_lock(struct ice_hw *hw) } /** + * ice_acquire_change_lock + * @hw: pointer to the HW structure + * @access: access type (read or write) + * + * This function will request ownership of the change lock. + */ +static enum ice_status +ice_acquire_change_lock(struct ice_hw *hw, enum ice_aq_res_access_type access) +{ + return ice_acquire_res(hw, ICE_CHANGE_LOCK_RES_ID, access, + ICE_CHANGE_LOCK_TIMEOUT); +} + +/** + * ice_release_change_lock + * @hw: pointer to the HW structure + * + * This function will release the change lock using the proper Admin Command. + */ +static void ice_release_change_lock(struct ice_hw *hw) +{ + ice_release_res(hw, ICE_CHANGE_LOCK_RES_ID); +} + +/** * ice_aq_download_pkg * @hw: pointer to the hardware structure * @pkg_buf: the package buffer to transfer @@ -253,6 +529,54 @@ ice_aq_download_pkg(struct ice_hw *hw, struct ice_buf_hdr *pkg_buf, } /** + * ice_aq_update_pkg + * @hw: pointer to the hardware structure + * @pkg_buf: the package cmd buffer + * @buf_size: the size of the package cmd buffer + * @last_buf: last buffer indicator + * @error_offset: returns error offset + * @error_info: returns error information + * @cd: pointer to command details structure or NULL + * + * Update Package (0x0C42) + */ +static enum ice_status +ice_aq_update_pkg(struct ice_hw *hw, struct ice_buf_hdr *pkg_buf, u16 buf_size, + bool last_buf, u32 *error_offset, u32 *error_info, + struct ice_sq_cd *cd) +{ + struct ice_aqc_download_pkg *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + if (error_offset) + *error_offset = 0; + if (error_info) + *error_info = 0; + + cmd = &desc.params.download_pkg; + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_update_pkg); + desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD); + + if (last_buf) + cmd->flags |= ICE_AQC_DOWNLOAD_PKG_LAST_BUF; + + status = ice_aq_send_cmd(hw, &desc, pkg_buf, buf_size, cd); + if (status == ICE_ERR_AQ_ERROR) { + /* Read error from buffer only when the FW returned an error */ + struct ice_aqc_download_pkg_resp *resp; + + resp = (struct ice_aqc_download_pkg_resp *)pkg_buf; + if (error_offset) + *error_offset = le32_to_cpu(resp->error_offset); + if (error_info) + *error_info = le32_to_cpu(resp->error_info); + } + + return status; +} + +/** * ice_find_seg_in_pkg * @hw: pointer to the hardware structure * @seg_type: the segment type to search for (i.e., SEGMENT_TYPE_CPK) @@ -287,6 +611,44 @@ ice_find_seg_in_pkg(struct ice_hw *hw, u32 seg_type, } /** + * ice_update_pkg + * @hw: pointer to the hardware structure + * @bufs: pointer to an array of buffers + * @count: the number of buffers in the array + * + * Obtains change lock and updates package. + */ +static enum ice_status +ice_update_pkg(struct ice_hw *hw, struct ice_buf *bufs, u32 count) +{ + enum ice_status status; + u32 offset, info, i; + + status = ice_acquire_change_lock(hw, ICE_RES_WRITE); + if (status) + return status; + + for (i = 0; i < count; i++) { + struct ice_buf_hdr *bh = (struct ice_buf_hdr *)(bufs + i); + bool last = ((i + 1) == count); + + status = ice_aq_update_pkg(hw, bh, le16_to_cpu(bh->data_end), + last, &offset, &info, NULL); + + if (status) { + ice_debug(hw, ICE_DBG_PKG, + "Update pkg failed: err %d off %d inf %d\n", + status, offset, info); + break; + } + } + + ice_release_change_lock(hw); + + return status; +} + +/** * ice_dwnld_cfg_bufs * @hw: pointer to the hardware structure * @bufs: pointer to an array of buffers @@ -767,6 +1129,169 @@ enum ice_status ice_copy_and_init_pkg(struct ice_hw *hw, const u8 *buf, u32 len) return status; } +/** + * ice_pkg_buf_alloc + * @hw: pointer to the HW structure + * + * Allocates a package buffer and returns a pointer to the buffer header. + * Note: all package contents must be in Little Endian form. + */ +static struct ice_buf_build *ice_pkg_buf_alloc(struct ice_hw *hw) +{ + struct ice_buf_build *bld; + struct ice_buf_hdr *buf; + + bld = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*bld), GFP_KERNEL); + if (!bld) + return NULL; + + buf = (struct ice_buf_hdr *)bld; + buf->data_end = cpu_to_le16(offsetof(struct ice_buf_hdr, + section_entry)); + return bld; +} + +/** + * ice_pkg_buf_free + * @hw: pointer to the HW structure + * @bld: pointer to pkg build (allocated by ice_pkg_buf_alloc()) + * + * Frees a package buffer + */ +static void ice_pkg_buf_free(struct ice_hw *hw, struct ice_buf_build *bld) +{ + devm_kfree(ice_hw_to_dev(hw), bld); +} + +/** + * ice_pkg_buf_reserve_section + * @bld: pointer to pkg build (allocated by ice_pkg_buf_alloc()) + * @count: the number of sections to reserve + * + * Reserves one or more section table entries in a package buffer. This routine + * can be called multiple times as long as they are made before calling + * ice_pkg_buf_alloc_section(). Once ice_pkg_buf_alloc_section() + * is called once, the number of sections that can be allocated will not be able + * to be increased; not using all reserved sections is fine, but this will + * result in some wasted space in the buffer. + * Note: all package contents must be in Little Endian form. + */ +static enum ice_status +ice_pkg_buf_reserve_section(struct ice_buf_build *bld, u16 count) +{ + struct ice_buf_hdr *buf; + u16 section_count; + u16 data_end; + + if (!bld) + return ICE_ERR_PARAM; + + buf = (struct ice_buf_hdr *)&bld->buf; + + /* already an active section, can't increase table size */ + section_count = le16_to_cpu(buf->section_count); + if (section_count > 0) + return ICE_ERR_CFG; + + if (bld->reserved_section_table_entries + count > ICE_MAX_S_COUNT) + return ICE_ERR_CFG; + bld->reserved_section_table_entries += count; + + data_end = le16_to_cpu(buf->data_end) + + (count * sizeof(buf->section_entry[0])); + buf->data_end = cpu_to_le16(data_end); + + return 0; +} + +/** + * ice_pkg_buf_alloc_section + * @bld: pointer to pkg build (allocated by ice_pkg_buf_alloc()) + * @type: the section type value + * @size: the size of the section to reserve (in bytes) + * + * Reserves memory in the buffer for a section's content and updates the + * buffers' status accordingly. This routine returns a pointer to the first + * byte of the section start within the buffer, which is used to fill in the + * section contents. + * Note: all package contents must be in Little Endian form. + */ +static void * +ice_pkg_buf_alloc_section(struct ice_buf_build *bld, u32 type, u16 size) +{ + struct ice_buf_hdr *buf; + u16 sect_count; + u16 data_end; + + if (!bld || !type || !size) + return NULL; + + buf = (struct ice_buf_hdr *)&bld->buf; + + /* check for enough space left in buffer */ + data_end = le16_to_cpu(buf->data_end); + + /* section start must align on 4 byte boundary */ + data_end = ALIGN(data_end, 4); + + if ((data_end + size) > ICE_MAX_S_DATA_END) + return NULL; + + /* check for more available section table entries */ + sect_count = le16_to_cpu(buf->section_count); + if (sect_count < bld->reserved_section_table_entries) { + void *section_ptr = ((u8 *)buf) + data_end; + + buf->section_entry[sect_count].offset = cpu_to_le16(data_end); + buf->section_entry[sect_count].size = cpu_to_le16(size); + buf->section_entry[sect_count].type = cpu_to_le32(type); + + data_end += size; + buf->data_end = cpu_to_le16(data_end); + + buf->section_count = cpu_to_le16(sect_count + 1); + return section_ptr; + } + + /* no free section table entries */ + return NULL; +} + +/** + * ice_pkg_buf_get_active_sections + * @bld: pointer to pkg build (allocated by ice_pkg_buf_alloc()) + * + * Returns the number of active sections. Before using the package buffer + * in an update package command, the caller should make sure that there is at + * least one active section - otherwise, the buffer is not legal and should + * not be used. + * Note: all package contents must be in Little Endian form. + */ +static u16 ice_pkg_buf_get_active_sections(struct ice_buf_build *bld) +{ + struct ice_buf_hdr *buf; + + if (!bld) + return 0; + + buf = (struct ice_buf_hdr *)&bld->buf; + return le16_to_cpu(buf->section_count); +} + +/** + * ice_pkg_buf + * @bld: pointer to pkg build (allocated by ice_pkg_buf_alloc()) + * + * Return a pointer to the buffer's header + */ +static struct ice_buf *ice_pkg_buf(struct ice_buf_build *bld) +{ + if (!bld) + return NULL; + + return &bld->buf; +} + /* PTG Management */ /** @@ -951,6 +1476,48 @@ enum ice_sid_all { ICE_SID_OFF_COUNT, }; +/* Characteristic handling */ + +/** + * ice_match_prop_lst - determine if properties of two lists match + * @list1: first properties list + * @list2: second properties list + * + * Count, cookies and the order must match in order to be considered equivalent. + */ +static bool +ice_match_prop_lst(struct list_head *list1, struct list_head *list2) +{ + struct ice_vsig_prof *tmp1; + struct ice_vsig_prof *tmp2; + u16 chk_count = 0; + u16 count = 0; + + /* compare counts */ + list_for_each_entry(tmp1, list1, list) + count++; + list_for_each_entry(tmp2, list2, list) + chk_count++; + if (!count || count != chk_count) + return false; + + tmp1 = list_first_entry(list1, struct ice_vsig_prof, list); + tmp2 = list_first_entry(list2, struct ice_vsig_prof, list); + + /* profile cookies must compare, and in the exact same order to take + * into account priority + */ + while (count--) { + if (tmp2->profile_cookie != tmp1->profile_cookie) + return false; + + tmp1 = list_next_entry(tmp1, list); + tmp2 = list_next_entry(tmp2, list); + } + + return true; +} + /* VSIG Management */ /** @@ -999,6 +1566,117 @@ static u16 ice_vsig_alloc_val(struct ice_hw *hw, enum ice_block blk, u16 vsig) } /** + * ice_vsig_alloc - Finds a free entry and allocates a new VSIG + * @hw: pointer to the hardware structure + * @blk: HW block + * + * This function will iterate through the VSIG list and mark the first + * unused entry for the new VSIG entry as used and return that value. + */ +static u16 ice_vsig_alloc(struct ice_hw *hw, enum ice_block blk) +{ + u16 i; + + for (i = 1; i < ICE_MAX_VSIGS; i++) + if (!hw->blk[blk].xlt2.vsig_tbl[i].in_use) + return ice_vsig_alloc_val(hw, blk, i); + + return ICE_DEFAULT_VSIG; +} + +/** + * ice_find_dup_props_vsig - find VSI group with a specified set of properties + * @hw: pointer to the hardware structure + * @blk: HW block + * @chs: characteristic list + * @vsig: returns the VSIG with the matching profiles, if found + * + * Each VSIG is associated with a characteristic set; i.e. all VSIs under + * a group have the same characteristic set. To check if there exists a VSIG + * which has the same characteristics as the input characteristics; this + * function will iterate through the XLT2 list and return the VSIG that has a + * matching configuration. In order to make sure that priorities are accounted + * for, the list must match exactly, including the order in which the + * characteristics are listed. + */ +static enum ice_status +ice_find_dup_props_vsig(struct ice_hw *hw, enum ice_block blk, + struct list_head *chs, u16 *vsig) +{ + struct ice_xlt2 *xlt2 = &hw->blk[blk].xlt2; + u16 i; + + for (i = 0; i < xlt2->count; i++) + if (xlt2->vsig_tbl[i].in_use && + ice_match_prop_lst(chs, &xlt2->vsig_tbl[i].prop_lst)) { + *vsig = ICE_VSIG_VALUE(i, hw->pf_id); + return 0; + } + + return ICE_ERR_DOES_NOT_EXIST; +} + +/** + * ice_vsig_free - free VSI group + * @hw: pointer to the hardware structure + * @blk: HW block + * @vsig: VSIG to remove + * + * The function will remove all VSIs associated with the input VSIG and move + * them to the DEFAULT_VSIG and mark the VSIG available. + */ +static enum ice_status +ice_vsig_free(struct ice_hw *hw, enum ice_block blk, u16 vsig) +{ + struct ice_vsig_prof *dtmp, *del; + struct ice_vsig_vsi *vsi_cur; + u16 idx; + + idx = vsig & ICE_VSIG_IDX_M; + if (idx >= ICE_MAX_VSIGS) + return ICE_ERR_PARAM; + + if (!hw->blk[blk].xlt2.vsig_tbl[idx].in_use) + return ICE_ERR_DOES_NOT_EXIST; + + hw->blk[blk].xlt2.vsig_tbl[idx].in_use = false; + + vsi_cur = hw->blk[blk].xlt2.vsig_tbl[idx].first_vsi; + /* If the VSIG has at least 1 VSI then iterate through the + * list and remove the VSIs before deleting the group. + */ + if (vsi_cur) { + /* remove all vsis associated with this VSIG XLT2 entry */ + do { + struct ice_vsig_vsi *tmp = vsi_cur->next_vsi; + + vsi_cur->vsig = ICE_DEFAULT_VSIG; + vsi_cur->changed = 1; + vsi_cur->next_vsi = NULL; + vsi_cur = tmp; + } while (vsi_cur); + + /* NULL terminate head of VSI list */ + hw->blk[blk].xlt2.vsig_tbl[idx].first_vsi = NULL; + } + + /* free characteristic list */ + list_for_each_entry_safe(del, dtmp, + &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) { + list_del(&del->list); + devm_kfree(ice_hw_to_dev(hw), del); + } + + /* if VSIG characteristic list was cleared for reset + * re-initialize the list head + */ + INIT_LIST_HEAD(&hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst); + + return 0; +} + +/** * ice_vsig_remove_vsi - remove VSI from VSIG * @hw: pointer to the hardware structure * @blk: HW block @@ -1117,6 +1795,215 @@ ice_vsig_add_mv_vsi(struct ice_hw *hw, enum ice_block blk, u16 vsi, u16 vsig) return 0; } +/** + * ice_find_prof_id - find profile ID for a given field vector + * @hw: pointer to the hardware structure + * @blk: HW block + * @fv: field vector to search for + * @prof_id: receives the profile ID + */ +static enum ice_status +ice_find_prof_id(struct ice_hw *hw, enum ice_block blk, + struct ice_fv_word *fv, u8 *prof_id) +{ + struct ice_es *es = &hw->blk[blk].es; + u16 off, i; + + for (i = 0; i < es->count; i++) { + off = i * es->fvw; + + if (memcmp(&es->t[off], fv, es->fvw * sizeof(*fv))) + continue; + + *prof_id = i; + return 0; + } + + return ICE_ERR_DOES_NOT_EXIST; +} + +/** + * ice_prof_id_rsrc_type - get profile ID resource type for a block type + * @blk: the block type + * @rsrc_type: pointer to variable to receive the resource type + */ +static bool ice_prof_id_rsrc_type(enum ice_block blk, u16 *rsrc_type) +{ + switch (blk) { + case ICE_BLK_RSS: + *rsrc_type = ICE_AQC_RES_TYPE_HASH_PROF_BLDR_PROFID; + break; + default: + return false; + } + return true; +} + +/** + * ice_tcam_ent_rsrc_type - get TCAM entry resource type for a block type + * @blk: the block type + * @rsrc_type: pointer to variable to receive the resource type + */ +static bool ice_tcam_ent_rsrc_type(enum ice_block blk, u16 *rsrc_type) +{ + switch (blk) { + case ICE_BLK_RSS: + *rsrc_type = ICE_AQC_RES_TYPE_HASH_PROF_BLDR_TCAM; + break; + default: + return false; + } + return true; +} + +/** + * ice_alloc_tcam_ent - allocate hardware TCAM entry + * @hw: pointer to the HW struct + * @blk: the block to allocate the TCAM for + * @tcam_idx: pointer to variable to receive the TCAM entry + * + * This function allocates a new entry in a Profile ID TCAM for a specific + * block. + */ +static enum ice_status +ice_alloc_tcam_ent(struct ice_hw *hw, enum ice_block blk, u16 *tcam_idx) +{ + u16 res_type; + + if (!ice_tcam_ent_rsrc_type(blk, &res_type)) + return ICE_ERR_PARAM; + + return ice_alloc_hw_res(hw, res_type, 1, true, tcam_idx); +} + +/** + * ice_free_tcam_ent - free hardware TCAM entry + * @hw: pointer to the HW struct + * @blk: the block from which to free the TCAM entry + * @tcam_idx: the TCAM entry to free + * + * This function frees an entry in a Profile ID TCAM for a specific block. + */ +static enum ice_status +ice_free_tcam_ent(struct ice_hw *hw, enum ice_block blk, u16 tcam_idx) +{ + u16 res_type; + + if (!ice_tcam_ent_rsrc_type(blk, &res_type)) + return ICE_ERR_PARAM; + + return ice_free_hw_res(hw, res_type, 1, &tcam_idx); +} + +/** + * ice_alloc_prof_id - allocate profile ID + * @hw: pointer to the HW struct + * @blk: the block to allocate the profile ID for + * @prof_id: pointer to variable to receive the profile ID + * + * This function allocates a new profile ID, which also corresponds to a Field + * Vector (Extraction Sequence) entry. + */ +static enum ice_status +ice_alloc_prof_id(struct ice_hw *hw, enum ice_block blk, u8 *prof_id) +{ + enum ice_status status; + u16 res_type; + u16 get_prof; + + if (!ice_prof_id_rsrc_type(blk, &res_type)) + return ICE_ERR_PARAM; + + status = ice_alloc_hw_res(hw, res_type, 1, false, &get_prof); + if (!status) + *prof_id = (u8)get_prof; + + return status; +} + +/** + * ice_free_prof_id - free profile ID + * @hw: pointer to the HW struct + * @blk: the block from which to free the profile ID + * @prof_id: the profile ID to free + * + * This function frees a profile ID, which also corresponds to a Field Vector. + */ +static enum ice_status +ice_free_prof_id(struct ice_hw *hw, enum ice_block blk, u8 prof_id) +{ + u16 tmp_prof_id = (u16)prof_id; + u16 res_type; + + if (!ice_prof_id_rsrc_type(blk, &res_type)) + return ICE_ERR_PARAM; + + return ice_free_hw_res(hw, res_type, 1, &tmp_prof_id); +} + +/** + * ice_prof_inc_ref - increment reference count for profile + * @hw: pointer to the HW struct + * @blk: the block from which to free the profile ID + * @prof_id: the profile ID for which to increment the reference count + */ +static enum ice_status +ice_prof_inc_ref(struct ice_hw *hw, enum ice_block blk, u8 prof_id) +{ + if (prof_id > hw->blk[blk].es.count) + return ICE_ERR_PARAM; + + hw->blk[blk].es.ref_count[prof_id]++; + + return 0; +} + +/** + * ice_write_es - write an extraction sequence to hardware + * @hw: pointer to the HW struct + * @blk: the block in which to write the extraction sequence + * @prof_id: the profile ID to write + * @fv: pointer to the extraction sequence to write - NULL to clear extraction + */ +static void +ice_write_es(struct ice_hw *hw, enum ice_block blk, u8 prof_id, + struct ice_fv_word *fv) +{ + u16 off; + + off = prof_id * hw->blk[blk].es.fvw; + if (!fv) { + memset(&hw->blk[blk].es.t[off], 0, + hw->blk[blk].es.fvw * sizeof(*fv)); + hw->blk[blk].es.written[prof_id] = false; + } else { + memcpy(&hw->blk[blk].es.t[off], fv, + hw->blk[blk].es.fvw * sizeof(*fv)); + } +} + +/** + * ice_prof_dec_ref - decrement reference count for profile + * @hw: pointer to the HW struct + * @blk: the block from which to free the profile ID + * @prof_id: the profile ID for which to decrement the reference count + */ +static enum ice_status +ice_prof_dec_ref(struct ice_hw *hw, enum ice_block blk, u8 prof_id) +{ + if (prof_id > hw->blk[blk].es.count) + return ICE_ERR_PARAM; + + if (hw->blk[blk].es.ref_count[prof_id] > 0) { + if (!--hw->blk[blk].es.ref_count[prof_id]) { + ice_write_es(hw, blk, prof_id, NULL); + return ice_free_prof_id(hw, blk, prof_id); + } + } + + return 0; +} + /* Block / table section IDs */ static const u32 ice_blk_sids[ICE_BLK_COUNT][ICE_SID_OFF_COUNT] = { /* SWITCH */ @@ -1374,16 +2261,85 @@ void ice_fill_blk_tbls(struct ice_hw *hw) } /** + * ice_free_prof_map - free profile map + * @hw: pointer to the hardware structure + * @blk_idx: HW block index + */ +static void ice_free_prof_map(struct ice_hw *hw, u8 blk_idx) +{ + struct ice_es *es = &hw->blk[blk_idx].es; + struct ice_prof_map *del, *tmp; + + mutex_lock(&es->prof_map_lock); + list_for_each_entry_safe(del, tmp, &es->prof_map, list) { + list_del(&del->list); + devm_kfree(ice_hw_to_dev(hw), del); + } + INIT_LIST_HEAD(&es->prof_map); + mutex_unlock(&es->prof_map_lock); +} + +/** + * ice_free_flow_profs - free flow profile entries + * @hw: pointer to the hardware structure + * @blk_idx: HW block index + */ +static void ice_free_flow_profs(struct ice_hw *hw, u8 blk_idx) +{ + struct ice_flow_prof *p, *tmp; + + mutex_lock(&hw->fl_profs_locks[blk_idx]); + list_for_each_entry_safe(p, tmp, &hw->fl_profs[blk_idx], l_entry) { + list_del(&p->l_entry); + devm_kfree(ice_hw_to_dev(hw), p); + } + mutex_unlock(&hw->fl_profs_locks[blk_idx]); + + /* if driver is in reset and tables are being cleared + * re-initialize the flow profile list heads + */ + INIT_LIST_HEAD(&hw->fl_profs[blk_idx]); +} + +/** + * ice_free_vsig_tbl - free complete VSIG table entries + * @hw: pointer to the hardware structure + * @blk: the HW block on which to free the VSIG table entries + */ +static void ice_free_vsig_tbl(struct ice_hw *hw, enum ice_block blk) +{ + u16 i; + + if (!hw->blk[blk].xlt2.vsig_tbl) + return; + + for (i = 1; i < ICE_MAX_VSIGS; i++) + if (hw->blk[blk].xlt2.vsig_tbl[i].in_use) + ice_vsig_free(hw, blk, i); +} + +/** * ice_free_hw_tbls - free hardware table memory * @hw: pointer to the hardware structure */ void ice_free_hw_tbls(struct ice_hw *hw) { + struct ice_rss_cfg *r, *rt; u8 i; for (i = 0; i < ICE_BLK_COUNT; i++) { - hw->blk[i].is_list_init = false; + if (hw->blk[i].is_list_init) { + struct ice_es *es = &hw->blk[i].es; + + ice_free_prof_map(hw, i); + mutex_destroy(&es->prof_map_lock); + ice_free_flow_profs(hw, i); + mutex_destroy(&hw->fl_profs_locks[i]); + + hw->blk[i].is_list_init = false; + } + ice_free_vsig_tbl(hw, (enum ice_block)i); devm_kfree(ice_hw_to_dev(hw), hw->blk[i].xlt1.ptypes); devm_kfree(ice_hw_to_dev(hw), hw->blk[i].xlt1.ptg_tbl); devm_kfree(ice_hw_to_dev(hw), hw->blk[i].xlt1.t); @@ -1397,10 +2353,26 @@ void ice_free_hw_tbls(struct ice_hw *hw) devm_kfree(ice_hw_to_dev(hw), hw->blk[i].es.written); } + list_for_each_entry_safe(r, rt, &hw->rss_list_head, l_entry) { + list_del(&r->l_entry); + devm_kfree(ice_hw_to_dev(hw), r); + } + mutex_destroy(&hw->rss_locks); memset(hw->blk, 0, sizeof(hw->blk)); } /** + * ice_init_flow_profs - init flow profile locks and list heads + * @hw: pointer to the hardware structure + * @blk_idx: HW block index + */ +static void ice_init_flow_profs(struct ice_hw *hw, u8 blk_idx) +{ + mutex_init(&hw->fl_profs_locks[blk_idx]); + INIT_LIST_HEAD(&hw->fl_profs[blk_idx]); +} + +/** * ice_clear_hw_tbls - clear HW tables and flow profiles * @hw: pointer to the hardware structure */ @@ -1415,6 +2387,13 @@ void ice_clear_hw_tbls(struct ice_hw *hw) struct ice_xlt2 *xlt2 = &hw->blk[i].xlt2; struct ice_es *es = &hw->blk[i].es; + if (hw->blk[i].is_list_init) { + ice_free_prof_map(hw, i); + ice_free_flow_profs(hw, i); + } + + ice_free_vsig_tbl(hw, (enum ice_block)i); + memset(xlt1->ptypes, 0, xlt1->count * sizeof(*xlt1->ptypes)); memset(xlt1->ptg_tbl, 0, ICE_MAX_PTGS * sizeof(*xlt1->ptg_tbl)); @@ -1443,6 +2422,8 @@ enum ice_status ice_init_hw_tbls(struct ice_hw *hw) { u8 i; + mutex_init(&hw->rss_locks); + INIT_LIST_HEAD(&hw->rss_list_head); for (i = 0; i < ICE_BLK_COUNT; i++) { struct ice_prof_redir *prof_redir = &hw->blk[i].prof_redir; struct ice_prof_tcam *prof = &hw->blk[i].prof; @@ -1454,6 +2435,9 @@ enum ice_status ice_init_hw_tbls(struct ice_hw *hw) if (hw->blk[i].is_list_init) continue; + ice_init_flow_profs(hw, i); + mutex_init(&es->prof_map_lock); + INIT_LIST_HEAD(&es->prof_map); hw->blk[i].is_list_init = true; hw->blk[i].overwrite = blk_sizes[i].overwrite; @@ -1547,3 +2531,1580 @@ err: ice_free_hw_tbls(hw); return ICE_ERR_NO_MEMORY; } + +/** + * ice_prof_gen_key - generate profile ID key + * @hw: pointer to the HW struct + * @blk: the block in which to write profile ID to + * @ptg: packet type group (PTG) portion of key + * @vsig: VSIG portion of key + * @cdid: CDID portion of key + * @flags: flag portion of key + * @vl_msk: valid mask + * @dc_msk: don't care mask + * @nm_msk: never match mask + * @key: output of profile ID key + */ +static enum ice_status +ice_prof_gen_key(struct ice_hw *hw, enum ice_block blk, u8 ptg, u16 vsig, + u8 cdid, u16 flags, u8 vl_msk[ICE_TCAM_KEY_VAL_SZ], + u8 dc_msk[ICE_TCAM_KEY_VAL_SZ], u8 nm_msk[ICE_TCAM_KEY_VAL_SZ], + u8 key[ICE_TCAM_KEY_SZ]) +{ + struct ice_prof_id_key inkey; + + inkey.xlt1 = ptg; + inkey.xlt2_cdid = cpu_to_le16(vsig); + inkey.flags = cpu_to_le16(flags); + + switch (hw->blk[blk].prof.cdid_bits) { + case 0: + break; + case 2: +#define ICE_CD_2_M 0xC000U +#define ICE_CD_2_S 14 + inkey.xlt2_cdid &= ~cpu_to_le16(ICE_CD_2_M); + inkey.xlt2_cdid |= cpu_to_le16(BIT(cdid) << ICE_CD_2_S); + break; + case 4: +#define ICE_CD_4_M 0xF000U +#define ICE_CD_4_S 12 + inkey.xlt2_cdid &= ~cpu_to_le16(ICE_CD_4_M); + inkey.xlt2_cdid |= cpu_to_le16(BIT(cdid) << ICE_CD_4_S); + break; + case 8: +#define ICE_CD_8_M 0xFF00U +#define ICE_CD_8_S 16 + inkey.xlt2_cdid &= ~cpu_to_le16(ICE_CD_8_M); + inkey.xlt2_cdid |= cpu_to_le16(BIT(cdid) << ICE_CD_8_S); + break; + default: + ice_debug(hw, ICE_DBG_PKG, "Error in profile config\n"); + break; + } + + return ice_set_key(key, ICE_TCAM_KEY_SZ, (u8 *)&inkey, vl_msk, dc_msk, + nm_msk, 0, ICE_TCAM_KEY_SZ / 2); +} + +/** + * ice_tcam_write_entry - write TCAM entry + * @hw: pointer to the HW struct + * @blk: the block in which to write profile ID to + * @idx: the entry index to write to + * @prof_id: profile ID + * @ptg: packet type group (PTG) portion of key + * @vsig: VSIG portion of key + * @cdid: CDID portion of key + * @flags: flag portion of key + * @vl_msk: valid mask + * @dc_msk: don't care mask + * @nm_msk: never match mask + */ +static enum ice_status +ice_tcam_write_entry(struct ice_hw *hw, enum ice_block blk, u16 idx, + u8 prof_id, u8 ptg, u16 vsig, u8 cdid, u16 flags, + u8 vl_msk[ICE_TCAM_KEY_VAL_SZ], + u8 dc_msk[ICE_TCAM_KEY_VAL_SZ], + u8 nm_msk[ICE_TCAM_KEY_VAL_SZ]) +{ + struct ice_prof_tcam_entry; + enum ice_status status; + + status = ice_prof_gen_key(hw, blk, ptg, vsig, cdid, flags, vl_msk, + dc_msk, nm_msk, hw->blk[blk].prof.t[idx].key); + if (!status) { + hw->blk[blk].prof.t[idx].addr = cpu_to_le16(idx); + hw->blk[blk].prof.t[idx].prof_id = prof_id; + } + + return status; +} + +/** + * ice_vsig_get_ref - returns number of VSIs belong to a VSIG + * @hw: pointer to the hardware structure + * @blk: HW block + * @vsig: VSIG to query + * @refs: pointer to variable to receive the reference count + */ +static enum ice_status +ice_vsig_get_ref(struct ice_hw *hw, enum ice_block blk, u16 vsig, u16 *refs) +{ + u16 idx = vsig & ICE_VSIG_IDX_M; + struct ice_vsig_vsi *ptr; + + *refs = 0; + + if (!hw->blk[blk].xlt2.vsig_tbl[idx].in_use) + return ICE_ERR_DOES_NOT_EXIST; + + ptr = hw->blk[blk].xlt2.vsig_tbl[idx].first_vsi; + while (ptr) { + (*refs)++; + ptr = ptr->next_vsi; + } + + return 0; +} + +/** + * ice_has_prof_vsig - check to see if VSIG has a specific profile + * @hw: pointer to the hardware structure + * @blk: HW block + * @vsig: VSIG to check against + * @hdl: profile handle + */ +static bool +ice_has_prof_vsig(struct ice_hw *hw, enum ice_block blk, u16 vsig, u64 hdl) +{ + u16 idx = vsig & ICE_VSIG_IDX_M; + struct ice_vsig_prof *ent; + + list_for_each_entry(ent, &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) + if (ent->profile_cookie == hdl) + return true; + + ice_debug(hw, ICE_DBG_INIT, + "Characteristic list for VSI group %d not found.\n", + vsig); + return false; +} + +/** + * ice_prof_bld_es - build profile ID extraction sequence changes + * @hw: pointer to the HW struct + * @blk: hardware block + * @bld: the update package buffer build to add to + * @chgs: the list of changes to make in hardware + */ +static enum ice_status +ice_prof_bld_es(struct ice_hw *hw, enum ice_block blk, + struct ice_buf_build *bld, struct list_head *chgs) +{ + u16 vec_size = hw->blk[blk].es.fvw * sizeof(struct ice_fv_word); + struct ice_chs_chg *tmp; + + list_for_each_entry(tmp, chgs, list_entry) + if (tmp->type == ICE_PTG_ES_ADD && tmp->add_prof) { + u16 off = tmp->prof_id * hw->blk[blk].es.fvw; + struct ice_pkg_es *p; + u32 id; + + id = ice_sect_id(blk, ICE_VEC_TBL); + p = (struct ice_pkg_es *) + ice_pkg_buf_alloc_section(bld, id, sizeof(*p) + + vec_size - + sizeof(p->es[0])); + + if (!p) + return ICE_ERR_MAX_LIMIT; + + p->count = cpu_to_le16(1); + p->offset = cpu_to_le16(tmp->prof_id); + + memcpy(p->es, &hw->blk[blk].es.t[off], vec_size); + } + + return 0; +} + +/** + * ice_prof_bld_tcam - build profile ID TCAM changes + * @hw: pointer to the HW struct + * @blk: hardware block + * @bld: the update package buffer build to add to + * @chgs: the list of changes to make in hardware + */ +static enum ice_status +ice_prof_bld_tcam(struct ice_hw *hw, enum ice_block blk, + struct ice_buf_build *bld, struct list_head *chgs) +{ + struct ice_chs_chg *tmp; + + list_for_each_entry(tmp, chgs, list_entry) + if (tmp->type == ICE_TCAM_ADD && tmp->add_tcam_idx) { + struct ice_prof_id_section *p; + u32 id; + + id = ice_sect_id(blk, ICE_PROF_TCAM); + p = (struct ice_prof_id_section *) + ice_pkg_buf_alloc_section(bld, id, sizeof(*p)); + + if (!p) + return ICE_ERR_MAX_LIMIT; + + p->count = cpu_to_le16(1); + p->entry[0].addr = cpu_to_le16(tmp->tcam_idx); + p->entry[0].prof_id = tmp->prof_id; + + memcpy(p->entry[0].key, + &hw->blk[blk].prof.t[tmp->tcam_idx].key, + sizeof(hw->blk[blk].prof.t->key)); + } + + return 0; +} + +/** + * ice_prof_bld_xlt1 - build XLT1 changes + * @blk: hardware block + * @bld: the update package buffer build to add to + * @chgs: the list of changes to make in hardware + */ +static enum ice_status +ice_prof_bld_xlt1(enum ice_block blk, struct ice_buf_build *bld, + struct list_head *chgs) +{ + struct ice_chs_chg *tmp; + + list_for_each_entry(tmp, chgs, list_entry) + if (tmp->type == ICE_PTG_ES_ADD && tmp->add_ptg) { + struct ice_xlt1_section *p; + u32 id; + + id = ice_sect_id(blk, ICE_XLT1); + p = (struct ice_xlt1_section *) + ice_pkg_buf_alloc_section(bld, id, sizeof(*p)); + + if (!p) + return ICE_ERR_MAX_LIMIT; + + p->count = cpu_to_le16(1); + p->offset = cpu_to_le16(tmp->ptype); + p->value[0] = tmp->ptg; + } + + return 0; +} + +/** + * ice_prof_bld_xlt2 - build XLT2 changes + * @blk: hardware block + * @bld: the update package buffer build to add to + * @chgs: the list of changes to make in hardware + */ +static enum ice_status +ice_prof_bld_xlt2(enum ice_block blk, struct ice_buf_build *bld, + struct list_head *chgs) +{ + struct ice_chs_chg *tmp; + + list_for_each_entry(tmp, chgs, list_entry) { + struct ice_xlt2_section *p; + u32 id; + + switch (tmp->type) { + case ICE_VSIG_ADD: + case ICE_VSI_MOVE: + case ICE_VSIG_REM: + id = ice_sect_id(blk, ICE_XLT2); + p = (struct ice_xlt2_section *) + ice_pkg_buf_alloc_section(bld, id, sizeof(*p)); + + if (!p) + return ICE_ERR_MAX_LIMIT; + + p->count = cpu_to_le16(1); + p->offset = cpu_to_le16(tmp->vsi); + p->value[0] = cpu_to_le16(tmp->vsig); + break; + default: + break; + } + } + + return 0; +} + +/** + * ice_upd_prof_hw - update hardware using the change list + * @hw: pointer to the HW struct + * @blk: hardware block + * @chgs: the list of changes to make in hardware + */ +static enum ice_status +ice_upd_prof_hw(struct ice_hw *hw, enum ice_block blk, + struct list_head *chgs) +{ + struct ice_buf_build *b; + struct ice_chs_chg *tmp; + enum ice_status status; + u16 pkg_sects; + u16 xlt1 = 0; + u16 xlt2 = 0; + u16 tcam = 0; + u16 es = 0; + u16 sects; + + /* count number of sections we need */ + list_for_each_entry(tmp, chgs, list_entry) { + switch (tmp->type) { + case ICE_PTG_ES_ADD: + if (tmp->add_ptg) + xlt1++; + if (tmp->add_prof) + es++; + break; + case ICE_TCAM_ADD: + tcam++; + break; + case ICE_VSIG_ADD: + case ICE_VSI_MOVE: + case ICE_VSIG_REM: + xlt2++; + break; + default: + break; + } + } + sects = xlt1 + xlt2 + tcam + es; + + if (!sects) + return 0; + + /* Build update package buffer */ + b = ice_pkg_buf_alloc(hw); + if (!b) + return ICE_ERR_NO_MEMORY; + + status = ice_pkg_buf_reserve_section(b, sects); + if (status) + goto error_tmp; + + /* Preserve order of table update: ES, TCAM, PTG, VSIG */ + if (es) { + status = ice_prof_bld_es(hw, blk, b, chgs); + if (status) + goto error_tmp; + } + + if (tcam) { + status = ice_prof_bld_tcam(hw, blk, b, chgs); + if (status) + goto error_tmp; + } + + if (xlt1) { + status = ice_prof_bld_xlt1(blk, b, chgs); + if (status) + goto error_tmp; + } + + if (xlt2) { + status = ice_prof_bld_xlt2(blk, b, chgs); + if (status) + goto error_tmp; + } + + /* After package buffer build check if the section count in buffer is + * non-zero and matches the number of sections detected for package + * update. + */ + pkg_sects = ice_pkg_buf_get_active_sections(b); + if (!pkg_sects || pkg_sects != sects) { + status = ICE_ERR_INVAL_SIZE; + goto error_tmp; + } + + /* update package */ + status = ice_update_pkg(hw, ice_pkg_buf(b), 1); + if (status == ICE_ERR_AQ_ERROR) + ice_debug(hw, ICE_DBG_INIT, "Unable to update HW profile\n"); + +error_tmp: + ice_pkg_buf_free(hw, b); + return status; +} + +/** + * ice_add_prof - add profile + * @hw: pointer to the HW struct + * @blk: hardware block + * @id: profile tracking ID + * @ptypes: array of bitmaps indicating ptypes (ICE_FLOW_PTYPE_MAX bits) + * @es: extraction sequence (length of array is determined by the block) + * + * This function registers a profile, which matches a set of PTGs with a + * particular extraction sequence. While the hardware profile is allocated + * it will not be written until the first call to ice_add_flow that specifies + * the ID value used here. + */ +enum ice_status +ice_add_prof(struct ice_hw *hw, enum ice_block blk, u64 id, u8 ptypes[], + struct ice_fv_word *es) +{ + u32 bytes = DIV_ROUND_UP(ICE_FLOW_PTYPE_MAX, BITS_PER_BYTE); + DECLARE_BITMAP(ptgs_used, ICE_XLT1_CNT); + struct ice_prof_map *prof; + enum ice_status status; + u32 byte = 0; + u8 prof_id; + + bitmap_zero(ptgs_used, ICE_XLT1_CNT); + + mutex_lock(&hw->blk[blk].es.prof_map_lock); + + /* search for existing profile */ + status = ice_find_prof_id(hw, blk, es, &prof_id); + if (status) { + /* allocate profile ID */ + status = ice_alloc_prof_id(hw, blk, &prof_id); + if (status) + goto err_ice_add_prof; + + /* and write new es */ + ice_write_es(hw, blk, prof_id, es); + } + + ice_prof_inc_ref(hw, blk, prof_id); + + /* add profile info */ + prof = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*prof), GFP_KERNEL); + if (!prof) + goto err_ice_add_prof; + + prof->profile_cookie = id; + prof->prof_id = prof_id; + prof->ptg_cnt = 0; + prof->context = 0; + + /* build list of ptgs */ + while (bytes && prof->ptg_cnt < ICE_MAX_PTG_PER_PROFILE) { + u32 bit; + + if (!ptypes[byte]) { + bytes--; + byte++; + continue; + } + + /* Examine 8 bits per byte */ + for_each_set_bit(bit, (unsigned long *)&ptypes[byte], + BITS_PER_BYTE) { + u16 ptype; + u8 ptg; + u8 m; + + ptype = byte * BITS_PER_BYTE + bit; + + /* The package should place all ptypes in a non-zero + * PTG, so the following call should never fail. + */ + if (ice_ptg_find_ptype(hw, blk, ptype, &ptg)) + continue; + + /* If PTG is already added, skip and continue */ + if (test_bit(ptg, ptgs_used)) + continue; + + set_bit(ptg, ptgs_used); + prof->ptg[prof->ptg_cnt] = ptg; + + if (++prof->ptg_cnt >= ICE_MAX_PTG_PER_PROFILE) + break; + + /* nothing left in byte, then exit */ + m = ~((1 << (bit + 1)) - 1); + if (!(ptypes[byte] & m)) + break; + } + + bytes--; + byte++; + } + + list_add(&prof->list, &hw->blk[blk].es.prof_map); + status = 0; + +err_ice_add_prof: + mutex_unlock(&hw->blk[blk].es.prof_map_lock); + return status; +} + +/** + * ice_search_prof_id_low - Search for a profile tracking ID low level + * @hw: pointer to the HW struct + * @blk: hardware block + * @id: profile tracking ID + * + * This will search for a profile tracking ID which was previously added. This + * version assumes that the caller has already acquired the prof map lock. + */ +static struct ice_prof_map * +ice_search_prof_id_low(struct ice_hw *hw, enum ice_block blk, u64 id) +{ + struct ice_prof_map *entry = NULL; + struct ice_prof_map *map; + + list_for_each_entry(map, &hw->blk[blk].es.prof_map, list) + if (map->profile_cookie == id) { + entry = map; + break; + } + + return entry; +} + +/** + * ice_search_prof_id - Search for a profile tracking ID + * @hw: pointer to the HW struct + * @blk: hardware block + * @id: profile tracking ID + * + * This will search for a profile tracking ID which was previously added. + */ +static struct ice_prof_map * +ice_search_prof_id(struct ice_hw *hw, enum ice_block blk, u64 id) +{ + struct ice_prof_map *entry; + + mutex_lock(&hw->blk[blk].es.prof_map_lock); + entry = ice_search_prof_id_low(hw, blk, id); + mutex_unlock(&hw->blk[blk].es.prof_map_lock); + + return entry; +} + +/** + * ice_vsig_prof_id_count - count profiles in a VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsig: VSIG to remove the profile from + */ +static u16 +ice_vsig_prof_id_count(struct ice_hw *hw, enum ice_block blk, u16 vsig) +{ + u16 idx = vsig & ICE_VSIG_IDX_M, count = 0; + struct ice_vsig_prof *p; + + list_for_each_entry(p, &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) + count++; + + return count; +} + +/** + * ice_rel_tcam_idx - release a TCAM index + * @hw: pointer to the HW struct + * @blk: hardware block + * @idx: the index to release + */ +static enum ice_status +ice_rel_tcam_idx(struct ice_hw *hw, enum ice_block blk, u16 idx) +{ + /* Masks to invoke a never match entry */ + u8 vl_msk[ICE_TCAM_KEY_VAL_SZ] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF }; + u8 dc_msk[ICE_TCAM_KEY_VAL_SZ] = { 0xFE, 0xFF, 0xFF, 0xFF, 0xFF }; + u8 nm_msk[ICE_TCAM_KEY_VAL_SZ] = { 0x01, 0x00, 0x00, 0x00, 0x00 }; + enum ice_status status; + + /* write the TCAM entry */ + status = ice_tcam_write_entry(hw, blk, idx, 0, 0, 0, 0, 0, vl_msk, + dc_msk, nm_msk); + if (status) + return status; + + /* release the TCAM entry */ + status = ice_free_tcam_ent(hw, blk, idx); + + return status; +} + +/** + * ice_rem_prof_id - remove one profile from a VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @prof: pointer to profile structure to remove + */ +static enum ice_status +ice_rem_prof_id(struct ice_hw *hw, enum ice_block blk, + struct ice_vsig_prof *prof) +{ + enum ice_status status; + u16 i; + + for (i = 0; i < prof->tcam_count; i++) + if (prof->tcam[i].in_use) { + prof->tcam[i].in_use = false; + status = ice_rel_tcam_idx(hw, blk, + prof->tcam[i].tcam_idx); + if (status) + return ICE_ERR_HW_TABLE; + } + + return 0; +} + +/** + * ice_rem_vsig - remove VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsig: the VSIG to remove + * @chg: the change list + */ +static enum ice_status +ice_rem_vsig(struct ice_hw *hw, enum ice_block blk, u16 vsig, + struct list_head *chg) +{ + u16 idx = vsig & ICE_VSIG_IDX_M; + struct ice_vsig_vsi *vsi_cur; + struct ice_vsig_prof *d, *t; + enum ice_status status; + + /* remove TCAM entries */ + list_for_each_entry_safe(d, t, + &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) { + status = ice_rem_prof_id(hw, blk, d); + if (status) + return status; + + list_del(&d->list); + devm_kfree(ice_hw_to_dev(hw), d); + } + + /* Move all VSIS associated with this VSIG to the default VSIG */ + vsi_cur = hw->blk[blk].xlt2.vsig_tbl[idx].first_vsi; + /* If the VSIG has at least 1 VSI then iterate through the list + * and remove the VSIs before deleting the group. + */ + if (vsi_cur) + do { + struct ice_vsig_vsi *tmp = vsi_cur->next_vsi; + struct ice_chs_chg *p; + + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), + GFP_KERNEL); + if (!p) + return ICE_ERR_NO_MEMORY; + + p->type = ICE_VSIG_REM; + p->orig_vsig = vsig; + p->vsig = ICE_DEFAULT_VSIG; + p->vsi = vsi_cur - hw->blk[blk].xlt2.vsis; + + list_add(&p->list_entry, chg); + + vsi_cur = tmp; + } while (vsi_cur); + + return ice_vsig_free(hw, blk, vsig); +} + +/** + * ice_rem_prof_id_vsig - remove a specific profile from a VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsig: VSIG to remove the profile from + * @hdl: profile handle indicating which profile to remove + * @chg: list to receive a record of changes + */ +static enum ice_status +ice_rem_prof_id_vsig(struct ice_hw *hw, enum ice_block blk, u16 vsig, u64 hdl, + struct list_head *chg) +{ + u16 idx = vsig & ICE_VSIG_IDX_M; + struct ice_vsig_prof *p, *t; + enum ice_status status; + + list_for_each_entry_safe(p, t, + &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) + if (p->profile_cookie == hdl) { + if (ice_vsig_prof_id_count(hw, blk, vsig) == 1) + /* this is the last profile, remove the VSIG */ + return ice_rem_vsig(hw, blk, vsig, chg); + + status = ice_rem_prof_id(hw, blk, p); + if (!status) { + list_del(&p->list); + devm_kfree(ice_hw_to_dev(hw), p); + } + return status; + } + + return ICE_ERR_DOES_NOT_EXIST; +} + +/** + * ice_rem_flow_all - remove all flows with a particular profile + * @hw: pointer to the HW struct + * @blk: hardware block + * @id: profile tracking ID + */ +static enum ice_status +ice_rem_flow_all(struct ice_hw *hw, enum ice_block blk, u64 id) +{ + struct ice_chs_chg *del, *tmp; + enum ice_status status; + struct list_head chg; + u16 i; + + INIT_LIST_HEAD(&chg); + + for (i = 1; i < ICE_MAX_VSIGS; i++) + if (hw->blk[blk].xlt2.vsig_tbl[i].in_use) { + if (ice_has_prof_vsig(hw, blk, i, id)) { + status = ice_rem_prof_id_vsig(hw, blk, i, id, + &chg); + if (status) + goto err_ice_rem_flow_all; + } + } + + status = ice_upd_prof_hw(hw, blk, &chg); + +err_ice_rem_flow_all: + list_for_each_entry_safe(del, tmp, &chg, list_entry) { + list_del(&del->list_entry); + devm_kfree(ice_hw_to_dev(hw), del); + } + + return status; +} + +/** + * ice_rem_prof - remove profile + * @hw: pointer to the HW struct + * @blk: hardware block + * @id: profile tracking ID + * + * This will remove the profile specified by the ID parameter, which was + * previously created through ice_add_prof. If any existing entries + * are associated with this profile, they will be removed as well. + */ +enum ice_status ice_rem_prof(struct ice_hw *hw, enum ice_block blk, u64 id) +{ + struct ice_prof_map *pmap; + enum ice_status status; + + mutex_lock(&hw->blk[blk].es.prof_map_lock); + + pmap = ice_search_prof_id_low(hw, blk, id); + if (!pmap) { + status = ICE_ERR_DOES_NOT_EXIST; + goto err_ice_rem_prof; + } + + /* remove all flows with this profile */ + status = ice_rem_flow_all(hw, blk, pmap->profile_cookie); + if (status) + goto err_ice_rem_prof; + + /* dereference profile, and possibly remove */ + ice_prof_dec_ref(hw, blk, pmap->prof_id); + + list_del(&pmap->list); + devm_kfree(ice_hw_to_dev(hw), pmap); + +err_ice_rem_prof: + mutex_unlock(&hw->blk[blk].es.prof_map_lock); + return status; +} + +/** + * ice_get_prof - get profile + * @hw: pointer to the HW struct + * @blk: hardware block + * @hdl: profile handle + * @chg: change list + */ +static enum ice_status +ice_get_prof(struct ice_hw *hw, enum ice_block blk, u64 hdl, + struct list_head *chg) +{ + struct ice_prof_map *map; + struct ice_chs_chg *p; + u16 i; + + /* Get the details on the profile specified by the handle ID */ + map = ice_search_prof_id(hw, blk, hdl); + if (!map) + return ICE_ERR_DOES_NOT_EXIST; + + for (i = 0; i < map->ptg_cnt; i++) + if (!hw->blk[blk].es.written[map->prof_id]) { + /* add ES to change list */ + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), + GFP_KERNEL); + if (!p) + goto err_ice_get_prof; + + p->type = ICE_PTG_ES_ADD; + p->ptype = 0; + p->ptg = map->ptg[i]; + p->add_ptg = 0; + + p->add_prof = 1; + p->prof_id = map->prof_id; + + hw->blk[blk].es.written[map->prof_id] = true; + + list_add(&p->list_entry, chg); + } + + return 0; + +err_ice_get_prof: + /* let caller clean up the change list */ + return ICE_ERR_NO_MEMORY; +} + +/** + * ice_get_profs_vsig - get a copy of the list of profiles from a VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsig: VSIG from which to copy the list + * @lst: output list + * + * This routine makes a copy of the list of profiles in the specified VSIG. + */ +static enum ice_status +ice_get_profs_vsig(struct ice_hw *hw, enum ice_block blk, u16 vsig, + struct list_head *lst) +{ + struct ice_vsig_prof *ent1, *ent2; + u16 idx = vsig & ICE_VSIG_IDX_M; + + list_for_each_entry(ent1, &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) { + struct ice_vsig_prof *p; + + /* copy to the input list */ + p = devm_kmemdup(ice_hw_to_dev(hw), ent1, sizeof(*p), + GFP_KERNEL); + if (!p) + goto err_ice_get_profs_vsig; + + list_add_tail(&p->list, lst); + } + + return 0; + +err_ice_get_profs_vsig: + list_for_each_entry_safe(ent1, ent2, lst, list) { + list_del(&ent1->list); + devm_kfree(ice_hw_to_dev(hw), ent1); + } + + return ICE_ERR_NO_MEMORY; +} + +/** + * ice_add_prof_to_lst - add profile entry to a list + * @hw: pointer to the HW struct + * @blk: hardware block + * @lst: the list to be added to + * @hdl: profile handle of entry to add + */ +static enum ice_status +ice_add_prof_to_lst(struct ice_hw *hw, enum ice_block blk, + struct list_head *lst, u64 hdl) +{ + struct ice_prof_map *map; + struct ice_vsig_prof *p; + u16 i; + + map = ice_search_prof_id(hw, blk, hdl); + if (!map) + return ICE_ERR_DOES_NOT_EXIST; + + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), GFP_KERNEL); + if (!p) + return ICE_ERR_NO_MEMORY; + + p->profile_cookie = map->profile_cookie; + p->prof_id = map->prof_id; + p->tcam_count = map->ptg_cnt; + + for (i = 0; i < map->ptg_cnt; i++) { + p->tcam[i].prof_id = map->prof_id; + p->tcam[i].tcam_idx = ICE_INVALID_TCAM; + p->tcam[i].ptg = map->ptg[i]; + } + + list_add(&p->list, lst); + + return 0; +} + +/** + * ice_move_vsi - move VSI to another VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsi: the VSI to move + * @vsig: the VSIG to move the VSI to + * @chg: the change list + */ +static enum ice_status +ice_move_vsi(struct ice_hw *hw, enum ice_block blk, u16 vsi, u16 vsig, + struct list_head *chg) +{ + enum ice_status status; + struct ice_chs_chg *p; + u16 orig_vsig; + + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), GFP_KERNEL); + if (!p) + return ICE_ERR_NO_MEMORY; + + status = ice_vsig_find_vsi(hw, blk, vsi, &orig_vsig); + if (!status) + status = ice_vsig_add_mv_vsi(hw, blk, vsi, vsig); + + if (status) { + devm_kfree(ice_hw_to_dev(hw), p); + return status; + } + + p->type = ICE_VSI_MOVE; + p->vsi = vsi; + p->orig_vsig = orig_vsig; + p->vsig = vsig; + + list_add(&p->list_entry, chg); + + return 0; +} + +/** + * ice_prof_tcam_ena_dis - add enable or disable TCAM change + * @hw: pointer to the HW struct + * @blk: hardware block + * @enable: true to enable, false to disable + * @vsig: the VSIG of the TCAM entry + * @tcam: pointer the TCAM info structure of the TCAM to disable + * @chg: the change list + * + * This function appends an enable or disable TCAM entry in the change log + */ +static enum ice_status +ice_prof_tcam_ena_dis(struct ice_hw *hw, enum ice_block blk, bool enable, + u16 vsig, struct ice_tcam_inf *tcam, + struct list_head *chg) +{ + enum ice_status status; + struct ice_chs_chg *p; + + /* Default: enable means change the low flag bit to don't care */ + u8 dc_msk[ICE_TCAM_KEY_VAL_SZ] = { 0x01, 0x00, 0x00, 0x00, 0x00 }; + u8 nm_msk[ICE_TCAM_KEY_VAL_SZ] = { 0x00, 0x00, 0x00, 0x00, 0x00 }; + u8 vl_msk[ICE_TCAM_KEY_VAL_SZ] = { 0x01, 0x00, 0x00, 0x00, 0x00 }; + + /* if disabling, free the TCAM */ + if (!enable) { + status = ice_free_tcam_ent(hw, blk, tcam->tcam_idx); + tcam->tcam_idx = 0; + tcam->in_use = 0; + return status; + } + + /* for re-enabling, reallocate a TCAM */ + status = ice_alloc_tcam_ent(hw, blk, &tcam->tcam_idx); + if (status) + return status; + + /* add TCAM to change list */ + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), GFP_KERNEL); + if (!p) + return ICE_ERR_NO_MEMORY; + + status = ice_tcam_write_entry(hw, blk, tcam->tcam_idx, tcam->prof_id, + tcam->ptg, vsig, 0, 0, vl_msk, dc_msk, + nm_msk); + if (status) + goto err_ice_prof_tcam_ena_dis; + + tcam->in_use = 1; + + p->type = ICE_TCAM_ADD; + p->add_tcam_idx = true; + p->prof_id = tcam->prof_id; + p->ptg = tcam->ptg; + p->vsig = 0; + p->tcam_idx = tcam->tcam_idx; + + /* log change */ + list_add(&p->list_entry, chg); + + return 0; + +err_ice_prof_tcam_ena_dis: + devm_kfree(ice_hw_to_dev(hw), p); + return status; +} + +/** + * ice_adj_prof_priorities - adjust profile based on priorities + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsig: the VSIG for which to adjust profile priorities + * @chg: the change list + */ +static enum ice_status +ice_adj_prof_priorities(struct ice_hw *hw, enum ice_block blk, u16 vsig, + struct list_head *chg) +{ + DECLARE_BITMAP(ptgs_used, ICE_XLT1_CNT); + struct ice_vsig_prof *t; + enum ice_status status; + u16 idx; + + bitmap_zero(ptgs_used, ICE_XLT1_CNT); + idx = vsig & ICE_VSIG_IDX_M; + + /* Priority is based on the order in which the profiles are added. The + * newest added profile has highest priority and the oldest added + * profile has the lowest priority. Since the profile property list for + * a VSIG is sorted from newest to oldest, this code traverses the list + * in order and enables the first of each PTG that it finds (that is not + * already enabled); it also disables any duplicate PTGs that it finds + * in the older profiles (that are currently enabled). + */ + + list_for_each_entry(t, &hw->blk[blk].xlt2.vsig_tbl[idx].prop_lst, + list) { + u16 i; + + for (i = 0; i < t->tcam_count; i++) { + /* Scan the priorities from newest to oldest. + * Make sure that the newest profiles take priority. + */ + if (test_bit(t->tcam[i].ptg, ptgs_used) && + t->tcam[i].in_use) { + /* need to mark this PTG as never match, as it + * was already in use and therefore duplicate + * (and lower priority) + */ + status = ice_prof_tcam_ena_dis(hw, blk, false, + vsig, + &t->tcam[i], + chg); + if (status) + return status; + } else if (!test_bit(t->tcam[i].ptg, ptgs_used) && + !t->tcam[i].in_use) { + /* need to enable this PTG, as it in not in use + * and not enabled (highest priority) + */ + status = ice_prof_tcam_ena_dis(hw, blk, true, + vsig, + &t->tcam[i], + chg); + if (status) + return status; + } + + /* keep track of used ptgs */ + set_bit(t->tcam[i].ptg, ptgs_used); + } + } + + return 0; +} + +/** + * ice_add_prof_id_vsig - add profile to VSIG + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsig: the VSIG to which this profile is to be added + * @hdl: the profile handle indicating the profile to add + * @chg: the change list + */ +static enum ice_status +ice_add_prof_id_vsig(struct ice_hw *hw, enum ice_block blk, u16 vsig, u64 hdl, + struct list_head *chg) +{ + /* Masks that ignore flags */ + u8 vl_msk[ICE_TCAM_KEY_VAL_SZ] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF }; + u8 dc_msk[ICE_TCAM_KEY_VAL_SZ] = { 0xFF, 0xFF, 0x00, 0x00, 0x00 }; + u8 nm_msk[ICE_TCAM_KEY_VAL_SZ] = { 0x00, 0x00, 0x00, 0x00, 0x00 }; + struct ice_prof_map *map; + struct ice_vsig_prof *t; + struct ice_chs_chg *p; + u16 i; + + /* Get the details on the profile specified by the handle ID */ + map = ice_search_prof_id(hw, blk, hdl); + if (!map) + return ICE_ERR_DOES_NOT_EXIST; + + /* Error, if this VSIG already has this profile */ + if (ice_has_prof_vsig(hw, blk, vsig, hdl)) + return ICE_ERR_ALREADY_EXISTS; + + /* new VSIG profile structure */ + t = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*t), GFP_KERNEL); + if (!t) + return ICE_ERR_NO_MEMORY; + + t->profile_cookie = map->profile_cookie; + t->prof_id = map->prof_id; + t->tcam_count = map->ptg_cnt; + + /* create TCAM entries */ + for (i = 0; i < map->ptg_cnt; i++) { + enum ice_status status; + u16 tcam_idx; + + /* add TCAM to change list */ + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), GFP_KERNEL); + if (!p) + goto err_ice_add_prof_id_vsig; + + /* allocate the TCAM entry index */ + status = ice_alloc_tcam_ent(hw, blk, &tcam_idx); + if (status) { + devm_kfree(ice_hw_to_dev(hw), p); + goto err_ice_add_prof_id_vsig; + } + + t->tcam[i].ptg = map->ptg[i]; + t->tcam[i].prof_id = map->prof_id; + t->tcam[i].tcam_idx = tcam_idx; + t->tcam[i].in_use = true; + + p->type = ICE_TCAM_ADD; + p->add_tcam_idx = true; + p->prof_id = t->tcam[i].prof_id; + p->ptg = t->tcam[i].ptg; + p->vsig = vsig; + p->tcam_idx = t->tcam[i].tcam_idx; + + /* write the TCAM entry */ + status = ice_tcam_write_entry(hw, blk, t->tcam[i].tcam_idx, + t->tcam[i].prof_id, + t->tcam[i].ptg, vsig, 0, 0, + vl_msk, dc_msk, nm_msk); + if (status) + goto err_ice_add_prof_id_vsig; + + /* log change */ + list_add(&p->list_entry, chg); + } + + /* add profile to VSIG */ + list_add(&t->list, + &hw->blk[blk].xlt2.vsig_tbl[(vsig & ICE_VSIG_IDX_M)].prop_lst); + + return 0; + +err_ice_add_prof_id_vsig: + /* let caller clean up the change list */ + devm_kfree(ice_hw_to_dev(hw), t); + return ICE_ERR_NO_MEMORY; +} + +/** + * ice_create_prof_id_vsig - add a new VSIG with a single profile + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsi: the initial VSI that will be in VSIG + * @hdl: the profile handle of the profile that will be added to the VSIG + * @chg: the change list + */ +static enum ice_status +ice_create_prof_id_vsig(struct ice_hw *hw, enum ice_block blk, u16 vsi, u64 hdl, + struct list_head *chg) +{ + enum ice_status status; + struct ice_chs_chg *p; + u16 new_vsig; + + p = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*p), GFP_KERNEL); + if (!p) + return ICE_ERR_NO_MEMORY; + + new_vsig = ice_vsig_alloc(hw, blk); + if (!new_vsig) { + status = ICE_ERR_HW_TABLE; + goto err_ice_create_prof_id_vsig; + } + + status = ice_move_vsi(hw, blk, vsi, new_vsig, chg); + if (status) + goto err_ice_create_prof_id_vsig; + + status = ice_add_prof_id_vsig(hw, blk, new_vsig, hdl, chg); + if (status) + goto err_ice_create_prof_id_vsig; + + p->type = ICE_VSIG_ADD; + p->vsi = vsi; + p->orig_vsig = ICE_DEFAULT_VSIG; + p->vsig = new_vsig; + + list_add(&p->list_entry, chg); + + return 0; + +err_ice_create_prof_id_vsig: + /* let caller clean up the change list */ + devm_kfree(ice_hw_to_dev(hw), p); + return status; +} + +/** + * ice_create_vsig_from_lst - create a new VSIG with a list of profiles + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsi: the initial VSI that will be in VSIG + * @lst: the list of profile that will be added to the VSIG + * @chg: the change list + */ +static enum ice_status +ice_create_vsig_from_lst(struct ice_hw *hw, enum ice_block blk, u16 vsi, + struct list_head *lst, struct list_head *chg) +{ + struct ice_vsig_prof *t; + enum ice_status status; + u16 vsig; + + vsig = ice_vsig_alloc(hw, blk); + if (!vsig) + return ICE_ERR_HW_TABLE; + + status = ice_move_vsi(hw, blk, vsi, vsig, chg); + if (status) + return status; + + list_for_each_entry(t, lst, list) { + status = ice_add_prof_id_vsig(hw, blk, vsig, t->profile_cookie, + chg); + if (status) + return status; + } + + return 0; +} + +/** + * ice_find_prof_vsig - find a VSIG with a specific profile handle + * @hw: pointer to the HW struct + * @blk: hardware block + * @hdl: the profile handle of the profile to search for + * @vsig: returns the VSIG with the matching profile + */ +static bool +ice_find_prof_vsig(struct ice_hw *hw, enum ice_block blk, u64 hdl, u16 *vsig) +{ + struct ice_vsig_prof *t; + enum ice_status status; + struct list_head lst; + + INIT_LIST_HEAD(&lst); + + t = kzalloc(sizeof(*t), GFP_KERNEL); + if (!t) + return false; + + t->profile_cookie = hdl; + list_add(&t->list, &lst); + + status = ice_find_dup_props_vsig(hw, blk, &lst, vsig); + + list_del(&t->list); + kfree(t); + + return !status; +} + +/** + * ice_add_prof_id_flow - add profile flow + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsi: the VSI to enable with the profile specified by ID + * @hdl: profile handle + * + * Calling this function will update the hardware tables to enable the + * profile indicated by the ID parameter for the VSIs specified in the VSI + * array. Once successfully called, the flow will be enabled. + */ +enum ice_status +ice_add_prof_id_flow(struct ice_hw *hw, enum ice_block blk, u16 vsi, u64 hdl) +{ + struct ice_vsig_prof *tmp1, *del1; + struct ice_chs_chg *tmp, *del; + struct list_head union_lst; + enum ice_status status; + struct list_head chg; + u16 vsig; + + INIT_LIST_HEAD(&union_lst); + INIT_LIST_HEAD(&chg); + + /* Get profile */ + status = ice_get_prof(hw, blk, hdl, &chg); + if (status) + return status; + + /* determine if VSI is already part of a VSIG */ + status = ice_vsig_find_vsi(hw, blk, vsi, &vsig); + if (!status && vsig) { + bool only_vsi; + u16 or_vsig; + u16 ref; + + /* found in VSIG */ + or_vsig = vsig; + + /* make sure that there is no overlap/conflict between the new + * characteristics and the existing ones; we don't support that + * scenario + */ + if (ice_has_prof_vsig(hw, blk, vsig, hdl)) { + status = ICE_ERR_ALREADY_EXISTS; + goto err_ice_add_prof_id_flow; + } + + /* last VSI in the VSIG? */ + status = ice_vsig_get_ref(hw, blk, vsig, &ref); + if (status) + goto err_ice_add_prof_id_flow; + only_vsi = (ref == 1); + + /* create a union of the current profiles and the one being + * added + */ + status = ice_get_profs_vsig(hw, blk, vsig, &union_lst); + if (status) + goto err_ice_add_prof_id_flow; + + status = ice_add_prof_to_lst(hw, blk, &union_lst, hdl); + if (status) + goto err_ice_add_prof_id_flow; + + /* search for an existing VSIG with an exact charc match */ + status = ice_find_dup_props_vsig(hw, blk, &union_lst, &vsig); + if (!status) { + /* move VSI to the VSIG that matches */ + status = ice_move_vsi(hw, blk, vsi, vsig, &chg); + if (status) + goto err_ice_add_prof_id_flow; + + /* VSI has been moved out of or_vsig. If the or_vsig had + * only that VSI it is now empty and can be removed. + */ + if (only_vsi) { + status = ice_rem_vsig(hw, blk, or_vsig, &chg); + if (status) + goto err_ice_add_prof_id_flow; + } + } else if (only_vsi) { + /* If the original VSIG only contains one VSI, then it + * will be the requesting VSI. In this case the VSI is + * not sharing entries and we can simply add the new + * profile to the VSIG. + */ + status = ice_add_prof_id_vsig(hw, blk, vsig, hdl, &chg); + if (status) + goto err_ice_add_prof_id_flow; + + /* Adjust priorities */ + status = ice_adj_prof_priorities(hw, blk, vsig, &chg); + if (status) + goto err_ice_add_prof_id_flow; + } else { + /* No match, so we need a new VSIG */ + status = ice_create_vsig_from_lst(hw, blk, vsi, + &union_lst, &chg); + if (status) + goto err_ice_add_prof_id_flow; + + /* Adjust priorities */ + status = ice_adj_prof_priorities(hw, blk, vsig, &chg); + if (status) + goto err_ice_add_prof_id_flow; + } + } else { + /* need to find or add a VSIG */ + /* search for an existing VSIG with an exact charc match */ + if (ice_find_prof_vsig(hw, blk, hdl, &vsig)) { + /* found an exact match */ + /* add or move VSI to the VSIG that matches */ + status = ice_move_vsi(hw, blk, vsi, vsig, &chg); + if (status) + goto err_ice_add_prof_id_flow; + } else { + /* we did not find an exact match */ + /* we need to add a VSIG */ + status = ice_create_prof_id_vsig(hw, blk, vsi, hdl, + &chg); + if (status) + goto err_ice_add_prof_id_flow; + } + } + + /* update hardware */ + if (!status) + status = ice_upd_prof_hw(hw, blk, &chg); + +err_ice_add_prof_id_flow: + list_for_each_entry_safe(del, tmp, &chg, list_entry) { + list_del(&del->list_entry); + devm_kfree(ice_hw_to_dev(hw), del); + } + + list_for_each_entry_safe(del1, tmp1, &union_lst, list) { + list_del(&del1->list); + devm_kfree(ice_hw_to_dev(hw), del1); + } + + return status; +} + +/** + * ice_rem_prof_from_list - remove a profile from list + * @hw: pointer to the HW struct + * @lst: list to remove the profile from + * @hdl: the profile handle indicating the profile to remove + */ +static enum ice_status +ice_rem_prof_from_list(struct ice_hw *hw, struct list_head *lst, u64 hdl) +{ + struct ice_vsig_prof *ent, *tmp; + + list_for_each_entry_safe(ent, tmp, lst, list) + if (ent->profile_cookie == hdl) { + list_del(&ent->list); + devm_kfree(ice_hw_to_dev(hw), ent); + return 0; + } + + return ICE_ERR_DOES_NOT_EXIST; +} + +/** + * ice_rem_prof_id_flow - remove flow + * @hw: pointer to the HW struct + * @blk: hardware block + * @vsi: the VSI from which to remove the profile specified by ID + * @hdl: profile tracking handle + * + * Calling this function will update the hardware tables to remove the + * profile indicated by the ID parameter for the VSIs specified in the VSI + * array. Once successfully called, the flow will be disabled. + */ +enum ice_status +ice_rem_prof_id_flow(struct ice_hw *hw, enum ice_block blk, u16 vsi, u64 hdl) +{ + struct ice_vsig_prof *tmp1, *del1; + struct ice_chs_chg *tmp, *del; + struct list_head chg, copy; + enum ice_status status; + u16 vsig; + + INIT_LIST_HEAD(©); + INIT_LIST_HEAD(&chg); + + /* determine if VSI is already part of a VSIG */ + status = ice_vsig_find_vsi(hw, blk, vsi, &vsig); + if (!status && vsig) { + bool last_profile; + bool only_vsi; + u16 ref; + + /* found in VSIG */ + last_profile = ice_vsig_prof_id_count(hw, blk, vsig) == 1; + status = ice_vsig_get_ref(hw, blk, vsig, &ref); + if (status) + goto err_ice_rem_prof_id_flow; + only_vsi = (ref == 1); + + if (only_vsi) { + /* If the original VSIG only contains one reference, + * which will be the requesting VSI, then the VSI is not + * sharing entries and we can simply remove the specific + * characteristics from the VSIG. + */ + + if (last_profile) { + /* If there are no profiles left for this VSIG, + * then simply remove the the VSIG. + */ + status = ice_rem_vsig(hw, blk, vsig, &chg); + if (status) + goto err_ice_rem_prof_id_flow; + } else { + status = ice_rem_prof_id_vsig(hw, blk, vsig, + hdl, &chg); + if (status) + goto err_ice_rem_prof_id_flow; + + /* Adjust priorities */ + status = ice_adj_prof_priorities(hw, blk, vsig, + &chg); + if (status) + goto err_ice_rem_prof_id_flow; + } + + } else { + /* Make a copy of the VSIG's list of Profiles */ + status = ice_get_profs_vsig(hw, blk, vsig, ©); + if (status) + goto err_ice_rem_prof_id_flow; + + /* Remove specified profile entry from the list */ + status = ice_rem_prof_from_list(hw, ©, hdl); + if (status) + goto err_ice_rem_prof_id_flow; + + if (list_empty(©)) { + status = ice_move_vsi(hw, blk, vsi, + ICE_DEFAULT_VSIG, &chg); + if (status) + goto err_ice_rem_prof_id_flow; + + } else if (!ice_find_dup_props_vsig(hw, blk, ©, + &vsig)) { + /* found an exact match */ + /* add or move VSI to the VSIG that matches */ + /* Search for a VSIG with a matching profile + * list + */ + + /* Found match, move VSI to the matching VSIG */ + status = ice_move_vsi(hw, blk, vsi, vsig, &chg); + if (status) + goto err_ice_rem_prof_id_flow; + } else { + /* since no existing VSIG supports this + * characteristic pattern, we need to create a + * new VSIG and TCAM entries + */ + status = ice_create_vsig_from_lst(hw, blk, vsi, + ©, &chg); + if (status) + goto err_ice_rem_prof_id_flow; + + /* Adjust priorities */ + status = ice_adj_prof_priorities(hw, blk, vsig, + &chg); + if (status) + goto err_ice_rem_prof_id_flow; + } + } + } else { + status = ICE_ERR_DOES_NOT_EXIST; + } + + /* update hardware tables */ + if (!status) + status = ice_upd_prof_hw(hw, blk, &chg); + +err_ice_rem_prof_id_flow: + list_for_each_entry_safe(del, tmp, &chg, list_entry) { + list_del(&del->list_entry); + devm_kfree(ice_hw_to_dev(hw), del); + } + + list_for_each_entry_safe(del1, tmp1, ©, list) { + list_del(&del1->list); + devm_kfree(ice_hw_to_dev(hw), del1); + } + + return status; +} diff --git a/drivers/net/ethernet/intel/ice/ice_flex_pipe.h b/drivers/net/ethernet/intel/ice/ice_flex_pipe.h index 37eb282742d1..c7b5e1a6ea2b 100644 --- a/drivers/net/ethernet/intel/ice/ice_flex_pipe.h +++ b/drivers/net/ethernet/intel/ice/ice_flex_pipe.h @@ -18,6 +18,13 @@ #define ICE_PKG_CNT 4 +enum ice_status +ice_add_prof(struct ice_hw *hw, enum ice_block blk, u64 id, u8 ptypes[], + struct ice_fv_word *es); +enum ice_status +ice_add_prof_id_flow(struct ice_hw *hw, enum ice_block blk, u16 vsi, u64 hdl); +enum ice_status +ice_rem_prof_id_flow(struct ice_hw *hw, enum ice_block blk, u16 vsi, u64 hdl); enum ice_status ice_init_pkg(struct ice_hw *hw, u8 *buff, u32 len); enum ice_status ice_copy_and_init_pkg(struct ice_hw *hw, const u8 *buf, u32 len); @@ -26,4 +33,6 @@ void ice_free_seg(struct ice_hw *hw); void ice_fill_blk_tbls(struct ice_hw *hw); void ice_clear_hw_tbls(struct ice_hw *hw); void ice_free_hw_tbls(struct ice_hw *hw); +enum ice_status +ice_rem_prof(struct ice_hw *hw, enum ice_block blk, u64 id); #endif /* _ICE_FLEX_PIPE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h b/drivers/net/ethernet/intel/ice/ice_flex_type.h index 5d5a7eaffa30..0fb3fe3ff3ea 100644 --- a/drivers/net/ethernet/intel/ice/ice_flex_type.h +++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h @@ -3,6 +3,9 @@ #ifndef _ICE_FLEX_TYPE_H_ #define _ICE_FLEX_TYPE_H_ + +#define ICE_FV_OFFSET_INVAL 0x1FF + /* Extraction Sequence (Field Vector) Table */ struct ice_fv_word { u8 prot_id; @@ -105,37 +108,57 @@ struct ice_buf_hdr { sizeof(struct ice_buf_hdr) - (hd_sz)) / (ent_sz)) /* ice package section IDs */ +#define ICE_SID_XLT0_SW 10 +#define ICE_SID_XLT_KEY_BUILDER_SW 11 #define ICE_SID_XLT1_SW 12 #define ICE_SID_XLT2_SW 13 #define ICE_SID_PROFID_TCAM_SW 14 #define ICE_SID_PROFID_REDIR_SW 15 #define ICE_SID_FLD_VEC_SW 16 +#define ICE_SID_CDID_KEY_BUILDER_SW 17 +#define ICE_SID_CDID_REDIR_SW 18 +#define ICE_SID_XLT0_ACL 20 +#define ICE_SID_XLT_KEY_BUILDER_ACL 21 #define ICE_SID_XLT1_ACL 22 #define ICE_SID_XLT2_ACL 23 #define ICE_SID_PROFID_TCAM_ACL 24 #define ICE_SID_PROFID_REDIR_ACL 25 #define ICE_SID_FLD_VEC_ACL 26 +#define ICE_SID_CDID_KEY_BUILDER_ACL 27 +#define ICE_SID_CDID_REDIR_ACL 28 +#define ICE_SID_XLT0_FD 30 +#define ICE_SID_XLT_KEY_BUILDER_FD 31 #define ICE_SID_XLT1_FD 32 #define ICE_SID_XLT2_FD 33 #define ICE_SID_PROFID_TCAM_FD 34 #define ICE_SID_PROFID_REDIR_FD 35 #define ICE_SID_FLD_VEC_FD 36 +#define ICE_SID_CDID_KEY_BUILDER_FD 37 +#define ICE_SID_CDID_REDIR_FD 38 +#define ICE_SID_XLT0_RSS 40 +#define ICE_SID_XLT_KEY_BUILDER_RSS 41 #define ICE_SID_XLT1_RSS 42 #define ICE_SID_XLT2_RSS 43 #define ICE_SID_PROFID_TCAM_RSS 44 #define ICE_SID_PROFID_REDIR_RSS 45 #define ICE_SID_FLD_VEC_RSS 46 +#define ICE_SID_CDID_KEY_BUILDER_RSS 47 +#define ICE_SID_CDID_REDIR_RSS 48 #define ICE_SID_RXPARSER_BOOST_TCAM 56 +#define ICE_SID_XLT0_PE 80 +#define ICE_SID_XLT_KEY_BUILDER_PE 81 #define ICE_SID_XLT1_PE 82 #define ICE_SID_XLT2_PE 83 #define ICE_SID_PROFID_TCAM_PE 84 #define ICE_SID_PROFID_REDIR_PE 85 #define ICE_SID_FLD_VEC_PE 86 +#define ICE_SID_CDID_KEY_BUILDER_PE 87 +#define ICE_SID_CDID_REDIR_PE 88 /* Label Metadata section IDs */ #define ICE_SID_LBL_FIRST 0x80000010 @@ -152,6 +175,19 @@ enum ice_block { ICE_BLK_COUNT }; +enum ice_sect { + ICE_XLT0 = 0, + ICE_XLT_KB, + ICE_XLT1, + ICE_XLT2, + ICE_PROF_TCAM, + ICE_PROF_REDIR, + ICE_VEC_TBL, + ICE_CDID_KB, + ICE_CDID_REDIR, + ICE_SECT_COUNT +}; + /* package labels */ struct ice_label { __le16 value; @@ -234,6 +270,13 @@ struct ice_prof_redir_section { u8 redir_value[1]; }; +/* package buffer building */ + +struct ice_buf_build { + struct ice_buf buf; + u16 reserved_section_table_entries; +}; + struct ice_pkg_enum { struct ice_buf_table *buf_table; u32 buf_idx; @@ -248,6 +291,12 @@ struct ice_pkg_enum { void *(*handler)(u32 sect_type, void *section, u32 index, u32 *offset); }; +struct ice_pkg_es { + __le16 count; + __le16 offset; + struct ice_fv_word es[1]; +}; + struct ice_es { u32 sid; u16 count; @@ -280,6 +329,35 @@ struct ice_ptg_ptype { u8 ptg; }; +#define ICE_MAX_TCAM_PER_PROFILE 32 +#define ICE_MAX_PTG_PER_PROFILE 32 + +struct ice_prof_map { + struct list_head list; + u64 profile_cookie; + u64 context; + u8 prof_id; + u8 ptg_cnt; + u8 ptg[ICE_MAX_PTG_PER_PROFILE]; +}; + +#define ICE_INVALID_TCAM 0xFFFF + +struct ice_tcam_inf { + u16 tcam_idx; + u8 ptg; + u8 prof_id; + u8 in_use; +}; + +struct ice_vsig_prof { + struct list_head list; + u64 profile_cookie; + u8 prof_id; + u8 tcam_count; + struct ice_tcam_inf tcam[ICE_MAX_TCAM_PER_PROFILE]; +}; + struct ice_vsig_entry { struct list_head prop_lst; struct ice_vsig_vsi *first_vsi; @@ -329,6 +407,13 @@ struct ice_xlt2 { u16 count; }; +/* Profile ID Management */ +struct ice_prof_id_key { + __le16 flags; + u8 xlt1; + __le16 xlt2_cdid; +} __packed; + /* Keys are made up of two values, each one-half the size of the key. * For TCAM, the entire key is 80 bits wide (or 2, 40-bit wide values) */ @@ -371,4 +456,31 @@ struct ice_blk_info { u8 is_list_init; }; +enum ice_chg_type { + ICE_TCAM_NONE = 0, + ICE_PTG_ES_ADD, + ICE_TCAM_ADD, + ICE_VSIG_ADD, + ICE_VSIG_REM, + ICE_VSI_MOVE, +}; + +struct ice_chs_chg { + struct list_head list_entry; + enum ice_chg_type type; + + u8 add_ptg; + u8 add_vsig; + u8 add_tcam_idx; + u8 add_prof; + u16 ptype; + u8 ptg; + u8 prof_id; + u16 vsi; + u16 vsig; + u16 orig_vsig; + u16 tcam_idx; +}; + +#define ICE_FLOW_PTYPE_MAX ICE_XLT1_CNT #endif /* _ICE_FLEX_TYPE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_flow.c b/drivers/net/ethernet/intel/ice/ice_flow.c new file mode 100644 index 000000000000..a05ceb59863b --- /dev/null +++ b/drivers/net/ethernet/intel/ice/ice_flow.c @@ -0,0 +1,1275 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019, Intel Corporation. */ + +#include "ice_common.h" +#include "ice_flow.h" + +/* Describe properties of a protocol header field */ +struct ice_flow_field_info { + enum ice_flow_seg_hdr hdr; + s16 off; /* Offset from start of a protocol header, in bits */ + u16 size; /* Size of fields in bits */ +}; + +#define ICE_FLOW_FLD_INFO(_hdr, _offset_bytes, _size_bytes) { \ + .hdr = _hdr, \ + .off = (_offset_bytes) * BITS_PER_BYTE, \ + .size = (_size_bytes) * BITS_PER_BYTE, \ +} + +/* Table containing properties of supported protocol header fields */ +static const +struct ice_flow_field_info ice_flds_info[ICE_FLOW_FIELD_IDX_MAX] = { + /* IPv4 / IPv6 */ + /* ICE_FLOW_FIELD_IDX_IPV4_SA */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV4, 12, sizeof(struct in_addr)), + /* ICE_FLOW_FIELD_IDX_IPV4_DA */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV4, 16, sizeof(struct in_addr)), + /* ICE_FLOW_FIELD_IDX_IPV6_SA */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV6, 8, sizeof(struct in6_addr)), + /* ICE_FLOW_FIELD_IDX_IPV6_DA */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV6, 24, sizeof(struct in6_addr)), + /* Transport */ + /* ICE_FLOW_FIELD_IDX_TCP_SRC_PORT */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_TCP, 0, sizeof(__be16)), + /* ICE_FLOW_FIELD_IDX_TCP_DST_PORT */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_TCP, 2, sizeof(__be16)), + /* ICE_FLOW_FIELD_IDX_UDP_SRC_PORT */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_UDP, 0, sizeof(__be16)), + /* ICE_FLOW_FIELD_IDX_UDP_DST_PORT */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_UDP, 2, sizeof(__be16)), + /* ICE_FLOW_FIELD_IDX_SCTP_SRC_PORT */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_SCTP, 0, sizeof(__be16)), + /* ICE_FLOW_FIELD_IDX_SCTP_DST_PORT */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_SCTP, 2, sizeof(__be16)), + +}; + +/* Bitmaps indicating relevant packet types for a particular protocol header + * + * Packet types for packets with an Outer/First/Single IPv4 header + */ +static const u32 ice_ptypes_ipv4_ofos[] = { + 0x1DC00000, 0x04000800, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* Packet types for packets with an Innermost/Last IPv4 header */ +static const u32 ice_ptypes_ipv4_il[] = { + 0xE0000000, 0xB807700E, 0x80000003, 0xE01DC03B, + 0x0000000E, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* Packet types for packets with an Outer/First/Single IPv6 header */ +static const u32 ice_ptypes_ipv6_ofos[] = { + 0x00000000, 0x00000000, 0x77000000, 0x10002000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* Packet types for packets with an Innermost/Last IPv6 header */ +static const u32 ice_ptypes_ipv6_il[] = { + 0x00000000, 0x03B80770, 0x000001DC, 0x0EE00000, + 0x00000770, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* UDP Packet types for non-tunneled packets or tunneled + * packets with inner UDP. + */ +static const u32 ice_ptypes_udp_il[] = { + 0x81000000, 0x20204040, 0x04000010, 0x80810102, + 0x00000040, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* Packet types for packets with an Innermost/Last TCP header */ +static const u32 ice_ptypes_tcp_il[] = { + 0x04000000, 0x80810102, 0x10000040, 0x02040408, + 0x00000102, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* Packet types for packets with an Innermost/Last SCTP header */ +static const u32 ice_ptypes_sctp_il[] = { + 0x08000000, 0x01020204, 0x20000081, 0x04080810, + 0x00000204, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00000000, 0x00000000, 0x00000000, +}; + +/* Manage parameters and info. used during the creation of a flow profile */ +struct ice_flow_prof_params { + enum ice_block blk; + u16 entry_length; /* # of bytes formatted entry will require */ + u8 es_cnt; + struct ice_flow_prof *prof; + + /* For ACL, the es[0] will have the data of ICE_RX_MDID_PKT_FLAGS_15_0 + * This will give us the direction flags. + */ + struct ice_fv_word es[ICE_MAX_FV_WORDS]; + DECLARE_BITMAP(ptypes, ICE_FLOW_PTYPE_MAX); +}; + +#define ICE_FLOW_SEG_HDRS_L3_MASK \ + (ICE_FLOW_SEG_HDR_IPV4 | ICE_FLOW_SEG_HDR_IPV6) +#define ICE_FLOW_SEG_HDRS_L4_MASK \ + (ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_UDP | ICE_FLOW_SEG_HDR_SCTP) + +/** + * ice_flow_val_hdrs - validates packet segments for valid protocol headers + * @segs: array of one or more packet segments that describe the flow + * @segs_cnt: number of packet segments provided + */ +static enum ice_status +ice_flow_val_hdrs(struct ice_flow_seg_info *segs, u8 segs_cnt) +{ + u8 i; + + for (i = 0; i < segs_cnt; i++) { + /* Multiple L3 headers */ + if (segs[i].hdrs & ICE_FLOW_SEG_HDRS_L3_MASK && + !is_power_of_2(segs[i].hdrs & ICE_FLOW_SEG_HDRS_L3_MASK)) + return ICE_ERR_PARAM; + + /* Multiple L4 headers */ + if (segs[i].hdrs & ICE_FLOW_SEG_HDRS_L4_MASK && + !is_power_of_2(segs[i].hdrs & ICE_FLOW_SEG_HDRS_L4_MASK)) + return ICE_ERR_PARAM; + } + + return 0; +} + +/** + * ice_flow_proc_seg_hdrs - process protocol headers present in pkt segments + * @params: information about the flow to be processed + * + * This function identifies the packet types associated with the protocol + * headers being present in packet segments of the specified flow profile. + */ +static enum ice_status +ice_flow_proc_seg_hdrs(struct ice_flow_prof_params *params) +{ + struct ice_flow_prof *prof; + u8 i; + + memset(params->ptypes, 0xff, sizeof(params->ptypes)); + + prof = params->prof; + + for (i = 0; i < params->prof->segs_cnt; i++) { + const unsigned long *src; + u32 hdrs; + + hdrs = prof->segs[i].hdrs; + + if (hdrs & ICE_FLOW_SEG_HDR_IPV4) { + src = !i ? (const unsigned long *)ice_ptypes_ipv4_ofos : + (const unsigned long *)ice_ptypes_ipv4_il; + bitmap_and(params->ptypes, params->ptypes, src, + ICE_FLOW_PTYPE_MAX); + } else if (hdrs & ICE_FLOW_SEG_HDR_IPV6) { + src = !i ? (const unsigned long *)ice_ptypes_ipv6_ofos : + (const unsigned long *)ice_ptypes_ipv6_il; + bitmap_and(params->ptypes, params->ptypes, src, + ICE_FLOW_PTYPE_MAX); + } + + if (hdrs & ICE_FLOW_SEG_HDR_UDP) { + src = (const unsigned long *)ice_ptypes_udp_il; + bitmap_and(params->ptypes, params->ptypes, src, + ICE_FLOW_PTYPE_MAX); + } else if (hdrs & ICE_FLOW_SEG_HDR_TCP) { + bitmap_and(params->ptypes, params->ptypes, + (const unsigned long *)ice_ptypes_tcp_il, + ICE_FLOW_PTYPE_MAX); + } else if (hdrs & ICE_FLOW_SEG_HDR_SCTP) { + src = (const unsigned long *)ice_ptypes_sctp_il; + bitmap_and(params->ptypes, params->ptypes, src, + ICE_FLOW_PTYPE_MAX); + } + } + + return 0; +} + +/** + * ice_flow_xtract_fld - Create an extraction sequence entry for the given field + * @hw: pointer to the HW struct + * @params: information about the flow to be processed + * @seg: packet segment index of the field to be extracted + * @fld: ID of field to be extracted + * + * This function determines the protocol ID, offset, and size of the given + * field. It then allocates one or more extraction sequence entries for the + * given field, and fill the entries with protocol ID and offset information. + */ +static enum ice_status +ice_flow_xtract_fld(struct ice_hw *hw, struct ice_flow_prof_params *params, + u8 seg, enum ice_flow_field fld) +{ + enum ice_prot_id prot_id = ICE_PROT_ID_INVAL; + u8 fv_words = hw->blk[params->blk].es.fvw; + struct ice_flow_fld_info *flds; + u16 cnt, ese_bits, i; + u16 off; + + flds = params->prof->segs[seg].fields; + + switch (fld) { + case ICE_FLOW_FIELD_IDX_IPV4_SA: + case ICE_FLOW_FIELD_IDX_IPV4_DA: + prot_id = seg == 0 ? ICE_PROT_IPV4_OF_OR_S : ICE_PROT_IPV4_IL; + break; + case ICE_FLOW_FIELD_IDX_IPV6_SA: + case ICE_FLOW_FIELD_IDX_IPV6_DA: + prot_id = seg == 0 ? ICE_PROT_IPV6_OF_OR_S : ICE_PROT_IPV6_IL; + break; + case ICE_FLOW_FIELD_IDX_TCP_SRC_PORT: + case ICE_FLOW_FIELD_IDX_TCP_DST_PORT: + prot_id = ICE_PROT_TCP_IL; + break; + case ICE_FLOW_FIELD_IDX_UDP_SRC_PORT: + case ICE_FLOW_FIELD_IDX_UDP_DST_PORT: + prot_id = ICE_PROT_UDP_IL_OR_S; + break; + case ICE_FLOW_FIELD_IDX_SCTP_SRC_PORT: + case ICE_FLOW_FIELD_IDX_SCTP_DST_PORT: + prot_id = ICE_PROT_SCTP_IL; + break; + default: + return ICE_ERR_NOT_IMPL; + } + + /* Each extraction sequence entry is a word in size, and extracts a + * word-aligned offset from a protocol header. + */ + ese_bits = ICE_FLOW_FV_EXTRACT_SZ * BITS_PER_BYTE; + + flds[fld].xtrct.prot_id = prot_id; + flds[fld].xtrct.off = (ice_flds_info[fld].off / ese_bits) * + ICE_FLOW_FV_EXTRACT_SZ; + flds[fld].xtrct.disp = (u8)(ice_flds_info[fld].off % ese_bits); + flds[fld].xtrct.idx = params->es_cnt; + + /* Adjust the next field-entry index after accommodating the number of + * entries this field consumes + */ + cnt = DIV_ROUND_UP(flds[fld].xtrct.disp + ice_flds_info[fld].size, + ese_bits); + + /* Fill in the extraction sequence entries needed for this field */ + off = flds[fld].xtrct.off; + for (i = 0; i < cnt; i++) { + u8 idx; + + /* Make sure the number of extraction sequence required + * does not exceed the block's capability + */ + if (params->es_cnt >= fv_words) + return ICE_ERR_MAX_LIMIT; + + /* some blocks require a reversed field vector layout */ + if (hw->blk[params->blk].es.reverse) + idx = fv_words - params->es_cnt - 1; + else + idx = params->es_cnt; + + params->es[idx].prot_id = prot_id; + params->es[idx].off = off; + params->es_cnt++; + + off += ICE_FLOW_FV_EXTRACT_SZ; + } + + return 0; +} + +/** + * ice_flow_create_xtrct_seq - Create an extraction sequence for given segments + * @hw: pointer to the HW struct + * @params: information about the flow to be processed + * + * This function iterates through all matched fields in the given segments, and + * creates an extraction sequence for the fields. + */ +static enum ice_status +ice_flow_create_xtrct_seq(struct ice_hw *hw, + struct ice_flow_prof_params *params) +{ + struct ice_flow_prof *prof = params->prof; + enum ice_status status = 0; + u8 i; + + for (i = 0; i < prof->segs_cnt; i++) { + u8 j; + + for_each_set_bit(j, (unsigned long *)&prof->segs[i].match, + ICE_FLOW_FIELD_IDX_MAX) { + status = ice_flow_xtract_fld(hw, params, i, + (enum ice_flow_field)j); + if (status) + return status; + } + } + + return status; +} + +/** + * ice_flow_proc_segs - process all packet segments associated with a profile + * @hw: pointer to the HW struct + * @params: information about the flow to be processed + */ +static enum ice_status +ice_flow_proc_segs(struct ice_hw *hw, struct ice_flow_prof_params *params) +{ + enum ice_status status; + + status = ice_flow_proc_seg_hdrs(params); + if (status) + return status; + + status = ice_flow_create_xtrct_seq(hw, params); + if (status) + return status; + + switch (params->blk) { + case ICE_BLK_RSS: + /* Only header information is provided for RSS configuration. + * No further processing is needed. + */ + status = 0; + break; + default: + return ICE_ERR_NOT_IMPL; + } + + return status; +} + +#define ICE_FLOW_FIND_PROF_CHK_FLDS 0x00000001 +#define ICE_FLOW_FIND_PROF_CHK_VSI 0x00000002 +#define ICE_FLOW_FIND_PROF_NOT_CHK_DIR 0x00000004 + +/** + * ice_flow_find_prof_conds - Find a profile matching headers and conditions + * @hw: pointer to the HW struct + * @blk: classification stage + * @dir: flow direction + * @segs: array of one or more packet segments that describe the flow + * @segs_cnt: number of packet segments provided + * @vsi_handle: software VSI handle to check VSI (ICE_FLOW_FIND_PROF_CHK_VSI) + * @conds: additional conditions to be checked (ICE_FLOW_FIND_PROF_CHK_*) + */ +static struct ice_flow_prof * +ice_flow_find_prof_conds(struct ice_hw *hw, enum ice_block blk, + enum ice_flow_dir dir, struct ice_flow_seg_info *segs, + u8 segs_cnt, u16 vsi_handle, u32 conds) +{ + struct ice_flow_prof *p, *prof = NULL; + + mutex_lock(&hw->fl_profs_locks[blk]); + list_for_each_entry(p, &hw->fl_profs[blk], l_entry) + if ((p->dir == dir || conds & ICE_FLOW_FIND_PROF_NOT_CHK_DIR) && + segs_cnt && segs_cnt == p->segs_cnt) { + u8 i; + + /* Check for profile-VSI association if specified */ + if ((conds & ICE_FLOW_FIND_PROF_CHK_VSI) && + ice_is_vsi_valid(hw, vsi_handle) && + !test_bit(vsi_handle, p->vsis)) + continue; + + /* Protocol headers must be checked. Matched fields are + * checked if specified. + */ + for (i = 0; i < segs_cnt; i++) + if (segs[i].hdrs != p->segs[i].hdrs || + ((conds & ICE_FLOW_FIND_PROF_CHK_FLDS) && + segs[i].match != p->segs[i].match)) + break; + + /* A match is found if all segments are matched */ + if (i == segs_cnt) { + prof = p; + break; + } + } + mutex_unlock(&hw->fl_profs_locks[blk]); + + return prof; +} + +/** + * ice_flow_find_prof_id - Look up a profile with given profile ID + * @hw: pointer to the HW struct + * @blk: classification stage + * @prof_id: unique ID to identify this flow profile + */ +static struct ice_flow_prof * +ice_flow_find_prof_id(struct ice_hw *hw, enum ice_block blk, u64 prof_id) +{ + struct ice_flow_prof *p; + + list_for_each_entry(p, &hw->fl_profs[blk], l_entry) + if (p->id == prof_id) + return p; + + return NULL; +} + +/** + * ice_flow_add_prof_sync - Add a flow profile for packet segments and fields + * @hw: pointer to the HW struct + * @blk: classification stage + * @dir: flow direction + * @prof_id: unique ID to identify this flow profile + * @segs: array of one or more packet segments that describe the flow + * @segs_cnt: number of packet segments provided + * @prof: stores the returned flow profile added + * + * Assumption: the caller has acquired the lock to the profile list + */ +static enum ice_status +ice_flow_add_prof_sync(struct ice_hw *hw, enum ice_block blk, + enum ice_flow_dir dir, u64 prof_id, + struct ice_flow_seg_info *segs, u8 segs_cnt, + struct ice_flow_prof **prof) +{ + struct ice_flow_prof_params params; + enum ice_status status; + u8 i; + + if (!prof) + return ICE_ERR_BAD_PTR; + + memset(¶ms, 0, sizeof(params)); + params.prof = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*params.prof), + GFP_KERNEL); + if (!params.prof) + return ICE_ERR_NO_MEMORY; + + /* initialize extraction sequence to all invalid (0xff) */ + for (i = 0; i < ICE_MAX_FV_WORDS; i++) { + params.es[i].prot_id = ICE_PROT_INVALID; + params.es[i].off = ICE_FV_OFFSET_INVAL; + } + + params.blk = blk; + params.prof->id = prof_id; + params.prof->dir = dir; + params.prof->segs_cnt = segs_cnt; + + /* Make a copy of the segments that need to be persistent in the flow + * profile instance + */ + for (i = 0; i < segs_cnt; i++) + memcpy(¶ms.prof->segs[i], &segs[i], sizeof(*segs)); + + status = ice_flow_proc_segs(hw, ¶ms); + if (status) { + ice_debug(hw, ICE_DBG_FLOW, + "Error processing a flow's packet segments\n"); + goto out; + } + + /* Add a HW profile for this flow profile */ + status = ice_add_prof(hw, blk, prof_id, (u8 *)params.ptypes, params.es); + if (status) { + ice_debug(hw, ICE_DBG_FLOW, "Error adding a HW flow profile\n"); + goto out; + } + + INIT_LIST_HEAD(¶ms.prof->entries); + mutex_init(¶ms.prof->entries_lock); + *prof = params.prof; + +out: + if (status) + devm_kfree(ice_hw_to_dev(hw), params.prof); + + return status; +} + +/** + * ice_flow_rem_prof_sync - remove a flow profile + * @hw: pointer to the hardware structure + * @blk: classification stage + * @prof: pointer to flow profile to remove + * + * Assumption: the caller has acquired the lock to the profile list + */ +static enum ice_status +ice_flow_rem_prof_sync(struct ice_hw *hw, enum ice_block blk, + struct ice_flow_prof *prof) +{ + enum ice_status status; + + /* Remove all hardware profiles associated with this flow profile */ + status = ice_rem_prof(hw, blk, prof->id); + if (!status) { + list_del(&prof->l_entry); + mutex_destroy(&prof->entries_lock); + devm_kfree(ice_hw_to_dev(hw), prof); + } + + return status; +} + +/** + * ice_flow_assoc_prof - associate a VSI with a flow profile + * @hw: pointer to the hardware structure + * @blk: classification stage + * @prof: pointer to flow profile + * @vsi_handle: software VSI handle + * + * Assumption: the caller has acquired the lock to the profile list + * and the software VSI handle has been validated + */ +static enum ice_status +ice_flow_assoc_prof(struct ice_hw *hw, enum ice_block blk, + struct ice_flow_prof *prof, u16 vsi_handle) +{ + enum ice_status status = 0; + + if (!test_bit(vsi_handle, prof->vsis)) { + status = ice_add_prof_id_flow(hw, blk, + ice_get_hw_vsi_num(hw, + vsi_handle), + prof->id); + if (!status) + set_bit(vsi_handle, prof->vsis); + else + ice_debug(hw, ICE_DBG_FLOW, + "HW profile add failed, %d\n", + status); + } + + return status; +} + +/** + * ice_flow_disassoc_prof - disassociate a VSI from a flow profile + * @hw: pointer to the hardware structure + * @blk: classification stage + * @prof: pointer to flow profile + * @vsi_handle: software VSI handle + * + * Assumption: the caller has acquired the lock to the profile list + * and the software VSI handle has been validated + */ +static enum ice_status +ice_flow_disassoc_prof(struct ice_hw *hw, enum ice_block blk, + struct ice_flow_prof *prof, u16 vsi_handle) +{ + enum ice_status status = 0; + + if (test_bit(vsi_handle, prof->vsis)) { + status = ice_rem_prof_id_flow(hw, blk, + ice_get_hw_vsi_num(hw, + vsi_handle), + prof->id); + if (!status) + clear_bit(vsi_handle, prof->vsis); + else + ice_debug(hw, ICE_DBG_FLOW, + "HW profile remove failed, %d\n", + status); + } + + return status; +} + +/** + * ice_flow_add_prof - Add a flow profile for packet segments and matched fields + * @hw: pointer to the HW struct + * @blk: classification stage + * @dir: flow direction + * @prof_id: unique ID to identify this flow profile + * @segs: array of one or more packet segments that describe the flow + * @segs_cnt: number of packet segments provided + * @prof: stores the returned flow profile added + */ +static enum ice_status +ice_flow_add_prof(struct ice_hw *hw, enum ice_block blk, enum ice_flow_dir dir, + u64 prof_id, struct ice_flow_seg_info *segs, u8 segs_cnt, + struct ice_flow_prof **prof) +{ + enum ice_status status; + + if (segs_cnt > ICE_FLOW_SEG_MAX) + return ICE_ERR_MAX_LIMIT; + + if (!segs_cnt) + return ICE_ERR_PARAM; + + if (!segs) + return ICE_ERR_BAD_PTR; + + status = ice_flow_val_hdrs(segs, segs_cnt); + if (status) + return status; + + mutex_lock(&hw->fl_profs_locks[blk]); + + status = ice_flow_add_prof_sync(hw, blk, dir, prof_id, segs, segs_cnt, + prof); + if (!status) + list_add(&(*prof)->l_entry, &hw->fl_profs[blk]); + + mutex_unlock(&hw->fl_profs_locks[blk]); + + return status; +} + +/** + * ice_flow_rem_prof - Remove a flow profile and all entries associated with it + * @hw: pointer to the HW struct + * @blk: the block for which the flow profile is to be removed + * @prof_id: unique ID of the flow profile to be removed + */ +static enum ice_status +ice_flow_rem_prof(struct ice_hw *hw, enum ice_block blk, u64 prof_id) +{ + struct ice_flow_prof *prof; + enum ice_status status; + + mutex_lock(&hw->fl_profs_locks[blk]); + + prof = ice_flow_find_prof_id(hw, blk, prof_id); + if (!prof) { + status = ICE_ERR_DOES_NOT_EXIST; + goto out; + } + + /* prof becomes invalid after the call */ + status = ice_flow_rem_prof_sync(hw, blk, prof); + +out: + mutex_unlock(&hw->fl_profs_locks[blk]); + + return status; +} + +/** + * ice_flow_set_fld_ext - specifies locations of field from entry's input buffer + * @seg: packet segment the field being set belongs to + * @fld: field to be set + * @type: type of the field + * @val_loc: if not ICE_FLOW_FLD_OFF_INVAL, location of the value to match from + * entry's input buffer + * @mask_loc: if not ICE_FLOW_FLD_OFF_INVAL, location of mask value from entry's + * input buffer + * @last_loc: if not ICE_FLOW_FLD_OFF_INVAL, location of last/upper value from + * entry's input buffer + * + * This helper function stores information of a field being matched, including + * the type of the field and the locations of the value to match, the mask, and + * and the upper-bound value in the start of the input buffer for a flow entry. + * This function should only be used for fixed-size data structures. + * + * This function also opportunistically determines the protocol headers to be + * present based on the fields being set. Some fields cannot be used alone to + * determine the protocol headers present. Sometimes, fields for particular + * protocol headers are not matched. In those cases, the protocol headers + * must be explicitly set. + */ +static void +ice_flow_set_fld_ext(struct ice_flow_seg_info *seg, enum ice_flow_field fld, + enum ice_flow_fld_match_type type, u16 val_loc, + u16 mask_loc, u16 last_loc) +{ + u64 bit = BIT_ULL(fld); + + seg->match |= bit; + if (type == ICE_FLOW_FLD_TYPE_RANGE) + seg->range |= bit; + + seg->fields[fld].type = type; + seg->fields[fld].src.val = val_loc; + seg->fields[fld].src.mask = mask_loc; + seg->fields[fld].src.last = last_loc; + + ICE_FLOW_SET_HDRS(seg, ice_flds_info[fld].hdr); +} + +/** + * ice_flow_set_fld - specifies locations of field from entry's input buffer + * @seg: packet segment the field being set belongs to + * @fld: field to be set + * @val_loc: if not ICE_FLOW_FLD_OFF_INVAL, location of the value to match from + * entry's input buffer + * @mask_loc: if not ICE_FLOW_FLD_OFF_INVAL, location of mask value from entry's + * input buffer + * @last_loc: if not ICE_FLOW_FLD_OFF_INVAL, location of last/upper value from + * entry's input buffer + * @range: indicate if field being matched is to be in a range + * + * This function specifies the locations, in the form of byte offsets from the + * start of the input buffer for a flow entry, from where the value to match, + * the mask value, and upper value can be extracted. These locations are then + * stored in the flow profile. When adding a flow entry associated with the + * flow profile, these locations will be used to quickly extract the values and + * create the content of a match entry. This function should only be used for + * fixed-size data structures. + */ +static void +ice_flow_set_fld(struct ice_flow_seg_info *seg, enum ice_flow_field fld, + u16 val_loc, u16 mask_loc, u16 last_loc, bool range) +{ + enum ice_flow_fld_match_type t = range ? + ICE_FLOW_FLD_TYPE_RANGE : ICE_FLOW_FLD_TYPE_REG; + + ice_flow_set_fld_ext(seg, fld, t, val_loc, mask_loc, last_loc); +} + +#define ICE_FLOW_RSS_SEG_HDR_L3_MASKS \ + (ICE_FLOW_SEG_HDR_IPV4 | ICE_FLOW_SEG_HDR_IPV6) + +#define ICE_FLOW_RSS_SEG_HDR_L4_MASKS \ + (ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_UDP | ICE_FLOW_SEG_HDR_SCTP) + +#define ICE_FLOW_RSS_SEG_HDR_VAL_MASKS \ + (ICE_FLOW_RSS_SEG_HDR_L3_MASKS | \ + ICE_FLOW_RSS_SEG_HDR_L4_MASKS) + +/** + * ice_flow_set_rss_seg_info - setup packet segments for RSS + * @segs: pointer to the flow field segment(s) + * @hash_fields: fields to be hashed on for the segment(s) + * @flow_hdr: protocol header fields within a packet segment + * + * Helper function to extract fields from hash bitmap and use flow + * header value to set flow field segment for further use in flow + * profile entry or removal. + */ +static enum ice_status +ice_flow_set_rss_seg_info(struct ice_flow_seg_info *segs, u64 hash_fields, + u32 flow_hdr) +{ + u64 val; + u8 i; + + for_each_set_bit(i, (unsigned long *)&hash_fields, + ICE_FLOW_FIELD_IDX_MAX) + ice_flow_set_fld(segs, (enum ice_flow_field)i, + ICE_FLOW_FLD_OFF_INVAL, ICE_FLOW_FLD_OFF_INVAL, + ICE_FLOW_FLD_OFF_INVAL, false); + + ICE_FLOW_SET_HDRS(segs, flow_hdr); + + if (segs->hdrs & ~ICE_FLOW_RSS_SEG_HDR_VAL_MASKS) + return ICE_ERR_PARAM; + + val = (u64)(segs->hdrs & ICE_FLOW_RSS_SEG_HDR_L3_MASKS); + if (val && !is_power_of_2(val)) + return ICE_ERR_CFG; + + val = (u64)(segs->hdrs & ICE_FLOW_RSS_SEG_HDR_L4_MASKS); + if (val && !is_power_of_2(val)) + return ICE_ERR_CFG; + + return 0; +} + +/** + * ice_rem_vsi_rss_list - remove VSI from RSS list + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * + * Remove the VSI from all RSS configurations in the list. + */ +void ice_rem_vsi_rss_list(struct ice_hw *hw, u16 vsi_handle) +{ + struct ice_rss_cfg *r, *tmp; + + if (list_empty(&hw->rss_list_head)) + return; + + mutex_lock(&hw->rss_locks); + list_for_each_entry_safe(r, tmp, &hw->rss_list_head, l_entry) + if (test_and_clear_bit(vsi_handle, r->vsis)) + if (bitmap_empty(r->vsis, ICE_MAX_VSI)) { + list_del(&r->l_entry); + devm_kfree(ice_hw_to_dev(hw), r); + } + mutex_unlock(&hw->rss_locks); +} + +/** + * ice_rem_vsi_rss_cfg - remove RSS configurations associated with VSI + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * + * This function will iterate through all flow profiles and disassociate + * the VSI from that profile. If the flow profile has no VSIs it will + * be removed. + */ +enum ice_status ice_rem_vsi_rss_cfg(struct ice_hw *hw, u16 vsi_handle) +{ + const enum ice_block blk = ICE_BLK_RSS; + struct ice_flow_prof *p, *t; + enum ice_status status = 0; + + if (!ice_is_vsi_valid(hw, vsi_handle)) + return ICE_ERR_PARAM; + + if (list_empty(&hw->fl_profs[blk])) + return 0; + + mutex_lock(&hw->fl_profs_locks[blk]); + list_for_each_entry_safe(p, t, &hw->fl_profs[blk], l_entry) + if (test_bit(vsi_handle, p->vsis)) { + status = ice_flow_disassoc_prof(hw, blk, p, vsi_handle); + if (status) + break; + + if (bitmap_empty(p->vsis, ICE_MAX_VSI)) { + status = ice_flow_rem_prof_sync(hw, blk, p); + if (status) + break; + } + } + mutex_unlock(&hw->fl_profs_locks[blk]); + + return status; +} + +/** + * ice_rem_rss_list - remove RSS configuration from list + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * @prof: pointer to flow profile + * + * Assumption: lock has already been acquired for RSS list + */ +static void +ice_rem_rss_list(struct ice_hw *hw, u16 vsi_handle, struct ice_flow_prof *prof) +{ + struct ice_rss_cfg *r, *tmp; + + /* Search for RSS hash fields associated to the VSI that match the + * hash configurations associated to the flow profile. If found + * remove from the RSS entry list of the VSI context and delete entry. + */ + list_for_each_entry_safe(r, tmp, &hw->rss_list_head, l_entry) + if (r->hashed_flds == prof->segs[prof->segs_cnt - 1].match && + r->packet_hdr == prof->segs[prof->segs_cnt - 1].hdrs) { + clear_bit(vsi_handle, r->vsis); + if (bitmap_empty(r->vsis, ICE_MAX_VSI)) { + list_del(&r->l_entry); + devm_kfree(ice_hw_to_dev(hw), r); + } + return; + } +} + +/** + * ice_add_rss_list - add RSS configuration to list + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * @prof: pointer to flow profile + * + * Assumption: lock has already been acquired for RSS list + */ +static enum ice_status +ice_add_rss_list(struct ice_hw *hw, u16 vsi_handle, struct ice_flow_prof *prof) +{ + struct ice_rss_cfg *r, *rss_cfg; + + list_for_each_entry(r, &hw->rss_list_head, l_entry) + if (r->hashed_flds == prof->segs[prof->segs_cnt - 1].match && + r->packet_hdr == prof->segs[prof->segs_cnt - 1].hdrs) { + set_bit(vsi_handle, r->vsis); + return 0; + } + + rss_cfg = devm_kzalloc(ice_hw_to_dev(hw), sizeof(*rss_cfg), + GFP_KERNEL); + if (!rss_cfg) + return ICE_ERR_NO_MEMORY; + + rss_cfg->hashed_flds = prof->segs[prof->segs_cnt - 1].match; + rss_cfg->packet_hdr = prof->segs[prof->segs_cnt - 1].hdrs; + set_bit(vsi_handle, rss_cfg->vsis); + + list_add_tail(&rss_cfg->l_entry, &hw->rss_list_head); + + return 0; +} + +#define ICE_FLOW_PROF_HASH_S 0 +#define ICE_FLOW_PROF_HASH_M (0xFFFFFFFFULL << ICE_FLOW_PROF_HASH_S) +#define ICE_FLOW_PROF_HDR_S 32 +#define ICE_FLOW_PROF_HDR_M (0x3FFFFFFFULL << ICE_FLOW_PROF_HDR_S) +#define ICE_FLOW_PROF_ENCAP_S 63 +#define ICE_FLOW_PROF_ENCAP_M (BIT_ULL(ICE_FLOW_PROF_ENCAP_S)) + +#define ICE_RSS_OUTER_HEADERS 1 + +/* Flow profile ID format: + * [0:31] - Packet match fields + * [32:62] - Protocol header + * [63] - Encapsulation flag, 0 if non-tunneled, 1 if tunneled + */ +#define ICE_FLOW_GEN_PROFID(hash, hdr, segs_cnt) \ + (u64)(((u64)(hash) & ICE_FLOW_PROF_HASH_M) | \ + (((u64)(hdr) << ICE_FLOW_PROF_HDR_S) & ICE_FLOW_PROF_HDR_M) | \ + ((u8)((segs_cnt) - 1) ? ICE_FLOW_PROF_ENCAP_M : 0)) + +/** + * ice_add_rss_cfg_sync - add an RSS configuration + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * @hashed_flds: hash bit fields (ICE_FLOW_HASH_*) to configure + * @addl_hdrs: protocol header fields + * @segs_cnt: packet segment count + * + * Assumption: lock has already been acquired for RSS list + */ +static enum ice_status +ice_add_rss_cfg_sync(struct ice_hw *hw, u16 vsi_handle, u64 hashed_flds, + u32 addl_hdrs, u8 segs_cnt) +{ + const enum ice_block blk = ICE_BLK_RSS; + struct ice_flow_prof *prof = NULL; + struct ice_flow_seg_info *segs; + enum ice_status status; + + if (!segs_cnt || segs_cnt > ICE_FLOW_SEG_MAX) + return ICE_ERR_PARAM; + + segs = kcalloc(segs_cnt, sizeof(*segs), GFP_KERNEL); + if (!segs) + return ICE_ERR_NO_MEMORY; + + /* Construct the packet segment info from the hashed fields */ + status = ice_flow_set_rss_seg_info(&segs[segs_cnt - 1], hashed_flds, + addl_hdrs); + if (status) + goto exit; + + /* Search for a flow profile that has matching headers, hash fields + * and has the input VSI associated to it. If found, no further + * operations required and exit. + */ + prof = ice_flow_find_prof_conds(hw, blk, ICE_FLOW_RX, segs, segs_cnt, + vsi_handle, + ICE_FLOW_FIND_PROF_CHK_FLDS | + ICE_FLOW_FIND_PROF_CHK_VSI); + if (prof) + goto exit; + + /* Check if a flow profile exists with the same protocol headers and + * associated with the input VSI. If so disassociate the VSI from + * this profile. The VSI will be added to a new profile created with + * the protocol header and new hash field configuration. + */ + prof = ice_flow_find_prof_conds(hw, blk, ICE_FLOW_RX, segs, segs_cnt, + vsi_handle, ICE_FLOW_FIND_PROF_CHK_VSI); + if (prof) { + status = ice_flow_disassoc_prof(hw, blk, prof, vsi_handle); + if (!status) + ice_rem_rss_list(hw, vsi_handle, prof); + else + goto exit; + + /* Remove profile if it has no VSIs associated */ + if (bitmap_empty(prof->vsis, ICE_MAX_VSI)) { + status = ice_flow_rem_prof(hw, blk, prof->id); + if (status) + goto exit; + } + } + + /* Search for a profile that has same match fields only. If this + * exists then associate the VSI to this profile. + */ + prof = ice_flow_find_prof_conds(hw, blk, ICE_FLOW_RX, segs, segs_cnt, + vsi_handle, + ICE_FLOW_FIND_PROF_CHK_FLDS); + if (prof) { + status = ice_flow_assoc_prof(hw, blk, prof, vsi_handle); + if (!status) + status = ice_add_rss_list(hw, vsi_handle, prof); + goto exit; + } + + /* Create a new flow profile with generated profile and packet + * segment information. + */ + status = ice_flow_add_prof(hw, blk, ICE_FLOW_RX, + ICE_FLOW_GEN_PROFID(hashed_flds, + segs[segs_cnt - 1].hdrs, + segs_cnt), + segs, segs_cnt, &prof); + if (status) + goto exit; + + status = ice_flow_assoc_prof(hw, blk, prof, vsi_handle); + /* If association to a new flow profile failed then this profile can + * be removed. + */ + if (status) { + ice_flow_rem_prof(hw, blk, prof->id); + goto exit; + } + + status = ice_add_rss_list(hw, vsi_handle, prof); + +exit: + kfree(segs); + return status; +} + +/** + * ice_add_rss_cfg - add an RSS configuration with specified hashed fields + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * @hashed_flds: hash bit fields (ICE_FLOW_HASH_*) to configure + * @addl_hdrs: protocol header fields + * + * This function will generate a flow profile based on fields associated with + * the input fields to hash on, the flow type and use the VSI number to add + * a flow entry to the profile. + */ +enum ice_status +ice_add_rss_cfg(struct ice_hw *hw, u16 vsi_handle, u64 hashed_flds, + u32 addl_hdrs) +{ + enum ice_status status; + + if (hashed_flds == ICE_HASH_INVALID || + !ice_is_vsi_valid(hw, vsi_handle)) + return ICE_ERR_PARAM; + + mutex_lock(&hw->rss_locks); + status = ice_add_rss_cfg_sync(hw, vsi_handle, hashed_flds, addl_hdrs, + ICE_RSS_OUTER_HEADERS); + mutex_unlock(&hw->rss_locks); + + return status; +} + +/* Mapping of AVF hash bit fields to an L3-L4 hash combination. + * As the ice_flow_avf_hdr_field represent individual bit shifts in a hash, + * convert its values to their appropriate flow L3, L4 values. + */ +#define ICE_FLOW_AVF_RSS_IPV4_MASKS \ + (BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_OTHER) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_FRAG_IPV4)) +#define ICE_FLOW_AVF_RSS_TCP_IPV4_MASKS \ + (BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_TCP_SYN_NO_ACK) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_TCP)) +#define ICE_FLOW_AVF_RSS_UDP_IPV4_MASKS \ + (BIT_ULL(ICE_AVF_FLOW_FIELD_UNICAST_IPV4_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_MULTICAST_IPV4_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_UDP)) +#define ICE_FLOW_AVF_RSS_ALL_IPV4_MASKS \ + (ICE_FLOW_AVF_RSS_TCP_IPV4_MASKS | ICE_FLOW_AVF_RSS_UDP_IPV4_MASKS | \ + ICE_FLOW_AVF_RSS_IPV4_MASKS | BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_SCTP)) + +#define ICE_FLOW_AVF_RSS_IPV6_MASKS \ + (BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_OTHER) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_FRAG_IPV6)) +#define ICE_FLOW_AVF_RSS_UDP_IPV6_MASKS \ + (BIT_ULL(ICE_AVF_FLOW_FIELD_UNICAST_IPV6_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_MULTICAST_IPV6_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_UDP)) +#define ICE_FLOW_AVF_RSS_TCP_IPV6_MASKS \ + (BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_TCP_SYN_NO_ACK) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_TCP)) +#define ICE_FLOW_AVF_RSS_ALL_IPV6_MASKS \ + (ICE_FLOW_AVF_RSS_TCP_IPV6_MASKS | ICE_FLOW_AVF_RSS_UDP_IPV6_MASKS | \ + ICE_FLOW_AVF_RSS_IPV6_MASKS | BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_SCTP)) + +/** + * ice_add_avf_rss_cfg - add an RSS configuration for AVF driver + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * @avf_hash: hash bit fields (ICE_AVF_FLOW_FIELD_*) to configure + * + * This function will take the hash bitmap provided by the AVF driver via a + * message, convert it to ICE-compatible values, and configure RSS flow + * profiles. + */ +enum ice_status +ice_add_avf_rss_cfg(struct ice_hw *hw, u16 vsi_handle, u64 avf_hash) +{ + enum ice_status status = 0; + u64 hash_flds; + + if (avf_hash == ICE_AVF_FLOW_FIELD_INVALID || + !ice_is_vsi_valid(hw, vsi_handle)) + return ICE_ERR_PARAM; + + /* Make sure no unsupported bits are specified */ + if (avf_hash & ~(ICE_FLOW_AVF_RSS_ALL_IPV4_MASKS | + ICE_FLOW_AVF_RSS_ALL_IPV6_MASKS)) + return ICE_ERR_CFG; + + hash_flds = avf_hash; + + /* Always create an L3 RSS configuration for any L4 RSS configuration */ + if (hash_flds & ICE_FLOW_AVF_RSS_ALL_IPV4_MASKS) + hash_flds |= ICE_FLOW_AVF_RSS_IPV4_MASKS; + + if (hash_flds & ICE_FLOW_AVF_RSS_ALL_IPV6_MASKS) + hash_flds |= ICE_FLOW_AVF_RSS_IPV6_MASKS; + + /* Create the corresponding RSS configuration for each valid hash bit */ + while (hash_flds) { + u64 rss_hash = ICE_HASH_INVALID; + + if (hash_flds & ICE_FLOW_AVF_RSS_ALL_IPV4_MASKS) { + if (hash_flds & ICE_FLOW_AVF_RSS_IPV4_MASKS) { + rss_hash = ICE_FLOW_HASH_IPV4; + hash_flds &= ~ICE_FLOW_AVF_RSS_IPV4_MASKS; + } else if (hash_flds & + ICE_FLOW_AVF_RSS_TCP_IPV4_MASKS) { + rss_hash = ICE_FLOW_HASH_IPV4 | + ICE_FLOW_HASH_TCP_PORT; + hash_flds &= ~ICE_FLOW_AVF_RSS_TCP_IPV4_MASKS; + } else if (hash_flds & + ICE_FLOW_AVF_RSS_UDP_IPV4_MASKS) { + rss_hash = ICE_FLOW_HASH_IPV4 | + ICE_FLOW_HASH_UDP_PORT; + hash_flds &= ~ICE_FLOW_AVF_RSS_UDP_IPV4_MASKS; + } else if (hash_flds & + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_SCTP)) { + rss_hash = ICE_FLOW_HASH_IPV4 | + ICE_FLOW_HASH_SCTP_PORT; + hash_flds &= + ~BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_SCTP); + } + } else if (hash_flds & ICE_FLOW_AVF_RSS_ALL_IPV6_MASKS) { + if (hash_flds & ICE_FLOW_AVF_RSS_IPV6_MASKS) { + rss_hash = ICE_FLOW_HASH_IPV6; + hash_flds &= ~ICE_FLOW_AVF_RSS_IPV6_MASKS; + } else if (hash_flds & + ICE_FLOW_AVF_RSS_TCP_IPV6_MASKS) { + rss_hash = ICE_FLOW_HASH_IPV6 | + ICE_FLOW_HASH_TCP_PORT; + hash_flds &= ~ICE_FLOW_AVF_RSS_TCP_IPV6_MASKS; + } else if (hash_flds & + ICE_FLOW_AVF_RSS_UDP_IPV6_MASKS) { + rss_hash = ICE_FLOW_HASH_IPV6 | + ICE_FLOW_HASH_UDP_PORT; + hash_flds &= ~ICE_FLOW_AVF_RSS_UDP_IPV6_MASKS; + } else if (hash_flds & + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_SCTP)) { + rss_hash = ICE_FLOW_HASH_IPV6 | + ICE_FLOW_HASH_SCTP_PORT; + hash_flds &= + ~BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_SCTP); + } + } + + if (rss_hash == ICE_HASH_INVALID) + return ICE_ERR_OUT_OF_RANGE; + + status = ice_add_rss_cfg(hw, vsi_handle, rss_hash, + ICE_FLOW_SEG_HDR_NONE); + if (status) + break; + } + + return status; +} + +/** + * ice_replay_rss_cfg - replay RSS configurations associated with VSI + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + */ +enum ice_status ice_replay_rss_cfg(struct ice_hw *hw, u16 vsi_handle) +{ + enum ice_status status = 0; + struct ice_rss_cfg *r; + + if (!ice_is_vsi_valid(hw, vsi_handle)) + return ICE_ERR_PARAM; + + mutex_lock(&hw->rss_locks); + list_for_each_entry(r, &hw->rss_list_head, l_entry) { + if (test_bit(vsi_handle, r->vsis)) { + status = ice_add_rss_cfg_sync(hw, vsi_handle, + r->hashed_flds, + r->packet_hdr, + ICE_RSS_OUTER_HEADERS); + if (status) + break; + } + } + mutex_unlock(&hw->rss_locks); + + return status; +} + +/** + * ice_get_rss_cfg - returns hashed fields for the given header types + * @hw: pointer to the hardware structure + * @vsi_handle: software VSI handle + * @hdrs: protocol header type + * + * This function will return the match fields of the first instance of flow + * profile having the given header types and containing input VSI + */ +u64 ice_get_rss_cfg(struct ice_hw *hw, u16 vsi_handle, u32 hdrs) +{ + struct ice_rss_cfg *r, *rss_cfg = NULL; + + /* verify if the protocol header is non zero and VSI is valid */ + if (hdrs == ICE_FLOW_SEG_HDR_NONE || !ice_is_vsi_valid(hw, vsi_handle)) + return ICE_HASH_INVALID; + + mutex_lock(&hw->rss_locks); + list_for_each_entry(r, &hw->rss_list_head, l_entry) + if (test_bit(vsi_handle, r->vsis) && + r->packet_hdr == hdrs) { + rss_cfg = r; + break; + } + mutex_unlock(&hw->rss_locks); + + return rss_cfg ? rss_cfg->hashed_flds : ICE_HASH_INVALID; +} diff --git a/drivers/net/ethernet/intel/ice/ice_flow.h b/drivers/net/ethernet/intel/ice/ice_flow.h new file mode 100644 index 000000000000..5558627bd5eb --- /dev/null +++ b/drivers/net/ethernet/intel/ice/ice_flow.h @@ -0,0 +1,207 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2019, Intel Corporation. */ + +#ifndef _ICE_FLOW_H_ +#define _ICE_FLOW_H_ + +#define ICE_FLOW_ENTRY_HANDLE_INVAL 0 +#define ICE_FLOW_FLD_OFF_INVAL 0xffff + +/* Generate flow hash field from flow field type(s) */ +#define ICE_FLOW_HASH_IPV4 \ + (BIT_ULL(ICE_FLOW_FIELD_IDX_IPV4_SA) | \ + BIT_ULL(ICE_FLOW_FIELD_IDX_IPV4_DA)) +#define ICE_FLOW_HASH_IPV6 \ + (BIT_ULL(ICE_FLOW_FIELD_IDX_IPV6_SA) | \ + BIT_ULL(ICE_FLOW_FIELD_IDX_IPV6_DA)) +#define ICE_FLOW_HASH_TCP_PORT \ + (BIT_ULL(ICE_FLOW_FIELD_IDX_TCP_SRC_PORT) | \ + BIT_ULL(ICE_FLOW_FIELD_IDX_TCP_DST_PORT)) +#define ICE_FLOW_HASH_UDP_PORT \ + (BIT_ULL(ICE_FLOW_FIELD_IDX_UDP_SRC_PORT) | \ + BIT_ULL(ICE_FLOW_FIELD_IDX_UDP_DST_PORT)) +#define ICE_FLOW_HASH_SCTP_PORT \ + (BIT_ULL(ICE_FLOW_FIELD_IDX_SCTP_SRC_PORT) | \ + BIT_ULL(ICE_FLOW_FIELD_IDX_SCTP_DST_PORT)) + +#define ICE_HASH_INVALID 0 +#define ICE_HASH_TCP_IPV4 (ICE_FLOW_HASH_IPV4 | ICE_FLOW_HASH_TCP_PORT) +#define ICE_HASH_TCP_IPV6 (ICE_FLOW_HASH_IPV6 | ICE_FLOW_HASH_TCP_PORT) +#define ICE_HASH_UDP_IPV4 (ICE_FLOW_HASH_IPV4 | ICE_FLOW_HASH_UDP_PORT) +#define ICE_HASH_UDP_IPV6 (ICE_FLOW_HASH_IPV6 | ICE_FLOW_HASH_UDP_PORT) + +/* Protocol header fields within a packet segment. A segment consists of one or + * more protocol headers that make up a logical group of protocol headers. Each + * logical group of protocol headers encapsulates or is encapsulated using/by + * tunneling or encapsulation protocols for network virtualization such as GRE, + * VxLAN, etc. + */ +enum ice_flow_seg_hdr { + ICE_FLOW_SEG_HDR_NONE = 0x00000000, + ICE_FLOW_SEG_HDR_IPV4 = 0x00000004, + ICE_FLOW_SEG_HDR_IPV6 = 0x00000008, + ICE_FLOW_SEG_HDR_TCP = 0x00000040, + ICE_FLOW_SEG_HDR_UDP = 0x00000080, + ICE_FLOW_SEG_HDR_SCTP = 0x00000100, +}; + +enum ice_flow_field { + /* L3 */ + ICE_FLOW_FIELD_IDX_IPV4_SA, + ICE_FLOW_FIELD_IDX_IPV4_DA, + ICE_FLOW_FIELD_IDX_IPV6_SA, + ICE_FLOW_FIELD_IDX_IPV6_DA, + /* L4 */ + ICE_FLOW_FIELD_IDX_TCP_SRC_PORT, + ICE_FLOW_FIELD_IDX_TCP_DST_PORT, + ICE_FLOW_FIELD_IDX_UDP_SRC_PORT, + ICE_FLOW_FIELD_IDX_UDP_DST_PORT, + ICE_FLOW_FIELD_IDX_SCTP_SRC_PORT, + ICE_FLOW_FIELD_IDX_SCTP_DST_PORT, + /* The total number of enums must not exceed 64 */ + ICE_FLOW_FIELD_IDX_MAX +}; + +/* Flow headers and fields for AVF support */ +enum ice_flow_avf_hdr_field { + /* Values 0 - 28 are reserved for future use */ + ICE_AVF_FLOW_FIELD_INVALID = 0, + ICE_AVF_FLOW_FIELD_UNICAST_IPV4_UDP = 29, + ICE_AVF_FLOW_FIELD_MULTICAST_IPV4_UDP, + ICE_AVF_FLOW_FIELD_IPV4_UDP, + ICE_AVF_FLOW_FIELD_IPV4_TCP_SYN_NO_ACK, + ICE_AVF_FLOW_FIELD_IPV4_TCP, + ICE_AVF_FLOW_FIELD_IPV4_SCTP, + ICE_AVF_FLOW_FIELD_IPV4_OTHER, + ICE_AVF_FLOW_FIELD_FRAG_IPV4, + /* Values 37-38 are reserved */ + ICE_AVF_FLOW_FIELD_UNICAST_IPV6_UDP = 39, + ICE_AVF_FLOW_FIELD_MULTICAST_IPV6_UDP, + ICE_AVF_FLOW_FIELD_IPV6_UDP, + ICE_AVF_FLOW_FIELD_IPV6_TCP_SYN_NO_ACK, + ICE_AVF_FLOW_FIELD_IPV6_TCP, + ICE_AVF_FLOW_FIELD_IPV6_SCTP, + ICE_AVF_FLOW_FIELD_IPV6_OTHER, + ICE_AVF_FLOW_FIELD_FRAG_IPV6, + ICE_AVF_FLOW_FIELD_RSVD47, + ICE_AVF_FLOW_FIELD_FCOE_OX, + ICE_AVF_FLOW_FIELD_FCOE_RX, + ICE_AVF_FLOW_FIELD_FCOE_OTHER, + /* Values 51-62 are reserved */ + ICE_AVF_FLOW_FIELD_L2_PAYLOAD = 63, + ICE_AVF_FLOW_FIELD_MAX +}; + +/* Supported RSS offloads This macro is defined to support + * VIRTCHNL_OP_GET_RSS_HENA_CAPS ops. PF driver sends the RSS hardware + * capabilities to the caller of this ops. + */ +#define ICE_DEFAULT_RSS_HENA ( \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_SCTP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_TCP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_OTHER) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_FRAG_IPV4) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_TCP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_SCTP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_OTHER) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_FRAG_IPV6) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV4_TCP_SYN_NO_ACK) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_UNICAST_IPV4_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_MULTICAST_IPV4_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_IPV6_TCP_SYN_NO_ACK) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_UNICAST_IPV6_UDP) | \ + BIT_ULL(ICE_AVF_FLOW_FIELD_MULTICAST_IPV6_UDP)) + +enum ice_flow_dir { + ICE_FLOW_RX = 0x02, +}; + +enum ice_flow_priority { + ICE_FLOW_PRIO_LOW, + ICE_FLOW_PRIO_NORMAL, + ICE_FLOW_PRIO_HIGH +}; + +#define ICE_FLOW_SEG_MAX 2 +#define ICE_FLOW_FV_EXTRACT_SZ 2 + +#define ICE_FLOW_SET_HDRS(seg, val) ((seg)->hdrs |= (u32)(val)) + +struct ice_flow_seg_xtrct { + u8 prot_id; /* Protocol ID of extracted header field */ + u16 off; /* Starting offset of the field in header in bytes */ + u8 idx; /* Index of FV entry used */ + u8 disp; /* Displacement of field in bits fr. FV entry's start */ +}; + +enum ice_flow_fld_match_type { + ICE_FLOW_FLD_TYPE_REG, /* Value, mask */ + ICE_FLOW_FLD_TYPE_RANGE, /* Value, mask, last (upper bound) */ + ICE_FLOW_FLD_TYPE_PREFIX, /* IP address, prefix, size of prefix */ + ICE_FLOW_FLD_TYPE_SIZE, /* Value, mask, size of match */ +}; + +struct ice_flow_fld_loc { + /* Describe offsets of field information relative to the beginning of + * input buffer provided when adding flow entries. + */ + u16 val; /* Offset where the value is located */ + u16 mask; /* Offset where the mask/prefix value is located */ + u16 last; /* Length or offset where the upper value is located */ +}; + +struct ice_flow_fld_info { + enum ice_flow_fld_match_type type; + /* Location where to retrieve data from an input buffer */ + struct ice_flow_fld_loc src; + /* Location where to put the data into the final entry buffer */ + struct ice_flow_fld_loc entry; + struct ice_flow_seg_xtrct xtrct; +}; + +struct ice_flow_seg_info { + u32 hdrs; /* Bitmask indicating protocol headers present */ + u64 match; /* Bitmask indicating header fields to be matched */ + u64 range; /* Bitmask indicating header fields matched as ranges */ + + struct ice_flow_fld_info fields[ICE_FLOW_FIELD_IDX_MAX]; +}; + +struct ice_flow_prof { + struct list_head l_entry; + + u64 id; + enum ice_flow_dir dir; + u8 segs_cnt; + + /* Keep track of flow entries associated with this flow profile */ + struct mutex entries_lock; + struct list_head entries; + + struct ice_flow_seg_info segs[ICE_FLOW_SEG_MAX]; + + /* software VSI handles referenced by this flow profile */ + DECLARE_BITMAP(vsis, ICE_MAX_VSI); +}; + +struct ice_rss_cfg { + struct list_head l_entry; + /* bitmap of VSIs added to the RSS entry */ + DECLARE_BITMAP(vsis, ICE_MAX_VSI); + u64 hashed_flds; + u32 packet_hdr; +}; + +enum ice_status ice_flow_rem_entry(struct ice_hw *hw, u64 entry_h); +void ice_rem_vsi_rss_list(struct ice_hw *hw, u16 vsi_handle); +enum ice_status ice_replay_rss_cfg(struct ice_hw *hw, u16 vsi_handle); +enum ice_status +ice_add_avf_rss_cfg(struct ice_hw *hw, u16 vsi_handle, u64 hashed_flds); +enum ice_status ice_rem_vsi_rss_cfg(struct ice_hw *hw, u16 vsi_handle); +enum ice_status +ice_add_rss_cfg(struct ice_hw *hw, u16 vsi_handle, u64 hashed_flds, + u32 addl_hdrs); +u64 ice_get_rss_cfg(struct ice_hw *hw, u16 vsi_handle, u32 hdrs); +#endif /* _ICE_FLOW_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h index 0997d352709b..878e125d8b42 100644 --- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h +++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h @@ -199,6 +199,14 @@ enum ice_rxdid { /* Receive Flex Descriptor Rx opcode values */ #define ICE_RX_OPC_MDID 0x01 +/* Receive Descriptor MDID values that access packet flags */ +enum ice_flex_mdid_pkt_flags { + ICE_RX_MDID_PKT_FLAGS_15_0 = 20, + ICE_RX_MDID_PKT_FLAGS_31_16, + ICE_RX_MDID_PKT_FLAGS_47_32, + ICE_RX_MDID_PKT_FLAGS_63_48, +}; + /* Receive Descriptor MDID values */ enum ice_flex_rx_mdid { ICE_RX_MDID_FLOW_ID_LOWER = 5, diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 4cfad81ba496..1874c9f51a32 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -3,6 +3,7 @@ #include "ice.h" #include "ice_base.h" +#include "ice_flow.h" #include "ice_lib.h" #include "ice_dcb_lib.h" @@ -493,7 +494,28 @@ bool ice_is_safe_mode(struct ice_pf *pf) } /** - * ice_rss_clean - Delete RSS related VSI structures that hold user inputs + * ice_vsi_clean_rss_flow_fld - Delete RSS configuration + * @vsi: the VSI being cleaned up + * + * This function deletes RSS input set for all flows that were configured + * for this VSI + */ +static void ice_vsi_clean_rss_flow_fld(struct ice_vsi *vsi) +{ + struct ice_pf *pf = vsi->back; + enum ice_status status; + + if (ice_is_safe_mode(pf)) + return; + + status = ice_rem_vsi_rss_cfg(&pf->hw, vsi->idx); + if (status) + dev_dbg(ice_pf_to_dev(pf), "ice_rem_vsi_rss_cfg failed for vsi = %d, error = %d\n", + vsi->vsi_num, status); +} + +/** + * ice_rss_clean - Delete RSS related VSI structures and configuration * @vsi: the VSI being removed */ static void ice_rss_clean(struct ice_vsi *vsi) @@ -507,6 +529,11 @@ static void ice_rss_clean(struct ice_vsi *vsi) devm_kfree(dev, vsi->rss_hkey_user); if (vsi->rss_lut_user) devm_kfree(dev, vsi->rss_lut_user); + + ice_vsi_clean_rss_flow_fld(vsi); + /* remove RSS replay list */ + if (!ice_is_safe_mode(pf)) + ice_rem_vsi_rss_list(&pf->hw, vsi->idx); } /** @@ -1087,6 +1114,115 @@ ice_vsi_cfg_rss_exit: } /** + * ice_vsi_set_vf_rss_flow_fld - Sets VF VSI RSS input set for different flows + * @vsi: VSI to be configured + * + * This function will only be called during the VF VSI setup. Upon successful + * completion of package download, this function will configure default RSS + * input sets for VF VSI. + */ +static void ice_vsi_set_vf_rss_flow_fld(struct ice_vsi *vsi) +{ + struct ice_pf *pf = vsi->back; + enum ice_status status; + struct device *dev; + + dev = ice_pf_to_dev(pf); + if (ice_is_safe_mode(pf)) { + dev_dbg(dev, "Advanced RSS disabled. Package download failed, vsi num = %d\n", + vsi->vsi_num); + return; + } + + status = ice_add_avf_rss_cfg(&pf->hw, vsi->idx, ICE_DEFAULT_RSS_HENA); + if (status) + dev_dbg(dev, "ice_add_avf_rss_cfg failed for vsi = %d, error = %d\n", + vsi->vsi_num, status); +} + +/** + * ice_vsi_set_rss_flow_fld - Sets RSS input set for different flows + * @vsi: VSI to be configured + * + * This function will only be called after successful download package call + * during initialization of PF. Since the downloaded package will erase the + * RSS section, this function will configure RSS input sets for different + * flow types. The last profile added has the highest priority, therefore 2 + * tuple profiles (i.e. IPv4 src/dst) are added before 4 tuple profiles + * (i.e. IPv4 src/dst TCP src/dst port). + */ +static void ice_vsi_set_rss_flow_fld(struct ice_vsi *vsi) +{ + u16 vsi_handle = vsi->idx, vsi_num = vsi->vsi_num; + struct ice_pf *pf = vsi->back; + struct ice_hw *hw = &pf->hw; + enum ice_status status; + struct device *dev; + + dev = ice_pf_to_dev(pf); + if (ice_is_safe_mode(pf)) { + dev_dbg(dev, "Advanced RSS disabled. Package download failed, vsi num = %d\n", + vsi_num); + return; + } + /* configure RSS for IPv4 with input set IP src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_FLOW_HASH_IPV4, + ICE_FLOW_SEG_HDR_IPV4); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for ipv4 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for IPv6 with input set IPv6 src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_FLOW_HASH_IPV6, + ICE_FLOW_SEG_HDR_IPV6); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for ipv6 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for tcp4 with input set IP src/dst, TCP src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_HASH_TCP_IPV4, + ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_IPV4); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for tcp4 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for udp4 with input set IP src/dst, UDP src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_HASH_UDP_IPV4, + ICE_FLOW_SEG_HDR_UDP | ICE_FLOW_SEG_HDR_IPV4); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for udp4 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for sctp4 with input set IP src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_FLOW_HASH_IPV4, + ICE_FLOW_SEG_HDR_SCTP | ICE_FLOW_SEG_HDR_IPV4); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for sctp4 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for tcp6 with input set IPv6 src/dst, TCP src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_HASH_TCP_IPV6, + ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_IPV6); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for tcp6 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for udp6 with input set IPv6 src/dst, UDP src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_HASH_UDP_IPV6, + ICE_FLOW_SEG_HDR_UDP | ICE_FLOW_SEG_HDR_IPV6); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for udp6 flow, vsi = %d, error = %d\n", + vsi_num, status); + + /* configure RSS for sctp6 with input set IPv6 src/dst */ + status = ice_add_rss_cfg(hw, vsi_handle, ICE_FLOW_HASH_IPV6, + ICE_FLOW_SEG_HDR_SCTP | ICE_FLOW_SEG_HDR_IPV6); + if (status) + dev_dbg(dev, "ice_add_rss_cfg failed for sctp6 flow, vsi = %d, error = %d\n", + vsi_num, status); +} + +/** * ice_add_mac_to_list - Add a MAC address filter entry to the list * @vsi: the VSI to be forwarded to * @add_list: pointer to the list which contains MAC filter entries @@ -1901,8 +2037,10 @@ ice_vsi_setup(struct ice_pf *pf, struct ice_port_info *pi, * receive traffic on first queue. Hence no need to capture * return value */ - if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) + if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) { ice_vsi_cfg_rss_lut_key(vsi); + ice_vsi_set_rss_flow_fld(vsi); + } break; case ICE_VSI_VF: /* VF driver will take care of creating netdev for this type and @@ -1926,8 +2064,10 @@ ice_vsi_setup(struct ice_pf *pf, struct ice_port_info *pi, * receive traffic on first queue. Hence no need to capture * return value */ - if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) + if (test_bit(ICE_FLAG_RSS_ENA, pf->flags)) { ice_vsi_cfg_rss_lut_key(vsi); + ice_vsi_set_vf_rss_flow_fld(vsi); + } break; case ICE_VSI_LB: ret = ice_vsi_alloc_rings(vsi); diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index eb9d00608e9a..5ae671609f98 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -13,7 +13,7 @@ #define DRV_VERSION_MAJOR 0 #define DRV_VERSION_MINOR 8 -#define DRV_VERSION_BUILD 1 +#define DRV_VERSION_BUILD 2 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \ __stringify(DRV_VERSION_MINOR) "." \ diff --git a/drivers/net/ethernet/intel/ice/ice_protocol_type.h b/drivers/net/ethernet/intel/ice/ice_protocol_type.h new file mode 100644 index 000000000000..71647566964e --- /dev/null +++ b/drivers/net/ethernet/intel/ice/ice_protocol_type.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2019, Intel Corporation. */ + +#ifndef _ICE_PROTOCOL_TYPE_H_ +#define _ICE_PROTOCOL_TYPE_H_ +/* Decoders for ice_prot_id: + * - F: First + * - I: Inner + * - L: Last + * - O: Outer + * - S: Single + */ +enum ice_prot_id { + ICE_PROT_ID_INVAL = 0, + ICE_PROT_IPV4_OF_OR_S = 32, + ICE_PROT_IPV4_IL = 33, + ICE_PROT_IPV6_OF_OR_S = 40, + ICE_PROT_IPV6_IL = 41, + ICE_PROT_TCP_IL = 49, + ICE_PROT_UDP_IL_OR_S = 53, + ICE_PROT_SCTP_IL = 96, + ICE_PROT_META_ID = 255, /* when offset == metadata */ + ICE_PROT_INVALID = 255 /* when offset == ICE_FV_OFFSET_INVAL */ +}; +#endif /* _ICE_PROTOCOL_TYPE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_status.h b/drivers/net/ethernet/intel/ice/ice_status.h index c01597885629..a9a8bc3aca42 100644 --- a/drivers/net/ethernet/intel/ice/ice_status.h +++ b/drivers/net/ethernet/intel/ice/ice_status.h @@ -26,6 +26,7 @@ enum ice_status { ICE_ERR_IN_USE = -16, ICE_ERR_MAX_LIMIT = -17, ICE_ERR_RESET_ONGOING = -18, + ICE_ERR_HW_TABLE = -19, ICE_ERR_NVM_CHECKSUM = -51, ICE_ERR_BUF_TOO_SHORT = -52, ICE_ERR_NVM_BLANK_MODE = -53, diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c index b5a53f862a83..431266081a80 100644 --- a/drivers/net/ethernet/intel/ice/ice_switch.c +++ b/drivers/net/ethernet/intel/ice/ice_switch.c @@ -50,42 +50,6 @@ static const u8 dummy_eth_header[DUMMY_ETH_HDR_LEN] = { 0x2, 0, 0, 0, 0, 0, ((n) * sizeof(((struct ice_sw_rule_vsi_list *)0)->vsi))) /** - * ice_aq_alloc_free_res - command to allocate/free resources - * @hw: pointer to the HW struct - * @num_entries: number of resource entries in buffer - * @buf: Indirect buffer to hold data parameters and response - * @buf_size: size of buffer for indirect commands - * @opc: pass in the command opcode - * @cd: pointer to command details structure or NULL - * - * Helper function to allocate/free resources using the admin queue commands - */ -static enum ice_status -ice_aq_alloc_free_res(struct ice_hw *hw, u16 num_entries, - struct ice_aqc_alloc_free_res_elem *buf, u16 buf_size, - enum ice_adminq_opc opc, struct ice_sq_cd *cd) -{ - struct ice_aqc_alloc_free_res_cmd *cmd; - struct ice_aq_desc desc; - - cmd = &desc.params.sw_res_ctrl; - - if (!buf) - return ICE_ERR_PARAM; - - if (buf_size < (num_entries * sizeof(buf->elem[0]))) - return ICE_ERR_PARAM; - - ice_fill_dflt_direct_cmd_desc(&desc, opc); - - desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD); - - cmd->num_entries = cpu_to_le16(num_entries); - - return ice_aq_send_cmd(hw, &desc, buf, buf_size, cd); -} - -/** * ice_init_def_sw_recp - initialize the recipe book keeping tables * @hw: pointer to the HW struct * diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index c4854a987130..b361ffabb0ca 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -13,6 +13,7 @@ #include "ice_controlq.h" #include "ice_lan_tx_rx.h" #include "ice_flex_type.h" +#include "ice_protocol_type.h" static inline bool ice_is_tc_ena(unsigned long bitmap, u8 tc) { @@ -41,6 +42,7 @@ static inline u32 ice_round_to_num(u32 N, u32 R) #define ICE_DBG_QCTX BIT_ULL(6) #define ICE_DBG_NVM BIT_ULL(7) #define ICE_DBG_LAN BIT_ULL(8) +#define ICE_DBG_FLOW BIT_ULL(9) #define ICE_DBG_SW BIT_ULL(13) #define ICE_DBG_SCHED BIT_ULL(14) #define ICE_DBG_PKG BIT_ULL(16) @@ -559,6 +561,10 @@ struct ice_hw { /* HW block tables */ struct ice_blk_info blk[ICE_BLK_COUNT]; + struct mutex fl_profs_locks[ICE_BLK_COUNT]; /* lock fltr profiles */ + struct list_head fl_profs[ICE_BLK_COUNT]; + struct mutex rss_locks; /* protect RSS configuration */ + struct list_head rss_list_head; }; /* Statistics collected by each port, VSI, VEB, and S-channel */ diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index 0449d4b28ade..2dfbfdff45a8 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -4226,6 +4226,12 @@ static int mvneta_xdp_setup(struct net_device *dev, struct bpf_prog *prog, return -EOPNOTSUPP; } + if (pp->bm_priv) { + NL_SET_ERR_MSG_MOD(extack, + "Hardware Buffer Management not supported on XDP"); + return -EOPNOTSUPP; + } + need_update = !!pp->xdp_prog != !!prog; if (running && need_update) mvneta_stop(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index fc80b59db9a8..220ef9f06f84 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -892,6 +892,8 @@ struct mlx5e_profile { int (*update_rx)(struct mlx5e_priv *priv); void (*update_stats)(struct mlx5e_priv *priv); void (*update_carrier)(struct mlx5e_priv *priv); + unsigned int (*stats_grps_num)(struct mlx5e_priv *priv); + mlx5e_stats_grp_t *stats_grps; struct { mlx5e_fp_handle_rx_cqe handle_rx_cqe; mlx5e_fp_handle_rx_cqe handle_rx_cqe_mpwqe; @@ -964,7 +966,6 @@ struct sk_buff * mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, struct mlx5e_wqe_frag_info *wi, u32 cqe_bcnt); -void mlx5e_update_stats(struct mlx5e_priv *priv); void mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats); void mlx5e_fold_sw_stats64(struct mlx5e_priv *priv, struct rtnl_link_stats64 *s); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h index d48292ccda29..0416f7712109 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h @@ -21,6 +21,7 @@ struct mlx5e_tc_table { DECLARE_HASHTABLE(hairpin_tbl, 8); struct notifier_block netdevice_nb; + struct netdev_net_notifier netdevice_nn; }; struct mlx5e_flow_table { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c index 778dab1af8fc..f260dd96873b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c @@ -180,7 +180,7 @@ mlx5e_ktls_tx_post_param_wqes(struct mlx5e_txqsq *sq, struct tx_sync_info { u64 rcd_sn; - s32 sync_len; + u32 sync_len; int nr_frags; skb_frag_t frags[MAX_SKB_FRAGS]; }; @@ -193,13 +193,14 @@ enum mlx5e_ktls_sync_retval { static enum mlx5e_ktls_sync_retval tx_sync_info_get(struct mlx5e_ktls_offload_context_tx *priv_tx, - u32 tcp_seq, struct tx_sync_info *info) + u32 tcp_seq, int datalen, struct tx_sync_info *info) { struct tls_offload_context_tx *tx_ctx = priv_tx->tx_ctx; enum mlx5e_ktls_sync_retval ret = MLX5E_KTLS_SYNC_DONE; struct tls_record_info *record; int remaining, i = 0; unsigned long flags; + bool ends_before; spin_lock_irqsave(&tx_ctx->lock, flags); record = tls_get_record(tx_ctx, tcp_seq, &info->rcd_sn); @@ -209,9 +210,21 @@ tx_sync_info_get(struct mlx5e_ktls_offload_context_tx *priv_tx, goto out; } - if (unlikely(tcp_seq < tls_record_start_seq(record))) { - ret = tls_record_is_start_marker(record) ? - MLX5E_KTLS_SYNC_SKIP_NO_DATA : MLX5E_KTLS_SYNC_FAIL; + /* There are the following cases: + * 1. packet ends before start marker: bypass offload. + * 2. packet starts before start marker and ends after it: drop, + * not supported, breaks contract with kernel. + * 3. packet ends before tls record info starts: drop, + * this packet was already acknowledged and its record info + * was released. + */ + ends_before = before(tcp_seq + datalen, tls_record_start_seq(record)); + + if (unlikely(tls_record_is_start_marker(record))) { + ret = ends_before ? MLX5E_KTLS_SYNC_SKIP_NO_DATA : MLX5E_KTLS_SYNC_FAIL; + goto out; + } else if (ends_before) { + ret = MLX5E_KTLS_SYNC_FAIL; goto out; } @@ -337,7 +350,7 @@ mlx5e_ktls_tx_handle_ooo(struct mlx5e_ktls_offload_context_tx *priv_tx, u8 num_wqebbs; int i = 0; - ret = tx_sync_info_get(priv_tx, seq, &info); + ret = tx_sync_info_get(priv_tx, seq, datalen, &info); if (unlikely(ret != MLX5E_KTLS_SYNC_DONE)) { if (ret == MLX5E_KTLS_SYNC_SKIP_NO_DATA) { stats->tls_skip_no_sync_data++; @@ -351,14 +364,6 @@ mlx5e_ktls_tx_handle_ooo(struct mlx5e_ktls_offload_context_tx *priv_tx, goto err_out; } - if (unlikely(info.sync_len < 0)) { - if (likely(datalen <= -info.sync_len)) - return MLX5E_KTLS_SYNC_DONE; - - stats->tls_drop_bypass_req++; - goto err_out; - } - stats->tls_ooo++; tx_post_resync_params(sq, priv_tx, info.rcd_sn); @@ -378,8 +383,6 @@ mlx5e_ktls_tx_handle_ooo(struct mlx5e_ktls_offload_context_tx *priv_tx, if (unlikely(contig_wqebbs_room < num_wqebbs)) mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room); - tx_post_resync_params(sq, priv_tx, info.rcd_sn); - for (; i < info.nr_frags; i++) { unsigned int orig_fsz, frag_offset = 0, n = 0; skb_frag_t *f = &info.frags[i]; @@ -455,12 +458,18 @@ struct sk_buff *mlx5e_ktls_handle_tx_skb(struct net_device *netdev, enum mlx5e_ktls_sync_retval ret = mlx5e_ktls_tx_handle_ooo(priv_tx, sq, datalen, seq); - if (likely(ret == MLX5E_KTLS_SYNC_DONE)) + switch (ret) { + case MLX5E_KTLS_SYNC_DONE: *wqe = mlx5e_sq_fetch_wqe(sq, sizeof(**wqe), pi); - else if (ret == MLX5E_KTLS_SYNC_FAIL) + break; + case MLX5E_KTLS_SYNC_SKIP_NO_DATA: + if (likely(!skb->decrypted)) + goto out; + WARN_ON_ONCE(1); + /* fall-through */ + default: /* MLX5E_KTLS_SYNC_FAIL */ goto err_out; - else /* ret == MLX5E_KTLS_SYNC_SKIP_NO_DATA */ - goto out; + } } priv_tx->expected_seq = seq + datalen; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index c6776f308d5e..d674cb679895 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -218,13 +218,9 @@ static const struct pflag_desc mlx5e_priv_flags[MLX5E_NUM_PFLAGS]; int mlx5e_ethtool_get_sset_count(struct mlx5e_priv *priv, int sset) { - int i, num_stats = 0; - switch (sset) { case ETH_SS_STATS: - for (i = 0; i < mlx5e_num_stats_grps; i++) - num_stats += mlx5e_stats_grps[i].get_num_stats(priv); - return num_stats; + return mlx5e_stats_total_num(priv); case ETH_SS_PRIV_FLAGS: return MLX5E_NUM_PFLAGS; case ETH_SS_TEST: @@ -242,14 +238,6 @@ static int mlx5e_get_sset_count(struct net_device *dev, int sset) return mlx5e_ethtool_get_sset_count(priv, sset); } -static void mlx5e_fill_stats_strings(struct mlx5e_priv *priv, u8 *data) -{ - int i, idx = 0; - - for (i = 0; i < mlx5e_num_stats_grps; i++) - idx = mlx5e_stats_grps[i].fill_strings(priv, data, idx); -} - void mlx5e_ethtool_get_strings(struct mlx5e_priv *priv, u32 stringset, u8 *data) { int i; @@ -268,7 +256,7 @@ void mlx5e_ethtool_get_strings(struct mlx5e_priv *priv, u32 stringset, u8 *data) break; case ETH_SS_STATS: - mlx5e_fill_stats_strings(priv, data); + mlx5e_stats_fill_strings(priv, data); break; } } @@ -283,14 +271,13 @@ static void mlx5e_get_strings(struct net_device *dev, u32 stringset, u8 *data) void mlx5e_ethtool_get_ethtool_stats(struct mlx5e_priv *priv, struct ethtool_stats *stats, u64 *data) { - int i, idx = 0; + int idx = 0; mutex_lock(&priv->state_lock); - mlx5e_update_stats(priv); + mlx5e_stats_update(priv); mutex_unlock(&priv->state_lock); - for (i = 0; i < mlx5e_num_stats_grps; i++) - idx = mlx5e_stats_grps[i].fill_stats(priv, data, idx); + mlx5e_stats_fill(priv, data, idx); } static void mlx5e_get_ethtool_stats(struct net_device *dev, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 78737fd42616..454d3459bd8b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -159,23 +159,14 @@ static void mlx5e_update_carrier_work(struct work_struct *work) mutex_unlock(&priv->state_lock); } -void mlx5e_update_stats(struct mlx5e_priv *priv) -{ - int i; - - for (i = mlx5e_num_stats_grps - 1; i >= 0; i--) - if (mlx5e_stats_grps[i].update_stats) - mlx5e_stats_grps[i].update_stats(priv); -} - void mlx5e_update_ndo_stats(struct mlx5e_priv *priv) { int i; - for (i = mlx5e_num_stats_grps - 1; i >= 0; i--) - if (mlx5e_stats_grps[i].update_stats_mask & + for (i = mlx5e_nic_stats_grps_num(priv) - 1; i >= 0; i--) + if (mlx5e_nic_stats_grps[i]->update_stats_mask & MLX5E_NDO_UPDATE_STATS) - mlx5e_stats_grps[i].update_stats(priv); + mlx5e_nic_stats_grps[i]->update_stats(priv); } static void mlx5e_update_stats_work(struct work_struct *work) @@ -4878,6 +4869,8 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev) netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_UDP_TUNNEL_CSUM; netdev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM; + netdev->vlan_features |= NETIF_F_GSO_UDP_TUNNEL | + NETIF_F_GSO_UDP_TUNNEL_CSUM; } if (mlx5e_tunnel_proto_supported(mdev, IPPROTO_GRE)) { @@ -5151,6 +5144,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv) static void mlx5e_nic_disable(struct mlx5e_priv *priv) { + struct net_device *netdev = priv->netdev; struct mlx5_core_dev *mdev = priv->mdev; #ifdef CONFIG_MLX5_CORE_EN_DCB @@ -5171,7 +5165,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) mlx5e_monitor_counter_cleanup(priv); mlx5e_disable_async_events(priv); - mlx5_lag_remove(mdev); + mlx5_lag_remove(mdev, netdev); } int mlx5e_update_nic_rx(struct mlx5e_priv *priv) @@ -5195,6 +5189,8 @@ static const struct mlx5e_profile mlx5e_nic_profile = { .rx_handlers.handle_rx_cqe_mpwqe = mlx5e_handle_rx_cqe_mpwrq, .max_tc = MLX5E_MAX_NUM_TC, .rq_groups = MLX5E_NUM_RQ_GROUPS(XSK), + .stats_grps = mlx5e_nic_stats_grps, + .stats_grps_num = mlx5e_nic_stats_grps_num, }; /* mlx5e generic netdev management API (move to en_common.c) */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 446eb4d6c983..7b48ccacebe2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -117,24 +117,71 @@ static const struct counter_desc vport_rep_stats_desc[] = { #define NUM_VPORT_REP_SW_COUNTERS ARRAY_SIZE(sw_rep_stats_desc) #define NUM_VPORT_REP_HW_COUNTERS ARRAY_SIZE(vport_rep_stats_desc) -static void mlx5e_rep_get_strings(struct net_device *dev, - u32 stringset, uint8_t *data) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(sw_rep) { - int i, j; + return NUM_VPORT_REP_SW_COUNTERS; +} - switch (stringset) { - case ETH_SS_STATS: - for (i = 0; i < NUM_VPORT_REP_SW_COUNTERS; i++) - strcpy(data + (i * ETH_GSTRING_LEN), - sw_rep_stats_desc[i].format); - for (j = 0; j < NUM_VPORT_REP_HW_COUNTERS; j++, i++) - strcpy(data + (i * ETH_GSTRING_LEN), - vport_rep_stats_desc[j].format); - break; - } +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(sw_rep) +{ + int i; + + for (i = 0; i < NUM_VPORT_REP_SW_COUNTERS; i++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, + sw_rep_stats_desc[i].format); + return idx; +} + +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(sw_rep) +{ + int i; + + for (i = 0; i < NUM_VPORT_REP_SW_COUNTERS; i++) + data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.sw, + sw_rep_stats_desc, i); + return idx; +} + +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw_rep) +{ + struct mlx5e_sw_stats *s = &priv->stats.sw; + struct rtnl_link_stats64 stats64 = {}; + + memset(s, 0, sizeof(*s)); + mlx5e_fold_sw_stats64(priv, &stats64); + + s->rx_packets = stats64.rx_packets; + s->rx_bytes = stats64.rx_bytes; + s->tx_packets = stats64.tx_packets; + s->tx_bytes = stats64.tx_bytes; + s->tx_queue_dropped = stats64.tx_dropped; +} + +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(vport_rep) +{ + return NUM_VPORT_REP_HW_COUNTERS; +} + +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vport_rep) +{ + int i; + + for (i = 0; i < NUM_VPORT_REP_HW_COUNTERS; i++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, vport_rep_stats_desc[i].format); + return idx; +} + +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport_rep) +{ + int i; + + for (i = 0; i < NUM_VPORT_REP_HW_COUNTERS; i++) + data[idx++] = MLX5E_READ_CTR64_CPU(&priv->stats.vf_vport, + vport_rep_stats_desc, i); + return idx; } -static void mlx5e_rep_update_hw_counters(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vport_rep) { struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; struct mlx5e_rep_priv *rpriv = priv->ppriv; @@ -157,64 +204,33 @@ static void mlx5e_rep_update_hw_counters(struct mlx5e_priv *priv) vport_stats->tx_bytes = vf_stats.rx_bytes; } -static void mlx5e_uplink_rep_update_hw_counters(struct mlx5e_priv *priv) -{ - struct mlx5e_pport_stats *pstats = &priv->stats.pport; - struct rtnl_link_stats64 *vport_stats; - - mlx5e_grp_802_3_update_stats(priv); - - vport_stats = &priv->stats.vf_vport; - - vport_stats->rx_packets = PPORT_802_3_GET(pstats, a_frames_received_ok); - vport_stats->rx_bytes = PPORT_802_3_GET(pstats, a_octets_received_ok); - vport_stats->tx_packets = PPORT_802_3_GET(pstats, a_frames_transmitted_ok); - vport_stats->tx_bytes = PPORT_802_3_GET(pstats, a_octets_transmitted_ok); -} - -static void mlx5e_rep_update_sw_counters(struct mlx5e_priv *priv) +static void mlx5e_rep_get_strings(struct net_device *dev, + u32 stringset, uint8_t *data) { - struct mlx5e_sw_stats *s = &priv->stats.sw; - struct rtnl_link_stats64 stats64 = {}; - - memset(s, 0, sizeof(*s)); - mlx5e_fold_sw_stats64(priv, &stats64); + struct mlx5e_priv *priv = netdev_priv(dev); - s->rx_packets = stats64.rx_packets; - s->rx_bytes = stats64.rx_bytes; - s->tx_packets = stats64.tx_packets; - s->tx_bytes = stats64.tx_bytes; - s->tx_queue_dropped = stats64.tx_dropped; + switch (stringset) { + case ETH_SS_STATS: + mlx5e_stats_fill_strings(priv, data); + break; + } } static void mlx5e_rep_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *stats, u64 *data) { struct mlx5e_priv *priv = netdev_priv(dev); - int i, j; - - if (!data) - return; - mutex_lock(&priv->state_lock); - mlx5e_rep_update_sw_counters(priv); - priv->profile->update_stats(priv); - mutex_unlock(&priv->state_lock); - - for (i = 0; i < NUM_VPORT_REP_SW_COUNTERS; i++) - data[i] = MLX5E_READ_CTR64_CPU(&priv->stats.sw, - sw_rep_stats_desc, i); - - for (j = 0; j < NUM_VPORT_REP_HW_COUNTERS; j++, i++) - data[i] = MLX5E_READ_CTR64_CPU(&priv->stats.vf_vport, - vport_rep_stats_desc, j); + mlx5e_ethtool_get_ethtool_stats(priv, stats, data); } static int mlx5e_rep_get_sset_count(struct net_device *dev, int sset) { + struct mlx5e_priv *priv = netdev_priv(dev); + switch (sset) { case ETH_SS_STATS: - return NUM_VPORT_REP_SW_COUNTERS + NUM_VPORT_REP_HW_COUNTERS; + return mlx5e_stats_total_num(priv); default: return -EOPNOTSUPP; } @@ -1674,10 +1690,65 @@ static void mlx5e_cleanup_rep_rx(struct mlx5e_priv *priv) mlx5e_close_drop_rq(&priv->drop_rq); } +static int mlx5e_init_ul_rep_rx(struct mlx5e_priv *priv) +{ + int err = mlx5e_init_rep_rx(priv); + + if (err) + return err; + + mlx5e_create_q_counters(priv); + return 0; +} + +static void mlx5e_cleanup_ul_rep_rx(struct mlx5e_priv *priv) +{ + mlx5e_destroy_q_counters(priv); + mlx5e_cleanup_rep_rx(priv); +} + +static int mlx5e_init_uplink_rep_tx(struct mlx5e_rep_priv *rpriv) +{ + struct mlx5_rep_uplink_priv *uplink_priv; + struct net_device *netdev; + struct mlx5e_priv *priv; + int err; + + netdev = rpriv->netdev; + priv = netdev_priv(netdev); + uplink_priv = &rpriv->uplink_priv; + + mutex_init(&uplink_priv->unready_flows_lock); + INIT_LIST_HEAD(&uplink_priv->unready_flows); + + /* init shared tc flow table */ + err = mlx5e_tc_esw_init(&uplink_priv->tc_ht); + if (err) + return err; + + mlx5_init_port_tun_entropy(&uplink_priv->tun_entropy, priv->mdev); + + /* init indirect block notifications */ + INIT_LIST_HEAD(&uplink_priv->tc_indr_block_priv_list); + uplink_priv->netdevice_nb.notifier_call = mlx5e_nic_rep_netdevice_event; + err = register_netdevice_notifier_dev_net(rpriv->netdev, + &uplink_priv->netdevice_nb, + &uplink_priv->netdevice_nn); + if (err) { + mlx5_core_err(priv->mdev, "Failed to register netdev notifier\n"); + goto tc_esw_cleanup; + } + + return 0; + +tc_esw_cleanup: + mlx5e_tc_esw_cleanup(&uplink_priv->tc_ht); + return err; +} + static int mlx5e_init_rep_tx(struct mlx5e_priv *priv) { struct mlx5e_rep_priv *rpriv = priv->ppriv; - struct mlx5_rep_uplink_priv *uplink_priv; int err; err = mlx5e_create_tises(priv); @@ -1687,52 +1758,41 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv) } if (rpriv->rep->vport == MLX5_VPORT_UPLINK) { - uplink_priv = &rpriv->uplink_priv; - - mutex_init(&uplink_priv->unready_flows_lock); - INIT_LIST_HEAD(&uplink_priv->unready_flows); - - /* init shared tc flow table */ - err = mlx5e_tc_esw_init(&uplink_priv->tc_ht); + err = mlx5e_init_uplink_rep_tx(rpriv); if (err) goto destroy_tises; - - mlx5_init_port_tun_entropy(&uplink_priv->tun_entropy, priv->mdev); - - /* init indirect block notifications */ - INIT_LIST_HEAD(&uplink_priv->tc_indr_block_priv_list); - uplink_priv->netdevice_nb.notifier_call = mlx5e_nic_rep_netdevice_event; - err = register_netdevice_notifier(&uplink_priv->netdevice_nb); - if (err) { - mlx5_core_err(priv->mdev, "Failed to register netdev notifier\n"); - goto tc_esw_cleanup; - } } return 0; -tc_esw_cleanup: - mlx5e_tc_esw_cleanup(&uplink_priv->tc_ht); destroy_tises: mlx5e_destroy_tises(priv); return err; } +static void mlx5e_cleanup_uplink_rep_tx(struct mlx5e_rep_priv *rpriv) +{ + struct mlx5_rep_uplink_priv *uplink_priv = &rpriv->uplink_priv; + + /* clean indirect TC block notifications */ + unregister_netdevice_notifier_dev_net(rpriv->netdev, + &uplink_priv->netdevice_nb, + &uplink_priv->netdevice_nn); + mlx5e_rep_indr_clean_block_privs(rpriv); + + /* delete shared tc flow table */ + mlx5e_tc_esw_cleanup(&rpriv->uplink_priv.tc_ht); + mutex_destroy(&rpriv->uplink_priv.unready_flows_lock); +} + static void mlx5e_cleanup_rep_tx(struct mlx5e_priv *priv) { struct mlx5e_rep_priv *rpriv = priv->ppriv; mlx5e_destroy_tises(priv); - if (rpriv->rep->vport == MLX5_VPORT_UPLINK) { - /* clean indirect TC block notifications */ - unregister_netdevice_notifier(&rpriv->uplink_priv.netdevice_nb); - mlx5e_rep_indr_clean_block_privs(rpriv); - - /* delete shared tc flow table */ - mlx5e_tc_esw_cleanup(&rpriv->uplink_priv.tc_ht); - mutex_destroy(&rpriv->uplink_priv.unready_flows_lock); - } + if (rpriv->rep->vport == MLX5_VPORT_UPLINK) + mlx5e_cleanup_uplink_rep_tx(rpriv); } static void mlx5e_rep_enable(struct mlx5e_priv *priv) @@ -1801,6 +1861,7 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv) static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv) { + struct net_device *netdev = priv->netdev; struct mlx5_core_dev *mdev = priv->mdev; struct mlx5e_rep_priv *rpriv = priv->ppriv; @@ -1809,7 +1870,44 @@ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv) #endif mlx5_notifier_unregister(mdev, &priv->events_nb); cancel_work_sync(&rpriv->uplink_priv.reoffload_flows_work); - mlx5_lag_remove(mdev); + mlx5_lag_remove(mdev, netdev); +} + +static MLX5E_DEFINE_STATS_GRP(sw_rep, 0); +static MLX5E_DEFINE_STATS_GRP(vport_rep, MLX5E_NDO_UPDATE_STATS); + +/* The stats groups order is opposite to the update_stats() order calls */ +static mlx5e_stats_grp_t mlx5e_rep_stats_grps[] = { + &MLX5E_STATS_GRP(sw_rep), + &MLX5E_STATS_GRP(vport_rep), +}; + +static unsigned int mlx5e_rep_stats_grps_num(struct mlx5e_priv *priv) +{ + return ARRAY_SIZE(mlx5e_rep_stats_grps); +} + +/* The stats groups order is opposite to the update_stats() order calls */ +static mlx5e_stats_grp_t mlx5e_ul_rep_stats_grps[] = { + &MLX5E_STATS_GRP(sw), + &MLX5E_STATS_GRP(qcnt), + &MLX5E_STATS_GRP(vnic_env), + &MLX5E_STATS_GRP(vport), + &MLX5E_STATS_GRP(802_3), + &MLX5E_STATS_GRP(2863), + &MLX5E_STATS_GRP(2819), + &MLX5E_STATS_GRP(phy), + &MLX5E_STATS_GRP(eth_ext), + &MLX5E_STATS_GRP(pcie), + &MLX5E_STATS_GRP(per_prio), + &MLX5E_STATS_GRP(pme), + &MLX5E_STATS_GRP(channels), + &MLX5E_STATS_GRP(per_port_buff_congest), +}; + +static unsigned int mlx5e_ul_rep_stats_grps_num(struct mlx5e_priv *priv) +{ + return ARRAY_SIZE(mlx5e_ul_rep_stats_grps); } static const struct mlx5e_profile mlx5e_rep_profile = { @@ -1821,29 +1919,33 @@ static const struct mlx5e_profile mlx5e_rep_profile = { .cleanup_tx = mlx5e_cleanup_rep_tx, .enable = mlx5e_rep_enable, .update_rx = mlx5e_update_rep_rx, - .update_stats = mlx5e_rep_update_hw_counters, + .update_stats = mlx5e_update_ndo_stats, .rx_handlers.handle_rx_cqe = mlx5e_handle_rx_cqe_rep, .rx_handlers.handle_rx_cqe_mpwqe = mlx5e_handle_rx_cqe_mpwrq, .max_tc = 1, .rq_groups = MLX5E_NUM_RQ_GROUPS(REGULAR), + .stats_grps = mlx5e_rep_stats_grps, + .stats_grps_num = mlx5e_rep_stats_grps_num, }; static const struct mlx5e_profile mlx5e_uplink_rep_profile = { .init = mlx5e_init_rep, .cleanup = mlx5e_cleanup_rep, - .init_rx = mlx5e_init_rep_rx, - .cleanup_rx = mlx5e_cleanup_rep_rx, + .init_rx = mlx5e_init_ul_rep_rx, + .cleanup_rx = mlx5e_cleanup_ul_rep_rx, .init_tx = mlx5e_init_rep_tx, .cleanup_tx = mlx5e_cleanup_rep_tx, .enable = mlx5e_uplink_rep_enable, .disable = mlx5e_uplink_rep_disable, .update_rx = mlx5e_update_rep_rx, - .update_stats = mlx5e_uplink_rep_update_hw_counters, + .update_stats = mlx5e_update_ndo_stats, .update_carrier = mlx5e_update_carrier, .rx_handlers.handle_rx_cqe = mlx5e_handle_rx_cqe_rep, .rx_handlers.handle_rx_cqe_mpwqe = mlx5e_handle_rx_cqe_mpwrq, .max_tc = MLX5E_MAX_NUM_TC, .rq_groups = MLX5E_NUM_RQ_GROUPS(REGULAR), + .stats_grps = mlx5e_ul_rep_stats_grps, + .stats_grps_num = mlx5e_ul_rep_stats_grps_num, }; static bool diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 31f83c8adcc9..3f756d51435f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -73,6 +73,7 @@ struct mlx5_rep_uplink_priv { */ struct list_head tc_indr_block_priv_list; struct notifier_block netdevice_nb; + struct netdev_net_notifier netdevice_nn; struct mlx5_tun_entropy tun_entropy; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index 4291db78efc9..30b216d9284c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -35,6 +35,58 @@ #include "en_accel/ipsec.h" #include "en_accel/tls.h" +static unsigned int stats_grps_num(struct mlx5e_priv *priv) +{ + return !priv->profile->stats_grps_num ? 0 : + priv->profile->stats_grps_num(priv); +} + +unsigned int mlx5e_stats_total_num(struct mlx5e_priv *priv) +{ + mlx5e_stats_grp_t *stats_grps = priv->profile->stats_grps; + const unsigned int num_stats_grps = stats_grps_num(priv); + unsigned int total = 0; + int i; + + for (i = 0; i < num_stats_grps; i++) + total += stats_grps[i]->get_num_stats(priv); + + return total; +} + +void mlx5e_stats_update(struct mlx5e_priv *priv) +{ + mlx5e_stats_grp_t *stats_grps = priv->profile->stats_grps; + const unsigned int num_stats_grps = stats_grps_num(priv); + int i; + + for (i = num_stats_grps - 1; i >= 0; i--) + if (stats_grps[i]->update_stats) + stats_grps[i]->update_stats(priv); +} + +void mlx5e_stats_fill(struct mlx5e_priv *priv, u64 *data, int idx) +{ + mlx5e_stats_grp_t *stats_grps = priv->profile->stats_grps; + const unsigned int num_stats_grps = stats_grps_num(priv); + int i; + + for (i = 0; i < num_stats_grps; i++) + idx = stats_grps[i]->fill_stats(priv, data, idx); +} + +void mlx5e_stats_fill_strings(struct mlx5e_priv *priv, u8 *data) +{ + mlx5e_stats_grp_t *stats_grps = priv->profile->stats_grps; + const unsigned int num_stats_grps = stats_grps_num(priv); + int i, idx = 0; + + for (i = 0; i < num_stats_grps; i++) + idx = stats_grps[i]->fill_strings(priv, data, idx); +} + +/* Concrete NIC Stats */ + static const struct counter_desc sw_stats_desc[] = { { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_packets) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_bytes) }, @@ -146,12 +198,12 @@ static const struct counter_desc sw_stats_desc[] = { #define NUM_SW_COUNTERS ARRAY_SIZE(sw_stats_desc) -static int mlx5e_grp_sw_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(sw) { return NUM_SW_COUNTERS; } -static int mlx5e_grp_sw_fill_strings(struct mlx5e_priv *priv, u8 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(sw) { int i; @@ -160,7 +212,7 @@ static int mlx5e_grp_sw_fill_strings(struct mlx5e_priv *priv, u8 *data, int idx) return idx; } -static int mlx5e_grp_sw_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(sw) { int i; @@ -169,7 +221,7 @@ static int mlx5e_grp_sw_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) return idx; } -static void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw) { struct mlx5e_sw_stats *s = &priv->stats.sw; int i; @@ -315,7 +367,7 @@ static const struct counter_desc drop_rq_stats_desc[] = { #define NUM_Q_COUNTERS ARRAY_SIZE(q_stats_desc) #define NUM_DROP_RQ_COUNTERS ARRAY_SIZE(drop_rq_stats_desc) -static int mlx5e_grp_q_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(qcnt) { int num_stats = 0; @@ -328,7 +380,7 @@ static int mlx5e_grp_q_get_num_stats(struct mlx5e_priv *priv) return num_stats; } -static int mlx5e_grp_q_fill_strings(struct mlx5e_priv *priv, u8 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(qcnt) { int i; @@ -343,7 +395,7 @@ static int mlx5e_grp_q_fill_strings(struct mlx5e_priv *priv, u8 *data, int idx) return idx; } -static int mlx5e_grp_q_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(qcnt) { int i; @@ -356,7 +408,7 @@ static int mlx5e_grp_q_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) return idx; } -static void mlx5e_grp_q_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(qcnt) { struct mlx5e_qcounter_stats *qcnt = &priv->stats.qcnt; u32 out[MLX5_ST_SZ_DW(query_q_counter_out)]; @@ -391,14 +443,13 @@ static const struct counter_desc vnic_env_stats_dev_oob_desc[] = { (MLX5_CAP_GEN(dev, vnic_env_int_rq_oob) ? \ ARRAY_SIZE(vnic_env_stats_dev_oob_desc) : 0) -static int mlx5e_grp_vnic_env_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(vnic_env) { return NUM_VNIC_ENV_STEER_COUNTERS(priv->mdev) + NUM_VNIC_ENV_DEV_OOB_COUNTERS(priv->mdev); } -static int mlx5e_grp_vnic_env_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vnic_env) { int i; @@ -412,8 +463,7 @@ static int mlx5e_grp_vnic_env_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_vnic_env_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vnic_env) { int i; @@ -427,7 +477,7 @@ static int mlx5e_grp_vnic_env_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static void mlx5e_grp_vnic_env_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vnic_env) { u32 *out = (u32 *)priv->stats.vnic.query_vnic_env_out; int outlen = MLX5_ST_SZ_BYTES(query_vnic_env_out); @@ -490,13 +540,12 @@ static const struct counter_desc vport_stats_desc[] = { #define NUM_VPORT_COUNTERS ARRAY_SIZE(vport_stats_desc) -static int mlx5e_grp_vport_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(vport) { return NUM_VPORT_COUNTERS; } -static int mlx5e_grp_vport_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(vport) { int i; @@ -505,8 +554,7 @@ static int mlx5e_grp_vport_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_vport_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(vport) { int i; @@ -516,7 +564,7 @@ static int mlx5e_grp_vport_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static void mlx5e_grp_vport_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(vport) { int outlen = MLX5_ST_SZ_BYTES(query_vport_counter_out); u32 *out = (u32 *)priv->stats.vport.query_vport_out; @@ -555,13 +603,12 @@ static const struct counter_desc pport_802_3_stats_desc[] = { #define NUM_PPORT_802_3_COUNTERS ARRAY_SIZE(pport_802_3_stats_desc) -static int mlx5e_grp_802_3_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(802_3) { return NUM_PPORT_802_3_COUNTERS; } -static int mlx5e_grp_802_3_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(802_3) { int i; @@ -570,8 +617,7 @@ static int mlx5e_grp_802_3_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_802_3_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(802_3) { int i; @@ -584,7 +630,7 @@ static int mlx5e_grp_802_3_fill_stats(struct mlx5e_priv *priv, u64 *data, #define MLX5_BASIC_PPCNT_SUPPORTED(mdev) \ (MLX5_CAP_GEN(mdev, pcam_reg) ? MLX5_CAP_PCAM_REG(mdev, ppcnt) : 1) -void mlx5e_grp_802_3_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(802_3) { struct mlx5e_pport_stats *pstats = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -612,13 +658,12 @@ static const struct counter_desc pport_2863_stats_desc[] = { #define NUM_PPORT_2863_COUNTERS ARRAY_SIZE(pport_2863_stats_desc) -static int mlx5e_grp_2863_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(2863) { return NUM_PPORT_2863_COUNTERS; } -static int mlx5e_grp_2863_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(2863) { int i; @@ -627,8 +672,7 @@ static int mlx5e_grp_2863_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_2863_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(2863) { int i; @@ -638,7 +682,7 @@ static int mlx5e_grp_2863_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static void mlx5e_grp_2863_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(2863) { struct mlx5e_pport_stats *pstats = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -673,13 +717,12 @@ static const struct counter_desc pport_2819_stats_desc[] = { #define NUM_PPORT_2819_COUNTERS ARRAY_SIZE(pport_2819_stats_desc) -static int mlx5e_grp_2819_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(2819) { return NUM_PPORT_2819_COUNTERS; } -static int mlx5e_grp_2819_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(2819) { int i; @@ -688,8 +731,7 @@ static int mlx5e_grp_2819_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_2819_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(2819) { int i; @@ -699,7 +741,7 @@ static int mlx5e_grp_2819_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static void mlx5e_grp_2819_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(2819) { struct mlx5e_pport_stats *pstats = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -737,7 +779,7 @@ pport_phy_statistical_err_lanes_stats_desc[] = { #define NUM_PPORT_PHY_STATISTICAL_PER_LANE_COUNTERS \ ARRAY_SIZE(pport_phy_statistical_err_lanes_stats_desc) -static int mlx5e_grp_phy_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(phy) { struct mlx5_core_dev *mdev = priv->mdev; int num_stats; @@ -754,8 +796,7 @@ static int mlx5e_grp_phy_get_num_stats(struct mlx5e_priv *priv) return num_stats; } -static int mlx5e_grp_phy_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(phy) { struct mlx5_core_dev *mdev = priv->mdev; int i; @@ -777,7 +818,7 @@ static int mlx5e_grp_phy_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_phy_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(phy) { struct mlx5_core_dev *mdev = priv->mdev; int i; @@ -803,7 +844,7 @@ static int mlx5e_grp_phy_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) return idx; } -static void mlx5e_grp_phy_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(phy) { struct mlx5e_pport_stats *pstats = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -833,7 +874,7 @@ static const struct counter_desc pport_eth_ext_stats_desc[] = { #define NUM_PPORT_ETH_EXT_COUNTERS ARRAY_SIZE(pport_eth_ext_stats_desc) -static int mlx5e_grp_eth_ext_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(eth_ext) { if (MLX5_CAP_PCAM_FEATURE((priv)->mdev, rx_buffer_fullness_counters)) return NUM_PPORT_ETH_EXT_COUNTERS; @@ -841,8 +882,7 @@ static int mlx5e_grp_eth_ext_get_num_stats(struct mlx5e_priv *priv) return 0; } -static int mlx5e_grp_eth_ext_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(eth_ext) { int i; @@ -853,8 +893,7 @@ static int mlx5e_grp_eth_ext_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_eth_ext_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(eth_ext) { int i; @@ -866,7 +905,7 @@ static int mlx5e_grp_eth_ext_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static void mlx5e_grp_eth_ext_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(eth_ext) { struct mlx5e_pport_stats *pstats = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -907,7 +946,7 @@ static const struct counter_desc pcie_perf_stall_stats_desc[] = { #define NUM_PCIE_PERF_COUNTERS64 ARRAY_SIZE(pcie_perf_stats_desc64) #define NUM_PCIE_PERF_STALL_COUNTERS ARRAY_SIZE(pcie_perf_stall_stats_desc) -static int mlx5e_grp_pcie_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(pcie) { int num_stats = 0; @@ -923,8 +962,7 @@ static int mlx5e_grp_pcie_get_num_stats(struct mlx5e_priv *priv) return num_stats; } -static int mlx5e_grp_pcie_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(pcie) { int i; @@ -945,8 +983,7 @@ static int mlx5e_grp_pcie_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_pcie_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pcie) { int i; @@ -970,7 +1007,7 @@ static int mlx5e_grp_pcie_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static void mlx5e_grp_pcie_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(pcie) { struct mlx5e_pcie_stats *pcie_stats = &priv->stats.pcie; struct mlx5_core_dev *mdev = priv->mdev; @@ -1018,8 +1055,7 @@ static int mlx5e_grp_per_tc_prio_get_num_stats(struct mlx5e_priv *priv) return NUM_PPORT_PER_TC_PRIO_COUNTERS * NUM_PPORT_PRIO; } -static int mlx5e_grp_per_port_buffer_congest_fill_strings(struct mlx5e_priv *priv, - u8 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(per_port_buff_congest) { struct mlx5_core_dev *mdev = priv->mdev; int i, prio; @@ -1039,8 +1075,7 @@ static int mlx5e_grp_per_port_buffer_congest_fill_strings(struct mlx5e_priv *pri return idx; } -static int mlx5e_grp_per_port_buffer_congest_fill_stats(struct mlx5e_priv *priv, - u64 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(per_port_buff_congest) { struct mlx5e_pport_stats *pport = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -1115,13 +1150,13 @@ static void mlx5e_grp_per_tc_congest_prio_update_stats(struct mlx5e_priv *priv) } } -static int mlx5e_grp_per_port_buffer_congest_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(per_port_buff_congest) { return mlx5e_grp_per_tc_prio_get_num_stats(priv) + mlx5e_grp_per_tc_congest_prio_get_num_stats(priv); } -static void mlx5e_grp_per_port_buffer_congest_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(per_port_buff_congest) { mlx5e_grp_per_tc_prio_update_stats(priv); mlx5e_grp_per_tc_congest_prio_update_stats(priv); @@ -1296,29 +1331,27 @@ static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv, return idx; } -static int mlx5e_grp_per_prio_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(per_prio) { return mlx5e_grp_per_prio_traffic_get_num_stats() + mlx5e_grp_per_prio_pfc_get_num_stats(priv); } -static int mlx5e_grp_per_prio_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(per_prio) { idx = mlx5e_grp_per_prio_traffic_fill_strings(priv, data, idx); idx = mlx5e_grp_per_prio_pfc_fill_strings(priv, data, idx); return idx; } -static int mlx5e_grp_per_prio_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(per_prio) { idx = mlx5e_grp_per_prio_traffic_fill_stats(priv, data, idx); idx = mlx5e_grp_per_prio_pfc_fill_stats(priv, data, idx); return idx; } -static void mlx5e_grp_per_prio_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(per_prio) { struct mlx5e_pport_stats *pstats = &priv->stats.pport; struct mlx5_core_dev *mdev = priv->mdev; @@ -1353,13 +1386,12 @@ static const struct counter_desc mlx5e_pme_error_desc[] = { #define NUM_PME_STATUS_STATS ARRAY_SIZE(mlx5e_pme_status_desc) #define NUM_PME_ERR_STATS ARRAY_SIZE(mlx5e_pme_error_desc) -static int mlx5e_grp_pme_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(pme) { return NUM_PME_STATUS_STATS + NUM_PME_ERR_STATS; } -static int mlx5e_grp_pme_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(pme) { int i; @@ -1372,8 +1404,7 @@ static int mlx5e_grp_pme_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_pme_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pme) { struct mlx5_pme_stats pme_stats; int i; @@ -1391,45 +1422,46 @@ static int mlx5e_grp_pme_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } -static int mlx5e_grp_ipsec_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(pme) { return; } + +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(ipsec) { return mlx5e_ipsec_get_count(priv); } -static int mlx5e_grp_ipsec_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(ipsec) { return idx + mlx5e_ipsec_get_strings(priv, data + idx * ETH_GSTRING_LEN); } -static int mlx5e_grp_ipsec_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(ipsec) { return idx + mlx5e_ipsec_get_stats(priv, data + idx); } -static void mlx5e_grp_ipsec_update_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(ipsec) { mlx5e_ipsec_update_stats(priv); } -static int mlx5e_grp_tls_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(tls) { return mlx5e_tls_get_count(priv); } -static int mlx5e_grp_tls_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(tls) { return idx + mlx5e_tls_get_strings(priv, data + idx * ETH_GSTRING_LEN); } -static int mlx5e_grp_tls_fill_stats(struct mlx5e_priv *priv, u64 *data, int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(tls) { return idx + mlx5e_tls_get_stats(priv, data + idx); } +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(tls) { return; } + static const struct counter_desc rq_stats_desc[] = { { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, packets) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, bytes) }, @@ -1563,7 +1595,7 @@ static const struct counter_desc ch_stats_desc[] = { #define NUM_XSKSQ_STATS ARRAY_SIZE(xsksq_stats_desc) #define NUM_CH_STATS ARRAY_SIZE(ch_stats_desc) -static int mlx5e_grp_channels_get_num_stats(struct mlx5e_priv *priv) +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(channels) { int max_nch = priv->max_nch; @@ -1576,8 +1608,7 @@ static int mlx5e_grp_channels_get_num_stats(struct mlx5e_priv *priv) (NUM_XSKSQ_STATS * max_nch * priv->xsk.ever_used); } -static int mlx5e_grp_channels_fill_strings(struct mlx5e_priv *priv, u8 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(channels) { bool is_xsk = priv->xsk.ever_used; int max_nch = priv->max_nch; @@ -1619,8 +1650,7 @@ static int mlx5e_grp_channels_fill_strings(struct mlx5e_priv *priv, u8 *data, return idx; } -static int mlx5e_grp_channels_fill_stats(struct mlx5e_priv *priv, u64 *data, - int idx) +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(channels) { bool is_xsk = priv->xsk.ever_used; int max_nch = priv->max_nch; @@ -1668,104 +1698,46 @@ static int mlx5e_grp_channels_fill_stats(struct mlx5e_priv *priv, u64 *data, return idx; } +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(channels) { return; } + +MLX5E_DEFINE_STATS_GRP(sw, 0); +MLX5E_DEFINE_STATS_GRP(qcnt, MLX5E_NDO_UPDATE_STATS); +MLX5E_DEFINE_STATS_GRP(vnic_env, 0); +MLX5E_DEFINE_STATS_GRP(vport, MLX5E_NDO_UPDATE_STATS); +MLX5E_DEFINE_STATS_GRP(802_3, MLX5E_NDO_UPDATE_STATS); +MLX5E_DEFINE_STATS_GRP(2863, 0); +MLX5E_DEFINE_STATS_GRP(2819, 0); +MLX5E_DEFINE_STATS_GRP(phy, 0); +MLX5E_DEFINE_STATS_GRP(pcie, 0); +MLX5E_DEFINE_STATS_GRP(per_prio, 0); +MLX5E_DEFINE_STATS_GRP(pme, 0); +MLX5E_DEFINE_STATS_GRP(channels, 0); +MLX5E_DEFINE_STATS_GRP(per_port_buff_congest, 0); +MLX5E_DEFINE_STATS_GRP(eth_ext, 0); +static MLX5E_DEFINE_STATS_GRP(ipsec, 0); +static MLX5E_DEFINE_STATS_GRP(tls, 0); + /* The stats groups order is opposite to the update_stats() order calls */ -const struct mlx5e_stats_grp mlx5e_stats_grps[] = { - { - .get_num_stats = mlx5e_grp_sw_get_num_stats, - .fill_strings = mlx5e_grp_sw_fill_strings, - .fill_stats = mlx5e_grp_sw_fill_stats, - .update_stats = mlx5e_grp_sw_update_stats, - }, - { - .get_num_stats = mlx5e_grp_q_get_num_stats, - .fill_strings = mlx5e_grp_q_fill_strings, - .fill_stats = mlx5e_grp_q_fill_stats, - .update_stats_mask = MLX5E_NDO_UPDATE_STATS, - .update_stats = mlx5e_grp_q_update_stats, - }, - { - .get_num_stats = mlx5e_grp_vnic_env_get_num_stats, - .fill_strings = mlx5e_grp_vnic_env_fill_strings, - .fill_stats = mlx5e_grp_vnic_env_fill_stats, - .update_stats = mlx5e_grp_vnic_env_update_stats, - }, - { - .get_num_stats = mlx5e_grp_vport_get_num_stats, - .fill_strings = mlx5e_grp_vport_fill_strings, - .fill_stats = mlx5e_grp_vport_fill_stats, - .update_stats_mask = MLX5E_NDO_UPDATE_STATS, - .update_stats = mlx5e_grp_vport_update_stats, - }, - { - .get_num_stats = mlx5e_grp_802_3_get_num_stats, - .fill_strings = mlx5e_grp_802_3_fill_strings, - .fill_stats = mlx5e_grp_802_3_fill_stats, - .update_stats_mask = MLX5E_NDO_UPDATE_STATS, - .update_stats = mlx5e_grp_802_3_update_stats, - }, - { - .get_num_stats = mlx5e_grp_2863_get_num_stats, - .fill_strings = mlx5e_grp_2863_fill_strings, - .fill_stats = mlx5e_grp_2863_fill_stats, - .update_stats = mlx5e_grp_2863_update_stats, - }, - { - .get_num_stats = mlx5e_grp_2819_get_num_stats, - .fill_strings = mlx5e_grp_2819_fill_strings, - .fill_stats = mlx5e_grp_2819_fill_stats, - .update_stats = mlx5e_grp_2819_update_stats, - }, - { - .get_num_stats = mlx5e_grp_phy_get_num_stats, - .fill_strings = mlx5e_grp_phy_fill_strings, - .fill_stats = mlx5e_grp_phy_fill_stats, - .update_stats = mlx5e_grp_phy_update_stats, - }, - { - .get_num_stats = mlx5e_grp_eth_ext_get_num_stats, - .fill_strings = mlx5e_grp_eth_ext_fill_strings, - .fill_stats = mlx5e_grp_eth_ext_fill_stats, - .update_stats = mlx5e_grp_eth_ext_update_stats, - }, - { - .get_num_stats = mlx5e_grp_pcie_get_num_stats, - .fill_strings = mlx5e_grp_pcie_fill_strings, - .fill_stats = mlx5e_grp_pcie_fill_stats, - .update_stats = mlx5e_grp_pcie_update_stats, - }, - { - .get_num_stats = mlx5e_grp_per_prio_get_num_stats, - .fill_strings = mlx5e_grp_per_prio_fill_strings, - .fill_stats = mlx5e_grp_per_prio_fill_stats, - .update_stats = mlx5e_grp_per_prio_update_stats, - }, - { - .get_num_stats = mlx5e_grp_pme_get_num_stats, - .fill_strings = mlx5e_grp_pme_fill_strings, - .fill_stats = mlx5e_grp_pme_fill_stats, - }, - { - .get_num_stats = mlx5e_grp_ipsec_get_num_stats, - .fill_strings = mlx5e_grp_ipsec_fill_strings, - .fill_stats = mlx5e_grp_ipsec_fill_stats, - .update_stats = mlx5e_grp_ipsec_update_stats, - }, - { - .get_num_stats = mlx5e_grp_tls_get_num_stats, - .fill_strings = mlx5e_grp_tls_fill_strings, - .fill_stats = mlx5e_grp_tls_fill_stats, - }, - { - .get_num_stats = mlx5e_grp_channels_get_num_stats, - .fill_strings = mlx5e_grp_channels_fill_strings, - .fill_stats = mlx5e_grp_channels_fill_stats, - }, - { - .get_num_stats = mlx5e_grp_per_port_buffer_congest_get_num_stats, - .fill_strings = mlx5e_grp_per_port_buffer_congest_fill_strings, - .fill_stats = mlx5e_grp_per_port_buffer_congest_fill_stats, - .update_stats = mlx5e_grp_per_port_buffer_congest_update_stats, - }, +mlx5e_stats_grp_t mlx5e_nic_stats_grps[] = { + &MLX5E_STATS_GRP(sw), + &MLX5E_STATS_GRP(qcnt), + &MLX5E_STATS_GRP(vnic_env), + &MLX5E_STATS_GRP(vport), + &MLX5E_STATS_GRP(802_3), + &MLX5E_STATS_GRP(2863), + &MLX5E_STATS_GRP(2819), + &MLX5E_STATS_GRP(phy), + &MLX5E_STATS_GRP(eth_ext), + &MLX5E_STATS_GRP(pcie), + &MLX5E_STATS_GRP(per_prio), + &MLX5E_STATS_GRP(pme), + &MLX5E_STATS_GRP(ipsec), + &MLX5E_STATS_GRP(tls), + &MLX5E_STATS_GRP(channels), + &MLX5E_STATS_GRP(per_port_buff_congest), }; -const int mlx5e_num_stats_grps = ARRAY_SIZE(mlx5e_stats_grps); +unsigned int mlx5e_nic_stats_grps_num(struct mlx5e_priv *priv) +{ + return ARRAY_SIZE(mlx5e_nic_stats_grps); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index 869f3502f631..092b39ffa32a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -29,6 +29,7 @@ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ + #ifndef __MLX5_EN_STATS_H__ #define __MLX5_EN_STATS_H__ @@ -55,6 +56,56 @@ struct counter_desc { size_t offset; /* Byte offset */ }; +enum { + MLX5E_NDO_UPDATE_STATS = BIT(0x1), +}; + +struct mlx5e_priv; +struct mlx5e_stats_grp { + u16 update_stats_mask; + int (*get_num_stats)(struct mlx5e_priv *priv); + int (*fill_strings)(struct mlx5e_priv *priv, u8 *data, int idx); + int (*fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx); + void (*update_stats)(struct mlx5e_priv *priv); +}; + +typedef const struct mlx5e_stats_grp *const mlx5e_stats_grp_t; + +#define MLX5E_STATS_GRP_OP(grp, name) mlx5e_stats_grp_ ## grp ## _ ## name + +#define MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(grp) \ + int MLX5E_STATS_GRP_OP(grp, num_stats)(struct mlx5e_priv *priv) + +#define MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(grp) \ + void MLX5E_STATS_GRP_OP(grp, update_stats)(struct mlx5e_priv *priv) + +#define MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(grp) \ + int MLX5E_STATS_GRP_OP(grp, fill_strings)(struct mlx5e_priv *priv, u8 *data, int idx) + +#define MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(grp) \ + int MLX5E_STATS_GRP_OP(grp, fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx) + +#define MLX5E_STATS_GRP(grp) mlx5e_stats_grp_ ## grp + +#define MLX5E_DECLARE_STATS_GRP(grp) \ + const struct mlx5e_stats_grp MLX5E_STATS_GRP(grp) + +#define MLX5E_DEFINE_STATS_GRP(grp, mask) \ +MLX5E_DECLARE_STATS_GRP(grp) = { \ + .get_num_stats = MLX5E_STATS_GRP_OP(grp, num_stats), \ + .fill_stats = MLX5E_STATS_GRP_OP(grp, fill_stats), \ + .fill_strings = MLX5E_STATS_GRP_OP(grp, fill_strings), \ + .update_stats = MLX5E_STATS_GRP_OP(grp, update_stats), \ + .update_stats_mask = mask, \ +} + +unsigned int mlx5e_stats_total_num(struct mlx5e_priv *priv); +void mlx5e_stats_update(struct mlx5e_priv *priv); +void mlx5e_stats_fill(struct mlx5e_priv *priv, u64 *data, int idx); +void mlx5e_stats_fill_strings(struct mlx5e_priv *priv, u8 *data); + +/* Concrete NIC Stats */ + struct mlx5e_sw_stats { u64 rx_packets; u64 rx_bytes; @@ -322,22 +373,22 @@ struct mlx5e_stats { struct mlx5e_pcie_stats pcie; }; -enum { - MLX5E_NDO_UPDATE_STATS = BIT(0x1), -}; - -struct mlx5e_priv; -struct mlx5e_stats_grp { - u16 update_stats_mask; - int (*get_num_stats)(struct mlx5e_priv *priv); - int (*fill_strings)(struct mlx5e_priv *priv, u8 *data, int idx); - int (*fill_stats)(struct mlx5e_priv *priv, u64 *data, int idx); - void (*update_stats)(struct mlx5e_priv *priv); -}; - -extern const struct mlx5e_stats_grp mlx5e_stats_grps[]; -extern const int mlx5e_num_stats_grps; +extern mlx5e_stats_grp_t mlx5e_nic_stats_grps[]; +unsigned int mlx5e_nic_stats_grps_num(struct mlx5e_priv *priv); -void mlx5e_grp_802_3_update_stats(struct mlx5e_priv *priv); +extern MLX5E_DECLARE_STATS_GRP(sw); +extern MLX5E_DECLARE_STATS_GRP(qcnt); +extern MLX5E_DECLARE_STATS_GRP(vnic_env); +extern MLX5E_DECLARE_STATS_GRP(vport); +extern MLX5E_DECLARE_STATS_GRP(802_3); +extern MLX5E_DECLARE_STATS_GRP(2863); +extern MLX5E_DECLARE_STATS_GRP(2819); +extern MLX5E_DECLARE_STATS_GRP(phy); +extern MLX5E_DECLARE_STATS_GRP(eth_ext); +extern MLX5E_DECLARE_STATS_GRP(pcie); +extern MLX5E_DECLARE_STATS_GRP(per_prio); +extern MLX5E_DECLARE_STATS_GRP(pme); +extern MLX5E_DECLARE_STATS_GRP(channels); +extern MLX5E_DECLARE_STATS_GRP(per_port_buff_congest); #endif /* __MLX5_EN_STATS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 26f559b453dc..74091f72c9a8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -1810,6 +1810,40 @@ static void *get_match_headers_value(u32 flags, outer_headers); } +static int mlx5e_flower_parse_meta(struct net_device *filter_dev, + struct flow_cls_offload *f) +{ + struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct netlink_ext_ack *extack = f->common.extack; + struct net_device *ingress_dev; + struct flow_match_meta match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_META)) + return 0; + + flow_rule_match_meta(rule, &match); + if (match.mask->ingress_ifindex != 0xFFFFFFFF) { + NL_SET_ERR_MSG_MOD(extack, "Unsupported ingress ifindex mask"); + return -EINVAL; + } + + ingress_dev = __dev_get_by_index(dev_net(filter_dev), + match.key->ingress_ifindex); + if (!ingress_dev) { + NL_SET_ERR_MSG_MOD(extack, + "Can't find the ingress port to match on"); + return -EINVAL; + } + + if (ingress_dev != filter_dev) { + NL_SET_ERR_MSG_MOD(extack, + "Can't match on the ingress filter port"); + return -EINVAL; + } + + return 0; +} + static int __parse_cls_flower(struct mlx5e_priv *priv, struct mlx5_flow_spec *spec, struct flow_cls_offload *f, @@ -1830,6 +1864,7 @@ static int __parse_cls_flower(struct mlx5e_priv *priv, u16 addr_type = 0; u8 ip_proto = 0; u8 *match_level; + int err; match_level = outer_match_level; @@ -1873,6 +1908,10 @@ static int __parse_cls_flower(struct mlx5e_priv *priv, spec); } + err = mlx5e_flower_parse_meta(filter_dev, f); + if (err) + return err; + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) { struct flow_match_basic match; @@ -4045,6 +4084,13 @@ static int apply_police_params(struct mlx5e_priv *priv, u32 rate, u32 rate_mbps; int err; + vport_num = rpriv->rep->vport; + if (vport_num >= MLX5_VPORT_ECPF) { + NL_SET_ERR_MSG_MOD(extack, + "Ingress rate limit is supported only for Eswitch ports connected to VFs"); + return -EOPNOTSUPP; + } + esw = priv->mdev->priv.eswitch; /* rate is given in bytes/sec. * First convert to bits/sec and then round to the nearest mbit/secs. @@ -4053,8 +4099,6 @@ static int apply_police_params(struct mlx5e_priv *priv, u32 rate, * 1 mbit/sec. */ rate_mbps = rate ? max_t(u32, (rate * 8 + 500000) / 1000000, 1) : 0; - vport_num = rpriv->rep->vport; - err = mlx5_esw_modify_vport_rate(esw, vport_num, rate_mbps); if (err) NL_SET_ERR_MSG_MOD(extack, "failed applying action to hardware"); @@ -4207,7 +4251,10 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) return err; tc->netdevice_nb.notifier_call = mlx5e_tc_netdev_event; - if (register_netdevice_notifier(&tc->netdevice_nb)) { + err = register_netdevice_notifier_dev_net(priv->netdev, + &tc->netdevice_nb, + &tc->netdevice_nn); + if (err) { tc->netdevice_nb.notifier_call = NULL; mlx5_core_warn(priv->mdev, "Failed to register netdev notifier\n"); } @@ -4229,7 +4276,9 @@ void mlx5e_tc_nic_cleanup(struct mlx5e_priv *priv) struct mlx5e_tc_table *tc = &priv->fs.tc; if (tc->netdevice_nb.notifier_call) - unregister_netdevice_notifier(&tc->netdevice_nb); + unregister_netdevice_notifier_dev_net(priv->netdev, + &tc->netdevice_nb, + &tc->netdevice_nn); mutex_destroy(&tc->mod_hdr.lock); mutex_destroy(&tc->hairpin_tbl_lock); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 05b13a1e829c..5acf60b1bbfe 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1931,8 +1931,10 @@ static void mlx5_eswitch_clear_vf_vports_info(struct mlx5_eswitch *esw) struct mlx5_vport *vport; int i; - mlx5_esw_for_each_vf_vport(esw, i, vport, esw->esw_funcs.num_vfs) + mlx5_esw_for_each_vf_vport(esw, i, vport, esw->esw_funcs.num_vfs) { memset(&vport->info, 0, sizeof(vport->info)); + vport->info.link_state = MLX5_VPORT_ADMIN_STATE_AUTO; + } } /* Public E-Switch API */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index a6d0b62ef234..979f13bdc203 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1172,7 +1172,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw, return -EINVAL; } - mlx5_eswitch_disable(esw, false); + mlx5_eswitch_disable(esw, true); mlx5_eswitch_update_num_of_vfs(esw, esw->dev->priv.sriov.num_vfs); err = mlx5_eswitch_enable(esw, MLX5_ESWITCH_OFFLOADS); if (err) { @@ -2014,7 +2014,8 @@ int mlx5_esw_funcs_changed_handler(struct notifier_block *nb, unsigned long type int esw_offloads_enable(struct mlx5_eswitch *esw) { - int err; + struct mlx5_vport *vport; + int err, i; if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat) && MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, decap)) @@ -2031,6 +2032,10 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) if (err) goto err_vport_metadata; + /* Representor will control the vport link state */ + mlx5_esw_for_each_vf_vport(esw, i, vport, esw->esw_funcs.num_vfs) + vport->info.link_state = MLX5_VPORT_ADMIN_STATE_DOWN; + err = mlx5_eswitch_enable_pf_vf_vports(esw, MLX5_VPORT_UC_ADDR_CHANGE); if (err) goto err_vports; @@ -2060,7 +2065,7 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw, { int err, err1; - mlx5_eswitch_disable(esw, false); + mlx5_eswitch_disable(esw, true); err = mlx5_eswitch_enable(esw, MLX5_ESWITCH_LEGACY); if (err) { NL_SET_ERR_MSG_MOD(extack, "Failed setting eswitch to legacy"); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c index 3a60eb5360bd..c5a446e295aa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c @@ -32,10 +32,10 @@ * pools. */ #define ESW_SIZE (16 * 1024 * 1024) -const unsigned int ESW_POOLS[] = { 4 * 1024 * 1024, - 1 * 1024 * 1024, - 64 * 1024, - 4 * 1024, }; +static const unsigned int ESW_POOLS[] = { 4 * 1024 * 1024, + 1 * 1024 * 1024, + 64 * 1024, + 4 * 1024, }; struct mlx5_esw_chains_priv { struct rhashtable chains_ht; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c index 7c87f523e370..56078b23f1a0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c @@ -419,6 +419,28 @@ static void mlx5i_cleanup_rx(struct mlx5e_priv *priv) mlx5e_destroy_q_counters(priv); } +/* The stats groups order is opposite to the update_stats() order calls */ +static mlx5e_stats_grp_t mlx5i_stats_grps[] = { + &MLX5E_STATS_GRP(sw), + &MLX5E_STATS_GRP(qcnt), + &MLX5E_STATS_GRP(vnic_env), + &MLX5E_STATS_GRP(vport), + &MLX5E_STATS_GRP(802_3), + &MLX5E_STATS_GRP(2863), + &MLX5E_STATS_GRP(2819), + &MLX5E_STATS_GRP(phy), + &MLX5E_STATS_GRP(pcie), + &MLX5E_STATS_GRP(per_prio), + &MLX5E_STATS_GRP(pme), + &MLX5E_STATS_GRP(channels), + &MLX5E_STATS_GRP(per_port_buff_congest), +}; + +static unsigned int mlx5i_stats_grps_num(struct mlx5e_priv *priv) +{ + return ARRAY_SIZE(mlx5i_stats_grps); +} + static const struct mlx5e_profile mlx5i_nic_profile = { .init = mlx5i_init, .cleanup = mlx5i_cleanup, @@ -435,6 +457,8 @@ static const struct mlx5e_profile mlx5i_nic_profile = { .rx_handlers.handle_rx_cqe_mpwqe = NULL, /* Not supported */ .max_tc = MLX5I_MAX_NUM_TC, .rq_groups = MLX5E_NUM_RQ_GROUPS(REGULAR), + .stats_grps = mlx5i_stats_grps, + .stats_grps_num = mlx5i_stats_grps_num, }; /* mlx5i netdev NDos */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index fc0d9583475d..b91eabc09fbc 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -586,7 +586,8 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev) if (!ldev->nb.notifier_call) { ldev->nb.notifier_call = mlx5_lag_netdev_event; - if (register_netdevice_notifier(&ldev->nb)) { + if (register_netdevice_notifier_dev_net(netdev, &ldev->nb, + &ldev->nn)) { ldev->nb.notifier_call = NULL; mlx5_core_err(dev, "Failed to register LAG netdev notifier\n"); } @@ -599,7 +600,7 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev) } /* Must be called with intf_mutex held */ -void mlx5_lag_remove(struct mlx5_core_dev *dev) +void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev) { struct mlx5_lag *ldev; int i; @@ -619,7 +620,8 @@ void mlx5_lag_remove(struct mlx5_core_dev *dev) if (i == MLX5_MAX_PORTS) { if (ldev->nb.notifier_call) - unregister_netdevice_notifier(&ldev->nb); + unregister_netdevice_notifier_dev_net(netdev, &ldev->nb, + &ldev->nn); mlx5_lag_mp_cleanup(ldev); cancel_delayed_work_sync(&ldev->bond_work); mlx5_lag_dev_free(ldev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index f1068aac6406..316ab09e2664 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -44,6 +44,7 @@ struct mlx5_lag { struct workqueue_struct *wq; struct delayed_work bond_work; struct notifier_block nb; + struct netdev_net_notifier nn; struct lag_mp lag_mp; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index cf7b8da0f010..f554cfddcf4e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1563,6 +1563,7 @@ static const struct pci_device_id mlx5_core_pci_table[] = { { PCI_VDEVICE(MELLANOX, 0x101d) }, /* ConnectX-6 Dx */ { PCI_VDEVICE(MELLANOX, 0x101e), MLX5_PCI_DEV_IS_VF}, /* ConnectX Family mlx5Gen Virtual Function */ { PCI_VDEVICE(MELLANOX, 0x101f) }, /* ConnectX-6 LX */ + { PCI_VDEVICE(MELLANOX, 0x1021) }, /* ConnectX-7 */ { PCI_VDEVICE(MELLANOX, 0xa2d2) }, /* BlueField integrated ConnectX-5 network controller */ { PCI_VDEVICE(MELLANOX, 0xa2d3), MLX5_PCI_DEV_IS_VF}, /* BlueField integrated ConnectX-5 network controller VF */ { PCI_VDEVICE(MELLANOX, 0xa2d6) }, /* BlueField-2 integrated ConnectX-6 Dx network controller */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index da67b28d6e23..fcce9e0fc82c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -157,7 +157,7 @@ int mlx5_query_qcam_reg(struct mlx5_core_dev *mdev, u32 *qcam, u8 feature_group, u8 access_reg_group); void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev); -void mlx5_lag_remove(struct mlx5_core_dev *dev); +void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev); int mlx5_irq_table_init(struct mlx5_core_dev *dev); void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c index 9359eed10889..6dec2a550a10 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c @@ -677,9 +677,12 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher, goto out_invalid_arg; } if (action->dest_tbl.tbl->level <= matcher->tbl->level) { + mlx5_core_warn_once(dmn->mdev, + "Connecting table to a lower/same level destination table\n"); mlx5dr_dbg(dmn, - "Destination table level should be higher than source table\n"); - goto out_invalid_arg; + "Connecting table at level %d to a destination table at level %d\n", + matcher->tbl->level, + action->dest_tbl.tbl->level); } attr.final_icm_addr = rx_rule ? action->dest_tbl.tbl->rx.s_anchor->chunk->icm_addr : @@ -1314,58 +1317,85 @@ not_found: } static int -dr_action_modify_sw_to_hw(struct mlx5dr_domain *dmn, - __be64 *sw_action, - __be64 *hw_action, - const struct dr_action_modify_field_conv **ret_hw_info) +dr_action_modify_sw_to_hw_add(struct mlx5dr_domain *dmn, + __be64 *sw_action, + __be64 *hw_action, + const struct dr_action_modify_field_conv **ret_hw_info) { const struct dr_action_modify_field_conv *hw_action_info; - u8 offset, length, max_length, action; + u8 max_length; u16 sw_field; - u8 hw_opcode; u32 data; /* Get SW modify action data */ - action = MLX5_GET(set_action_in, sw_action, action_type); - length = MLX5_GET(set_action_in, sw_action, length); - offset = MLX5_GET(set_action_in, sw_action, offset); sw_field = MLX5_GET(set_action_in, sw_action, field); data = MLX5_GET(set_action_in, sw_action, data); /* Convert SW data to HW modify action format */ hw_action_info = dr_action_modify_get_hw_info(sw_field); if (!hw_action_info) { - mlx5dr_dbg(dmn, "Modify action invalid field given\n"); + mlx5dr_dbg(dmn, "Modify add action invalid field given\n"); return -EINVAL; } max_length = hw_action_info->end - hw_action_info->start + 1; - switch (action) { - case MLX5_ACTION_TYPE_SET: - hw_opcode = MLX5DR_ACTION_MDFY_HW_OP_SET; - /* PRM defines that length zero specific length of 32bits */ - if (!length) - length = 32; + MLX5_SET(dr_action_hw_set, hw_action, + opcode, MLX5DR_ACTION_MDFY_HW_OP_ADD); - if (length + offset > max_length) { - mlx5dr_dbg(dmn, "Modify action length + offset exceeds limit\n"); - return -EINVAL; - } - break; + MLX5_SET(dr_action_hw_set, hw_action, destination_field_code, + hw_action_info->hw_field); - case MLX5_ACTION_TYPE_ADD: - hw_opcode = MLX5DR_ACTION_MDFY_HW_OP_ADD; - offset = 0; - length = max_length; - break; + MLX5_SET(dr_action_hw_set, hw_action, destination_left_shifter, + hw_action_info->start); - default: - mlx5dr_info(dmn, "Unsupported action_type for modify action\n"); - return -EOPNOTSUPP; + /* PRM defines that length zero specific length of 32bits */ + MLX5_SET(dr_action_hw_set, hw_action, destination_length, + max_length == 32 ? 0 : max_length); + + MLX5_SET(dr_action_hw_set, hw_action, inline_data, data); + + *ret_hw_info = hw_action_info; + + return 0; +} + +static int +dr_action_modify_sw_to_hw_set(struct mlx5dr_domain *dmn, + __be64 *sw_action, + __be64 *hw_action, + const struct dr_action_modify_field_conv **ret_hw_info) +{ + const struct dr_action_modify_field_conv *hw_action_info; + u8 offset, length, max_length; + u16 sw_field; + u32 data; + + /* Get SW modify action data */ + length = MLX5_GET(set_action_in, sw_action, length); + offset = MLX5_GET(set_action_in, sw_action, offset); + sw_field = MLX5_GET(set_action_in, sw_action, field); + data = MLX5_GET(set_action_in, sw_action, data); + + /* Convert SW data to HW modify action format */ + hw_action_info = dr_action_modify_get_hw_info(sw_field); + if (!hw_action_info) { + mlx5dr_dbg(dmn, "Modify set action invalid field given\n"); + return -EINVAL; + } + + /* PRM defines that length zero specific length of 32bits */ + length = length ? length : 32; + + max_length = hw_action_info->end - hw_action_info->start + 1; + + if (length + offset > max_length) { + mlx5dr_dbg(dmn, "Modify action length + offset exceeds limit\n"); + return -EINVAL; } - MLX5_SET(dr_action_hw_set, hw_action, opcode, hw_opcode); + MLX5_SET(dr_action_hw_set, hw_action, + opcode, MLX5DR_ACTION_MDFY_HW_OP_SET); MLX5_SET(dr_action_hw_set, hw_action, destination_field_code, hw_action_info->hw_field); @@ -1384,48 +1414,236 @@ dr_action_modify_sw_to_hw(struct mlx5dr_domain *dmn, } static int -dr_action_modify_check_field_limitation(struct mlx5dr_domain *dmn, - const __be64 *sw_action) +dr_action_modify_sw_to_hw_copy(struct mlx5dr_domain *dmn, + __be64 *sw_action, + __be64 *hw_action, + const struct dr_action_modify_field_conv **ret_dst_hw_info, + const struct dr_action_modify_field_conv **ret_src_hw_info) +{ + u8 src_offset, dst_offset, src_max_length, dst_max_length, length; + const struct dr_action_modify_field_conv *hw_dst_action_info; + const struct dr_action_modify_field_conv *hw_src_action_info; + u16 src_field, dst_field; + + /* Get SW modify action data */ + src_field = MLX5_GET(copy_action_in, sw_action, src_field); + dst_field = MLX5_GET(copy_action_in, sw_action, dst_field); + src_offset = MLX5_GET(copy_action_in, sw_action, src_offset); + dst_offset = MLX5_GET(copy_action_in, sw_action, dst_offset); + length = MLX5_GET(copy_action_in, sw_action, length); + + /* Convert SW data to HW modify action format */ + hw_src_action_info = dr_action_modify_get_hw_info(src_field); + hw_dst_action_info = dr_action_modify_get_hw_info(dst_field); + if (!hw_src_action_info || !hw_dst_action_info) { + mlx5dr_dbg(dmn, "Modify copy action invalid field given\n"); + return -EINVAL; + } + + /* PRM defines that length zero specific length of 32bits */ + length = length ? length : 32; + + src_max_length = hw_src_action_info->end - + hw_src_action_info->start + 1; + dst_max_length = hw_dst_action_info->end - + hw_dst_action_info->start + 1; + + if (length + src_offset > src_max_length || + length + dst_offset > dst_max_length) { + mlx5dr_dbg(dmn, "Modify action length + offset exceeds limit\n"); + return -EINVAL; + } + + MLX5_SET(dr_action_hw_copy, hw_action, + opcode, MLX5DR_ACTION_MDFY_HW_OP_COPY); + + MLX5_SET(dr_action_hw_copy, hw_action, destination_field_code, + hw_dst_action_info->hw_field); + + MLX5_SET(dr_action_hw_copy, hw_action, destination_left_shifter, + hw_dst_action_info->start + dst_offset); + + MLX5_SET(dr_action_hw_copy, hw_action, destination_length, + length == 32 ? 0 : length); + + MLX5_SET(dr_action_hw_copy, hw_action, source_field_code, + hw_src_action_info->hw_field); + + MLX5_SET(dr_action_hw_copy, hw_action, source_left_shifter, + hw_src_action_info->start + dst_offset); + + *ret_dst_hw_info = hw_dst_action_info; + *ret_src_hw_info = hw_src_action_info; + + return 0; +} + +static int +dr_action_modify_sw_to_hw(struct mlx5dr_domain *dmn, + __be64 *sw_action, + __be64 *hw_action, + const struct dr_action_modify_field_conv **ret_dst_hw_info, + const struct dr_action_modify_field_conv **ret_src_hw_info) { - u16 sw_field; u8 action; + int ret; - sw_field = MLX5_GET(set_action_in, sw_action, field); + *hw_action = 0; + *ret_src_hw_info = NULL; + + /* Get SW modify action type */ action = MLX5_GET(set_action_in, sw_action, action_type); - /* Check if SW field is supported in current domain (RX/TX) */ - if (action == MLX5_ACTION_TYPE_SET) { - if (sw_field == MLX5_ACTION_IN_FIELD_METADATA_REG_A) { + switch (action) { + case MLX5_ACTION_TYPE_SET: + ret = dr_action_modify_sw_to_hw_set(dmn, sw_action, + hw_action, + ret_dst_hw_info); + break; + + case MLX5_ACTION_TYPE_ADD: + ret = dr_action_modify_sw_to_hw_add(dmn, sw_action, + hw_action, + ret_dst_hw_info); + break; + + case MLX5_ACTION_TYPE_COPY: + ret = dr_action_modify_sw_to_hw_copy(dmn, sw_action, + hw_action, + ret_dst_hw_info, + ret_src_hw_info); + break; + + default: + mlx5dr_info(dmn, "Unsupported action_type for modify action\n"); + ret = -EOPNOTSUPP; + } + + return ret; +} + +static int +dr_action_modify_check_set_field_limitation(struct mlx5dr_action *action, + const __be64 *sw_action) +{ + u16 sw_field = MLX5_GET(set_action_in, sw_action, field); + struct mlx5dr_domain *dmn = action->rewrite.dmn; + + if (sw_field == MLX5_ACTION_IN_FIELD_METADATA_REG_A) { + action->rewrite.allow_rx = 0; + if (dmn->type != MLX5DR_DOMAIN_TYPE_NIC_TX) { + mlx5dr_dbg(dmn, "Unsupported field %d for RX/FDB set action\n", + sw_field); + return -EINVAL; + } + } else if (sw_field == MLX5_ACTION_IN_FIELD_METADATA_REG_B) { + action->rewrite.allow_tx = 0; + if (dmn->type != MLX5DR_DOMAIN_TYPE_NIC_RX) { + mlx5dr_dbg(dmn, "Unsupported field %d for TX/FDB set action\n", + sw_field); + return -EINVAL; + } + } + + if (!action->rewrite.allow_rx && !action->rewrite.allow_tx) { + mlx5dr_dbg(dmn, "Modify SET actions not supported on both RX and TX\n"); + return -EINVAL; + } + + return 0; +} + +static int +dr_action_modify_check_add_field_limitation(struct mlx5dr_action *action, + const __be64 *sw_action) +{ + u16 sw_field = MLX5_GET(set_action_in, sw_action, field); + struct mlx5dr_domain *dmn = action->rewrite.dmn; + + if (sw_field != MLX5_ACTION_IN_FIELD_OUT_IP_TTL && + sw_field != MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT && + sw_field != MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM && + sw_field != MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM) { + mlx5dr_dbg(dmn, "Unsupported field %d for add action\n", + sw_field); + return -EINVAL; + } + + return 0; +} + +static int +dr_action_modify_check_copy_field_limitation(struct mlx5dr_action *action, + const __be64 *sw_action) +{ + struct mlx5dr_domain *dmn = action->rewrite.dmn; + u16 sw_fields[2]; + int i; + + sw_fields[0] = MLX5_GET(copy_action_in, sw_action, src_field); + sw_fields[1] = MLX5_GET(copy_action_in, sw_action, dst_field); + + for (i = 0; i < 2; i++) { + if (sw_fields[i] == MLX5_ACTION_IN_FIELD_METADATA_REG_A) { + action->rewrite.allow_rx = 0; if (dmn->type != MLX5DR_DOMAIN_TYPE_NIC_TX) { mlx5dr_dbg(dmn, "Unsupported field %d for RX/FDB set action\n", - sw_field); + sw_fields[i]); return -EINVAL; } - } - - if (sw_field == MLX5_ACTION_IN_FIELD_METADATA_REG_B) { + } else if (sw_fields[i] == MLX5_ACTION_IN_FIELD_METADATA_REG_B) { + action->rewrite.allow_tx = 0; if (dmn->type != MLX5DR_DOMAIN_TYPE_NIC_RX) { mlx5dr_dbg(dmn, "Unsupported field %d for TX/FDB set action\n", - sw_field); + sw_fields[i]); return -EINVAL; } } - } else if (action == MLX5_ACTION_TYPE_ADD) { - if (sw_field != MLX5_ACTION_IN_FIELD_OUT_IP_TTL && - sw_field != MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT && - sw_field != MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM && - sw_field != MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM) { - mlx5dr_dbg(dmn, "Unsupported field %d for add action\n", sw_field); - return -EINVAL; - } - } else { - mlx5dr_info(dmn, "Unsupported action %d modify action\n", action); - return -EOPNOTSUPP; + } + + if (!action->rewrite.allow_rx && !action->rewrite.allow_tx) { + mlx5dr_dbg(dmn, "Modify copy actions not supported on both RX and TX\n"); + return -EINVAL; } return 0; } +static int +dr_action_modify_check_field_limitation(struct mlx5dr_action *action, + const __be64 *sw_action) +{ + struct mlx5dr_domain *dmn = action->rewrite.dmn; + u8 action_type; + int ret; + + action_type = MLX5_GET(set_action_in, sw_action, action_type); + + switch (action_type) { + case MLX5_ACTION_TYPE_SET: + ret = dr_action_modify_check_set_field_limitation(action, + sw_action); + break; + + case MLX5_ACTION_TYPE_ADD: + ret = dr_action_modify_check_add_field_limitation(action, + sw_action); + break; + + case MLX5_ACTION_TYPE_COPY: + ret = dr_action_modify_check_copy_field_limitation(action, + sw_action); + break; + + default: + mlx5dr_info(dmn, "Unsupported action %d modify action\n", + action_type); + ret = -EOPNOTSUPP; + } + + return ret; +} + static bool dr_action_modify_check_is_ttl_modify(const u64 *sw_action) { @@ -1434,7 +1652,7 @@ dr_action_modify_check_is_ttl_modify(const u64 *sw_action) return sw_field == MLX5_ACTION_IN_FIELD_OUT_IP_TTL; } -static int dr_actions_convert_modify_header(struct mlx5dr_domain *dmn, +static int dr_actions_convert_modify_header(struct mlx5dr_action *action, u32 max_hw_actions, u32 num_sw_actions, __be64 sw_actions[], @@ -1442,20 +1660,26 @@ static int dr_actions_convert_modify_header(struct mlx5dr_domain *dmn, u32 *num_hw_actions, bool *modify_ttl) { - const struct dr_action_modify_field_conv *hw_action_info; + const struct dr_action_modify_field_conv *hw_dst_action_info; + const struct dr_action_modify_field_conv *hw_src_action_info; u16 hw_field = MLX5DR_ACTION_MDFY_HW_FLD_RESERVED; u32 l3_type = MLX5DR_ACTION_MDFY_HW_HDR_L3_NONE; u32 l4_type = MLX5DR_ACTION_MDFY_HW_HDR_L4_NONE; + struct mlx5dr_domain *dmn = action->rewrite.dmn; int ret, i, hw_idx = 0; __be64 *sw_action; __be64 hw_action; *modify_ttl = false; + action->rewrite.allow_rx = 1; + action->rewrite.allow_tx = 1; + for (i = 0; i < num_sw_actions; i++) { sw_action = &sw_actions[i]; - ret = dr_action_modify_check_field_limitation(dmn, sw_action); + ret = dr_action_modify_check_field_limitation(action, + sw_action); if (ret) return ret; @@ -1466,32 +1690,35 @@ static int dr_actions_convert_modify_header(struct mlx5dr_domain *dmn, ret = dr_action_modify_sw_to_hw(dmn, sw_action, &hw_action, - &hw_action_info); + &hw_dst_action_info, + &hw_src_action_info); if (ret) return ret; /* Due to a HW limitation we cannot modify 2 different L3 types */ - if (l3_type && hw_action_info->l3_type && - hw_action_info->l3_type != l3_type) { + if (l3_type && hw_dst_action_info->l3_type && + hw_dst_action_info->l3_type != l3_type) { mlx5dr_dbg(dmn, "Action list can't support two different L3 types\n"); return -EINVAL; } - if (hw_action_info->l3_type) - l3_type = hw_action_info->l3_type; + if (hw_dst_action_info->l3_type) + l3_type = hw_dst_action_info->l3_type; /* Due to a HW limitation we cannot modify two different L4 types */ - if (l4_type && hw_action_info->l4_type && - hw_action_info->l4_type != l4_type) { + if (l4_type && hw_dst_action_info->l4_type && + hw_dst_action_info->l4_type != l4_type) { mlx5dr_dbg(dmn, "Action list can't support two different L4 types\n"); return -EINVAL; } - if (hw_action_info->l4_type) - l4_type = hw_action_info->l4_type; + if (hw_dst_action_info->l4_type) + l4_type = hw_dst_action_info->l4_type; /* HW reads and executes two actions at once this means we * need to create a gap if two actions access the same field */ - if ((hw_idx % 2) && hw_field == hw_action_info->hw_field) { + if ((hw_idx % 2) && (hw_field == hw_dst_action_info->hw_field || + (hw_src_action_info && + hw_field == hw_src_action_info->hw_field))) { /* Check if after gap insertion the total number of HW * modify actions doesn't exceeds the limit */ @@ -1501,7 +1728,7 @@ static int dr_actions_convert_modify_header(struct mlx5dr_domain *dmn, return -EINVAL; } } - hw_field = hw_action_info->hw_field; + hw_field = hw_dst_action_info->hw_field; hw_actions[hw_idx] = hw_action; hw_idx++; @@ -1544,7 +1771,7 @@ static int dr_action_create_modify_action(struct mlx5dr_domain *dmn, goto free_chunk; } - ret = dr_actions_convert_modify_header(dmn, + ret = dr_actions_convert_modify_header(action, max_hw_actions, num_sw_actions, actions, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c index 51803eef13dd..c7f10d4f8f8d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2019 Mellanox Technologies. */ +#include <linux/smp.h> #include "dr_types.h" #define QUEUE_SIZE 128 @@ -729,7 +730,7 @@ static struct mlx5dr_cq *dr_create_cq(struct mlx5_core_dev *mdev, if (!in) goto err_cqwq; - vector = smp_processor_id() % mlx5_comp_vectors_count(mdev); + vector = raw_smp_processor_id() % mlx5_comp_vectors_count(mdev); err = mlx5_vector2eqn(mdev, vector, &eqn, &irqn); if (err) { kvfree(in); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c index b43275cde8bf..3abfc8125926 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c @@ -379,7 +379,6 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns, if (fte->action.action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) { list_for_each_entry(dst, &fte->node.children, node.list) { enum mlx5_flow_destination_type type = dst->dest_attr.type; - u32 id; if (num_actions == MLX5_FLOW_CONTEXT_ACTION_MAX || num_term_actions >= MLX5_FLOW_CONTEXT_ACTION_MAX) { @@ -387,19 +386,10 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns, goto free_actions; } - switch (type) { - case MLX5_FLOW_DESTINATION_TYPE_COUNTER: - id = dst->dest_attr.counter_id; + if (type == MLX5_FLOW_DESTINATION_TYPE_COUNTER) + continue; - tmp_action = - mlx5dr_action_create_flow_counter(id); - if (!tmp_action) { - err = -ENOMEM; - goto free_actions; - } - fs_dr_actions[fs_dr_num_actions++] = tmp_action; - actions[num_actions++] = tmp_action; - break; + switch (type) { case MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE: tmp_action = create_ft_action(domain, dst); if (!tmp_action) { @@ -432,6 +422,32 @@ static int mlx5_cmd_dr_create_fte(struct mlx5_flow_root_namespace *ns, } } + if (fte->action.action & MLX5_FLOW_CONTEXT_ACTION_COUNT) { + list_for_each_entry(dst, &fte->node.children, node.list) { + u32 id; + + if (dst->dest_attr.type != + MLX5_FLOW_DESTINATION_TYPE_COUNTER) + continue; + + if (num_actions == MLX5_FLOW_CONTEXT_ACTION_MAX) { + err = -ENOSPC; + goto free_actions; + } + + id = dst->dest_attr.counter_id; + tmp_action = + mlx5dr_action_create_flow_counter(id); + if (!tmp_action) { + err = -ENOMEM; + goto free_actions; + } + + fs_dr_actions[fs_dr_num_actions++] = tmp_action; + actions[num_actions++] = tmp_action; + } + } + params.match_sz = match_sz; params.match_buf = (u64 *)fte->val; if (num_term_actions == 1) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr.h index 1722f4668269..e01c3766c7de 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr.h @@ -32,6 +32,7 @@ enum { }; enum { + MLX5DR_ACTION_MDFY_HW_OP_COPY = 0x1, MLX5DR_ACTION_MDFY_HW_OP_SET = 0x2, MLX5DR_ACTION_MDFY_HW_OP_ADD = 0x3, }; @@ -625,4 +626,19 @@ struct mlx5_ifc_dr_action_hw_set_bits { u8 inline_data[0x20]; }; +struct mlx5_ifc_dr_action_hw_copy_bits { + u8 opcode[0x8]; + u8 destination_field_code[0x8]; + u8 reserved_at_10[0x2]; + u8 destination_left_shifter[0x6]; + u8 reserved_at_18[0x2]; + u8 destination_length[0x6]; + + u8 reserved_at_20[0x8]; + u8 source_field_code[0x8]; + u8 reserved_at_30[0x2]; + u8 source_left_shifter[0x6]; + u8 reserved_at_38[0x8]; +}; + #endif /* MLX5_IFC_DR_H */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c index f2a0e72285ba..02f7e4a39578 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c @@ -89,7 +89,7 @@ void mlx5_wq_cyc_wqe_dump(struct mlx5_wq_cyc *wq, u16 ix, u8 nstrides) len = nstrides << wq->fbc.log_stride; wqe = mlx5_wq_cyc_get_wqe(wq, ix); - pr_info("WQE DUMP: WQ size %d WQ cur size %d, WQE index 0x%x, len: %ld\n", + pr_info("WQE DUMP: WQ size %d WQ cur size %d, WQE index 0x%x, len: %zu\n", mlx5_wq_cyc_get_size(wq), wq->cur_sz, ix, len); print_hex_dump(KERN_WARNING, "", DUMP_PREFIX_OFFSET, 16, 1, wqe, len, false); } diff --git a/drivers/net/ethernet/mellanox/mlxsw/minimal.c b/drivers/net/ethernet/mellanox/mlxsw/minimal.c index 2b543911ae00..c4caeeadcba9 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/minimal.c +++ b/drivers/net/ethernet/mellanox/mlxsw/minimal.c @@ -213,8 +213,8 @@ mlxsw_m_port_create(struct mlxsw_m *mlxsw_m, u8 local_port, u8 module) err_register_netdev: mlxsw_m->ports[local_port] = NULL; - free_netdev(dev); err_dev_addr_get: + free_netdev(dev); err_alloc_etherdev: mlxsw_core_port_fini(mlxsw_m->core, local_port); return err; diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h index 0b80e75e87c3..dd6685156396 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/reg.h +++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h @@ -3563,8 +3563,8 @@ MLXSW_ITEM32(reg, qeec, min_shaper_rate, 0x0C, 0, 28); */ MLXSW_ITEM32(reg, qeec, mase, 0x10, 31, 1); -/* A large max rate will disable the max shaper. */ -#define MLXSW_REG_QEEC_MAS_DIS 200000000 /* Kbps */ +/* The largest max shaper value possible to disable the shaper. */ +#define MLXSW_REG_QEEC_MAS_DIS ((1u << 31) - 1) /* Kbps */ /* reg_qeec_max_shaper_rate * Max shaper information rate. @@ -3602,6 +3602,21 @@ MLXSW_ITEM32(reg, qeec, dwrr, 0x18, 15, 1); */ MLXSW_ITEM32(reg, qeec, dwrr_weight, 0x18, 0, 8); +/* reg_qeec_max_shaper_bs + * Max shaper burst size + * Burst size is 2^max_shaper_bs * 512 bits + * For Spectrum-1: Range is: 5..25 + * For Spectrum-2: Range is: 11..25 + * Reserved when ptps = 1 + * Access: RW + */ +MLXSW_ITEM32(reg, qeec, max_shaper_bs, 0x1C, 0, 6); + +#define MLXSW_REG_QEEC_HIGHEST_SHAPER_BS 25 +#define MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP1 5 +#define MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP2 11 +#define MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP3 5 + static inline void mlxsw_reg_qeec_pack(char *payload, u8 local_port, enum mlxsw_reg_qeec_hr hr, u8 index, u8 next_index) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c index 9eb3ac7669f7..7358b5bc7eb6 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c @@ -1796,6 +1796,8 @@ static int mlxsw_sp_setup_tc(struct net_device *dev, enum tc_setup_type type, return mlxsw_sp_setup_tc_prio(mlxsw_sp_port, type_data); case TC_SETUP_QDISC_ETS: return mlxsw_sp_setup_tc_ets(mlxsw_sp_port, type_data); + case TC_SETUP_QDISC_TBF: + return mlxsw_sp_setup_tc_tbf(mlxsw_sp_port, type_data); default: return -EOPNOTSUPP; } @@ -3577,7 +3579,7 @@ int mlxsw_sp_port_ets_set(struct mlxsw_sp_port *mlxsw_sp_port, int mlxsw_sp_port_ets_maxrate_set(struct mlxsw_sp_port *mlxsw_sp_port, enum mlxsw_reg_qeec_hr hr, u8 index, - u8 next_index, u32 maxrate) + u8 next_index, u32 maxrate, u8 burst_size) { struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp; char qeec_pl[MLXSW_REG_QEEC_LEN]; @@ -3586,6 +3588,7 @@ int mlxsw_sp_port_ets_maxrate_set(struct mlxsw_sp_port *mlxsw_sp_port, next_index); mlxsw_reg_qeec_mase_set(qeec_pl, true); mlxsw_reg_qeec_max_shaper_rate_set(qeec_pl, maxrate); + mlxsw_reg_qeec_max_shaper_bs_set(qeec_pl, burst_size); return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(qeec), qeec_pl); } @@ -3654,14 +3657,14 @@ static int mlxsw_sp_port_ets_init(struct mlxsw_sp_port *mlxsw_sp_port) */ err = mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, MLXSW_REG_QEEC_HR_PORT, 0, 0, - MLXSW_REG_QEEC_MAS_DIS); + MLXSW_REG_QEEC_MAS_DIS, 0); if (err) return err; for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) { err = mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, MLXSW_REG_QEEC_HR_SUBGROUP, i, 0, - MLXSW_REG_QEEC_MAS_DIS); + MLXSW_REG_QEEC_MAS_DIS, 0); if (err) return err; } @@ -3669,14 +3672,14 @@ static int mlxsw_sp_port_ets_init(struct mlxsw_sp_port *mlxsw_sp_port) err = mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, MLXSW_REG_QEEC_HR_TC, i, i, - MLXSW_REG_QEEC_MAS_DIS); + MLXSW_REG_QEEC_MAS_DIS, 0); if (err) return err; err = mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, MLXSW_REG_QEEC_HR_TC, i + 8, i, - MLXSW_REG_QEEC_MAS_DIS); + MLXSW_REG_QEEC_MAS_DIS, 0); if (err) return err; } @@ -5173,6 +5176,7 @@ static int mlxsw_sp1_init(struct mlxsw_core *mlxsw_core, mlxsw_sp->span_ops = &mlxsw_sp1_span_ops; mlxsw_sp->listeners = mlxsw_sp1_listener; mlxsw_sp->listeners_count = ARRAY_SIZE(mlxsw_sp1_listener); + mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP1; return mlxsw_sp_init(mlxsw_core, mlxsw_bus_info, extack); } @@ -5197,6 +5201,7 @@ static int mlxsw_sp2_init(struct mlxsw_core *mlxsw_core, mlxsw_sp->port_type_speed_ops = &mlxsw_sp2_port_type_speed_ops; mlxsw_sp->ptp_ops = &mlxsw_sp2_ptp_ops; mlxsw_sp->span_ops = &mlxsw_sp2_span_ops; + mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP2; return mlxsw_sp_init(mlxsw_core, mlxsw_bus_info, extack); } @@ -5219,6 +5224,7 @@ static int mlxsw_sp3_init(struct mlxsw_core *mlxsw_core, mlxsw_sp->port_type_speed_ops = &mlxsw_sp2_port_type_speed_ops; mlxsw_sp->ptp_ops = &mlxsw_sp2_ptp_ops; mlxsw_sp->span_ops = &mlxsw_sp2_span_ops; + mlxsw_sp->lowest_shaper_bs = MLXSW_REG_QEEC_LOWEST_SHAPER_BS_SP3; return mlxsw_sp_init(mlxsw_core, mlxsw_bus_info, extack); } diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h index 5f3b74360dc8..a0f1f9dceec5 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h @@ -189,6 +189,7 @@ struct mlxsw_sp { const struct mlxsw_sp_span_ops *span_ops; const struct mlxsw_listener *listeners; size_t listeners_count; + u32 lowest_shaper_bs; }; static inline struct mlxsw_sp_upper * @@ -487,7 +488,7 @@ int __mlxsw_sp_port_headroom_set(struct mlxsw_sp_port *mlxsw_sp_port, int mtu, struct ieee_pfc *my_pfc); int mlxsw_sp_port_ets_maxrate_set(struct mlxsw_sp_port *mlxsw_sp_port, enum mlxsw_reg_qeec_hr hr, u8 index, - u8 next_index, u32 maxrate); + u8 next_index, u32 maxrate, u8 burst_size); enum mlxsw_reg_spms_state mlxsw_sp_stp_spms_state(u8 stp_state); int mlxsw_sp_port_vid_stp_set(struct mlxsw_sp_port *mlxsw_sp_port, u16 vid, u8 state); @@ -861,6 +862,8 @@ int mlxsw_sp_setup_tc_prio(struct mlxsw_sp_port *mlxsw_sp_port, struct tc_prio_qopt_offload *p); int mlxsw_sp_setup_tc_ets(struct mlxsw_sp_port *mlxsw_sp_port, struct tc_ets_qopt_offload *p); +int mlxsw_sp_setup_tc_tbf(struct mlxsw_sp_port *mlxsw_sp_port, + struct tc_tbf_qopt_offload *p); /* spectrum_fid.c */ bool mlxsw_sp_fid_is_dummy(struct mlxsw_sp *mlxsw_sp, u16 fid_index); diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c index 150b3a144b83..3d3cca596116 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c @@ -8,6 +8,7 @@ #include <linux/string.h> #include <linux/rhashtable.h> #include <linux/netdevice.h> +#include <linux/mutex.h> #include <net/net_namespace.h> #include <net/tc_act/tc_vlan.h> @@ -25,6 +26,7 @@ struct mlxsw_sp_acl { struct mlxsw_sp_fid *dummy_fid; struct rhashtable ruleset_ht; struct list_head rules; + struct mutex rules_lock; /* Protects rules list */ struct { struct delayed_work dw; unsigned long interval; /* ms */ @@ -701,7 +703,9 @@ int mlxsw_sp_acl_rule_add(struct mlxsw_sp *mlxsw_sp, goto err_ruleset_block_bind; } + mutex_lock(&mlxsw_sp->acl->rules_lock); list_add_tail(&rule->list, &mlxsw_sp->acl->rules); + mutex_unlock(&mlxsw_sp->acl->rules_lock); block->rule_count++; block->egress_blocker_rule_count += rule->rulei->egress_bind_blocker; return 0; @@ -723,7 +727,9 @@ void mlxsw_sp_acl_rule_del(struct mlxsw_sp *mlxsw_sp, block->egress_blocker_rule_count -= rule->rulei->egress_bind_blocker; ruleset->ht_key.block->rule_count--; + mutex_lock(&mlxsw_sp->acl->rules_lock); list_del(&rule->list); + mutex_unlock(&mlxsw_sp->acl->rules_lock); if (!ruleset->ht_key.chain_index && mlxsw_sp_acl_ruleset_is_singular(ruleset)) mlxsw_sp_acl_ruleset_block_unbind(mlxsw_sp, ruleset, @@ -783,19 +789,18 @@ static int mlxsw_sp_acl_rules_activity_update(struct mlxsw_sp_acl *acl) struct mlxsw_sp_acl_rule *rule; int err; - /* Protect internal structures from changes */ - rtnl_lock(); + mutex_lock(&acl->rules_lock); list_for_each_entry(rule, &acl->rules, list) { err = mlxsw_sp_acl_rule_activity_update(acl->mlxsw_sp, rule); if (err) goto err_rule_update; } - rtnl_unlock(); + mutex_unlock(&acl->rules_lock); return 0; err_rule_update: - rtnl_unlock(); + mutex_unlock(&acl->rules_lock); return err; } @@ -880,6 +885,7 @@ int mlxsw_sp_acl_init(struct mlxsw_sp *mlxsw_sp) acl->dummy_fid = fid; INIT_LIST_HEAD(&acl->rules); + mutex_init(&acl->rules_lock); err = mlxsw_sp_acl_tcam_init(mlxsw_sp, &acl->tcam); if (err) goto err_acl_ops_init; @@ -892,6 +898,7 @@ int mlxsw_sp_acl_init(struct mlxsw_sp *mlxsw_sp) return 0; err_acl_ops_init: + mutex_destroy(&acl->rules_lock); mlxsw_sp_fid_put(fid); err_fid_get: rhashtable_destroy(&acl->ruleset_ht); @@ -908,6 +915,7 @@ void mlxsw_sp_acl_fini(struct mlxsw_sp *mlxsw_sp) cancel_delayed_work_sync(&mlxsw_sp->acl->rule_activity_update.dw); mlxsw_sp_acl_tcam_fini(mlxsw_sp, &acl->tcam); + mutex_destroy(&acl->rules_lock); WARN_ON(!list_empty(&acl->rules)); mlxsw_sp_fid_put(acl->dummy_fid); rhashtable_destroy(&acl->ruleset_ht); diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c index db66f2b56a6d..49a72a8f1f57 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c @@ -526,7 +526,7 @@ static int mlxsw_sp_dcbnl_ieee_setmaxrate(struct net_device *dev, err = mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, MLXSW_REG_QEEC_HR_SUBGROUP, i, 0, - maxrate->tc_maxrate[i]); + maxrate->tc_maxrate[i], 0); if (err) { netdev_err(dev, "Failed to set maxrate for TC %d\n", i); goto err_port_ets_maxrate_set; @@ -541,7 +541,8 @@ err_port_ets_maxrate_set: for (i--; i >= 0; i--) mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, MLXSW_REG_QEEC_HR_SUBGROUP, - i, 0, my_maxrate->tc_maxrate[i]); + i, 0, + my_maxrate->tc_maxrate[i], 0); return err; } diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c index d57c9b15f45e..79a2801d59f6 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c @@ -19,6 +19,7 @@ enum mlxsw_sp_qdisc_type { MLXSW_SP_QDISC_RED, MLXSW_SP_QDISC_PRIO, MLXSW_SP_QDISC_ETS, + MLXSW_SP_QDISC_TBF, }; struct mlxsw_sp_qdisc_ops { @@ -227,6 +228,70 @@ mlxsw_sp_qdisc_bstats_per_priority_get(struct mlxsw_sp_port_xstats *xstats, } } +static void +mlxsw_sp_qdisc_collect_tc_stats(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + u64 *p_tx_bytes, u64 *p_tx_packets, + u64 *p_drops, u64 *p_backlog) +{ + u8 tclass_num = mlxsw_sp_qdisc->tclass_num; + struct mlxsw_sp_port_xstats *xstats; + u64 tx_bytes, tx_packets; + + xstats = &mlxsw_sp_port->periodic_hw_stats.xstats; + mlxsw_sp_qdisc_bstats_per_priority_get(xstats, + mlxsw_sp_qdisc->prio_bitmap, + &tx_packets, &tx_bytes); + + *p_tx_packets += tx_packets; + *p_tx_bytes += tx_bytes; + *p_drops += xstats->wred_drop[tclass_num] + + mlxsw_sp_xstats_tail_drop(xstats, tclass_num); + *p_backlog += mlxsw_sp_xstats_backlog(xstats, tclass_num); +} + +static void +mlxsw_sp_qdisc_update_stats(struct mlxsw_sp *mlxsw_sp, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + u64 tx_bytes, u64 tx_packets, + u64 drops, u64 backlog, + struct tc_qopt_offload_stats *stats_ptr) +{ + struct mlxsw_sp_qdisc_stats *stats_base = &mlxsw_sp_qdisc->stats_base; + + tx_bytes -= stats_base->tx_bytes; + tx_packets -= stats_base->tx_packets; + drops -= stats_base->drops; + backlog -= stats_base->backlog; + + _bstats_update(stats_ptr->bstats, tx_bytes, tx_packets); + stats_ptr->qstats->drops += drops; + stats_ptr->qstats->backlog += mlxsw_sp_cells_bytes(mlxsw_sp, backlog); + + stats_base->backlog += backlog; + stats_base->drops += drops; + stats_base->tx_bytes += tx_bytes; + stats_base->tx_packets += tx_packets; +} + +static void +mlxsw_sp_qdisc_get_tc_stats(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + struct tc_qopt_offload_stats *stats_ptr) +{ + u64 tx_packets = 0; + u64 tx_bytes = 0; + u64 backlog = 0; + u64 drops = 0; + + mlxsw_sp_qdisc_collect_tc_stats(mlxsw_sp_port, mlxsw_sp_qdisc, + &tx_bytes, &tx_packets, + &drops, &backlog); + mlxsw_sp_qdisc_update_stats(mlxsw_sp_port->mlxsw_sp, mlxsw_sp_qdisc, + tx_bytes, tx_packets, drops, backlog, + stats_ptr); +} + static int mlxsw_sp_tclass_congestion_enable(struct mlxsw_sp_port *mlxsw_sp_port, int tclass_num, u32 min, u32 max, @@ -357,19 +422,28 @@ mlxsw_sp_qdisc_red_replace(struct mlxsw_sp_port *mlxsw_sp_port, } static void -mlxsw_sp_qdisc_red_unoffload(struct mlxsw_sp_port *mlxsw_sp_port, - struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, - void *params) +mlxsw_sp_qdisc_leaf_unoffload(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + struct gnet_stats_queue *qstats) { - struct tc_red_qopt_offload_params *p = params; u64 backlog; backlog = mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp, mlxsw_sp_qdisc->stats_base.backlog); - p->qstats->backlog -= backlog; + qstats->backlog -= backlog; mlxsw_sp_qdisc->stats_base.backlog = 0; } +static void +mlxsw_sp_qdisc_red_unoffload(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + void *params) +{ + struct tc_red_qopt_offload_params *p = params; + + mlxsw_sp_qdisc_leaf_unoffload(mlxsw_sp_port, mlxsw_sp_qdisc, p->qstats); +} + static int mlxsw_sp_qdisc_get_red_xstats(struct mlxsw_sp_port *mlxsw_sp_port, struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, @@ -403,41 +477,21 @@ mlxsw_sp_qdisc_get_red_stats(struct mlxsw_sp_port *mlxsw_sp_port, struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, struct tc_qopt_offload_stats *stats_ptr) { - u64 tx_bytes, tx_packets, overlimits, drops, backlog; u8 tclass_num = mlxsw_sp_qdisc->tclass_num; struct mlxsw_sp_qdisc_stats *stats_base; struct mlxsw_sp_port_xstats *xstats; + u64 overlimits; xstats = &mlxsw_sp_port->periodic_hw_stats.xstats; stats_base = &mlxsw_sp_qdisc->stats_base; - mlxsw_sp_qdisc_bstats_per_priority_get(xstats, - mlxsw_sp_qdisc->prio_bitmap, - &tx_packets, &tx_bytes); - tx_bytes = tx_bytes - stats_base->tx_bytes; - tx_packets = tx_packets - stats_base->tx_packets; - + mlxsw_sp_qdisc_get_tc_stats(mlxsw_sp_port, mlxsw_sp_qdisc, stats_ptr); overlimits = xstats->wred_drop[tclass_num] + xstats->ecn - stats_base->overlimits; - drops = xstats->wred_drop[tclass_num] + - mlxsw_sp_xstats_tail_drop(xstats, tclass_num) - - stats_base->drops; - backlog = mlxsw_sp_xstats_backlog(xstats, tclass_num); - _bstats_update(stats_ptr->bstats, tx_bytes, tx_packets); stats_ptr->qstats->overlimits += overlimits; - stats_ptr->qstats->drops += drops; - stats_ptr->qstats->backlog += - mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp, - backlog) - - mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp, - stats_base->backlog); - - stats_base->backlog = backlog; - stats_base->drops += drops; stats_base->overlimits += overlimits; - stats_base->tx_bytes += tx_bytes; - stats_base->tx_packets += tx_packets; + return 0; } @@ -487,6 +541,204 @@ int mlxsw_sp_setup_tc_red(struct mlxsw_sp_port *mlxsw_sp_port, } } +static void +mlxsw_sp_setup_tc_qdisc_leaf_clean_stats(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc) +{ + u64 backlog_cells = 0; + u64 tx_packets = 0; + u64 tx_bytes = 0; + u64 drops = 0; + + mlxsw_sp_qdisc_collect_tc_stats(mlxsw_sp_port, mlxsw_sp_qdisc, + &tx_bytes, &tx_packets, + &drops, &backlog_cells); + + mlxsw_sp_qdisc->stats_base.tx_packets = tx_packets; + mlxsw_sp_qdisc->stats_base.tx_bytes = tx_bytes; + mlxsw_sp_qdisc->stats_base.drops = drops; + mlxsw_sp_qdisc->stats_base.backlog = 0; +} + +static int +mlxsw_sp_qdisc_tbf_destroy(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc) +{ + struct mlxsw_sp_qdisc *root_qdisc = mlxsw_sp_port->root_qdisc; + + if (root_qdisc != mlxsw_sp_qdisc) + root_qdisc->stats_base.backlog -= + mlxsw_sp_qdisc->stats_base.backlog; + + return mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, + MLXSW_REG_QEEC_HR_SUBGROUP, + mlxsw_sp_qdisc->tclass_num, 0, + MLXSW_REG_QEEC_MAS_DIS, 0); +} + +static int +mlxsw_sp_qdisc_tbf_bs(struct mlxsw_sp_port *mlxsw_sp_port, + u32 max_size, u8 *p_burst_size) +{ + /* TBF burst size is configured in bytes. The ASIC burst size value is + * ((2 ^ bs) * 512 bits. Convert the TBF bytes to 512-bit units. + */ + u32 bs512 = max_size / 64; + u8 bs = fls(bs512); + + if (!bs) + return -EINVAL; + --bs; + + /* Demand a power of two. */ + if ((1 << bs) != bs512) + return -EINVAL; + + if (bs < mlxsw_sp_port->mlxsw_sp->lowest_shaper_bs || + bs > MLXSW_REG_QEEC_HIGHEST_SHAPER_BS) + return -EINVAL; + + *p_burst_size = bs; + return 0; +} + +static u32 +mlxsw_sp_qdisc_tbf_max_size(u8 bs) +{ + return (1U << bs) * 64; +} + +static u64 +mlxsw_sp_qdisc_tbf_rate_kbps(struct tc_tbf_qopt_offload_replace_params *p) +{ + /* TBF interface is in bytes/s, whereas Spectrum ASIC is configured in + * Kbits/s. + */ + return p->rate.rate_bytes_ps / 1000 * 8; +} + +static int +mlxsw_sp_qdisc_tbf_check_params(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + void *params) +{ + struct tc_tbf_qopt_offload_replace_params *p = params; + struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp; + u64 rate_kbps = mlxsw_sp_qdisc_tbf_rate_kbps(p); + u8 burst_size; + int err; + + if (rate_kbps >= MLXSW_REG_QEEC_MAS_DIS) { + dev_err(mlxsw_sp_port->mlxsw_sp->bus_info->dev, + "spectrum: TBF: rate of %lluKbps must be below %u\n", + rate_kbps, MLXSW_REG_QEEC_MAS_DIS); + return -EINVAL; + } + + err = mlxsw_sp_qdisc_tbf_bs(mlxsw_sp_port, p->max_size, &burst_size); + if (err) { + u8 highest_shaper_bs = MLXSW_REG_QEEC_HIGHEST_SHAPER_BS; + + dev_err(mlxsw_sp->bus_info->dev, + "spectrum: TBF: invalid burst size of %u, must be a power of two between %u and %u", + p->max_size, + mlxsw_sp_qdisc_tbf_max_size(mlxsw_sp->lowest_shaper_bs), + mlxsw_sp_qdisc_tbf_max_size(highest_shaper_bs)); + return -EINVAL; + } + + return 0; +} + +static int +mlxsw_sp_qdisc_tbf_replace(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + void *params) +{ + struct tc_tbf_qopt_offload_replace_params *p = params; + u64 rate_kbps = mlxsw_sp_qdisc_tbf_rate_kbps(p); + u8 burst_size; + int err; + + err = mlxsw_sp_qdisc_tbf_bs(mlxsw_sp_port, p->max_size, &burst_size); + if (WARN_ON_ONCE(err)) + /* check_params above was supposed to reject this value. */ + return -EINVAL; + + /* Configure subgroup shaper, so that both UC and MC traffic is subject + * to shaping. That is unlike RED, however UC queue lengths are going to + * be different than MC ones due to different pool and quota + * configurations, so the configuration is not applicable. For shaper on + * the other hand, subjecting the overall stream to the configured + * shaper makes sense. Also note that that is what we do for + * ieee_setmaxrate(). + */ + return mlxsw_sp_port_ets_maxrate_set(mlxsw_sp_port, + MLXSW_REG_QEEC_HR_SUBGROUP, + mlxsw_sp_qdisc->tclass_num, 0, + rate_kbps, burst_size); +} + +static void +mlxsw_sp_qdisc_tbf_unoffload(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + void *params) +{ + struct tc_tbf_qopt_offload_replace_params *p = params; + + mlxsw_sp_qdisc_leaf_unoffload(mlxsw_sp_port, mlxsw_sp_qdisc, p->qstats); +} + +static int +mlxsw_sp_qdisc_get_tbf_stats(struct mlxsw_sp_port *mlxsw_sp_port, + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, + struct tc_qopt_offload_stats *stats_ptr) +{ + mlxsw_sp_qdisc_get_tc_stats(mlxsw_sp_port, mlxsw_sp_qdisc, + stats_ptr); + return 0; +} + +static struct mlxsw_sp_qdisc_ops mlxsw_sp_qdisc_ops_tbf = { + .type = MLXSW_SP_QDISC_TBF, + .check_params = mlxsw_sp_qdisc_tbf_check_params, + .replace = mlxsw_sp_qdisc_tbf_replace, + .unoffload = mlxsw_sp_qdisc_tbf_unoffload, + .destroy = mlxsw_sp_qdisc_tbf_destroy, + .get_stats = mlxsw_sp_qdisc_get_tbf_stats, + .clean_stats = mlxsw_sp_setup_tc_qdisc_leaf_clean_stats, +}; + +int mlxsw_sp_setup_tc_tbf(struct mlxsw_sp_port *mlxsw_sp_port, + struct tc_tbf_qopt_offload *p) +{ + struct mlxsw_sp_qdisc *mlxsw_sp_qdisc; + + mlxsw_sp_qdisc = mlxsw_sp_qdisc_find(mlxsw_sp_port, p->parent, false); + if (!mlxsw_sp_qdisc) + return -EOPNOTSUPP; + + if (p->command == TC_TBF_REPLACE) + return mlxsw_sp_qdisc_replace(mlxsw_sp_port, p->handle, + mlxsw_sp_qdisc, + &mlxsw_sp_qdisc_ops_tbf, + &p->replace_params); + + if (!mlxsw_sp_qdisc_compare(mlxsw_sp_qdisc, p->handle, + MLXSW_SP_QDISC_TBF)) + return -EOPNOTSUPP; + + switch (p->command) { + case TC_TBF_DESTROY: + return mlxsw_sp_qdisc_destroy(mlxsw_sp_port, mlxsw_sp_qdisc); + case TC_TBF_STATS: + return mlxsw_sp_qdisc_get_stats(mlxsw_sp_port, mlxsw_sp_qdisc, + &p->stats); + default: + return -EOPNOTSUPP; + } +} + static int __mlxsw_sp_qdisc_ets_destroy(struct mlxsw_sp_port *mlxsw_sp_port) { @@ -628,37 +880,23 @@ mlxsw_sp_qdisc_get_prio_stats(struct mlxsw_sp_port *mlxsw_sp_port, struct mlxsw_sp_qdisc *mlxsw_sp_qdisc, struct tc_qopt_offload_stats *stats_ptr) { - u64 tx_bytes, tx_packets, drops = 0, backlog = 0; - struct mlxsw_sp_qdisc_stats *stats_base; - struct mlxsw_sp_port_xstats *xstats; - struct rtnl_link_stats64 *stats; + struct mlxsw_sp_qdisc *tc_qdisc; + u64 tx_packets = 0; + u64 tx_bytes = 0; + u64 backlog = 0; + u64 drops = 0; int i; - xstats = &mlxsw_sp_port->periodic_hw_stats.xstats; - stats = &mlxsw_sp_port->periodic_hw_stats.stats; - stats_base = &mlxsw_sp_qdisc->stats_base; - - tx_bytes = stats->tx_bytes - stats_base->tx_bytes; - tx_packets = stats->tx_packets - stats_base->tx_packets; - for (i = 0; i < IEEE_8021QAZ_MAX_TCS; i++) { - drops += mlxsw_sp_xstats_tail_drop(xstats, i); - drops += xstats->wred_drop[i]; - backlog += mlxsw_sp_xstats_backlog(xstats, i); + tc_qdisc = &mlxsw_sp_port->tclass_qdiscs[i]; + mlxsw_sp_qdisc_collect_tc_stats(mlxsw_sp_port, tc_qdisc, + &tx_bytes, &tx_packets, + &drops, &backlog); } - drops = drops - stats_base->drops; - _bstats_update(stats_ptr->bstats, tx_bytes, tx_packets); - stats_ptr->qstats->drops += drops; - stats_ptr->qstats->backlog += - mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp, - backlog) - - mlxsw_sp_cells_bytes(mlxsw_sp_port->mlxsw_sp, - stats_base->backlog); - stats_base->backlog = backlog; - stats_base->drops += drops; - stats_base->tx_bytes += tx_bytes; - stats_base->tx_packets += tx_packets; + mlxsw_sp_qdisc_update_stats(mlxsw_sp_port->mlxsw_sp, mlxsw_sp_qdisc, + tx_bytes, tx_packets, drops, backlog, + stats_ptr); return 0; } diff --git a/drivers/net/ethernet/natsemi/sonic.c b/drivers/net/ethernet/natsemi/sonic.c index fdebc8598b22..31be3ba66877 100644 --- a/drivers/net/ethernet/natsemi/sonic.c +++ b/drivers/net/ethernet/natsemi/sonic.c @@ -64,6 +64,8 @@ static int sonic_open(struct net_device *dev) netif_dbg(lp, ifup, dev, "%s: initializing sonic driver\n", __func__); + spin_lock_init(&lp->lock); + for (i = 0; i < SONIC_NUM_RRS; i++) { struct sk_buff *skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2); if (skb == NULL) { @@ -114,6 +116,24 @@ static int sonic_open(struct net_device *dev) return 0; } +/* Wait for the SONIC to become idle. */ +static void sonic_quiesce(struct net_device *dev, u16 mask) +{ + struct sonic_local * __maybe_unused lp = netdev_priv(dev); + int i; + u16 bits; + + for (i = 0; i < 1000; ++i) { + bits = SONIC_READ(SONIC_CMD) & mask; + if (!bits) + return; + if (irqs_disabled() || in_interrupt()) + udelay(20); + else + usleep_range(100, 200); + } + WARN_ONCE(1, "command deadline expired! 0x%04x\n", bits); +} /* * Close the SONIC device @@ -130,6 +150,9 @@ static int sonic_close(struct net_device *dev) /* * stop the SONIC, disable interrupts */ + SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS); + sonic_quiesce(dev, SONIC_CR_ALL); + SONIC_WRITE(SONIC_IMR, 0); SONIC_WRITE(SONIC_ISR, 0x7fff); SONIC_WRITE(SONIC_CMD, SONIC_CR_RST); @@ -169,6 +192,9 @@ static void sonic_tx_timeout(struct net_device *dev, unsigned int txqueue) * put the Sonic into software-reset mode and * disable all interrupts before releasing DMA buffers */ + SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS); + sonic_quiesce(dev, SONIC_CR_ALL); + SONIC_WRITE(SONIC_IMR, 0); SONIC_WRITE(SONIC_ISR, 0x7fff); SONIC_WRITE(SONIC_CMD, SONIC_CR_RST); @@ -206,8 +232,6 @@ static void sonic_tx_timeout(struct net_device *dev, unsigned int txqueue) * wake the tx queue * Concurrently with all of this, the SONIC is potentially writing to * the status flags of the TDs. - * Until some mutual exclusion is added, this code will not work with SMP. However, - * MIPS Jazz machines and m68k Macs were all uni-processor machines. */ static int sonic_send_packet(struct sk_buff *skb, struct net_device *dev) @@ -215,7 +239,8 @@ static int sonic_send_packet(struct sk_buff *skb, struct net_device *dev) struct sonic_local *lp = netdev_priv(dev); dma_addr_t laddr; int length; - int entry = lp->next_tx; + int entry; + unsigned long flags; netif_dbg(lp, tx_queued, dev, "%s: skb=%p\n", __func__, skb); @@ -237,6 +262,10 @@ static int sonic_send_packet(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } + spin_lock_irqsave(&lp->lock, flags); + + entry = lp->next_tx; + sonic_tda_put(dev, entry, SONIC_TD_STATUS, 0); /* clear status */ sonic_tda_put(dev, entry, SONIC_TD_FRAG_COUNT, 1); /* single fragment */ sonic_tda_put(dev, entry, SONIC_TD_PKTSIZE, length); /* length of packet */ @@ -246,10 +275,6 @@ static int sonic_send_packet(struct sk_buff *skb, struct net_device *dev) sonic_tda_put(dev, entry, SONIC_TD_LINK, sonic_tda_get(dev, entry, SONIC_TD_LINK) | SONIC_EOL); - /* - * Must set tx_skb[entry] only after clearing status, and - * before clearing EOL and before stopping queue - */ wmb(); lp->tx_len[entry] = length; lp->tx_laddr[entry] = laddr; @@ -272,6 +297,8 @@ static int sonic_send_packet(struct sk_buff *skb, struct net_device *dev) SONIC_WRITE(SONIC_CMD, SONIC_CR_TXP); + spin_unlock_irqrestore(&lp->lock, flags); + return NETDEV_TX_OK; } @@ -284,15 +311,28 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id) struct net_device *dev = dev_id; struct sonic_local *lp = netdev_priv(dev); int status; + unsigned long flags; + + /* The lock has two purposes. Firstly, it synchronizes sonic_interrupt() + * with sonic_send_packet() so that the two functions can share state. + * Secondly, it makes sonic_interrupt() re-entrant, as that is required + * by macsonic which must use two IRQs with different priority levels. + */ + spin_lock_irqsave(&lp->lock, flags); + + status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT; + if (!status) { + spin_unlock_irqrestore(&lp->lock, flags); - if (!(status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT)) return IRQ_NONE; + } do { + SONIC_WRITE(SONIC_ISR, status); /* clear the interrupt(s) */ + if (status & SONIC_INT_PKTRX) { netif_dbg(lp, intr, dev, "%s: packet rx\n", __func__); sonic_rx(dev); /* got packet(s) */ - SONIC_WRITE(SONIC_ISR, SONIC_INT_PKTRX); /* clear the interrupt */ } if (status & SONIC_INT_TXDN) { @@ -300,11 +340,12 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id) int td_status; int freed_some = 0; - /* At this point, cur_tx is the index of a TD that is one of: - * unallocated/freed (status set & tx_skb[entry] clear) - * allocated and sent (status set & tx_skb[entry] set ) - * allocated and not yet sent (status clear & tx_skb[entry] set ) - * still being allocated by sonic_send_packet (status clear & tx_skb[entry] clear) + /* The state of a Transmit Descriptor may be inferred + * from { tx_skb[entry], td_status } as follows. + * { clear, clear } => the TD has never been used + * { set, clear } => the TD was handed to SONIC + * { set, set } => the TD was handed back + * { clear, set } => the TD is available for re-use */ netif_dbg(lp, intr, dev, "%s: tx done\n", __func__); @@ -313,18 +354,19 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id) if ((td_status = sonic_tda_get(dev, entry, SONIC_TD_STATUS)) == 0) break; - if (td_status & 0x0001) { + if (td_status & SONIC_TCR_PTX) { lp->stats.tx_packets++; lp->stats.tx_bytes += sonic_tda_get(dev, entry, SONIC_TD_PKTSIZE); } else { - lp->stats.tx_errors++; - if (td_status & 0x0642) + if (td_status & (SONIC_TCR_EXD | + SONIC_TCR_EXC | SONIC_TCR_BCM)) lp->stats.tx_aborted_errors++; - if (td_status & 0x0180) + if (td_status & + (SONIC_TCR_NCRS | SONIC_TCR_CRLS)) lp->stats.tx_carrier_errors++; - if (td_status & 0x0020) + if (td_status & SONIC_TCR_OWC) lp->stats.tx_window_errors++; - if (td_status & 0x0004) + if (td_status & SONIC_TCR_FU) lp->stats.tx_fifo_errors++; } @@ -346,7 +388,6 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id) if (freed_some || lp->tx_skb[entry] == NULL) netif_wake_queue(dev); /* The ring is no longer full */ lp->cur_tx = entry; - SONIC_WRITE(SONIC_ISR, SONIC_INT_TXDN); /* clear the interrupt */ } /* @@ -355,42 +396,37 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id) if (status & SONIC_INT_RFO) { netif_dbg(lp, rx_err, dev, "%s: rx fifo overrun\n", __func__); - lp->stats.rx_fifo_errors++; - SONIC_WRITE(SONIC_ISR, SONIC_INT_RFO); /* clear the interrupt */ } if (status & SONIC_INT_RDE) { netif_dbg(lp, rx_err, dev, "%s: rx descriptors exhausted\n", __func__); - lp->stats.rx_dropped++; - SONIC_WRITE(SONIC_ISR, SONIC_INT_RDE); /* clear the interrupt */ } if (status & SONIC_INT_RBAE) { netif_dbg(lp, rx_err, dev, "%s: rx buffer area exceeded\n", __func__); - lp->stats.rx_dropped++; - SONIC_WRITE(SONIC_ISR, SONIC_INT_RBAE); /* clear the interrupt */ } /* counter overruns; all counters are 16bit wide */ - if (status & SONIC_INT_FAE) { + if (status & SONIC_INT_FAE) lp->stats.rx_frame_errors += 65536; - SONIC_WRITE(SONIC_ISR, SONIC_INT_FAE); /* clear the interrupt */ - } - if (status & SONIC_INT_CRC) { + if (status & SONIC_INT_CRC) lp->stats.rx_crc_errors += 65536; - SONIC_WRITE(SONIC_ISR, SONIC_INT_CRC); /* clear the interrupt */ - } - if (status & SONIC_INT_MP) { + if (status & SONIC_INT_MP) lp->stats.rx_missed_errors += 65536; - SONIC_WRITE(SONIC_ISR, SONIC_INT_MP); /* clear the interrupt */ - } /* transmit error */ if (status & SONIC_INT_TXER) { - if (SONIC_READ(SONIC_TCR) & SONIC_TCR_FU) - netif_dbg(lp, tx_err, dev, "%s: tx fifo underrun\n", - __func__); - SONIC_WRITE(SONIC_ISR, SONIC_INT_TXER); /* clear the interrupt */ + u16 tcr = SONIC_READ(SONIC_TCR); + + netif_dbg(lp, tx_err, dev, "%s: TXER intr, TCR %04x\n", + __func__, tcr); + + if (tcr & (SONIC_TCR_EXD | SONIC_TCR_EXC | + SONIC_TCR_FU | SONIC_TCR_BCM)) { + /* Aborted transmission. Try again. */ + netif_stop_queue(dev); + SONIC_WRITE(SONIC_CMD, SONIC_CR_TXP); + } } /* bus retry */ @@ -400,107 +436,164 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id) /* ... to help debug DMA problems causing endless interrupts. */ /* Bounce the eth interface to turn on the interrupt again. */ SONIC_WRITE(SONIC_IMR, 0); - SONIC_WRITE(SONIC_ISR, SONIC_INT_BR); /* clear the interrupt */ } - /* load CAM done */ - if (status & SONIC_INT_LCD) - SONIC_WRITE(SONIC_ISR, SONIC_INT_LCD); /* clear the interrupt */ - } while((status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT)); + status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT; + } while (status); + + spin_unlock_irqrestore(&lp->lock, flags); + return IRQ_HANDLED; } +/* Return the array index corresponding to a given Receive Buffer pointer. */ +static int index_from_addr(struct sonic_local *lp, dma_addr_t addr, + unsigned int last) +{ + unsigned int i = last; + + do { + i = (i + 1) & SONIC_RRS_MASK; + if (addr == lp->rx_laddr[i]) + return i; + } while (i != last); + + return -ENOENT; +} + +/* Allocate and map a new skb to be used as a receive buffer. */ +static bool sonic_alloc_rb(struct net_device *dev, struct sonic_local *lp, + struct sk_buff **new_skb, dma_addr_t *new_addr) +{ + *new_skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2); + if (!*new_skb) + return false; + + if (SONIC_BUS_SCALE(lp->dma_bitmode) == 2) + skb_reserve(*new_skb, 2); + + *new_addr = dma_map_single(lp->device, skb_put(*new_skb, SONIC_RBSIZE), + SONIC_RBSIZE, DMA_FROM_DEVICE); + if (!*new_addr) { + dev_kfree_skb(*new_skb); + *new_skb = NULL; + return false; + } + + return true; +} + +/* Place a new receive resource in the Receive Resource Area and update RWP. */ +static void sonic_update_rra(struct net_device *dev, struct sonic_local *lp, + dma_addr_t old_addr, dma_addr_t new_addr) +{ + unsigned int entry = sonic_rr_entry(dev, SONIC_READ(SONIC_RWP)); + unsigned int end = sonic_rr_entry(dev, SONIC_READ(SONIC_RRP)); + u32 buf; + + /* The resources in the range [RRP, RWP) belong to the SONIC. This loop + * scans the other resources in the RRA, those in the range [RWP, RRP). + */ + do { + buf = (sonic_rra_get(dev, entry, SONIC_RR_BUFADR_H) << 16) | + sonic_rra_get(dev, entry, SONIC_RR_BUFADR_L); + + if (buf == old_addr) + break; + + entry = (entry + 1) & SONIC_RRS_MASK; + } while (entry != end); + + WARN_ONCE(buf != old_addr, "failed to find resource!\n"); + + sonic_rra_put(dev, entry, SONIC_RR_BUFADR_H, new_addr >> 16); + sonic_rra_put(dev, entry, SONIC_RR_BUFADR_L, new_addr & 0xffff); + + entry = (entry + 1) & SONIC_RRS_MASK; + + SONIC_WRITE(SONIC_RWP, sonic_rr_addr(dev, entry)); +} + /* * We have a good packet(s), pass it/them up the network stack. */ static void sonic_rx(struct net_device *dev) { struct sonic_local *lp = netdev_priv(dev); - int status; int entry = lp->cur_rx; + int prev_entry = lp->eol_rx; + bool rbe = false; while (sonic_rda_get(dev, entry, SONIC_RD_IN_USE) == 0) { - struct sk_buff *used_skb; - struct sk_buff *new_skb; - dma_addr_t new_laddr; - u16 bufadr_l; - u16 bufadr_h; - int pkt_len; - - status = sonic_rda_get(dev, entry, SONIC_RD_STATUS); - if (status & SONIC_RCR_PRX) { - /* Malloc up new buffer. */ - new_skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2); - if (new_skb == NULL) { - lp->stats.rx_dropped++; + u16 status = sonic_rda_get(dev, entry, SONIC_RD_STATUS); + + /* If the RD has LPKT set, the chip has finished with the RB */ + if ((status & SONIC_RCR_PRX) && (status & SONIC_RCR_LPKT)) { + struct sk_buff *new_skb; + dma_addr_t new_laddr; + u32 addr = (sonic_rda_get(dev, entry, + SONIC_RD_PKTPTR_H) << 16) | + sonic_rda_get(dev, entry, SONIC_RD_PKTPTR_L); + int i = index_from_addr(lp, addr, entry); + + if (i < 0) { + WARN_ONCE(1, "failed to find buffer!\n"); break; } - /* provide 16 byte IP header alignment unless DMA requires otherwise */ - if(SONIC_BUS_SCALE(lp->dma_bitmode) == 2) - skb_reserve(new_skb, 2); - - new_laddr = dma_map_single(lp->device, skb_put(new_skb, SONIC_RBSIZE), - SONIC_RBSIZE, DMA_FROM_DEVICE); - if (!new_laddr) { - dev_kfree_skb(new_skb); - printk(KERN_ERR "%s: Failed to map rx buffer, dropping packet.\n", dev->name); + + if (sonic_alloc_rb(dev, lp, &new_skb, &new_laddr)) { + struct sk_buff *used_skb = lp->rx_skb[i]; + int pkt_len; + + /* Pass the used buffer up the stack */ + dma_unmap_single(lp->device, addr, SONIC_RBSIZE, + DMA_FROM_DEVICE); + + pkt_len = sonic_rda_get(dev, entry, + SONIC_RD_PKTLEN); + skb_trim(used_skb, pkt_len); + used_skb->protocol = eth_type_trans(used_skb, + dev); + netif_rx(used_skb); + lp->stats.rx_packets++; + lp->stats.rx_bytes += pkt_len; + + lp->rx_skb[i] = new_skb; + lp->rx_laddr[i] = new_laddr; + } else { + /* Failed to obtain a new buffer so re-use it */ + new_laddr = addr; lp->stats.rx_dropped++; - break; } - - /* now we have a new skb to replace it, pass the used one up the stack */ - dma_unmap_single(lp->device, lp->rx_laddr[entry], SONIC_RBSIZE, DMA_FROM_DEVICE); - used_skb = lp->rx_skb[entry]; - pkt_len = sonic_rda_get(dev, entry, SONIC_RD_PKTLEN); - skb_trim(used_skb, pkt_len); - used_skb->protocol = eth_type_trans(used_skb, dev); - netif_rx(used_skb); - lp->stats.rx_packets++; - lp->stats.rx_bytes += pkt_len; - - /* and insert the new skb */ - lp->rx_laddr[entry] = new_laddr; - lp->rx_skb[entry] = new_skb; - - bufadr_l = (unsigned long)new_laddr & 0xffff; - bufadr_h = (unsigned long)new_laddr >> 16; - sonic_rra_put(dev, entry, SONIC_RR_BUFADR_L, bufadr_l); - sonic_rra_put(dev, entry, SONIC_RR_BUFADR_H, bufadr_h); - } else { - /* This should only happen, if we enable accepting broken packets. */ - lp->stats.rx_errors++; - if (status & SONIC_RCR_FAER) - lp->stats.rx_frame_errors++; - if (status & SONIC_RCR_CRCR) - lp->stats.rx_crc_errors++; - } - if (status & SONIC_RCR_LPKT) { - /* - * this was the last packet out of the current receive buffer - * give the buffer back to the SONIC + /* If RBE is already asserted when RWP advances then + * it's safe to clear RBE after processing this packet. */ - lp->cur_rwp += SIZEOF_SONIC_RR * SONIC_BUS_SCALE(lp->dma_bitmode); - if (lp->cur_rwp >= lp->rra_end) lp->cur_rwp = lp->rra_laddr & 0xffff; - SONIC_WRITE(SONIC_RWP, lp->cur_rwp); - if (SONIC_READ(SONIC_ISR) & SONIC_INT_RBE) { - netif_dbg(lp, rx_err, dev, "%s: rx buffer exhausted\n", - __func__); - SONIC_WRITE(SONIC_ISR, SONIC_INT_RBE); /* clear the flag */ - } - } else - printk(KERN_ERR "%s: rx desc without RCR_LPKT. Shouldn't happen !?\n", - dev->name); + rbe = rbe || SONIC_READ(SONIC_ISR) & SONIC_INT_RBE; + sonic_update_rra(dev, lp, addr, new_laddr); + } /* * give back the descriptor */ - sonic_rda_put(dev, entry, SONIC_RD_LINK, - sonic_rda_get(dev, entry, SONIC_RD_LINK) | SONIC_EOL); + sonic_rda_put(dev, entry, SONIC_RD_STATUS, 0); sonic_rda_put(dev, entry, SONIC_RD_IN_USE, 1); - sonic_rda_put(dev, lp->eol_rx, SONIC_RD_LINK, - sonic_rda_get(dev, lp->eol_rx, SONIC_RD_LINK) & ~SONIC_EOL); - lp->eol_rx = entry; - lp->cur_rx = entry = (entry + 1) & SONIC_RDS_MASK; + + prev_entry = entry; + entry = (entry + 1) & SONIC_RDS_MASK; + } + + lp->cur_rx = entry; + + if (prev_entry != lp->eol_rx) { + /* Advance the EOL flag to put descriptors back into service */ + sonic_rda_put(dev, prev_entry, SONIC_RD_LINK, SONIC_EOL | + sonic_rda_get(dev, prev_entry, SONIC_RD_LINK)); + sonic_rda_put(dev, lp->eol_rx, SONIC_RD_LINK, ~SONIC_EOL & + sonic_rda_get(dev, lp->eol_rx, SONIC_RD_LINK)); + lp->eol_rx = prev_entry; } + + if (rbe) + SONIC_WRITE(SONIC_ISR, SONIC_INT_RBE); /* * If any worth-while packets have been received, netif_rx() * has done a mark_bh(NET_BH) for us and will work on them @@ -550,6 +643,8 @@ static void sonic_multicast_list(struct net_device *dev) (netdev_mc_count(dev) > 15)) { rcr |= SONIC_RCR_AMC; } else { + unsigned long flags; + netif_dbg(lp, ifup, dev, "%s: mc_count %d\n", __func__, netdev_mc_count(dev)); sonic_set_cam_enable(dev, 1); /* always enable our own address */ @@ -563,9 +658,14 @@ static void sonic_multicast_list(struct net_device *dev) i++; } SONIC_WRITE(SONIC_CDC, 16); - /* issue Load CAM command */ SONIC_WRITE(SONIC_CDP, lp->cda_laddr & 0xffff); + + /* LCAM and TXP commands can't be used simultaneously */ + spin_lock_irqsave(&lp->lock, flags); + sonic_quiesce(dev, SONIC_CR_TXP); SONIC_WRITE(SONIC_CMD, SONIC_CR_LCAM); + sonic_quiesce(dev, SONIC_CR_LCAM); + spin_unlock_irqrestore(&lp->lock, flags); } } @@ -580,7 +680,6 @@ static void sonic_multicast_list(struct net_device *dev) */ static int sonic_init(struct net_device *dev) { - unsigned int cmd; struct sonic_local *lp = netdev_priv(dev); int i; @@ -592,12 +691,16 @@ static int sonic_init(struct net_device *dev) SONIC_WRITE(SONIC_ISR, 0x7fff); SONIC_WRITE(SONIC_CMD, SONIC_CR_RST); + /* While in reset mode, clear CAM Enable register */ + SONIC_WRITE(SONIC_CE, 0); + /* * clear software reset flag, disable receiver, clear and * enable interrupts, then completely initialize the SONIC */ SONIC_WRITE(SONIC_CMD, 0); - SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS); + SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS | SONIC_CR_STP); + sonic_quiesce(dev, SONIC_CR_ALL); /* * initialize the receive resource area @@ -615,15 +718,10 @@ static int sonic_init(struct net_device *dev) } /* initialize all RRA registers */ - lp->rra_end = (lp->rra_laddr + SONIC_NUM_RRS * SIZEOF_SONIC_RR * - SONIC_BUS_SCALE(lp->dma_bitmode)) & 0xffff; - lp->cur_rwp = (lp->rra_laddr + (SONIC_NUM_RRS - 1) * SIZEOF_SONIC_RR * - SONIC_BUS_SCALE(lp->dma_bitmode)) & 0xffff; - - SONIC_WRITE(SONIC_RSA, lp->rra_laddr & 0xffff); - SONIC_WRITE(SONIC_REA, lp->rra_end); - SONIC_WRITE(SONIC_RRP, lp->rra_laddr & 0xffff); - SONIC_WRITE(SONIC_RWP, lp->cur_rwp); + SONIC_WRITE(SONIC_RSA, sonic_rr_addr(dev, 0)); + SONIC_WRITE(SONIC_REA, sonic_rr_addr(dev, SONIC_NUM_RRS)); + SONIC_WRITE(SONIC_RRP, sonic_rr_addr(dev, 0)); + SONIC_WRITE(SONIC_RWP, sonic_rr_addr(dev, SONIC_NUM_RRS - 1)); SONIC_WRITE(SONIC_URRA, lp->rra_laddr >> 16); SONIC_WRITE(SONIC_EOBC, (SONIC_RBSIZE >> 1) - (lp->dma_bitmode ? 2 : 1)); @@ -631,14 +729,7 @@ static int sonic_init(struct net_device *dev) netif_dbg(lp, ifup, dev, "%s: issuing RRRA command\n", __func__); SONIC_WRITE(SONIC_CMD, SONIC_CR_RRRA); - i = 0; - while (i++ < 100) { - if (SONIC_READ(SONIC_CMD) & SONIC_CR_RRRA) - break; - } - - netif_dbg(lp, ifup, dev, "%s: status=%x, i=%d\n", __func__, - SONIC_READ(SONIC_CMD), i); + sonic_quiesce(dev, SONIC_CR_RRRA); /* * Initialize the receive descriptors so that they @@ -713,28 +804,17 @@ static int sonic_init(struct net_device *dev) * load the CAM */ SONIC_WRITE(SONIC_CMD, SONIC_CR_LCAM); - - i = 0; - while (i++ < 100) { - if (SONIC_READ(SONIC_ISR) & SONIC_INT_LCD) - break; - } - netif_dbg(lp, ifup, dev, "%s: CMD=%x, ISR=%x, i=%d\n", __func__, - SONIC_READ(SONIC_CMD), SONIC_READ(SONIC_ISR), i); + sonic_quiesce(dev, SONIC_CR_LCAM); /* * enable receiver, disable loopback * and enable all interrupts */ - SONIC_WRITE(SONIC_CMD, SONIC_CR_RXEN | SONIC_CR_STP); SONIC_WRITE(SONIC_RCR, SONIC_RCR_DEFAULT); SONIC_WRITE(SONIC_TCR, SONIC_TCR_DEFAULT); SONIC_WRITE(SONIC_ISR, 0x7fff); SONIC_WRITE(SONIC_IMR, SONIC_IMR_DEFAULT); - - cmd = SONIC_READ(SONIC_CMD); - if ((cmd & SONIC_CR_RXEN) == 0 || (cmd & SONIC_CR_STP) == 0) - printk(KERN_ERR "sonic_init: failed, status=%x\n", cmd); + SONIC_WRITE(SONIC_CMD, SONIC_CR_RXEN); netif_dbg(lp, ifup, dev, "%s: new status=%x\n", __func__, SONIC_READ(SONIC_CMD)); diff --git a/drivers/net/ethernet/natsemi/sonic.h b/drivers/net/ethernet/natsemi/sonic.h index f1544481aac1..e0e4cba6f6f6 100644 --- a/drivers/net/ethernet/natsemi/sonic.h +++ b/drivers/net/ethernet/natsemi/sonic.h @@ -110,6 +110,9 @@ #define SONIC_CR_TXP 0x0002 #define SONIC_CR_HTX 0x0001 +#define SONIC_CR_ALL (SONIC_CR_LCAM | SONIC_CR_RRRA | \ + SONIC_CR_RXEN | SONIC_CR_TXP) + /* * SONIC data configuration bits */ @@ -175,6 +178,7 @@ #define SONIC_TCR_NCRS 0x0100 #define SONIC_TCR_CRLS 0x0080 #define SONIC_TCR_EXC 0x0040 +#define SONIC_TCR_OWC 0x0020 #define SONIC_TCR_PMB 0x0008 #define SONIC_TCR_FU 0x0004 #define SONIC_TCR_BCM 0x0002 @@ -274,8 +278,9 @@ #define SONIC_NUM_RDS SONIC_NUM_RRS /* number of receive descriptors */ #define SONIC_NUM_TDS 16 /* number of transmit descriptors */ -#define SONIC_RDS_MASK (SONIC_NUM_RDS-1) -#define SONIC_TDS_MASK (SONIC_NUM_TDS-1) +#define SONIC_RRS_MASK (SONIC_NUM_RRS - 1) +#define SONIC_RDS_MASK (SONIC_NUM_RDS - 1) +#define SONIC_TDS_MASK (SONIC_NUM_TDS - 1) #define SONIC_RBSIZE 1520 /* size of one resource buffer */ @@ -312,8 +317,6 @@ struct sonic_local { u32 rda_laddr; /* logical DMA address of RDA */ dma_addr_t rx_laddr[SONIC_NUM_RRS]; /* logical DMA addresses of rx skbuffs */ dma_addr_t tx_laddr[SONIC_NUM_TDS]; /* logical DMA addresses of tx skbuffs */ - unsigned int rra_end; - unsigned int cur_rwp; unsigned int cur_rx; unsigned int cur_tx; /* first unacked transmit packet */ unsigned int eol_rx; @@ -322,6 +325,7 @@ struct sonic_local { int msg_enable; struct device *device; /* generic device */ struct net_device_stats stats; + spinlock_t lock; }; #define TX_TIMEOUT (3 * HZ) @@ -344,30 +348,30 @@ static void sonic_msg_init(struct net_device *dev); as far as we can tell. */ /* OpenBSD calls this "SWO". I'd like to think that sonic_buf_put() is a much better name. */ -static inline void sonic_buf_put(void* base, int bitmode, +static inline void sonic_buf_put(u16 *base, int bitmode, int offset, __u16 val) { if (bitmode) #ifdef __BIG_ENDIAN - ((__u16 *) base + (offset*2))[1] = val; + __raw_writew(val, base + (offset * 2) + 1); #else - ((__u16 *) base + (offset*2))[0] = val; + __raw_writew(val, base + (offset * 2) + 0); #endif else - ((__u16 *) base)[offset] = val; + __raw_writew(val, base + (offset * 1) + 0); } -static inline __u16 sonic_buf_get(void* base, int bitmode, +static inline __u16 sonic_buf_get(u16 *base, int bitmode, int offset) { if (bitmode) #ifdef __BIG_ENDIAN - return ((volatile __u16 *) base + (offset*2))[1]; + return __raw_readw(base + (offset * 2) + 1); #else - return ((volatile __u16 *) base + (offset*2))[0]; + return __raw_readw(base + (offset * 2) + 0); #endif else - return ((volatile __u16 *) base)[offset]; + return __raw_readw(base + (offset * 1) + 0); } /* Inlines that you should actually use for reading/writing DMA buffers */ @@ -447,6 +451,22 @@ static inline __u16 sonic_rra_get(struct net_device* dev, int entry, (entry * SIZEOF_SONIC_RR) + offset); } +static inline u16 sonic_rr_addr(struct net_device *dev, int entry) +{ + struct sonic_local *lp = netdev_priv(dev); + + return lp->rra_laddr + + entry * SIZEOF_SONIC_RR * SONIC_BUS_SCALE(lp->dma_bitmode); +} + +static inline u16 sonic_rr_entry(struct net_device *dev, u16 addr) +{ + struct sonic_local *lp = netdev_priv(dev); + + return (addr - (u16)lp->rra_laddr) / (SIZEOF_SONIC_RR * + SONIC_BUS_SCALE(lp->dma_bitmode)); +} + static const char version[] = "sonic.c:v0.92 20.9.98 tsbogend@alpha.franken.de\n"; diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c index a496390b8632..07f9067affc6 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c @@ -2043,6 +2043,7 @@ static void qlcnic_83xx_exec_template_cmd(struct qlcnic_adapter *p_dev, break; } entry += p_hdr->size; + cond_resched(); } p_dev->ahw->reset.seq_index = index; } diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c index afa10a163da1..f34ae8c75bc5 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c @@ -703,6 +703,7 @@ static u32 qlcnic_read_memory_test_agent(struct qlcnic_adapter *adapter, addr += 16; reg_read -= 16; ret += 16; + cond_resched(); } out: mutex_unlock(&adapter->ahw->mem_lock); @@ -1383,6 +1384,7 @@ int qlcnic_dump_fw(struct qlcnic_adapter *adapter) buf_offset += entry->hdr.cap_size; entry_offset += entry->hdr.offset; buffer = fw_dump->data + buf_offset; + cond_resched(); } fw_dump->clr = 1; diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 7a5fe11378aa..aaa316be6183 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -85,7 +85,6 @@ #define RTL_R16(tp, reg) readw(tp->mmio_addr + (reg)) #define RTL_R32(tp, reg) readl(tp->mmio_addr + (reg)) -#define JUMBO_1K ETH_DATA_LEN #define JUMBO_4K (4*1024 - ETH_HLEN - 2) #define JUMBO_6K (6*1024 - ETH_HLEN - 2) #define JUMBO_7K (7*1024 - ETH_HLEN - 2) @@ -1494,7 +1493,7 @@ static netdev_features_t rtl8169_fix_features(struct net_device *dev, if (dev->mtu > TD_MSS_MAX) features &= ~NETIF_F_ALL_TSO; - if (dev->mtu > JUMBO_1K && + if (dev->mtu > ETH_DATA_LEN && tp->mac_version > RTL_GIGA_MAC_VER_06) features &= ~(NETIF_F_CSUM_MASK | NETIF_F_ALL_TSO); @@ -5360,7 +5359,7 @@ static int rtl_jumbo_max(struct rtl8169_private *tp) { /* Non-GBit versions don't support jumbo frames */ if (!tp->supports_gmii) - return JUMBO_1K; + return 0; switch (tp->mac_version) { /* RTL8169 */ @@ -5591,10 +5590,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) dev->hw_features |= NETIF_F_RXALL; dev->hw_features |= NETIF_F_RXFCS; - /* MTU range: 60 - hw-specific max */ - dev->min_mtu = ETH_ZLEN; jumbo_max = rtl_jumbo_max(tp); - dev->max_mtu = jumbo_max; + if (jumbo_max) + dev->max_mtu = jumbo_max; rtl_set_irq_mask(tp); @@ -5624,7 +5622,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) (RTL_R32(tp, TxConfig) >> 20) & 0xfcf, pci_irq_vector(pdev, 0)); - if (jumbo_max > JUMBO_1K) + if (jumbo_max) netif_info(tp, probe, dev, "jumbo features [frames: %d bytes, tx checksumming: %s]\n", jumbo_max, tp->mac_version <= RTL_GIGA_MAC_VER_06 ? diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile index 890fd65caa2d..87d093da22ca 100644 --- a/drivers/net/ethernet/sfc/Makefile +++ b/drivers/net/ethernet/sfc/Makefile @@ -4,7 +4,7 @@ sfc-y += efx.o efx_common.o efx_channels.o nic.o \ tx.o tx_common.o tx_tso.o rx.o rx_common.o \ selftest.o ethtool.o ethtool_common.o ptp.o \ mcdi.o mcdi_port.o mcdi_port_common.o \ - mcdi_functions.o mcdi_mon.o + mcdi_functions.o mcdi_filters.o mcdi_mon.o sfc-$(CONFIG_SFC_MTD) += mtd.o sfc-$(CONFIG_SFC_SRIOV) += sriov.o siena_sriov.o ef10_sriov.o diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 4997f61de3d6..397f3bc7da3b 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -13,6 +13,7 @@ #include "mcdi_port_common.h" #include "mcdi_functions.h" #include "nic.h" +#include "mcdi_filters.h" #include "workarounds.h" #include "selftest.h" #include "ef10_sriov.h" @@ -28,28 +29,6 @@ enum { EFX_EF10_TEST = 1, EFX_EF10_REFILL, }; -/* The maximum size of a shared RSS context */ -/* TODO: this should really be from the mcdi protocol export */ -#define EFX_EF10_MAX_SHARED_RSS_CONTEXT_SIZE 64UL - -/* The filter table(s) are managed by firmware and we have write-only - * access. When removing filters we must identify them to the - * firmware by a 64-bit handle, but this is too wide for Linux kernel - * interfaces (32-bit for RX NFC, 16-bit for RFS). Also, we need to - * be able to tell in advance whether a requested insertion will - * replace an existing filter. Therefore we maintain a software hash - * table, which should be at least as large as the hardware hash - * table. - * - * Huntington has a single 8K filter table shared between all filter - * types and both ports. - */ -#define HUNT_FILTER_TBL_ROWS 8192 - -#define EFX_EF10_FILTER_ID_INVALID 0xffff - -#define EFX_EF10_FILTER_DEV_UC_MAX 32 -#define EFX_EF10_FILTER_DEV_MC_MAX 256 /* VLAN list entry */ struct efx_ef10_vlan { @@ -57,95 +36,8 @@ struct efx_ef10_vlan { u16 vid; }; -enum efx_ef10_default_filters { - EFX_EF10_BCAST, - EFX_EF10_UCDEF, - EFX_EF10_MCDEF, - EFX_EF10_VXLAN4_UCDEF, - EFX_EF10_VXLAN4_MCDEF, - EFX_EF10_VXLAN6_UCDEF, - EFX_EF10_VXLAN6_MCDEF, - EFX_EF10_NVGRE4_UCDEF, - EFX_EF10_NVGRE4_MCDEF, - EFX_EF10_NVGRE6_UCDEF, - EFX_EF10_NVGRE6_MCDEF, - EFX_EF10_GENEVE4_UCDEF, - EFX_EF10_GENEVE4_MCDEF, - EFX_EF10_GENEVE6_UCDEF, - EFX_EF10_GENEVE6_MCDEF, - - EFX_EF10_NUM_DEFAULT_FILTERS -}; - -/* Per-VLAN filters information */ -struct efx_ef10_filter_vlan { - struct list_head list; - u16 vid; - u16 uc[EFX_EF10_FILTER_DEV_UC_MAX]; - u16 mc[EFX_EF10_FILTER_DEV_MC_MAX]; - u16 default_filters[EFX_EF10_NUM_DEFAULT_FILTERS]; -}; - -struct efx_ef10_dev_addr { - u8 addr[ETH_ALEN]; -}; - -struct efx_ef10_filter_table { -/* The MCDI match masks supported by this fw & hw, in order of priority */ - u32 rx_match_mcdi_flags[ - MC_CMD_GET_PARSER_DISP_INFO_OUT_SUPPORTED_MATCHES_MAXNUM * 2]; - unsigned int rx_match_count; - - struct rw_semaphore lock; /* Protects entries */ - struct { - unsigned long spec; /* pointer to spec plus flag bits */ -/* AUTO_OLD is used to mark and sweep MAC filters for the device address lists. */ -/* unused flag 1UL */ -#define EFX_EF10_FILTER_FLAG_AUTO_OLD 2UL -#define EFX_EF10_FILTER_FLAGS 3UL - u64 handle; /* firmware handle */ - } *entry; -/* Shadow of net_device address lists, guarded by mac_lock */ - struct efx_ef10_dev_addr dev_uc_list[EFX_EF10_FILTER_DEV_UC_MAX]; - struct efx_ef10_dev_addr dev_mc_list[EFX_EF10_FILTER_DEV_MC_MAX]; - int dev_uc_count; - int dev_mc_count; - bool uc_promisc; - bool mc_promisc; -/* Whether in multicast promiscuous mode when last changed */ - bool mc_promisc_last; - bool mc_overflow; /* Too many MC addrs; should always imply mc_promisc */ - bool vlan_filter; - struct list_head vlan_list; -}; - -/* An arbitrary search limit for the software hash table */ -#define EFX_EF10_FILTER_SEARCH_LIMIT 200 - -static void efx_ef10_rx_free_indir_table(struct efx_nic *efx); -static void efx_ef10_filter_table_remove(struct efx_nic *efx); -static int efx_ef10_filter_add_vlan(struct efx_nic *efx, u16 vid); -static void efx_ef10_filter_del_vlan_internal(struct efx_nic *efx, - struct efx_ef10_filter_vlan *vlan); -static void efx_ef10_filter_del_vlan(struct efx_nic *efx, u16 vid); static int efx_ef10_set_udp_tnl_ports(struct efx_nic *efx, bool unloading); -static u32 efx_ef10_filter_get_unsafe_id(u32 filter_id) -{ - WARN_ON_ONCE(filter_id == EFX_EF10_FILTER_ID_INVALID); - return filter_id & (HUNT_FILTER_TBL_ROWS - 1); -} - -static unsigned int efx_ef10_filter_get_unsafe_pri(u32 filter_id) -{ - return filter_id / (HUNT_FILTER_TBL_ROWS * 2); -} - -static u32 efx_ef10_make_filter_id(unsigned int pri, u16 idx) -{ - return pri * HUNT_FILTER_TBL_ROWS * 2 + idx; -} - static int efx_ef10_get_warm_boot_count(struct efx_nic *efx) { efx_dword_t reg; @@ -546,7 +438,7 @@ static int efx_ef10_add_vlan(struct efx_nic *efx, u16 vid) if (efx->filter_state) { mutex_lock(&efx->mac_lock); down_write(&efx->filter_sem); - rc = efx_ef10_filter_add_vlan(efx, vlan->vid); + rc = efx_mcdi_filter_add_vlan(efx, vlan->vid); up_write(&efx->filter_sem); mutex_unlock(&efx->mac_lock); if (rc) @@ -575,7 +467,7 @@ static void efx_ef10_del_vlan_internal(struct efx_nic *efx, if (efx->filter_state) { down_write(&efx->filter_sem); - efx_ef10_filter_del_vlan(efx, vlan->vid); + efx_mcdi_filter_del_vlan(efx, vlan->vid); up_write(&efx->filter_sem); } @@ -1038,7 +930,7 @@ static void efx_ef10_remove(struct efx_nic *efx) efx_mcdi_mon_remove(efx); - efx_ef10_rx_free_indir_table(efx); + efx_mcdi_rx_free_indir_table(efx); if (nic_data->wc_membase) iounmap(nic_data->wc_membase); @@ -1426,7 +1318,7 @@ static int efx_ef10_init_nic(struct efx_nic *efx) return 0; } -static void efx_ef10_reset_mc_allocations(struct efx_nic *efx) +static void efx_ef10_table_reset_mc_allocations(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data = efx->nic_data; #ifdef CONFIG_SFC_SRIOV @@ -1507,7 +1399,7 @@ static int efx_ef10_reset(struct efx_nic *efx, enum reset_type reset_type) */ if ((reset_type == RESET_TYPE_ALL || reset_type == RESET_TYPE_MCDI_TIMEOUT) && !rc) - efx_ef10_reset_mc_allocations(efx); + efx_ef10_table_reset_mc_allocations(efx); return rc; } @@ -2123,7 +2015,7 @@ static void efx_ef10_mcdi_reboot_detected(struct efx_nic *efx) struct efx_ef10_nic_data *nic_data = efx->nic_data; /* All our allocations have been reset */ - efx_ef10_reset_mc_allocations(efx); + efx_ef10_table_reset_mc_allocations(efx); /* The datapath firmware might have been changed */ nic_data->must_check_datapath_caps = true; @@ -2497,442 +2389,6 @@ static void efx_ef10_tx_write(struct efx_tx_queue *tx_queue) } } -#define RSS_MODE_HASH_ADDRS (1 << RSS_MODE_HASH_SRC_ADDR_LBN |\ - 1 << RSS_MODE_HASH_DST_ADDR_LBN) -#define RSS_MODE_HASH_PORTS (1 << RSS_MODE_HASH_SRC_PORT_LBN |\ - 1 << RSS_MODE_HASH_DST_PORT_LBN) -#define RSS_CONTEXT_FLAGS_DEFAULT (1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_IPV4_EN_LBN |\ - 1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_TCPV4_EN_LBN |\ - 1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_IPV6_EN_LBN |\ - 1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_TCPV6_EN_LBN |\ - (RSS_MODE_HASH_ADDRS | RSS_MODE_HASH_PORTS) << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TCP_IPV4_RSS_MODE_LBN |\ - RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV4_RSS_MODE_LBN |\ - RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_OTHER_IPV4_RSS_MODE_LBN |\ - (RSS_MODE_HASH_ADDRS | RSS_MODE_HASH_PORTS) << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TCP_IPV6_RSS_MODE_LBN |\ - RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV6_RSS_MODE_LBN |\ - RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_OTHER_IPV6_RSS_MODE_LBN) - -static int efx_ef10_get_rss_flags(struct efx_nic *efx, u32 context, u32 *flags) -{ - /* Firmware had a bug (sfc bug 61952) where it would not actually - * fill in the flags field in the response to MC_CMD_RSS_CONTEXT_GET_FLAGS. - * This meant that it would always contain whatever was previously - * in the MCDI buffer. Fortunately, all firmware versions with - * this bug have the same default flags value for a newly-allocated - * RSS context, and the only time we want to get the flags is just - * after allocating. Moreover, the response has a 32-bit hole - * where the context ID would be in the request, so we can use an - * overlength buffer in the request and pre-fill the flags field - * with what we believe the default to be. Thus if the firmware - * has the bug, it will leave our pre-filled value in the flags - * field of the response, and we will get the right answer. - * - * However, this does mean that this function should NOT be used if - * the RSS context flags might not be their defaults - it is ONLY - * reliably correct for a newly-allocated RSS context. - */ - MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_LEN); - MCDI_DECLARE_BUF(outbuf, MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_LEN); - size_t outlen; - int rc; - - /* Check we have a hole for the context ID */ - BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_GET_FLAGS_IN_LEN != MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_FLAGS_OFST); - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_FLAGS_IN_RSS_CONTEXT_ID, context); - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_FLAGS_OUT_FLAGS, - RSS_CONTEXT_FLAGS_DEFAULT); - rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_FLAGS, inbuf, - sizeof(inbuf), outbuf, sizeof(outbuf), &outlen); - if (rc == 0) { - if (outlen < MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_LEN) - rc = -EIO; - else - *flags = MCDI_DWORD(outbuf, RSS_CONTEXT_GET_FLAGS_OUT_FLAGS); - } - return rc; -} - -/* Attempt to enable 4-tuple UDP hashing on the specified RSS context. - * If we fail, we just leave the RSS context at its default hash settings, - * which is safe but may slightly reduce performance. - * Defaults are 4-tuple for TCP and 2-tuple for UDP and other-IP, so we - * just need to set the UDP ports flags (for both IP versions). - */ -static void efx_ef10_set_rss_flags(struct efx_nic *efx, - struct efx_rss_context *ctx) -{ - MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_SET_FLAGS_IN_LEN); - u32 flags; - - BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_SET_FLAGS_OUT_LEN != 0); - - if (efx_ef10_get_rss_flags(efx, ctx->context_id, &flags) != 0) - return; - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_RSS_CONTEXT_ID, - ctx->context_id); - flags |= RSS_MODE_HASH_PORTS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV4_RSS_MODE_LBN; - flags |= RSS_MODE_HASH_PORTS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV6_RSS_MODE_LBN; - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_FLAGS, flags); - if (!efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_FLAGS, inbuf, sizeof(inbuf), - NULL, 0, NULL)) - /* Succeeded, so UDP 4-tuple is now enabled */ - ctx->rx_hash_udp_4tuple = true; -} - -static int efx_ef10_alloc_rss_context(struct efx_nic *efx, bool exclusive, - struct efx_rss_context *ctx, - unsigned *context_size) -{ - MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_ALLOC_IN_LEN); - MCDI_DECLARE_BUF(outbuf, MC_CMD_RSS_CONTEXT_ALLOC_OUT_LEN); - struct efx_ef10_nic_data *nic_data = efx->nic_data; - size_t outlen; - int rc; - u32 alloc_type = exclusive ? - MC_CMD_RSS_CONTEXT_ALLOC_IN_TYPE_EXCLUSIVE : - MC_CMD_RSS_CONTEXT_ALLOC_IN_TYPE_SHARED; - unsigned rss_spread = exclusive ? - efx->rss_spread : - min(rounddown_pow_of_two(efx->rss_spread), - EFX_EF10_MAX_SHARED_RSS_CONTEXT_SIZE); - - if (!exclusive && rss_spread == 1) { - ctx->context_id = EFX_MCDI_RSS_CONTEXT_INVALID; - if (context_size) - *context_size = 1; - return 0; - } - - if (nic_data->datapath_caps & - 1 << MC_CMD_GET_CAPABILITIES_OUT_RX_RSS_LIMITED_LBN) - return -EOPNOTSUPP; - - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_ALLOC_IN_UPSTREAM_PORT_ID, - nic_data->vport_id); - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_ALLOC_IN_TYPE, alloc_type); - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_ALLOC_IN_NUM_QUEUES, rss_spread); - - rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_ALLOC, inbuf, sizeof(inbuf), - outbuf, sizeof(outbuf), &outlen); - if (rc != 0) - return rc; - - if (outlen < MC_CMD_RSS_CONTEXT_ALLOC_OUT_LEN) - return -EIO; - - ctx->context_id = MCDI_DWORD(outbuf, RSS_CONTEXT_ALLOC_OUT_RSS_CONTEXT_ID); - - if (context_size) - *context_size = rss_spread; - - if (nic_data->datapath_caps & - 1 << MC_CMD_GET_CAPABILITIES_OUT_ADDITIONAL_RSS_MODES_LBN) - efx_ef10_set_rss_flags(efx, ctx); - - return 0; -} - -static int efx_ef10_free_rss_context(struct efx_nic *efx, u32 context) -{ - MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_FREE_IN_LEN); - - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_FREE_IN_RSS_CONTEXT_ID, - context); - return efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_FREE, inbuf, sizeof(inbuf), - NULL, 0, NULL); -} - -static int efx_ef10_populate_rss_table(struct efx_nic *efx, u32 context, - const u32 *rx_indir_table, const u8 *key) -{ - MCDI_DECLARE_BUF(tablebuf, MC_CMD_RSS_CONTEXT_SET_TABLE_IN_LEN); - MCDI_DECLARE_BUF(keybuf, MC_CMD_RSS_CONTEXT_SET_KEY_IN_LEN); - int i, rc; - - MCDI_SET_DWORD(tablebuf, RSS_CONTEXT_SET_TABLE_IN_RSS_CONTEXT_ID, - context); - BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_indir_table) != - MC_CMD_RSS_CONTEXT_SET_TABLE_IN_INDIRECTION_TABLE_LEN); - - /* This iterates over the length of efx->rss_context.rx_indir_table, but - * copies bytes from rx_indir_table. That's because the latter is a - * pointer rather than an array, but should have the same length. - * The efx->rss_context.rx_hash_key loop below is similar. - */ - for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_indir_table); ++i) - MCDI_PTR(tablebuf, - RSS_CONTEXT_SET_TABLE_IN_INDIRECTION_TABLE)[i] = - (u8) rx_indir_table[i]; - - rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_TABLE, tablebuf, - sizeof(tablebuf), NULL, 0, NULL); - if (rc != 0) - return rc; - - MCDI_SET_DWORD(keybuf, RSS_CONTEXT_SET_KEY_IN_RSS_CONTEXT_ID, - context); - BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_hash_key) != - MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN); - for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_hash_key); ++i) - MCDI_PTR(keybuf, RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY)[i] = key[i]; - - return efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_KEY, keybuf, - sizeof(keybuf), NULL, 0, NULL); -} - -static void efx_ef10_rx_free_indir_table(struct efx_nic *efx) -{ - int rc; - - if (efx->rss_context.context_id != EFX_MCDI_RSS_CONTEXT_INVALID) { - rc = efx_ef10_free_rss_context(efx, efx->rss_context.context_id); - WARN_ON(rc != 0); - } - efx->rss_context.context_id = EFX_MCDI_RSS_CONTEXT_INVALID; -} - -static int efx_ef10_rx_push_shared_rss_config(struct efx_nic *efx, - unsigned *context_size) -{ - struct efx_ef10_nic_data *nic_data = efx->nic_data; - int rc = efx_ef10_alloc_rss_context(efx, false, &efx->rss_context, - context_size); - - if (rc != 0) - return rc; - - nic_data->rx_rss_context_exclusive = false; - efx_set_default_rx_indir_table(efx, &efx->rss_context); - return 0; -} - -static int efx_ef10_rx_push_exclusive_rss_config(struct efx_nic *efx, - const u32 *rx_indir_table, - const u8 *key) -{ - u32 old_rx_rss_context = efx->rss_context.context_id; - struct efx_ef10_nic_data *nic_data = efx->nic_data; - int rc; - - if (efx->rss_context.context_id == EFX_MCDI_RSS_CONTEXT_INVALID || - !nic_data->rx_rss_context_exclusive) { - rc = efx_ef10_alloc_rss_context(efx, true, &efx->rss_context, - NULL); - if (rc == -EOPNOTSUPP) - return rc; - else if (rc != 0) - goto fail1; - } - - rc = efx_ef10_populate_rss_table(efx, efx->rss_context.context_id, - rx_indir_table, key); - if (rc != 0) - goto fail2; - - if (efx->rss_context.context_id != old_rx_rss_context && - old_rx_rss_context != EFX_MCDI_RSS_CONTEXT_INVALID) - WARN_ON(efx_ef10_free_rss_context(efx, old_rx_rss_context) != 0); - nic_data->rx_rss_context_exclusive = true; - if (rx_indir_table != efx->rss_context.rx_indir_table) - memcpy(efx->rss_context.rx_indir_table, rx_indir_table, - sizeof(efx->rss_context.rx_indir_table)); - if (key != efx->rss_context.rx_hash_key) - memcpy(efx->rss_context.rx_hash_key, key, - efx->type->rx_hash_key_size); - - return 0; - -fail2: - if (old_rx_rss_context != efx->rss_context.context_id) { - WARN_ON(efx_ef10_free_rss_context(efx, efx->rss_context.context_id) != 0); - efx->rss_context.context_id = old_rx_rss_context; - } -fail1: - netif_err(efx, hw, efx->net_dev, "%s: failed rc=%d\n", __func__, rc); - return rc; -} - -static int efx_ef10_rx_push_rss_context_config(struct efx_nic *efx, - struct efx_rss_context *ctx, - const u32 *rx_indir_table, - const u8 *key) -{ - int rc; - - WARN_ON(!mutex_is_locked(&efx->rss_lock)); - - if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) { - rc = efx_ef10_alloc_rss_context(efx, true, ctx, NULL); - if (rc) - return rc; - } - - if (!rx_indir_table) /* Delete this context */ - return efx_ef10_free_rss_context(efx, ctx->context_id); - - rc = efx_ef10_populate_rss_table(efx, ctx->context_id, - rx_indir_table, key); - if (rc) - return rc; - - memcpy(ctx->rx_indir_table, rx_indir_table, - sizeof(efx->rss_context.rx_indir_table)); - memcpy(ctx->rx_hash_key, key, efx->type->rx_hash_key_size); - - return 0; -} - -static int efx_ef10_rx_pull_rss_context_config(struct efx_nic *efx, - struct efx_rss_context *ctx) -{ - MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN); - MCDI_DECLARE_BUF(tablebuf, MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_LEN); - MCDI_DECLARE_BUF(keybuf, MC_CMD_RSS_CONTEXT_GET_KEY_OUT_LEN); - size_t outlen; - int rc, i; - - WARN_ON(!mutex_is_locked(&efx->rss_lock)); - - BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN != - MC_CMD_RSS_CONTEXT_GET_KEY_IN_LEN); - - if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) - return -ENOENT; - - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_TABLE_IN_RSS_CONTEXT_ID, - ctx->context_id); - BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_indir_table) != - MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE_LEN); - rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_TABLE, inbuf, sizeof(inbuf), - tablebuf, sizeof(tablebuf), &outlen); - if (rc != 0) - return rc; - - if (WARN_ON(outlen != MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_LEN)) - return -EIO; - - for (i = 0; i < ARRAY_SIZE(ctx->rx_indir_table); i++) - ctx->rx_indir_table[i] = MCDI_PTR(tablebuf, - RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE)[i]; - - MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_KEY_IN_RSS_CONTEXT_ID, - ctx->context_id); - BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_hash_key) != - MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN); - rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_KEY, inbuf, sizeof(inbuf), - keybuf, sizeof(keybuf), &outlen); - if (rc != 0) - return rc; - - if (WARN_ON(outlen != MC_CMD_RSS_CONTEXT_GET_KEY_OUT_LEN)) - return -EIO; - - for (i = 0; i < ARRAY_SIZE(ctx->rx_hash_key); ++i) - ctx->rx_hash_key[i] = MCDI_PTR( - keybuf, RSS_CONTEXT_GET_KEY_OUT_TOEPLITZ_KEY)[i]; - - return 0; -} - -static int efx_ef10_rx_pull_rss_config(struct efx_nic *efx) -{ - int rc; - - mutex_lock(&efx->rss_lock); - rc = efx_ef10_rx_pull_rss_context_config(efx, &efx->rss_context); - mutex_unlock(&efx->rss_lock); - return rc; -} - -static void efx_ef10_rx_restore_rss_contexts(struct efx_nic *efx) -{ - struct efx_ef10_nic_data *nic_data = efx->nic_data; - struct efx_rss_context *ctx; - int rc; - - WARN_ON(!mutex_is_locked(&efx->rss_lock)); - - if (!nic_data->must_restore_rss_contexts) - return; - - list_for_each_entry(ctx, &efx->rss_context.list, list) { - /* previous NIC RSS context is gone */ - ctx->context_id = EFX_MCDI_RSS_CONTEXT_INVALID; - /* so try to allocate a new one */ - rc = efx_ef10_rx_push_rss_context_config(efx, ctx, - ctx->rx_indir_table, - ctx->rx_hash_key); - if (rc) - netif_warn(efx, probe, efx->net_dev, - "failed to restore RSS context %u, rc=%d" - "; RSS filters may fail to be applied\n", - ctx->user_id, rc); - } - nic_data->must_restore_rss_contexts = false; -} - -static int efx_ef10_pf_rx_push_rss_config(struct efx_nic *efx, bool user, - const u32 *rx_indir_table, - const u8 *key) -{ - int rc; - - if (efx->rss_spread == 1) - return 0; - - if (!key) - key = efx->rss_context.rx_hash_key; - - rc = efx_ef10_rx_push_exclusive_rss_config(efx, rx_indir_table, key); - - if (rc == -ENOBUFS && !user) { - unsigned context_size; - bool mismatch = false; - size_t i; - - for (i = 0; - i < ARRAY_SIZE(efx->rss_context.rx_indir_table) && !mismatch; - i++) - mismatch = rx_indir_table[i] != - ethtool_rxfh_indir_default(i, efx->rss_spread); - - rc = efx_ef10_rx_push_shared_rss_config(efx, &context_size); - if (rc == 0) { - if (context_size != efx->rss_spread) - netif_warn(efx, probe, efx->net_dev, - "Could not allocate an exclusive RSS" - " context; allocated a shared one of" - " different size." - " Wanted %u, got %u.\n", - efx->rss_spread, context_size); - else if (mismatch) - netif_warn(efx, probe, efx->net_dev, - "Could not allocate an exclusive RSS" - " context; allocated a shared one but" - " could not apply custom" - " indirection.\n"); - else - netif_info(efx, probe, efx->net_dev, - "Could not allocate an exclusive RSS" - " context; allocated a shared one.\n"); - } - } - return rc; -} - -static int efx_ef10_vf_rx_push_rss_config(struct efx_nic *efx, bool user, - const u32 *rx_indir_table - __attribute__ ((unused)), - const u8 *key - __attribute__ ((unused))) -{ - if (user) - return -EOPNOTSUPP; - if (efx->rss_context.context_id != EFX_MCDI_RSS_CONTEXT_INVALID) - return 0; - return efx_ef10_rx_push_shared_rss_config(efx, NULL); -} - /* This creates an entry in the RX descriptor queue */ static inline void efx_ef10_build_rx_desc(struct efx_rx_queue *rx_queue, unsigned int index) @@ -3684,1538 +3140,6 @@ static void efx_ef10_prepare_flr(struct efx_nic *efx) atomic_set(&efx->active_queues, 0); } -/* Decide whether a filter should be exclusive or else should allow - * delivery to additional recipients. Currently we decide that - * filters for specific local unicast MAC and IP addresses are - * exclusive. - */ -static bool efx_ef10_filter_is_exclusive(const struct efx_filter_spec *spec) -{ - if (spec->match_flags & EFX_FILTER_MATCH_LOC_MAC && - !is_multicast_ether_addr(spec->loc_mac)) - return true; - - if ((spec->match_flags & - (EFX_FILTER_MATCH_ETHER_TYPE | EFX_FILTER_MATCH_LOC_HOST)) == - (EFX_FILTER_MATCH_ETHER_TYPE | EFX_FILTER_MATCH_LOC_HOST)) { - if (spec->ether_type == htons(ETH_P_IP) && - !ipv4_is_multicast(spec->loc_host[0])) - return true; - if (spec->ether_type == htons(ETH_P_IPV6) && - ((const u8 *)spec->loc_host)[0] != 0xff) - return true; - } - - return false; -} - -static struct efx_filter_spec * -efx_ef10_filter_entry_spec(const struct efx_ef10_filter_table *table, - unsigned int filter_idx) -{ - return (struct efx_filter_spec *)(table->entry[filter_idx].spec & - ~EFX_EF10_FILTER_FLAGS); -} - -static unsigned int -efx_ef10_filter_entry_flags(const struct efx_ef10_filter_table *table, - unsigned int filter_idx) -{ - return table->entry[filter_idx].spec & EFX_EF10_FILTER_FLAGS; -} - -static void -efx_ef10_filter_set_entry(struct efx_ef10_filter_table *table, - unsigned int filter_idx, - const struct efx_filter_spec *spec, - unsigned int flags) -{ - table->entry[filter_idx].spec = (unsigned long)spec | flags; -} - -static void -efx_ef10_filter_push_prep_set_match_fields(struct efx_nic *efx, - const struct efx_filter_spec *spec, - efx_dword_t *inbuf) -{ - enum efx_encap_type encap_type = efx_filter_get_encap_type(spec); - u32 match_fields = 0, uc_match, mc_match; - - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, - efx_ef10_filter_is_exclusive(spec) ? - MC_CMD_FILTER_OP_IN_OP_INSERT : - MC_CMD_FILTER_OP_IN_OP_SUBSCRIBE); - - /* Convert match flags and values. Unlike almost - * everything else in MCDI, these fields are in - * network byte order. - */ -#define COPY_VALUE(value, mcdi_field) \ - do { \ - match_fields |= \ - 1 << MC_CMD_FILTER_OP_IN_MATCH_ ## \ - mcdi_field ## _LBN; \ - BUILD_BUG_ON( \ - MC_CMD_FILTER_OP_IN_ ## mcdi_field ## _LEN < \ - sizeof(value)); \ - memcpy(MCDI_PTR(inbuf, FILTER_OP_IN_ ## mcdi_field), \ - &value, sizeof(value)); \ - } while (0) -#define COPY_FIELD(gen_flag, gen_field, mcdi_field) \ - if (spec->match_flags & EFX_FILTER_MATCH_ ## gen_flag) { \ - COPY_VALUE(spec->gen_field, mcdi_field); \ - } - /* Handle encap filters first. They will always be mismatch - * (unknown UC or MC) filters - */ - if (encap_type) { - /* ether_type and outer_ip_proto need to be variables - * because COPY_VALUE wants to memcpy them - */ - __be16 ether_type = - htons(encap_type & EFX_ENCAP_FLAG_IPV6 ? - ETH_P_IPV6 : ETH_P_IP); - u8 vni_type = MC_CMD_FILTER_OP_EXT_IN_VNI_TYPE_GENEVE; - u8 outer_ip_proto; - - switch (encap_type & EFX_ENCAP_TYPES_MASK) { - case EFX_ENCAP_TYPE_VXLAN: - vni_type = MC_CMD_FILTER_OP_EXT_IN_VNI_TYPE_VXLAN; - /* fallthrough */ - case EFX_ENCAP_TYPE_GENEVE: - COPY_VALUE(ether_type, ETHER_TYPE); - outer_ip_proto = IPPROTO_UDP; - COPY_VALUE(outer_ip_proto, IP_PROTO); - /* We always need to set the type field, even - * though we're not matching on the TNI. - */ - MCDI_POPULATE_DWORD_1(inbuf, - FILTER_OP_EXT_IN_VNI_OR_VSID, - FILTER_OP_EXT_IN_VNI_TYPE, - vni_type); - break; - case EFX_ENCAP_TYPE_NVGRE: - COPY_VALUE(ether_type, ETHER_TYPE); - outer_ip_proto = IPPROTO_GRE; - COPY_VALUE(outer_ip_proto, IP_PROTO); - break; - default: - WARN_ON(1); - } - - uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_UCAST_DST_LBN; - mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_MCAST_DST_LBN; - } else { - uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_UCAST_DST_LBN; - mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_MCAST_DST_LBN; - } - - if (spec->match_flags & EFX_FILTER_MATCH_LOC_MAC_IG) - match_fields |= - is_multicast_ether_addr(spec->loc_mac) ? - 1 << mc_match : - 1 << uc_match; - COPY_FIELD(REM_HOST, rem_host, SRC_IP); - COPY_FIELD(LOC_HOST, loc_host, DST_IP); - COPY_FIELD(REM_MAC, rem_mac, SRC_MAC); - COPY_FIELD(REM_PORT, rem_port, SRC_PORT); - COPY_FIELD(LOC_MAC, loc_mac, DST_MAC); - COPY_FIELD(LOC_PORT, loc_port, DST_PORT); - COPY_FIELD(ETHER_TYPE, ether_type, ETHER_TYPE); - COPY_FIELD(INNER_VID, inner_vid, INNER_VLAN); - COPY_FIELD(OUTER_VID, outer_vid, OUTER_VLAN); - COPY_FIELD(IP_PROTO, ip_proto, IP_PROTO); -#undef COPY_FIELD -#undef COPY_VALUE - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_MATCH_FIELDS, - match_fields); -} - -static void efx_ef10_filter_push_prep(struct efx_nic *efx, - const struct efx_filter_spec *spec, - efx_dword_t *inbuf, u64 handle, - struct efx_rss_context *ctx, - bool replacing) -{ - struct efx_ef10_nic_data *nic_data = efx->nic_data; - u32 flags = spec->flags; - - memset(inbuf, 0, MC_CMD_FILTER_OP_EXT_IN_LEN); - - /* If RSS filter, caller better have given us an RSS context */ - if (flags & EFX_FILTER_FLAG_RX_RSS) { - /* We don't have the ability to return an error, so we'll just - * log a warning and disable RSS for the filter. - */ - if (WARN_ON_ONCE(!ctx)) - flags &= ~EFX_FILTER_FLAG_RX_RSS; - else if (WARN_ON_ONCE(ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID)) - flags &= ~EFX_FILTER_FLAG_RX_RSS; - } - - if (replacing) { - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, - MC_CMD_FILTER_OP_IN_OP_REPLACE); - MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, handle); - } else { - efx_ef10_filter_push_prep_set_match_fields(efx, spec, inbuf); - } - - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_PORT_ID, nic_data->vport_id); - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_DEST, - spec->dmaq_id == EFX_FILTER_RX_DMAQ_ID_DROP ? - MC_CMD_FILTER_OP_IN_RX_DEST_DROP : - MC_CMD_FILTER_OP_IN_RX_DEST_HOST); - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_TX_DOMAIN, 0); - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_TX_DEST, - MC_CMD_FILTER_OP_IN_TX_DEST_DEFAULT); - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_QUEUE, - spec->dmaq_id == EFX_FILTER_RX_DMAQ_ID_DROP ? - 0 : spec->dmaq_id); - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_MODE, - (flags & EFX_FILTER_FLAG_RX_RSS) ? - MC_CMD_FILTER_OP_IN_RX_MODE_RSS : - MC_CMD_FILTER_OP_IN_RX_MODE_SIMPLE); - if (flags & EFX_FILTER_FLAG_RX_RSS) - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_CONTEXT, ctx->context_id); -} - -static int efx_ef10_filter_push(struct efx_nic *efx, - const struct efx_filter_spec *spec, u64 *handle, - struct efx_rss_context *ctx, bool replacing) -{ - MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN); - MCDI_DECLARE_BUF(outbuf, MC_CMD_FILTER_OP_EXT_OUT_LEN); - size_t outlen; - int rc; - - efx_ef10_filter_push_prep(efx, spec, inbuf, *handle, ctx, replacing); - rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, inbuf, sizeof(inbuf), - outbuf, sizeof(outbuf), &outlen); - if (rc && spec->priority != EFX_FILTER_PRI_HINT) - efx_mcdi_display_error(efx, MC_CMD_FILTER_OP, sizeof(inbuf), - outbuf, outlen, rc); - if (rc == 0) - *handle = MCDI_QWORD(outbuf, FILTER_OP_OUT_HANDLE); - if (rc == -ENOSPC) - rc = -EBUSY; /* to match efx_farch_filter_insert() */ - return rc; -} - -static u32 efx_ef10_filter_mcdi_flags_from_spec(const struct efx_filter_spec *spec) -{ - enum efx_encap_type encap_type = efx_filter_get_encap_type(spec); - unsigned int match_flags = spec->match_flags; - unsigned int uc_match, mc_match; - u32 mcdi_flags = 0; - -#define MAP_FILTER_TO_MCDI_FLAG(gen_flag, mcdi_field, encap) { \ - unsigned int old_match_flags = match_flags; \ - match_flags &= ~EFX_FILTER_MATCH_ ## gen_flag; \ - if (match_flags != old_match_flags) \ - mcdi_flags |= \ - (1 << ((encap) ? \ - MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_ ## \ - mcdi_field ## _LBN : \ - MC_CMD_FILTER_OP_EXT_IN_MATCH_ ##\ - mcdi_field ## _LBN)); \ - } - /* inner or outer based on encap type */ - MAP_FILTER_TO_MCDI_FLAG(REM_HOST, SRC_IP, encap_type); - MAP_FILTER_TO_MCDI_FLAG(LOC_HOST, DST_IP, encap_type); - MAP_FILTER_TO_MCDI_FLAG(REM_MAC, SRC_MAC, encap_type); - MAP_FILTER_TO_MCDI_FLAG(REM_PORT, SRC_PORT, encap_type); - MAP_FILTER_TO_MCDI_FLAG(LOC_MAC, DST_MAC, encap_type); - MAP_FILTER_TO_MCDI_FLAG(LOC_PORT, DST_PORT, encap_type); - MAP_FILTER_TO_MCDI_FLAG(ETHER_TYPE, ETHER_TYPE, encap_type); - MAP_FILTER_TO_MCDI_FLAG(IP_PROTO, IP_PROTO, encap_type); - /* always outer */ - MAP_FILTER_TO_MCDI_FLAG(INNER_VID, INNER_VLAN, false); - MAP_FILTER_TO_MCDI_FLAG(OUTER_VID, OUTER_VLAN, false); -#undef MAP_FILTER_TO_MCDI_FLAG - - /* special handling for encap type, and mismatch */ - if (encap_type) { - match_flags &= ~EFX_FILTER_MATCH_ENCAP_TYPE; - mcdi_flags |= - (1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_ETHER_TYPE_LBN); - mcdi_flags |= (1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_IP_PROTO_LBN); - - uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_UCAST_DST_LBN; - mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_MCAST_DST_LBN; - } else { - uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_UCAST_DST_LBN; - mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_MCAST_DST_LBN; - } - - if (match_flags & EFX_FILTER_MATCH_LOC_MAC_IG) { - match_flags &= ~EFX_FILTER_MATCH_LOC_MAC_IG; - mcdi_flags |= - is_multicast_ether_addr(spec->loc_mac) ? - 1 << mc_match : - 1 << uc_match; - } - - /* Did we map them all? */ - WARN_ON_ONCE(match_flags); - - return mcdi_flags; -} - -static int efx_ef10_filter_pri(struct efx_ef10_filter_table *table, - const struct efx_filter_spec *spec) -{ - u32 mcdi_flags = efx_ef10_filter_mcdi_flags_from_spec(spec); - unsigned int match_pri; - - for (match_pri = 0; - match_pri < table->rx_match_count; - match_pri++) - if (table->rx_match_mcdi_flags[match_pri] == mcdi_flags) - return match_pri; - - return -EPROTONOSUPPORT; -} - -static s32 efx_ef10_filter_insert_locked(struct efx_nic *efx, - struct efx_filter_spec *spec, - bool replace_equal) -{ - DECLARE_BITMAP(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT); - struct efx_ef10_nic_data *nic_data = efx->nic_data; - struct efx_ef10_filter_table *table; - struct efx_filter_spec *saved_spec; - struct efx_rss_context *ctx = NULL; - unsigned int match_pri, hash; - unsigned int priv_flags; - bool rss_locked = false; - bool replacing = false; - unsigned int depth, i; - int ins_index = -1; - DEFINE_WAIT(wait); - bool is_mc_recip; - s32 rc; - - WARN_ON(!rwsem_is_locked(&efx->filter_sem)); - table = efx->filter_state; - down_write(&table->lock); - - /* For now, only support RX filters */ - if ((spec->flags & (EFX_FILTER_FLAG_RX | EFX_FILTER_FLAG_TX)) != - EFX_FILTER_FLAG_RX) { - rc = -EINVAL; - goto out_unlock; - } - - rc = efx_ef10_filter_pri(table, spec); - if (rc < 0) - goto out_unlock; - match_pri = rc; - - hash = efx_filter_spec_hash(spec); - is_mc_recip = efx_filter_is_mc_recipient(spec); - if (is_mc_recip) - bitmap_zero(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT); - - if (spec->flags & EFX_FILTER_FLAG_RX_RSS) { - mutex_lock(&efx->rss_lock); - rss_locked = true; - if (spec->rss_context) - ctx = efx_find_rss_context_entry(efx, spec->rss_context); - else - ctx = &efx->rss_context; - if (!ctx) { - rc = -ENOENT; - goto out_unlock; - } - if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) { - rc = -EOPNOTSUPP; - goto out_unlock; - } - } - - /* Find any existing filters with the same match tuple or - * else a free slot to insert at. - */ - for (depth = 1; depth < EFX_EF10_FILTER_SEARCH_LIMIT; depth++) { - i = (hash + depth) & (HUNT_FILTER_TBL_ROWS - 1); - saved_spec = efx_ef10_filter_entry_spec(table, i); - - if (!saved_spec) { - if (ins_index < 0) - ins_index = i; - } else if (efx_filter_spec_equal(spec, saved_spec)) { - if (spec->priority < saved_spec->priority && - spec->priority != EFX_FILTER_PRI_AUTO) { - rc = -EPERM; - goto out_unlock; - } - if (!is_mc_recip) { - /* This is the only one */ - if (spec->priority == - saved_spec->priority && - !replace_equal) { - rc = -EEXIST; - goto out_unlock; - } - ins_index = i; - break; - } else if (spec->priority > - saved_spec->priority || - (spec->priority == - saved_spec->priority && - replace_equal)) { - if (ins_index < 0) - ins_index = i; - else - __set_bit(depth, mc_rem_map); - } - } - } - - /* Once we reach the maximum search depth, use the first suitable - * slot, or return -EBUSY if there was none - */ - if (ins_index < 0) { - rc = -EBUSY; - goto out_unlock; - } - - /* Create a software table entry if necessary. */ - saved_spec = efx_ef10_filter_entry_spec(table, ins_index); - if (saved_spec) { - if (spec->priority == EFX_FILTER_PRI_AUTO && - saved_spec->priority >= EFX_FILTER_PRI_AUTO) { - /* Just make sure it won't be removed */ - if (saved_spec->priority > EFX_FILTER_PRI_AUTO) - saved_spec->flags |= EFX_FILTER_FLAG_RX_OVER_AUTO; - table->entry[ins_index].spec &= - ~EFX_EF10_FILTER_FLAG_AUTO_OLD; - rc = ins_index; - goto out_unlock; - } - replacing = true; - priv_flags = efx_ef10_filter_entry_flags(table, ins_index); - } else { - saved_spec = kmalloc(sizeof(*spec), GFP_ATOMIC); - if (!saved_spec) { - rc = -ENOMEM; - goto out_unlock; - } - *saved_spec = *spec; - priv_flags = 0; - } - efx_ef10_filter_set_entry(table, ins_index, saved_spec, priv_flags); - - /* Actually insert the filter on the HW */ - rc = efx_ef10_filter_push(efx, spec, &table->entry[ins_index].handle, - ctx, replacing); - - if (rc == -EINVAL && nic_data->must_realloc_vis) - /* The MC rebooted under us, causing it to reject our filter - * insertion as pointing to an invalid VI (spec->dmaq_id). - */ - rc = -EAGAIN; - - /* Finalise the software table entry */ - if (rc == 0) { - if (replacing) { - /* Update the fields that may differ */ - if (saved_spec->priority == EFX_FILTER_PRI_AUTO) - saved_spec->flags |= - EFX_FILTER_FLAG_RX_OVER_AUTO; - saved_spec->priority = spec->priority; - saved_spec->flags &= EFX_FILTER_FLAG_RX_OVER_AUTO; - saved_spec->flags |= spec->flags; - saved_spec->rss_context = spec->rss_context; - saved_spec->dmaq_id = spec->dmaq_id; - } - } else if (!replacing) { - kfree(saved_spec); - saved_spec = NULL; - } else { - /* We failed to replace, so the old filter is still present. - * Roll back the software table to reflect this. In fact the - * efx_ef10_filter_set_entry() call below will do the right - * thing, so nothing extra is needed here. - */ - } - efx_ef10_filter_set_entry(table, ins_index, saved_spec, priv_flags); - - /* Remove and finalise entries for lower-priority multicast - * recipients - */ - if (is_mc_recip) { - MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN); - unsigned int depth, i; - - memset(inbuf, 0, sizeof(inbuf)); - - for (depth = 0; depth < EFX_EF10_FILTER_SEARCH_LIMIT; depth++) { - if (!test_bit(depth, mc_rem_map)) - continue; - - i = (hash + depth) & (HUNT_FILTER_TBL_ROWS - 1); - saved_spec = efx_ef10_filter_entry_spec(table, i); - priv_flags = efx_ef10_filter_entry_flags(table, i); - - if (rc == 0) { - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, - MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE); - MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, - table->entry[i].handle); - rc = efx_mcdi_rpc(efx, MC_CMD_FILTER_OP, - inbuf, sizeof(inbuf), - NULL, 0, NULL); - } - - if (rc == 0) { - kfree(saved_spec); - saved_spec = NULL; - priv_flags = 0; - } - efx_ef10_filter_set_entry(table, i, saved_spec, - priv_flags); - } - } - - /* If successful, return the inserted filter ID */ - if (rc == 0) - rc = efx_ef10_make_filter_id(match_pri, ins_index); - -out_unlock: - if (rss_locked) - mutex_unlock(&efx->rss_lock); - up_write(&table->lock); - return rc; -} - -static s32 efx_ef10_filter_insert(struct efx_nic *efx, - struct efx_filter_spec *spec, - bool replace_equal) -{ - s32 ret; - - down_read(&efx->filter_sem); - ret = efx_ef10_filter_insert_locked(efx, spec, replace_equal); - up_read(&efx->filter_sem); - - return ret; -} - -static void efx_ef10_filter_update_rx_scatter(struct efx_nic *efx) -{ - /* no need to do anything here on EF10 */ -} - -/* Remove a filter. - * If !by_index, remove by ID - * If by_index, remove by index - * Filter ID may come from userland and must be range-checked. - * Caller must hold efx->filter_sem for read, and efx->filter_state->lock - * for write. - */ -static int efx_ef10_filter_remove_internal(struct efx_nic *efx, - unsigned int priority_mask, - u32 filter_id, bool by_index) -{ - unsigned int filter_idx = efx_ef10_filter_get_unsafe_id(filter_id); - struct efx_ef10_filter_table *table = efx->filter_state; - MCDI_DECLARE_BUF(inbuf, - MC_CMD_FILTER_OP_IN_HANDLE_OFST + - MC_CMD_FILTER_OP_IN_HANDLE_LEN); - struct efx_filter_spec *spec; - DEFINE_WAIT(wait); - int rc; - - spec = efx_ef10_filter_entry_spec(table, filter_idx); - if (!spec || - (!by_index && - efx_ef10_filter_pri(table, spec) != - efx_ef10_filter_get_unsafe_pri(filter_id))) - return -ENOENT; - - if (spec->flags & EFX_FILTER_FLAG_RX_OVER_AUTO && - priority_mask == (1U << EFX_FILTER_PRI_AUTO)) { - /* Just remove flags */ - spec->flags &= ~EFX_FILTER_FLAG_RX_OVER_AUTO; - table->entry[filter_idx].spec &= ~EFX_EF10_FILTER_FLAG_AUTO_OLD; - return 0; - } - - if (!(priority_mask & (1U << spec->priority))) - return -ENOENT; - - if (spec->flags & EFX_FILTER_FLAG_RX_OVER_AUTO) { - /* Reset to an automatic filter */ - - struct efx_filter_spec new_spec = *spec; - - new_spec.priority = EFX_FILTER_PRI_AUTO; - new_spec.flags = (EFX_FILTER_FLAG_RX | - (efx_rss_active(&efx->rss_context) ? - EFX_FILTER_FLAG_RX_RSS : 0)); - new_spec.dmaq_id = 0; - new_spec.rss_context = 0; - rc = efx_ef10_filter_push(efx, &new_spec, - &table->entry[filter_idx].handle, - &efx->rss_context, - true); - - if (rc == 0) - *spec = new_spec; - } else { - /* Really remove the filter */ - - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, - efx_ef10_filter_is_exclusive(spec) ? - MC_CMD_FILTER_OP_IN_OP_REMOVE : - MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE); - MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, - table->entry[filter_idx].handle); - rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, - inbuf, sizeof(inbuf), NULL, 0, NULL); - - if ((rc == 0) || (rc == -ENOENT)) { - /* Filter removed OK or didn't actually exist */ - kfree(spec); - efx_ef10_filter_set_entry(table, filter_idx, NULL, 0); - } else { - efx_mcdi_display_error(efx, MC_CMD_FILTER_OP, - MC_CMD_FILTER_OP_EXT_IN_LEN, - NULL, 0, rc); - } - } - - return rc; -} - -static int efx_ef10_filter_remove_safe(struct efx_nic *efx, - enum efx_filter_priority priority, - u32 filter_id) -{ - struct efx_ef10_filter_table *table; - int rc; - - down_read(&efx->filter_sem); - table = efx->filter_state; - down_write(&table->lock); - rc = efx_ef10_filter_remove_internal(efx, 1U << priority, filter_id, - false); - up_write(&table->lock); - up_read(&efx->filter_sem); - return rc; -} - -/* Caller must hold efx->filter_sem for read */ -static void efx_ef10_filter_remove_unsafe(struct efx_nic *efx, - enum efx_filter_priority priority, - u32 filter_id) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - - if (filter_id == EFX_EF10_FILTER_ID_INVALID) - return; - - down_write(&table->lock); - efx_ef10_filter_remove_internal(efx, 1U << priority, filter_id, - true); - up_write(&table->lock); -} - -static int efx_ef10_filter_get_safe(struct efx_nic *efx, - enum efx_filter_priority priority, - u32 filter_id, struct efx_filter_spec *spec) -{ - unsigned int filter_idx = efx_ef10_filter_get_unsafe_id(filter_id); - const struct efx_filter_spec *saved_spec; - struct efx_ef10_filter_table *table; - int rc; - - down_read(&efx->filter_sem); - table = efx->filter_state; - down_read(&table->lock); - saved_spec = efx_ef10_filter_entry_spec(table, filter_idx); - if (saved_spec && saved_spec->priority == priority && - efx_ef10_filter_pri(table, saved_spec) == - efx_ef10_filter_get_unsafe_pri(filter_id)) { - *spec = *saved_spec; - rc = 0; - } else { - rc = -ENOENT; - } - up_read(&table->lock); - up_read(&efx->filter_sem); - return rc; -} - -static int efx_ef10_filter_clear_rx(struct efx_nic *efx, - enum efx_filter_priority priority) -{ - struct efx_ef10_filter_table *table; - unsigned int priority_mask; - unsigned int i; - int rc; - - priority_mask = (((1U << (priority + 1)) - 1) & - ~(1U << EFX_FILTER_PRI_AUTO)); - - down_read(&efx->filter_sem); - table = efx->filter_state; - down_write(&table->lock); - for (i = 0; i < HUNT_FILTER_TBL_ROWS; i++) { - rc = efx_ef10_filter_remove_internal(efx, priority_mask, - i, true); - if (rc && rc != -ENOENT) - break; - rc = 0; - } - - up_write(&table->lock); - up_read(&efx->filter_sem); - return rc; -} - -static u32 efx_ef10_filter_count_rx_used(struct efx_nic *efx, - enum efx_filter_priority priority) -{ - struct efx_ef10_filter_table *table; - unsigned int filter_idx; - s32 count = 0; - - down_read(&efx->filter_sem); - table = efx->filter_state; - down_read(&table->lock); - for (filter_idx = 0; filter_idx < HUNT_FILTER_TBL_ROWS; filter_idx++) { - if (table->entry[filter_idx].spec && - efx_ef10_filter_entry_spec(table, filter_idx)->priority == - priority) - ++count; - } - up_read(&table->lock); - up_read(&efx->filter_sem); - return count; -} - -static u32 efx_ef10_filter_get_rx_id_limit(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - - return table->rx_match_count * HUNT_FILTER_TBL_ROWS * 2; -} - -static s32 efx_ef10_filter_get_rx_ids(struct efx_nic *efx, - enum efx_filter_priority priority, - u32 *buf, u32 size) -{ - struct efx_ef10_filter_table *table; - struct efx_filter_spec *spec; - unsigned int filter_idx; - s32 count = 0; - - down_read(&efx->filter_sem); - table = efx->filter_state; - down_read(&table->lock); - - for (filter_idx = 0; filter_idx < HUNT_FILTER_TBL_ROWS; filter_idx++) { - spec = efx_ef10_filter_entry_spec(table, filter_idx); - if (spec && spec->priority == priority) { - if (count == size) { - count = -EMSGSIZE; - break; - } - buf[count++] = - efx_ef10_make_filter_id( - efx_ef10_filter_pri(table, spec), - filter_idx); - } - } - up_read(&table->lock); - up_read(&efx->filter_sem); - return count; -} - -#ifdef CONFIG_RFS_ACCEL - -static bool efx_ef10_filter_rfs_expire_one(struct efx_nic *efx, u32 flow_id, - unsigned int filter_idx) -{ - struct efx_filter_spec *spec, saved_spec; - struct efx_ef10_filter_table *table; - struct efx_arfs_rule *rule = NULL; - bool ret = true, force = false; - u16 arfs_id; - - down_read(&efx->filter_sem); - table = efx->filter_state; - down_write(&table->lock); - spec = efx_ef10_filter_entry_spec(table, filter_idx); - - if (!spec || spec->priority != EFX_FILTER_PRI_HINT) - goto out_unlock; - - spin_lock_bh(&efx->rps_hash_lock); - if (!efx->rps_hash_table) { - /* In the absence of the table, we always return 0 to ARFS. */ - arfs_id = 0; - } else { - rule = efx_rps_hash_find(efx, spec); - if (!rule) - /* ARFS table doesn't know of this filter, so remove it */ - goto expire; - arfs_id = rule->arfs_id; - ret = efx_rps_check_rule(rule, filter_idx, &force); - if (force) - goto expire; - if (!ret) { - spin_unlock_bh(&efx->rps_hash_lock); - goto out_unlock; - } - } - if (!rps_may_expire_flow(efx->net_dev, spec->dmaq_id, flow_id, arfs_id)) - ret = false; - else if (rule) - rule->filter_id = EFX_ARFS_FILTER_ID_REMOVING; -expire: - saved_spec = *spec; /* remove operation will kfree spec */ - spin_unlock_bh(&efx->rps_hash_lock); - /* At this point (since we dropped the lock), another thread might queue - * up a fresh insertion request (but the actual insertion will be held - * up by our possession of the filter table lock). In that case, it - * will set rule->filter_id to EFX_ARFS_FILTER_ID_PENDING, meaning that - * the rule is not removed by efx_rps_hash_del() below. - */ - if (ret) - ret = efx_ef10_filter_remove_internal(efx, 1U << spec->priority, - filter_idx, true) == 0; - /* While we can't safely dereference rule (we dropped the lock), we can - * still test it for NULL. - */ - if (ret && rule) { - /* Expiring, so remove entry from ARFS table */ - spin_lock_bh(&efx->rps_hash_lock); - efx_rps_hash_del(efx, &saved_spec); - spin_unlock_bh(&efx->rps_hash_lock); - } -out_unlock: - up_write(&table->lock); - up_read(&efx->filter_sem); - return ret; -} - -#endif /* CONFIG_RFS_ACCEL */ - -static int efx_ef10_filter_match_flags_from_mcdi(bool encap, u32 mcdi_flags) -{ - int match_flags = 0; - -#define MAP_FLAG(gen_flag, mcdi_field) do { \ - u32 old_mcdi_flags = mcdi_flags; \ - mcdi_flags &= ~(1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_ ## \ - mcdi_field ## _LBN); \ - if (mcdi_flags != old_mcdi_flags) \ - match_flags |= EFX_FILTER_MATCH_ ## gen_flag; \ - } while (0) - - if (encap) { - /* encap filters must specify encap type */ - match_flags |= EFX_FILTER_MATCH_ENCAP_TYPE; - /* and imply ethertype and ip proto */ - mcdi_flags &= - ~(1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_IP_PROTO_LBN); - mcdi_flags &= - ~(1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_ETHER_TYPE_LBN); - /* VLAN tags refer to the outer packet */ - MAP_FLAG(INNER_VID, INNER_VLAN); - MAP_FLAG(OUTER_VID, OUTER_VLAN); - /* everything else refers to the inner packet */ - MAP_FLAG(LOC_MAC_IG, IFRM_UNKNOWN_UCAST_DST); - MAP_FLAG(LOC_MAC_IG, IFRM_UNKNOWN_MCAST_DST); - MAP_FLAG(REM_HOST, IFRM_SRC_IP); - MAP_FLAG(LOC_HOST, IFRM_DST_IP); - MAP_FLAG(REM_MAC, IFRM_SRC_MAC); - MAP_FLAG(REM_PORT, IFRM_SRC_PORT); - MAP_FLAG(LOC_MAC, IFRM_DST_MAC); - MAP_FLAG(LOC_PORT, IFRM_DST_PORT); - MAP_FLAG(ETHER_TYPE, IFRM_ETHER_TYPE); - MAP_FLAG(IP_PROTO, IFRM_IP_PROTO); - } else { - MAP_FLAG(LOC_MAC_IG, UNKNOWN_UCAST_DST); - MAP_FLAG(LOC_MAC_IG, UNKNOWN_MCAST_DST); - MAP_FLAG(REM_HOST, SRC_IP); - MAP_FLAG(LOC_HOST, DST_IP); - MAP_FLAG(REM_MAC, SRC_MAC); - MAP_FLAG(REM_PORT, SRC_PORT); - MAP_FLAG(LOC_MAC, DST_MAC); - MAP_FLAG(LOC_PORT, DST_PORT); - MAP_FLAG(ETHER_TYPE, ETHER_TYPE); - MAP_FLAG(INNER_VID, INNER_VLAN); - MAP_FLAG(OUTER_VID, OUTER_VLAN); - MAP_FLAG(IP_PROTO, IP_PROTO); - } -#undef MAP_FLAG - - /* Did we map them all? */ - if (mcdi_flags) - return -EINVAL; - - return match_flags; -} - -static void efx_ef10_filter_cleanup_vlans(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_filter_vlan *vlan, *next_vlan; - - /* See comment in efx_ef10_filter_table_remove() */ - if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) - return; - - if (!table) - return; - - list_for_each_entry_safe(vlan, next_vlan, &table->vlan_list, list) - efx_ef10_filter_del_vlan_internal(efx, vlan); -} - -static bool efx_ef10_filter_match_supported(struct efx_ef10_filter_table *table, - bool encap, - enum efx_filter_match_flags match_flags) -{ - unsigned int match_pri; - int mf; - - for (match_pri = 0; - match_pri < table->rx_match_count; - match_pri++) { - mf = efx_ef10_filter_match_flags_from_mcdi(encap, - table->rx_match_mcdi_flags[match_pri]); - if (mf == match_flags) - return true; - } - - return false; -} - -static int -efx_ef10_filter_table_probe_matches(struct efx_nic *efx, - struct efx_ef10_filter_table *table, - bool encap) -{ - MCDI_DECLARE_BUF(inbuf, MC_CMD_GET_PARSER_DISP_INFO_IN_LEN); - MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_PARSER_DISP_INFO_OUT_LENMAX); - unsigned int pd_match_pri, pd_match_count; - size_t outlen; - int rc; - - /* Find out which RX filter types are supported, and their priorities */ - MCDI_SET_DWORD(inbuf, GET_PARSER_DISP_INFO_IN_OP, - encap ? - MC_CMD_GET_PARSER_DISP_INFO_IN_OP_GET_SUPPORTED_ENCAP_RX_MATCHES : - MC_CMD_GET_PARSER_DISP_INFO_IN_OP_GET_SUPPORTED_RX_MATCHES); - rc = efx_mcdi_rpc(efx, MC_CMD_GET_PARSER_DISP_INFO, - inbuf, sizeof(inbuf), outbuf, sizeof(outbuf), - &outlen); - if (rc) - return rc; - - pd_match_count = MCDI_VAR_ARRAY_LEN( - outlen, GET_PARSER_DISP_INFO_OUT_SUPPORTED_MATCHES); - - for (pd_match_pri = 0; pd_match_pri < pd_match_count; pd_match_pri++) { - u32 mcdi_flags = - MCDI_ARRAY_DWORD( - outbuf, - GET_PARSER_DISP_INFO_OUT_SUPPORTED_MATCHES, - pd_match_pri); - rc = efx_ef10_filter_match_flags_from_mcdi(encap, mcdi_flags); - if (rc < 0) { - netif_dbg(efx, probe, efx->net_dev, - "%s: fw flags %#x pri %u not supported in driver\n", - __func__, mcdi_flags, pd_match_pri); - } else { - netif_dbg(efx, probe, efx->net_dev, - "%s: fw flags %#x pri %u supported as driver flags %#x pri %u\n", - __func__, mcdi_flags, pd_match_pri, - rc, table->rx_match_count); - table->rx_match_mcdi_flags[table->rx_match_count] = mcdi_flags; - table->rx_match_count++; - } - } - - return 0; -} - -static int efx_ef10_filter_table_probe(struct efx_nic *efx) -{ - struct efx_ef10_nic_data *nic_data = efx->nic_data; - struct net_device *net_dev = efx->net_dev; - struct efx_ef10_filter_table *table; - struct efx_ef10_vlan *vlan; - int rc; - - if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) - return -EINVAL; - - if (efx->filter_state) /* already probed */ - return 0; - - table = kzalloc(sizeof(*table), GFP_KERNEL); - if (!table) - return -ENOMEM; - - table->rx_match_count = 0; - rc = efx_ef10_filter_table_probe_matches(efx, table, false); - if (rc) - goto fail; - if (nic_data->datapath_caps & - (1 << MC_CMD_GET_CAPABILITIES_OUT_VXLAN_NVGRE_LBN)) - rc = efx_ef10_filter_table_probe_matches(efx, table, true); - if (rc) - goto fail; - if ((efx_supported_features(efx) & NETIF_F_HW_VLAN_CTAG_FILTER) && - !(efx_ef10_filter_match_supported(table, false, - (EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_LOC_MAC)) && - efx_ef10_filter_match_supported(table, false, - (EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_LOC_MAC_IG)))) { - netif_info(efx, probe, net_dev, - "VLAN filters are not supported in this firmware variant\n"); - net_dev->features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; - efx->fixed_features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; - net_dev->hw_features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; - } - - table->entry = vzalloc(array_size(HUNT_FILTER_TBL_ROWS, - sizeof(*table->entry))); - if (!table->entry) { - rc = -ENOMEM; - goto fail; - } - - table->mc_promisc_last = false; - table->vlan_filter = - !!(efx->net_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER); - INIT_LIST_HEAD(&table->vlan_list); - init_rwsem(&table->lock); - - efx->filter_state = table; - - list_for_each_entry(vlan, &nic_data->vlan_list, list) { - rc = efx_ef10_filter_add_vlan(efx, vlan->vid); - if (rc) - goto fail_add_vlan; - } - - return 0; - -fail_add_vlan: - efx_ef10_filter_cleanup_vlans(efx); - efx->filter_state = NULL; -fail: - kfree(table); - return rc; -} - -/* Caller must hold efx->filter_sem for read if race against - * efx_ef10_filter_table_remove() is possible - */ -static void efx_ef10_filter_table_restore(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_nic_data *nic_data = efx->nic_data; - unsigned int invalid_filters = 0, failed = 0; - struct efx_ef10_filter_vlan *vlan; - struct efx_filter_spec *spec; - struct efx_rss_context *ctx; - unsigned int filter_idx; - u32 mcdi_flags; - int match_pri; - int rc, i; - - WARN_ON(!rwsem_is_locked(&efx->filter_sem)); - - if (!nic_data->must_restore_filters) - return; - - if (!table) - return; - - down_write(&table->lock); - mutex_lock(&efx->rss_lock); - - for (filter_idx = 0; filter_idx < HUNT_FILTER_TBL_ROWS; filter_idx++) { - spec = efx_ef10_filter_entry_spec(table, filter_idx); - if (!spec) - continue; - - mcdi_flags = efx_ef10_filter_mcdi_flags_from_spec(spec); - match_pri = 0; - while (match_pri < table->rx_match_count && - table->rx_match_mcdi_flags[match_pri] != mcdi_flags) - ++match_pri; - if (match_pri >= table->rx_match_count) { - invalid_filters++; - goto not_restored; - } - if (spec->rss_context) - ctx = efx_find_rss_context_entry(efx, spec->rss_context); - else - ctx = &efx->rss_context; - if (spec->flags & EFX_FILTER_FLAG_RX_RSS) { - if (!ctx) { - netif_warn(efx, drv, efx->net_dev, - "Warning: unable to restore a filter with nonexistent RSS context %u.\n", - spec->rss_context); - invalid_filters++; - goto not_restored; - } - if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) { - netif_warn(efx, drv, efx->net_dev, - "Warning: unable to restore a filter with RSS context %u as it was not created.\n", - spec->rss_context); - invalid_filters++; - goto not_restored; - } - } - - rc = efx_ef10_filter_push(efx, spec, - &table->entry[filter_idx].handle, - ctx, false); - if (rc) - failed++; - - if (rc) { -not_restored: - list_for_each_entry(vlan, &table->vlan_list, list) - for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; ++i) - if (vlan->default_filters[i] == filter_idx) - vlan->default_filters[i] = - EFX_EF10_FILTER_ID_INVALID; - - kfree(spec); - efx_ef10_filter_set_entry(table, filter_idx, NULL, 0); - } - } - - mutex_unlock(&efx->rss_lock); - up_write(&table->lock); - - /* This can happen validly if the MC's capabilities have changed, so - * is not an error. - */ - if (invalid_filters) - netif_dbg(efx, drv, efx->net_dev, - "Did not restore %u filters that are now unsupported.\n", - invalid_filters); - - if (failed) - netif_err(efx, hw, efx->net_dev, - "unable to restore %u filters\n", failed); - else - nic_data->must_restore_filters = false; -} - -static void efx_ef10_filter_table_remove(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN); - struct efx_filter_spec *spec; - unsigned int filter_idx; - int rc; - - efx_ef10_filter_cleanup_vlans(efx); - efx->filter_state = NULL; - /* If we were called without locking, then it's not safe to free - * the table as others might be using it. So we just WARN, leak - * the memory, and potentially get an inconsistent filter table - * state. - * This should never actually happen. - */ - if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) - return; - - if (!table) - return; - - for (filter_idx = 0; filter_idx < HUNT_FILTER_TBL_ROWS; filter_idx++) { - spec = efx_ef10_filter_entry_spec(table, filter_idx); - if (!spec) - continue; - - MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, - efx_ef10_filter_is_exclusive(spec) ? - MC_CMD_FILTER_OP_IN_OP_REMOVE : - MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE); - MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, - table->entry[filter_idx].handle); - rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, inbuf, - sizeof(inbuf), NULL, 0, NULL); - if (rc) - netif_info(efx, drv, efx->net_dev, - "%s: filter %04x remove failed\n", - __func__, filter_idx); - kfree(spec); - } - - vfree(table->entry); - kfree(table); -} - -static void efx_ef10_filter_mark_one_old(struct efx_nic *efx, uint16_t *id) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - unsigned int filter_idx; - - efx_rwsem_assert_write_locked(&table->lock); - - if (*id != EFX_EF10_FILTER_ID_INVALID) { - filter_idx = efx_ef10_filter_get_unsafe_id(*id); - if (!table->entry[filter_idx].spec) - netif_dbg(efx, drv, efx->net_dev, - "marked null spec old %04x:%04x\n", *id, - filter_idx); - table->entry[filter_idx].spec |= EFX_EF10_FILTER_FLAG_AUTO_OLD; - *id = EFX_EF10_FILTER_ID_INVALID; - } -} - -/* Mark old per-VLAN filters that may need to be removed */ -static void _efx_ef10_filter_vlan_mark_old(struct efx_nic *efx, - struct efx_ef10_filter_vlan *vlan) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - unsigned int i; - - for (i = 0; i < table->dev_uc_count; i++) - efx_ef10_filter_mark_one_old(efx, &vlan->uc[i]); - for (i = 0; i < table->dev_mc_count; i++) - efx_ef10_filter_mark_one_old(efx, &vlan->mc[i]); - for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; i++) - efx_ef10_filter_mark_one_old(efx, &vlan->default_filters[i]); -} - -/* Mark old filters that may need to be removed. - * Caller must hold efx->filter_sem for read if race against - * efx_ef10_filter_table_remove() is possible - */ -static void efx_ef10_filter_mark_old(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_filter_vlan *vlan; - - down_write(&table->lock); - list_for_each_entry(vlan, &table->vlan_list, list) - _efx_ef10_filter_vlan_mark_old(efx, vlan); - up_write(&table->lock); -} - -static void efx_ef10_filter_uc_addr_list(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct net_device *net_dev = efx->net_dev; - struct netdev_hw_addr *uc; - unsigned int i; - - table->uc_promisc = !!(net_dev->flags & IFF_PROMISC); - ether_addr_copy(table->dev_uc_list[0].addr, net_dev->dev_addr); - i = 1; - netdev_for_each_uc_addr(uc, net_dev) { - if (i >= EFX_EF10_FILTER_DEV_UC_MAX) { - table->uc_promisc = true; - break; - } - ether_addr_copy(table->dev_uc_list[i].addr, uc->addr); - i++; - } - - table->dev_uc_count = i; -} - -static void efx_ef10_filter_mc_addr_list(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct net_device *net_dev = efx->net_dev; - struct netdev_hw_addr *mc; - unsigned int i; - - table->mc_overflow = false; - table->mc_promisc = !!(net_dev->flags & (IFF_PROMISC | IFF_ALLMULTI)); - - i = 0; - netdev_for_each_mc_addr(mc, net_dev) { - if (i >= EFX_EF10_FILTER_DEV_MC_MAX) { - table->mc_promisc = true; - table->mc_overflow = true; - break; - } - ether_addr_copy(table->dev_mc_list[i].addr, mc->addr); - i++; - } - - table->dev_mc_count = i; -} - -static int efx_ef10_filter_insert_addr_list(struct efx_nic *efx, - struct efx_ef10_filter_vlan *vlan, - bool multicast, bool rollback) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_dev_addr *addr_list; - enum efx_filter_flags filter_flags; - struct efx_filter_spec spec; - u8 baddr[ETH_ALEN]; - unsigned int i, j; - int addr_count; - u16 *ids; - int rc; - - if (multicast) { - addr_list = table->dev_mc_list; - addr_count = table->dev_mc_count; - ids = vlan->mc; - } else { - addr_list = table->dev_uc_list; - addr_count = table->dev_uc_count; - ids = vlan->uc; - } - - filter_flags = efx_rss_active(&efx->rss_context) ? EFX_FILTER_FLAG_RX_RSS : 0; - - /* Insert/renew filters */ - for (i = 0; i < addr_count; i++) { - EFX_WARN_ON_PARANOID(ids[i] != EFX_EF10_FILTER_ID_INVALID); - efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0); - efx_filter_set_eth_local(&spec, vlan->vid, addr_list[i].addr); - rc = efx_ef10_filter_insert_locked(efx, &spec, true); - if (rc < 0) { - if (rollback) { - netif_info(efx, drv, efx->net_dev, - "efx_ef10_filter_insert failed rc=%d\n", - rc); - /* Fall back to promiscuous */ - for (j = 0; j < i; j++) { - efx_ef10_filter_remove_unsafe( - efx, EFX_FILTER_PRI_AUTO, - ids[j]); - ids[j] = EFX_EF10_FILTER_ID_INVALID; - } - return rc; - } else { - /* keep invalid ID, and carry on */ - } - } else { - ids[i] = efx_ef10_filter_get_unsafe_id(rc); - } - } - - if (multicast && rollback) { - /* Also need an Ethernet broadcast filter */ - EFX_WARN_ON_PARANOID(vlan->default_filters[EFX_EF10_BCAST] != - EFX_EF10_FILTER_ID_INVALID); - efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0); - eth_broadcast_addr(baddr); - efx_filter_set_eth_local(&spec, vlan->vid, baddr); - rc = efx_ef10_filter_insert_locked(efx, &spec, true); - if (rc < 0) { - netif_warn(efx, drv, efx->net_dev, - "Broadcast filter insert failed rc=%d\n", rc); - /* Fall back to promiscuous */ - for (j = 0; j < i; j++) { - efx_ef10_filter_remove_unsafe( - efx, EFX_FILTER_PRI_AUTO, - ids[j]); - ids[j] = EFX_EF10_FILTER_ID_INVALID; - } - return rc; - } else { - vlan->default_filters[EFX_EF10_BCAST] = - efx_ef10_filter_get_unsafe_id(rc); - } - } - - return 0; -} - -static int efx_ef10_filter_insert_def(struct efx_nic *efx, - struct efx_ef10_filter_vlan *vlan, - enum efx_encap_type encap_type, - bool multicast, bool rollback) -{ - struct efx_ef10_nic_data *nic_data = efx->nic_data; - enum efx_filter_flags filter_flags; - struct efx_filter_spec spec; - u8 baddr[ETH_ALEN]; - int rc; - u16 *id; - - filter_flags = efx_rss_active(&efx->rss_context) ? EFX_FILTER_FLAG_RX_RSS : 0; - - efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0); - - if (multicast) - efx_filter_set_mc_def(&spec); - else - efx_filter_set_uc_def(&spec); - - if (encap_type) { - if (nic_data->datapath_caps & - (1 << MC_CMD_GET_CAPABILITIES_OUT_VXLAN_NVGRE_LBN)) - efx_filter_set_encap_type(&spec, encap_type); - else - /* don't insert encap filters on non-supporting - * platforms. ID will be left as INVALID. - */ - return 0; - } - - if (vlan->vid != EFX_FILTER_VID_UNSPEC) - efx_filter_set_eth_local(&spec, vlan->vid, NULL); - - rc = efx_ef10_filter_insert_locked(efx, &spec, true); - if (rc < 0) { - const char *um = multicast ? "Multicast" : "Unicast"; - const char *encap_name = ""; - const char *encap_ipv = ""; - - if ((encap_type & EFX_ENCAP_TYPES_MASK) == - EFX_ENCAP_TYPE_VXLAN) - encap_name = "VXLAN "; - else if ((encap_type & EFX_ENCAP_TYPES_MASK) == - EFX_ENCAP_TYPE_NVGRE) - encap_name = "NVGRE "; - else if ((encap_type & EFX_ENCAP_TYPES_MASK) == - EFX_ENCAP_TYPE_GENEVE) - encap_name = "GENEVE "; - if (encap_type & EFX_ENCAP_FLAG_IPV6) - encap_ipv = "IPv6 "; - else if (encap_type) - encap_ipv = "IPv4 "; - - /* unprivileged functions can't insert mismatch filters - * for encapsulated or unicast traffic, so downgrade - * those warnings to debug. - */ - netif_cond_dbg(efx, drv, efx->net_dev, - rc == -EPERM && (encap_type || !multicast), warn, - "%s%s%s mismatch filter insert failed rc=%d\n", - encap_name, encap_ipv, um, rc); - } else if (multicast) { - /* mapping from encap types to default filter IDs (multicast) */ - static enum efx_ef10_default_filters map[] = { - [EFX_ENCAP_TYPE_NONE] = EFX_EF10_MCDEF, - [EFX_ENCAP_TYPE_VXLAN] = EFX_EF10_VXLAN4_MCDEF, - [EFX_ENCAP_TYPE_NVGRE] = EFX_EF10_NVGRE4_MCDEF, - [EFX_ENCAP_TYPE_GENEVE] = EFX_EF10_GENEVE4_MCDEF, - [EFX_ENCAP_TYPE_VXLAN | EFX_ENCAP_FLAG_IPV6] = - EFX_EF10_VXLAN6_MCDEF, - [EFX_ENCAP_TYPE_NVGRE | EFX_ENCAP_FLAG_IPV6] = - EFX_EF10_NVGRE6_MCDEF, - [EFX_ENCAP_TYPE_GENEVE | EFX_ENCAP_FLAG_IPV6] = - EFX_EF10_GENEVE6_MCDEF, - }; - - /* quick bounds check (BCAST result impossible) */ - BUILD_BUG_ON(EFX_EF10_BCAST != 0); - if (encap_type >= ARRAY_SIZE(map) || map[encap_type] == 0) { - WARN_ON(1); - return -EINVAL; - } - /* then follow map */ - id = &vlan->default_filters[map[encap_type]]; - - EFX_WARN_ON_PARANOID(*id != EFX_EF10_FILTER_ID_INVALID); - *id = efx_ef10_filter_get_unsafe_id(rc); - if (!nic_data->workaround_26807 && !encap_type) { - /* Also need an Ethernet broadcast filter */ - efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, - filter_flags, 0); - eth_broadcast_addr(baddr); - efx_filter_set_eth_local(&spec, vlan->vid, baddr); - rc = efx_ef10_filter_insert_locked(efx, &spec, true); - if (rc < 0) { - netif_warn(efx, drv, efx->net_dev, - "Broadcast filter insert failed rc=%d\n", - rc); - if (rollback) { - /* Roll back the mc_def filter */ - efx_ef10_filter_remove_unsafe( - efx, EFX_FILTER_PRI_AUTO, - *id); - *id = EFX_EF10_FILTER_ID_INVALID; - return rc; - } - } else { - EFX_WARN_ON_PARANOID( - vlan->default_filters[EFX_EF10_BCAST] != - EFX_EF10_FILTER_ID_INVALID); - vlan->default_filters[EFX_EF10_BCAST] = - efx_ef10_filter_get_unsafe_id(rc); - } - } - rc = 0; - } else { - /* mapping from encap types to default filter IDs (unicast) */ - static enum efx_ef10_default_filters map[] = { - [EFX_ENCAP_TYPE_NONE] = EFX_EF10_UCDEF, - [EFX_ENCAP_TYPE_VXLAN] = EFX_EF10_VXLAN4_UCDEF, - [EFX_ENCAP_TYPE_NVGRE] = EFX_EF10_NVGRE4_UCDEF, - [EFX_ENCAP_TYPE_GENEVE] = EFX_EF10_GENEVE4_UCDEF, - [EFX_ENCAP_TYPE_VXLAN | EFX_ENCAP_FLAG_IPV6] = - EFX_EF10_VXLAN6_UCDEF, - [EFX_ENCAP_TYPE_NVGRE | EFX_ENCAP_FLAG_IPV6] = - EFX_EF10_NVGRE6_UCDEF, - [EFX_ENCAP_TYPE_GENEVE | EFX_ENCAP_FLAG_IPV6] = - EFX_EF10_GENEVE6_UCDEF, - }; - - /* quick bounds check (BCAST result impossible) */ - BUILD_BUG_ON(EFX_EF10_BCAST != 0); - if (encap_type >= ARRAY_SIZE(map) || map[encap_type] == 0) { - WARN_ON(1); - return -EINVAL; - } - /* then follow map */ - id = &vlan->default_filters[map[encap_type]]; - EFX_WARN_ON_PARANOID(*id != EFX_EF10_FILTER_ID_INVALID); - *id = rc; - rc = 0; - } - return rc; -} - -/* Remove filters that weren't renewed. */ -static void efx_ef10_filter_remove_old(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - int remove_failed = 0; - int remove_noent = 0; - int rc; - int i; - - down_write(&table->lock); - for (i = 0; i < HUNT_FILTER_TBL_ROWS; i++) { - if (READ_ONCE(table->entry[i].spec) & - EFX_EF10_FILTER_FLAG_AUTO_OLD) { - rc = efx_ef10_filter_remove_internal(efx, - 1U << EFX_FILTER_PRI_AUTO, i, true); - if (rc == -ENOENT) - remove_noent++; - else if (rc) - remove_failed++; - } - } - up_write(&table->lock); - - if (remove_failed) - netif_info(efx, drv, efx->net_dev, - "%s: failed to remove %d filters\n", - __func__, remove_failed); - if (remove_noent) - netif_info(efx, drv, efx->net_dev, - "%s: failed to remove %d non-existent filters\n", - __func__, remove_noent); -} - static int efx_ef10_vport_set_mac_address(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data = efx->nic_data; @@ -5229,7 +3153,7 @@ static int efx_ef10_vport_set_mac_address(struct efx_nic *efx) efx_device_detach_sync(efx); efx_net_stop(efx->net_dev); down_write(&efx->filter_sem); - efx_ef10_filter_table_remove(efx); + efx_mcdi_filter_table_remove(efx); up_write(&efx->filter_sem); rc = efx_ef10_vadaptor_free(efx, nic_data->vport_id); @@ -5261,7 +3185,7 @@ restore_vadaptor: goto reset_nic; restore_filters: down_write(&efx->filter_sem); - rc2 = efx_ef10_filter_table_probe(efx); + rc2 = efx_mcdi_filter_table_probe(efx); up_write(&efx->filter_sem); if (rc2) goto reset_nic; @@ -5282,256 +3206,6 @@ reset_nic: return rc ? rc : rc2; } -/* Caller must hold efx->filter_sem for read if race against - * efx_ef10_filter_table_remove() is possible - */ -static void efx_ef10_filter_vlan_sync_rx_mode(struct efx_nic *efx, - struct efx_ef10_filter_vlan *vlan) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_nic_data *nic_data = efx->nic_data; - - /* Do not install unspecified VID if VLAN filtering is enabled. - * Do not install all specified VIDs if VLAN filtering is disabled. - */ - if ((vlan->vid == EFX_FILTER_VID_UNSPEC) == table->vlan_filter) - return; - - /* Insert/renew unicast filters */ - if (table->uc_promisc) { - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NONE, - false, false); - efx_ef10_filter_insert_addr_list(efx, vlan, false, false); - } else { - /* If any of the filters failed to insert, fall back to - * promiscuous mode - add in the uc_def filter. But keep - * our individual unicast filters. - */ - if (efx_ef10_filter_insert_addr_list(efx, vlan, false, false)) - efx_ef10_filter_insert_def(efx, vlan, - EFX_ENCAP_TYPE_NONE, - false, false); - } - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN, - false, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN | - EFX_ENCAP_FLAG_IPV6, - false, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE, - false, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE | - EFX_ENCAP_FLAG_IPV6, - false, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE, - false, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE | - EFX_ENCAP_FLAG_IPV6, - false, false); - - /* Insert/renew multicast filters */ - /* If changing promiscuous state with cascaded multicast filters, remove - * old filters first, so that packets are dropped rather than duplicated - */ - if (nic_data->workaround_26807 && - table->mc_promisc_last != table->mc_promisc) - efx_ef10_filter_remove_old(efx); - if (table->mc_promisc) { - if (nic_data->workaround_26807) { - /* If we failed to insert promiscuous filters, rollback - * and fall back to individual multicast filters - */ - if (efx_ef10_filter_insert_def(efx, vlan, - EFX_ENCAP_TYPE_NONE, - true, true)) { - /* Changing promisc state, so remove old filters */ - efx_ef10_filter_remove_old(efx); - efx_ef10_filter_insert_addr_list(efx, vlan, - true, false); - } - } else { - /* If we failed to insert promiscuous filters, don't - * rollback. Regardless, also insert the mc_list, - * unless it's incomplete due to overflow - */ - efx_ef10_filter_insert_def(efx, vlan, - EFX_ENCAP_TYPE_NONE, - true, false); - if (!table->mc_overflow) - efx_ef10_filter_insert_addr_list(efx, vlan, - true, false); - } - } else { - /* If any filters failed to insert, rollback and fall back to - * promiscuous mode - mc_def filter and maybe broadcast. If - * that fails, roll back again and insert as many of our - * individual multicast filters as we can. - */ - if (efx_ef10_filter_insert_addr_list(efx, vlan, true, true)) { - /* Changing promisc state, so remove old filters */ - if (nic_data->workaround_26807) - efx_ef10_filter_remove_old(efx); - if (efx_ef10_filter_insert_def(efx, vlan, - EFX_ENCAP_TYPE_NONE, - true, true)) - efx_ef10_filter_insert_addr_list(efx, vlan, - true, false); - } - } - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN, - true, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN | - EFX_ENCAP_FLAG_IPV6, - true, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE, - true, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE | - EFX_ENCAP_FLAG_IPV6, - true, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE, - true, false); - efx_ef10_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE | - EFX_ENCAP_FLAG_IPV6, - true, false); -} - -/* Caller must hold efx->filter_sem for read if race against - * efx_ef10_filter_table_remove() is possible - */ -static void efx_ef10_filter_sync_rx_mode(struct efx_nic *efx) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct net_device *net_dev = efx->net_dev; - struct efx_ef10_filter_vlan *vlan; - bool vlan_filter; - - if (!efx_dev_registered(efx)) - return; - - if (!table) - return; - - efx_ef10_filter_mark_old(efx); - - /* Copy/convert the address lists; add the primary station - * address and broadcast address - */ - netif_addr_lock_bh(net_dev); - efx_ef10_filter_uc_addr_list(efx); - efx_ef10_filter_mc_addr_list(efx); - netif_addr_unlock_bh(net_dev); - - /* If VLAN filtering changes, all old filters are finally removed. - * Do it in advance to avoid conflicts for unicast untagged and - * VLAN 0 tagged filters. - */ - vlan_filter = !!(net_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER); - if (table->vlan_filter != vlan_filter) { - table->vlan_filter = vlan_filter; - efx_ef10_filter_remove_old(efx); - } - - list_for_each_entry(vlan, &table->vlan_list, list) - efx_ef10_filter_vlan_sync_rx_mode(efx, vlan); - - efx_ef10_filter_remove_old(efx); - table->mc_promisc_last = table->mc_promisc; -} - -static struct efx_ef10_filter_vlan *efx_ef10_filter_find_vlan(struct efx_nic *efx, u16 vid) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_filter_vlan *vlan; - - WARN_ON(!rwsem_is_locked(&efx->filter_sem)); - - list_for_each_entry(vlan, &table->vlan_list, list) { - if (vlan->vid == vid) - return vlan; - } - - return NULL; -} - -static int efx_ef10_filter_add_vlan(struct efx_nic *efx, u16 vid) -{ - struct efx_ef10_filter_table *table = efx->filter_state; - struct efx_ef10_filter_vlan *vlan; - unsigned int i; - - if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) - return -EINVAL; - - vlan = efx_ef10_filter_find_vlan(efx, vid); - if (WARN_ON(vlan)) { - netif_err(efx, drv, efx->net_dev, - "VLAN %u already added\n", vid); - return -EALREADY; - } - - vlan = kzalloc(sizeof(*vlan), GFP_KERNEL); - if (!vlan) - return -ENOMEM; - - vlan->vid = vid; - - for (i = 0; i < ARRAY_SIZE(vlan->uc); i++) - vlan->uc[i] = EFX_EF10_FILTER_ID_INVALID; - for (i = 0; i < ARRAY_SIZE(vlan->mc); i++) - vlan->mc[i] = EFX_EF10_FILTER_ID_INVALID; - for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; i++) - vlan->default_filters[i] = EFX_EF10_FILTER_ID_INVALID; - - list_add_tail(&vlan->list, &table->vlan_list); - - if (efx_dev_registered(efx)) - efx_ef10_filter_vlan_sync_rx_mode(efx, vlan); - - return 0; -} - -static void efx_ef10_filter_del_vlan_internal(struct efx_nic *efx, - struct efx_ef10_filter_vlan *vlan) -{ - unsigned int i; - - /* See comment in efx_ef10_filter_table_remove() */ - if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) - return; - - list_del(&vlan->list); - - for (i = 0; i < ARRAY_SIZE(vlan->uc); i++) - efx_ef10_filter_remove_unsafe(efx, EFX_FILTER_PRI_AUTO, - vlan->uc[i]); - for (i = 0; i < ARRAY_SIZE(vlan->mc); i++) - efx_ef10_filter_remove_unsafe(efx, EFX_FILTER_PRI_AUTO, - vlan->mc[i]); - for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; i++) - if (vlan->default_filters[i] != EFX_EF10_FILTER_ID_INVALID) - efx_ef10_filter_remove_unsafe(efx, EFX_FILTER_PRI_AUTO, - vlan->default_filters[i]); - - kfree(vlan); -} - -static void efx_ef10_filter_del_vlan(struct efx_nic *efx, u16 vid) -{ - struct efx_ef10_filter_vlan *vlan; - - /* See comment in efx_ef10_filter_table_remove() */ - if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) - return; - - vlan = efx_ef10_filter_find_vlan(efx, vid); - if (!vlan) { - netif_err(efx, drv, efx->net_dev, - "VLAN %u not found in filter state\n", vid); - return; - } - - efx_ef10_filter_del_vlan_internal(efx, vlan); -} - static int efx_ef10_set_mac_address(struct efx_nic *efx) { MCDI_DECLARE_BUF(inbuf, MC_CMD_VADAPTOR_SET_MAC_IN_LEN); @@ -5544,7 +3218,7 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx) mutex_lock(&efx->mac_lock); down_write(&efx->filter_sem); - efx_ef10_filter_table_remove(efx); + efx_mcdi_filter_table_remove(efx); ether_addr_copy(MCDI_PTR(inbuf, VADAPTOR_SET_MAC_IN_MACADDR), efx->net_dev->dev_addr); @@ -5553,7 +3227,7 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx) rc = efx_mcdi_rpc_quiet(efx, MC_CMD_VADAPTOR_SET_MAC, inbuf, sizeof(inbuf), NULL, 0, NULL); - efx_ef10_filter_table_probe(efx); + efx_mcdi_filter_table_probe(efx); up_write(&efx->filter_sem); mutex_unlock(&efx->mac_lock); @@ -5615,14 +3289,14 @@ static int efx_ef10_set_mac_address(struct efx_nic *efx) static int efx_ef10_mac_reconfigure(struct efx_nic *efx) { - efx_ef10_filter_sync_rx_mode(efx); + efx_mcdi_filter_sync_rx_mode(efx); return efx_mcdi_set_mac(efx); } static int efx_ef10_mac_reconfigure_vf(struct efx_nic *efx) { - efx_ef10_filter_sync_rx_mode(efx); + efx_mcdi_filter_sync_rx_mode(efx); return 0; } @@ -6337,8 +4011,8 @@ const struct efx_nic_type efx_hunt_a0_vf_nic_type = { .tx_remove = efx_mcdi_tx_remove, .tx_write = efx_ef10_tx_write, .tx_limit_len = efx_ef10_tx_limit_len, - .rx_push_rss_config = efx_ef10_vf_rx_push_rss_config, - .rx_pull_rss_config = efx_ef10_rx_pull_rss_config, + .rx_push_rss_config = efx_mcdi_vf_rx_push_rss_config, + .rx_pull_rss_config = efx_mcdi_rx_pull_rss_config, .rx_probe = efx_mcdi_rx_probe, .rx_init = efx_mcdi_rx_init, .rx_remove = efx_mcdi_rx_remove, @@ -6351,19 +4025,19 @@ const struct efx_nic_type efx_hunt_a0_vf_nic_type = { .ev_process = efx_ef10_ev_process, .ev_read_ack = efx_ef10_ev_read_ack, .ev_test_generate = efx_ef10_ev_test_generate, - .filter_table_probe = efx_ef10_filter_table_probe, - .filter_table_restore = efx_ef10_filter_table_restore, - .filter_table_remove = efx_ef10_filter_table_remove, - .filter_update_rx_scatter = efx_ef10_filter_update_rx_scatter, - .filter_insert = efx_ef10_filter_insert, - .filter_remove_safe = efx_ef10_filter_remove_safe, - .filter_get_safe = efx_ef10_filter_get_safe, - .filter_clear_rx = efx_ef10_filter_clear_rx, - .filter_count_rx_used = efx_ef10_filter_count_rx_used, - .filter_get_rx_id_limit = efx_ef10_filter_get_rx_id_limit, - .filter_get_rx_ids = efx_ef10_filter_get_rx_ids, + .filter_table_probe = efx_mcdi_filter_table_probe, + .filter_table_restore = efx_mcdi_filter_table_restore, + .filter_table_remove = efx_mcdi_filter_table_remove, + .filter_update_rx_scatter = efx_mcdi_update_rx_scatter, + .filter_insert = efx_mcdi_filter_insert, + .filter_remove_safe = efx_mcdi_filter_remove_safe, + .filter_get_safe = efx_mcdi_filter_get_safe, + .filter_clear_rx = efx_mcdi_filter_clear_rx, + .filter_count_rx_used = efx_mcdi_filter_count_rx_used, + .filter_get_rx_id_limit = efx_mcdi_filter_get_rx_id_limit, + .filter_get_rx_ids = efx_mcdi_filter_get_rx_ids, #ifdef CONFIG_RFS_ACCEL - .filter_rfs_expire_one = efx_ef10_filter_rfs_expire_one, + .filter_rfs_expire_one = efx_mcdi_filter_rfs_expire_one, #endif #ifdef CONFIG_SFC_MTD .mtd_probe = efx_port_dummy_op_int, @@ -6393,7 +4067,7 @@ const struct efx_nic_type efx_hunt_a0_vf_nic_type = { .timer_period_max = 1 << ERF_DD_EVQ_IND_TIMER_VAL_WIDTH, .offload_features = EF10_OFFLOAD_FEATURES, .mcdi_max_ver = 2, - .max_rx_ip_filters = HUNT_FILTER_TBL_ROWS, + .max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS, .hwtstamp_filters = 1 << HWTSTAMP_FILTER_NONE | 1 << HWTSTAMP_FILTER_ALL, .rx_hash_key_size = 40, @@ -6446,11 +4120,11 @@ const struct efx_nic_type efx_hunt_a0_nic_type = { .tx_remove = efx_mcdi_tx_remove, .tx_write = efx_ef10_tx_write, .tx_limit_len = efx_ef10_tx_limit_len, - .rx_push_rss_config = efx_ef10_pf_rx_push_rss_config, - .rx_pull_rss_config = efx_ef10_rx_pull_rss_config, - .rx_push_rss_context_config = efx_ef10_rx_push_rss_context_config, - .rx_pull_rss_context_config = efx_ef10_rx_pull_rss_context_config, - .rx_restore_rss_contexts = efx_ef10_rx_restore_rss_contexts, + .rx_push_rss_config = efx_mcdi_pf_rx_push_rss_config, + .rx_pull_rss_config = efx_mcdi_rx_pull_rss_config, + .rx_push_rss_context_config = efx_mcdi_rx_push_rss_context_config, + .rx_pull_rss_context_config = efx_mcdi_rx_pull_rss_context_config, + .rx_restore_rss_contexts = efx_mcdi_rx_restore_rss_contexts, .rx_probe = efx_mcdi_rx_probe, .rx_init = efx_mcdi_rx_init, .rx_remove = efx_mcdi_rx_remove, @@ -6463,19 +4137,19 @@ const struct efx_nic_type efx_hunt_a0_nic_type = { .ev_process = efx_ef10_ev_process, .ev_read_ack = efx_ef10_ev_read_ack, .ev_test_generate = efx_ef10_ev_test_generate, - .filter_table_probe = efx_ef10_filter_table_probe, - .filter_table_restore = efx_ef10_filter_table_restore, - .filter_table_remove = efx_ef10_filter_table_remove, - .filter_update_rx_scatter = efx_ef10_filter_update_rx_scatter, - .filter_insert = efx_ef10_filter_insert, - .filter_remove_safe = efx_ef10_filter_remove_safe, - .filter_get_safe = efx_ef10_filter_get_safe, - .filter_clear_rx = efx_ef10_filter_clear_rx, - .filter_count_rx_used = efx_ef10_filter_count_rx_used, - .filter_get_rx_id_limit = efx_ef10_filter_get_rx_id_limit, - .filter_get_rx_ids = efx_ef10_filter_get_rx_ids, + .filter_table_probe = efx_mcdi_filter_table_probe, + .filter_table_restore = efx_mcdi_filter_table_restore, + .filter_table_remove = efx_mcdi_filter_table_remove, + .filter_update_rx_scatter = efx_mcdi_update_rx_scatter, + .filter_insert = efx_mcdi_filter_insert, + .filter_remove_safe = efx_mcdi_filter_remove_safe, + .filter_get_safe = efx_mcdi_filter_get_safe, + .filter_clear_rx = efx_mcdi_filter_clear_rx, + .filter_count_rx_used = efx_mcdi_filter_count_rx_used, + .filter_get_rx_id_limit = efx_mcdi_filter_get_rx_id_limit, + .filter_get_rx_ids = efx_mcdi_filter_get_rx_ids, #ifdef CONFIG_RFS_ACCEL - .filter_rfs_expire_one = efx_ef10_filter_rfs_expire_one, + .filter_rfs_expire_one = efx_mcdi_filter_rfs_expire_one, #endif #ifdef CONFIG_SFC_MTD .mtd_probe = efx_ef10_mtd_probe, @@ -6528,7 +4202,7 @@ const struct efx_nic_type efx_hunt_a0_nic_type = { .timer_period_max = 1 << ERF_DD_EVQ_IND_TIMER_VAL_WIDTH, .offload_features = EF10_OFFLOAD_FEATURES, .mcdi_max_ver = 2, - .max_rx_ip_filters = HUNT_FILTER_TBL_ROWS, + .max_rx_ip_filters = EFX_MCDI_FILTER_TBL_ROWS, .hwtstamp_filters = 1 << HWTSTAMP_FILTER_NONE | 1 << HWTSTAMP_FILTER_ALL, .rx_hash_key_size = 40, diff --git a/drivers/net/ethernet/sfc/mcdi_filters.c b/drivers/net/ethernet/sfc/mcdi_filters.c new file mode 100644 index 000000000000..4310ae5bd898 --- /dev/null +++ b/drivers/net/ethernet/sfc/mcdi_filters.c @@ -0,0 +1,2270 @@ +#include "mcdi_filters.h" +#include "mcdi.h" +#include "nic.h" +#include "rx_common.h" + +/* The maximum size of a shared RSS context */ +/* TODO: this should really be from the mcdi protocol export */ +#define EFX_EF10_MAX_SHARED_RSS_CONTEXT_SIZE 64UL + +#define EFX_EF10_FILTER_ID_INVALID 0xffff + +/* An arbitrary search limit for the software hash table */ +#define EFX_EF10_FILTER_SEARCH_LIMIT 200 + +static struct efx_filter_spec * +efx_mcdi_filter_entry_spec(const struct efx_mcdi_filter_table *table, + unsigned int filter_idx) +{ + return (struct efx_filter_spec *)(table->entry[filter_idx].spec & + ~EFX_EF10_FILTER_FLAGS); +} + +static unsigned int +efx_mcdi_filter_entry_flags(const struct efx_mcdi_filter_table *table, + unsigned int filter_idx) +{ + return table->entry[filter_idx].spec & EFX_EF10_FILTER_FLAGS; +} + +static u32 efx_mcdi_filter_get_unsafe_id(u32 filter_id) +{ + WARN_ON_ONCE(filter_id == EFX_EF10_FILTER_ID_INVALID); + return filter_id & (EFX_MCDI_FILTER_TBL_ROWS - 1); +} + +static unsigned int efx_mcdi_filter_get_unsafe_pri(u32 filter_id) +{ + return filter_id / (EFX_MCDI_FILTER_TBL_ROWS * 2); +} + +static u32 efx_mcdi_filter_make_filter_id(unsigned int pri, u16 idx) +{ + return pri * EFX_MCDI_FILTER_TBL_ROWS * 2 + idx; +} + +/* + * Decide whether a filter should be exclusive or else should allow + * delivery to additional recipients. Currently we decide that + * filters for specific local unicast MAC and IP addresses are + * exclusive. + */ +static bool efx_mcdi_filter_is_exclusive(const struct efx_filter_spec *spec) +{ + if (spec->match_flags & EFX_FILTER_MATCH_LOC_MAC && + !is_multicast_ether_addr(spec->loc_mac)) + return true; + + if ((spec->match_flags & + (EFX_FILTER_MATCH_ETHER_TYPE | EFX_FILTER_MATCH_LOC_HOST)) == + (EFX_FILTER_MATCH_ETHER_TYPE | EFX_FILTER_MATCH_LOC_HOST)) { + if (spec->ether_type == htons(ETH_P_IP) && + !ipv4_is_multicast(spec->loc_host[0])) + return true; + if (spec->ether_type == htons(ETH_P_IPV6) && + ((const u8 *)spec->loc_host)[0] != 0xff) + return true; + } + + return false; +} + +static void +efx_mcdi_filter_set_entry(struct efx_mcdi_filter_table *table, + unsigned int filter_idx, + const struct efx_filter_spec *spec, + unsigned int flags) +{ + table->entry[filter_idx].spec = (unsigned long)spec | flags; +} + +static void +efx_mcdi_filter_push_prep_set_match_fields(struct efx_nic *efx, + const struct efx_filter_spec *spec, + efx_dword_t *inbuf) +{ + enum efx_encap_type encap_type = efx_filter_get_encap_type(spec); + u32 match_fields = 0, uc_match, mc_match; + + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, + efx_mcdi_filter_is_exclusive(spec) ? + MC_CMD_FILTER_OP_IN_OP_INSERT : + MC_CMD_FILTER_OP_IN_OP_SUBSCRIBE); + + /* + * Convert match flags and values. Unlike almost + * everything else in MCDI, these fields are in + * network byte order. + */ +#define COPY_VALUE(value, mcdi_field) \ + do { \ + match_fields |= \ + 1 << MC_CMD_FILTER_OP_IN_MATCH_ ## \ + mcdi_field ## _LBN; \ + BUILD_BUG_ON( \ + MC_CMD_FILTER_OP_IN_ ## mcdi_field ## _LEN < \ + sizeof(value)); \ + memcpy(MCDI_PTR(inbuf, FILTER_OP_IN_ ## mcdi_field), \ + &value, sizeof(value)); \ + } while (0) +#define COPY_FIELD(gen_flag, gen_field, mcdi_field) \ + if (spec->match_flags & EFX_FILTER_MATCH_ ## gen_flag) { \ + COPY_VALUE(spec->gen_field, mcdi_field); \ + } + /* + * Handle encap filters first. They will always be mismatch + * (unknown UC or MC) filters + */ + if (encap_type) { + /* + * ether_type and outer_ip_proto need to be variables + * because COPY_VALUE wants to memcpy them + */ + __be16 ether_type = + htons(encap_type & EFX_ENCAP_FLAG_IPV6 ? + ETH_P_IPV6 : ETH_P_IP); + u8 vni_type = MC_CMD_FILTER_OP_EXT_IN_VNI_TYPE_GENEVE; + u8 outer_ip_proto; + + switch (encap_type & EFX_ENCAP_TYPES_MASK) { + case EFX_ENCAP_TYPE_VXLAN: + vni_type = MC_CMD_FILTER_OP_EXT_IN_VNI_TYPE_VXLAN; + /* fallthrough */ + case EFX_ENCAP_TYPE_GENEVE: + COPY_VALUE(ether_type, ETHER_TYPE); + outer_ip_proto = IPPROTO_UDP; + COPY_VALUE(outer_ip_proto, IP_PROTO); + /* + * We always need to set the type field, even + * though we're not matching on the TNI. + */ + MCDI_POPULATE_DWORD_1(inbuf, + FILTER_OP_EXT_IN_VNI_OR_VSID, + FILTER_OP_EXT_IN_VNI_TYPE, + vni_type); + break; + case EFX_ENCAP_TYPE_NVGRE: + COPY_VALUE(ether_type, ETHER_TYPE); + outer_ip_proto = IPPROTO_GRE; + COPY_VALUE(outer_ip_proto, IP_PROTO); + break; + default: + WARN_ON(1); + } + + uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_UCAST_DST_LBN; + mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_MCAST_DST_LBN; + } else { + uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_UCAST_DST_LBN; + mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_MCAST_DST_LBN; + } + + if (spec->match_flags & EFX_FILTER_MATCH_LOC_MAC_IG) + match_fields |= + is_multicast_ether_addr(spec->loc_mac) ? + 1 << mc_match : + 1 << uc_match; + COPY_FIELD(REM_HOST, rem_host, SRC_IP); + COPY_FIELD(LOC_HOST, loc_host, DST_IP); + COPY_FIELD(REM_MAC, rem_mac, SRC_MAC); + COPY_FIELD(REM_PORT, rem_port, SRC_PORT); + COPY_FIELD(LOC_MAC, loc_mac, DST_MAC); + COPY_FIELD(LOC_PORT, loc_port, DST_PORT); + COPY_FIELD(ETHER_TYPE, ether_type, ETHER_TYPE); + COPY_FIELD(INNER_VID, inner_vid, INNER_VLAN); + COPY_FIELD(OUTER_VID, outer_vid, OUTER_VLAN); + COPY_FIELD(IP_PROTO, ip_proto, IP_PROTO); +#undef COPY_FIELD +#undef COPY_VALUE + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_MATCH_FIELDS, + match_fields); +} + +static void efx_mcdi_filter_push_prep(struct efx_nic *efx, + const struct efx_filter_spec *spec, + efx_dword_t *inbuf, u64 handle, + struct efx_rss_context *ctx, + bool replacing) +{ + struct efx_ef10_nic_data *nic_data = efx->nic_data; + u32 flags = spec->flags; + + memset(inbuf, 0, MC_CMD_FILTER_OP_EXT_IN_LEN); + + /* If RSS filter, caller better have given us an RSS context */ + if (flags & EFX_FILTER_FLAG_RX_RSS) { + /* + * We don't have the ability to return an error, so we'll just + * log a warning and disable RSS for the filter. + */ + if (WARN_ON_ONCE(!ctx)) + flags &= ~EFX_FILTER_FLAG_RX_RSS; + else if (WARN_ON_ONCE(ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID)) + flags &= ~EFX_FILTER_FLAG_RX_RSS; + } + + if (replacing) { + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, + MC_CMD_FILTER_OP_IN_OP_REPLACE); + MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, handle); + } else { + efx_mcdi_filter_push_prep_set_match_fields(efx, spec, inbuf); + } + + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_PORT_ID, nic_data->vport_id); + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_DEST, + spec->dmaq_id == EFX_FILTER_RX_DMAQ_ID_DROP ? + MC_CMD_FILTER_OP_IN_RX_DEST_DROP : + MC_CMD_FILTER_OP_IN_RX_DEST_HOST); + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_TX_DOMAIN, 0); + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_TX_DEST, + MC_CMD_FILTER_OP_IN_TX_DEST_DEFAULT); + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_QUEUE, + spec->dmaq_id == EFX_FILTER_RX_DMAQ_ID_DROP ? + 0 : spec->dmaq_id); + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_MODE, + (flags & EFX_FILTER_FLAG_RX_RSS) ? + MC_CMD_FILTER_OP_IN_RX_MODE_RSS : + MC_CMD_FILTER_OP_IN_RX_MODE_SIMPLE); + if (flags & EFX_FILTER_FLAG_RX_RSS) + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_RX_CONTEXT, ctx->context_id); +} + +static int efx_mcdi_filter_push(struct efx_nic *efx, + const struct efx_filter_spec *spec, u64 *handle, + struct efx_rss_context *ctx, bool replacing) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN); + MCDI_DECLARE_BUF(outbuf, MC_CMD_FILTER_OP_EXT_OUT_LEN); + size_t outlen; + int rc; + + efx_mcdi_filter_push_prep(efx, spec, inbuf, *handle, ctx, replacing); + rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, inbuf, sizeof(inbuf), + outbuf, sizeof(outbuf), &outlen); + if (rc && spec->priority != EFX_FILTER_PRI_HINT) + efx_mcdi_display_error(efx, MC_CMD_FILTER_OP, sizeof(inbuf), + outbuf, outlen, rc); + if (rc == 0) + *handle = MCDI_QWORD(outbuf, FILTER_OP_OUT_HANDLE); + if (rc == -ENOSPC) + rc = -EBUSY; /* to match efx_farch_filter_insert() */ + return rc; +} + +static u32 efx_mcdi_filter_mcdi_flags_from_spec(const struct efx_filter_spec *spec) +{ + enum efx_encap_type encap_type = efx_filter_get_encap_type(spec); + unsigned int match_flags = spec->match_flags; + unsigned int uc_match, mc_match; + u32 mcdi_flags = 0; + +#define MAP_FILTER_TO_MCDI_FLAG(gen_flag, mcdi_field, encap) { \ + unsigned int old_match_flags = match_flags; \ + match_flags &= ~EFX_FILTER_MATCH_ ## gen_flag; \ + if (match_flags != old_match_flags) \ + mcdi_flags |= \ + (1 << ((encap) ? \ + MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_ ## \ + mcdi_field ## _LBN : \ + MC_CMD_FILTER_OP_EXT_IN_MATCH_ ##\ + mcdi_field ## _LBN)); \ + } + /* inner or outer based on encap type */ + MAP_FILTER_TO_MCDI_FLAG(REM_HOST, SRC_IP, encap_type); + MAP_FILTER_TO_MCDI_FLAG(LOC_HOST, DST_IP, encap_type); + MAP_FILTER_TO_MCDI_FLAG(REM_MAC, SRC_MAC, encap_type); + MAP_FILTER_TO_MCDI_FLAG(REM_PORT, SRC_PORT, encap_type); + MAP_FILTER_TO_MCDI_FLAG(LOC_MAC, DST_MAC, encap_type); + MAP_FILTER_TO_MCDI_FLAG(LOC_PORT, DST_PORT, encap_type); + MAP_FILTER_TO_MCDI_FLAG(ETHER_TYPE, ETHER_TYPE, encap_type); + MAP_FILTER_TO_MCDI_FLAG(IP_PROTO, IP_PROTO, encap_type); + /* always outer */ + MAP_FILTER_TO_MCDI_FLAG(INNER_VID, INNER_VLAN, false); + MAP_FILTER_TO_MCDI_FLAG(OUTER_VID, OUTER_VLAN, false); +#undef MAP_FILTER_TO_MCDI_FLAG + + /* special handling for encap type, and mismatch */ + if (encap_type) { + match_flags &= ~EFX_FILTER_MATCH_ENCAP_TYPE; + mcdi_flags |= + (1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_ETHER_TYPE_LBN); + mcdi_flags |= (1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_IP_PROTO_LBN); + + uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_UCAST_DST_LBN; + mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_MCAST_DST_LBN; + } else { + uc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_UCAST_DST_LBN; + mc_match = MC_CMD_FILTER_OP_EXT_IN_MATCH_UNKNOWN_MCAST_DST_LBN; + } + + if (match_flags & EFX_FILTER_MATCH_LOC_MAC_IG) { + match_flags &= ~EFX_FILTER_MATCH_LOC_MAC_IG; + mcdi_flags |= + is_multicast_ether_addr(spec->loc_mac) ? + 1 << mc_match : + 1 << uc_match; + } + + /* Did we map them all? */ + WARN_ON_ONCE(match_flags); + + return mcdi_flags; +} + +static int efx_mcdi_filter_pri(struct efx_mcdi_filter_table *table, + const struct efx_filter_spec *spec) +{ + u32 mcdi_flags = efx_mcdi_filter_mcdi_flags_from_spec(spec); + unsigned int match_pri; + + for (match_pri = 0; + match_pri < table->rx_match_count; + match_pri++) + if (table->rx_match_mcdi_flags[match_pri] == mcdi_flags) + return match_pri; + + return -EPROTONOSUPPORT; +} + +static s32 efx_mcdi_filter_insert_locked(struct efx_nic *efx, + struct efx_filter_spec *spec, + bool replace_equal) +{ + DECLARE_BITMAP(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT); + struct efx_ef10_nic_data *nic_data = efx->nic_data; + struct efx_mcdi_filter_table *table; + struct efx_filter_spec *saved_spec; + struct efx_rss_context *ctx = NULL; + unsigned int match_pri, hash; + unsigned int priv_flags; + bool rss_locked = false; + bool replacing = false; + unsigned int depth, i; + int ins_index = -1; + DEFINE_WAIT(wait); + bool is_mc_recip; + s32 rc; + + WARN_ON(!rwsem_is_locked(&efx->filter_sem)); + table = efx->filter_state; + down_write(&table->lock); + + /* For now, only support RX filters */ + if ((spec->flags & (EFX_FILTER_FLAG_RX | EFX_FILTER_FLAG_TX)) != + EFX_FILTER_FLAG_RX) { + rc = -EINVAL; + goto out_unlock; + } + + rc = efx_mcdi_filter_pri(table, spec); + if (rc < 0) + goto out_unlock; + match_pri = rc; + + hash = efx_filter_spec_hash(spec); + is_mc_recip = efx_filter_is_mc_recipient(spec); + if (is_mc_recip) + bitmap_zero(mc_rem_map, EFX_EF10_FILTER_SEARCH_LIMIT); + + if (spec->flags & EFX_FILTER_FLAG_RX_RSS) { + mutex_lock(&efx->rss_lock); + rss_locked = true; + if (spec->rss_context) + ctx = efx_find_rss_context_entry(efx, spec->rss_context); + else + ctx = &efx->rss_context; + if (!ctx) { + rc = -ENOENT; + goto out_unlock; + } + if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) { + rc = -EOPNOTSUPP; + goto out_unlock; + } + } + + /* Find any existing filters with the same match tuple or + * else a free slot to insert at. + */ + for (depth = 1; depth < EFX_EF10_FILTER_SEARCH_LIMIT; depth++) { + i = (hash + depth) & (EFX_MCDI_FILTER_TBL_ROWS - 1); + saved_spec = efx_mcdi_filter_entry_spec(table, i); + + if (!saved_spec) { + if (ins_index < 0) + ins_index = i; + } else if (efx_filter_spec_equal(spec, saved_spec)) { + if (spec->priority < saved_spec->priority && + spec->priority != EFX_FILTER_PRI_AUTO) { + rc = -EPERM; + goto out_unlock; + } + if (!is_mc_recip) { + /* This is the only one */ + if (spec->priority == + saved_spec->priority && + !replace_equal) { + rc = -EEXIST; + goto out_unlock; + } + ins_index = i; + break; + } else if (spec->priority > + saved_spec->priority || + (spec->priority == + saved_spec->priority && + replace_equal)) { + if (ins_index < 0) + ins_index = i; + else + __set_bit(depth, mc_rem_map); + } + } + } + + /* Once we reach the maximum search depth, use the first suitable + * slot, or return -EBUSY if there was none + */ + if (ins_index < 0) { + rc = -EBUSY; + goto out_unlock; + } + + /* Create a software table entry if necessary. */ + saved_spec = efx_mcdi_filter_entry_spec(table, ins_index); + if (saved_spec) { + if (spec->priority == EFX_FILTER_PRI_AUTO && + saved_spec->priority >= EFX_FILTER_PRI_AUTO) { + /* Just make sure it won't be removed */ + if (saved_spec->priority > EFX_FILTER_PRI_AUTO) + saved_spec->flags |= EFX_FILTER_FLAG_RX_OVER_AUTO; + table->entry[ins_index].spec &= + ~EFX_EF10_FILTER_FLAG_AUTO_OLD; + rc = ins_index; + goto out_unlock; + } + replacing = true; + priv_flags = efx_mcdi_filter_entry_flags(table, ins_index); + } else { + saved_spec = kmalloc(sizeof(*spec), GFP_ATOMIC); + if (!saved_spec) { + rc = -ENOMEM; + goto out_unlock; + } + *saved_spec = *spec; + priv_flags = 0; + } + efx_mcdi_filter_set_entry(table, ins_index, saved_spec, priv_flags); + + /* Actually insert the filter on the HW */ + rc = efx_mcdi_filter_push(efx, spec, &table->entry[ins_index].handle, + ctx, replacing); + + if (rc == -EINVAL && nic_data->must_realloc_vis) + /* The MC rebooted under us, causing it to reject our filter + * insertion as pointing to an invalid VI (spec->dmaq_id). + */ + rc = -EAGAIN; + + /* Finalise the software table entry */ + if (rc == 0) { + if (replacing) { + /* Update the fields that may differ */ + if (saved_spec->priority == EFX_FILTER_PRI_AUTO) + saved_spec->flags |= + EFX_FILTER_FLAG_RX_OVER_AUTO; + saved_spec->priority = spec->priority; + saved_spec->flags &= EFX_FILTER_FLAG_RX_OVER_AUTO; + saved_spec->flags |= spec->flags; + saved_spec->rss_context = spec->rss_context; + saved_spec->dmaq_id = spec->dmaq_id; + } + } else if (!replacing) { + kfree(saved_spec); + saved_spec = NULL; + } else { + /* We failed to replace, so the old filter is still present. + * Roll back the software table to reflect this. In fact the + * efx_mcdi_filter_set_entry() call below will do the right + * thing, so nothing extra is needed here. + */ + } + efx_mcdi_filter_set_entry(table, ins_index, saved_spec, priv_flags); + + /* Remove and finalise entries for lower-priority multicast + * recipients + */ + if (is_mc_recip) { + MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN); + unsigned int depth, i; + + memset(inbuf, 0, sizeof(inbuf)); + + for (depth = 0; depth < EFX_EF10_FILTER_SEARCH_LIMIT; depth++) { + if (!test_bit(depth, mc_rem_map)) + continue; + + i = (hash + depth) & (EFX_MCDI_FILTER_TBL_ROWS - 1); + saved_spec = efx_mcdi_filter_entry_spec(table, i); + priv_flags = efx_mcdi_filter_entry_flags(table, i); + + if (rc == 0) { + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, + MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE); + MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, + table->entry[i].handle); + rc = efx_mcdi_rpc(efx, MC_CMD_FILTER_OP, + inbuf, sizeof(inbuf), + NULL, 0, NULL); + } + + if (rc == 0) { + kfree(saved_spec); + saved_spec = NULL; + priv_flags = 0; + } + efx_mcdi_filter_set_entry(table, i, saved_spec, + priv_flags); + } + } + + /* If successful, return the inserted filter ID */ + if (rc == 0) + rc = efx_mcdi_filter_make_filter_id(match_pri, ins_index); + +out_unlock: + if (rss_locked) + mutex_unlock(&efx->rss_lock); + up_write(&table->lock); + return rc; +} + +s32 efx_mcdi_filter_insert(struct efx_nic *efx, struct efx_filter_spec *spec, + bool replace_equal) +{ + s32 ret; + + down_read(&efx->filter_sem); + ret = efx_mcdi_filter_insert_locked(efx, spec, replace_equal); + up_read(&efx->filter_sem); + + return ret; +} + +/* + * Remove a filter. + * If !by_index, remove by ID + * If by_index, remove by index + * Filter ID may come from userland and must be range-checked. + * Caller must hold efx->filter_sem for read, and efx->filter_state->lock + * for write. + */ +static int efx_mcdi_filter_remove_internal(struct efx_nic *efx, + unsigned int priority_mask, + u32 filter_id, bool by_index) +{ + unsigned int filter_idx = efx_mcdi_filter_get_unsafe_id(filter_id); + struct efx_mcdi_filter_table *table = efx->filter_state; + MCDI_DECLARE_BUF(inbuf, + MC_CMD_FILTER_OP_IN_HANDLE_OFST + + MC_CMD_FILTER_OP_IN_HANDLE_LEN); + struct efx_filter_spec *spec; + DEFINE_WAIT(wait); + int rc; + + spec = efx_mcdi_filter_entry_spec(table, filter_idx); + if (!spec || + (!by_index && + efx_mcdi_filter_pri(table, spec) != + efx_mcdi_filter_get_unsafe_pri(filter_id))) + return -ENOENT; + + if (spec->flags & EFX_FILTER_FLAG_RX_OVER_AUTO && + priority_mask == (1U << EFX_FILTER_PRI_AUTO)) { + /* Just remove flags */ + spec->flags &= ~EFX_FILTER_FLAG_RX_OVER_AUTO; + table->entry[filter_idx].spec &= ~EFX_EF10_FILTER_FLAG_AUTO_OLD; + return 0; + } + + if (!(priority_mask & (1U << spec->priority))) + return -ENOENT; + + if (spec->flags & EFX_FILTER_FLAG_RX_OVER_AUTO) { + /* Reset to an automatic filter */ + + struct efx_filter_spec new_spec = *spec; + + new_spec.priority = EFX_FILTER_PRI_AUTO; + new_spec.flags = (EFX_FILTER_FLAG_RX | + (efx_rss_active(&efx->rss_context) ? + EFX_FILTER_FLAG_RX_RSS : 0)); + new_spec.dmaq_id = 0; + new_spec.rss_context = 0; + rc = efx_mcdi_filter_push(efx, &new_spec, + &table->entry[filter_idx].handle, + &efx->rss_context, + true); + + if (rc == 0) + *spec = new_spec; + } else { + /* Really remove the filter */ + + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, + efx_mcdi_filter_is_exclusive(spec) ? + MC_CMD_FILTER_OP_IN_OP_REMOVE : + MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE); + MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, + table->entry[filter_idx].handle); + rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, + inbuf, sizeof(inbuf), NULL, 0, NULL); + + if ((rc == 0) || (rc == -ENOENT)) { + /* Filter removed OK or didn't actually exist */ + kfree(spec); + efx_mcdi_filter_set_entry(table, filter_idx, NULL, 0); + } else { + efx_mcdi_display_error(efx, MC_CMD_FILTER_OP, + MC_CMD_FILTER_OP_EXT_IN_LEN, + NULL, 0, rc); + } + } + + return rc; +} + +/* Remove filters that weren't renewed. */ +static void efx_mcdi_filter_remove_old(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + int remove_failed = 0; + int remove_noent = 0; + int rc; + int i; + + down_write(&table->lock); + for (i = 0; i < EFX_MCDI_FILTER_TBL_ROWS; i++) { + if (READ_ONCE(table->entry[i].spec) & + EFX_EF10_FILTER_FLAG_AUTO_OLD) { + rc = efx_mcdi_filter_remove_internal(efx, + 1U << EFX_FILTER_PRI_AUTO, i, true); + if (rc == -ENOENT) + remove_noent++; + else if (rc) + remove_failed++; + } + } + up_write(&table->lock); + + if (remove_failed) + netif_info(efx, drv, efx->net_dev, + "%s: failed to remove %d filters\n", + __func__, remove_failed); + if (remove_noent) + netif_info(efx, drv, efx->net_dev, + "%s: failed to remove %d non-existent filters\n", + __func__, remove_noent); +} + +int efx_mcdi_filter_remove_safe(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 filter_id) +{ + struct efx_mcdi_filter_table *table; + int rc; + + down_read(&efx->filter_sem); + table = efx->filter_state; + down_write(&table->lock); + rc = efx_mcdi_filter_remove_internal(efx, 1U << priority, filter_id, + false); + up_write(&table->lock); + up_read(&efx->filter_sem); + return rc; +} + +/* Caller must hold efx->filter_sem for read */ +static void efx_mcdi_filter_remove_unsafe(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 filter_id) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + + if (filter_id == EFX_EF10_FILTER_ID_INVALID) + return; + + down_write(&table->lock); + efx_mcdi_filter_remove_internal(efx, 1U << priority, filter_id, + true); + up_write(&table->lock); +} + +int efx_mcdi_filter_get_safe(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 filter_id, struct efx_filter_spec *spec) +{ + unsigned int filter_idx = efx_mcdi_filter_get_unsafe_id(filter_id); + const struct efx_filter_spec *saved_spec; + struct efx_mcdi_filter_table *table; + int rc; + + down_read(&efx->filter_sem); + table = efx->filter_state; + down_read(&table->lock); + saved_spec = efx_mcdi_filter_entry_spec(table, filter_idx); + if (saved_spec && saved_spec->priority == priority && + efx_mcdi_filter_pri(table, saved_spec) == + efx_mcdi_filter_get_unsafe_pri(filter_id)) { + *spec = *saved_spec; + rc = 0; + } else { + rc = -ENOENT; + } + up_read(&table->lock); + up_read(&efx->filter_sem); + return rc; +} + +static int efx_mcdi_filter_insert_addr_list(struct efx_nic *efx, + struct efx_mcdi_filter_vlan *vlan, + bool multicast, bool rollback) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_mcdi_dev_addr *addr_list; + enum efx_filter_flags filter_flags; + struct efx_filter_spec spec; + u8 baddr[ETH_ALEN]; + unsigned int i, j; + int addr_count; + u16 *ids; + int rc; + + if (multicast) { + addr_list = table->dev_mc_list; + addr_count = table->dev_mc_count; + ids = vlan->mc; + } else { + addr_list = table->dev_uc_list; + addr_count = table->dev_uc_count; + ids = vlan->uc; + } + + filter_flags = efx_rss_active(&efx->rss_context) ? EFX_FILTER_FLAG_RX_RSS : 0; + + /* Insert/renew filters */ + for (i = 0; i < addr_count; i++) { + EFX_WARN_ON_PARANOID(ids[i] != EFX_EF10_FILTER_ID_INVALID); + efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0); + efx_filter_set_eth_local(&spec, vlan->vid, addr_list[i].addr); + rc = efx_mcdi_filter_insert_locked(efx, &spec, true); + if (rc < 0) { + if (rollback) { + netif_info(efx, drv, efx->net_dev, + "efx_mcdi_filter_insert failed rc=%d\n", + rc); + /* Fall back to promiscuous */ + for (j = 0; j < i; j++) { + efx_mcdi_filter_remove_unsafe( + efx, EFX_FILTER_PRI_AUTO, + ids[j]); + ids[j] = EFX_EF10_FILTER_ID_INVALID; + } + return rc; + } else { + /* keep invalid ID, and carry on */ + } + } else { + ids[i] = efx_mcdi_filter_get_unsafe_id(rc); + } + } + + if (multicast && rollback) { + /* Also need an Ethernet broadcast filter */ + EFX_WARN_ON_PARANOID(vlan->default_filters[EFX_EF10_BCAST] != + EFX_EF10_FILTER_ID_INVALID); + efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0); + eth_broadcast_addr(baddr); + efx_filter_set_eth_local(&spec, vlan->vid, baddr); + rc = efx_mcdi_filter_insert_locked(efx, &spec, true); + if (rc < 0) { + netif_warn(efx, drv, efx->net_dev, + "Broadcast filter insert failed rc=%d\n", rc); + /* Fall back to promiscuous */ + for (j = 0; j < i; j++) { + efx_mcdi_filter_remove_unsafe( + efx, EFX_FILTER_PRI_AUTO, + ids[j]); + ids[j] = EFX_EF10_FILTER_ID_INVALID; + } + return rc; + } else { + vlan->default_filters[EFX_EF10_BCAST] = + efx_mcdi_filter_get_unsafe_id(rc); + } + } + + return 0; +} + +static int efx_mcdi_filter_insert_def(struct efx_nic *efx, + struct efx_mcdi_filter_vlan *vlan, + enum efx_encap_type encap_type, + bool multicast, bool rollback) +{ + struct efx_ef10_nic_data *nic_data = efx->nic_data; + enum efx_filter_flags filter_flags; + struct efx_filter_spec spec; + u8 baddr[ETH_ALEN]; + int rc; + u16 *id; + + filter_flags = efx_rss_active(&efx->rss_context) ? EFX_FILTER_FLAG_RX_RSS : 0; + + efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, filter_flags, 0); + + if (multicast) + efx_filter_set_mc_def(&spec); + else + efx_filter_set_uc_def(&spec); + + if (encap_type) { + if (nic_data->datapath_caps & + (1 << MC_CMD_GET_CAPABILITIES_OUT_VXLAN_NVGRE_LBN)) + efx_filter_set_encap_type(&spec, encap_type); + else + /* + * don't insert encap filters on non-supporting + * platforms. ID will be left as INVALID. + */ + return 0; + } + + if (vlan->vid != EFX_FILTER_VID_UNSPEC) + efx_filter_set_eth_local(&spec, vlan->vid, NULL); + + rc = efx_mcdi_filter_insert_locked(efx, &spec, true); + if (rc < 0) { + const char *um = multicast ? "Multicast" : "Unicast"; + const char *encap_name = ""; + const char *encap_ipv = ""; + + if ((encap_type & EFX_ENCAP_TYPES_MASK) == + EFX_ENCAP_TYPE_VXLAN) + encap_name = "VXLAN "; + else if ((encap_type & EFX_ENCAP_TYPES_MASK) == + EFX_ENCAP_TYPE_NVGRE) + encap_name = "NVGRE "; + else if ((encap_type & EFX_ENCAP_TYPES_MASK) == + EFX_ENCAP_TYPE_GENEVE) + encap_name = "GENEVE "; + if (encap_type & EFX_ENCAP_FLAG_IPV6) + encap_ipv = "IPv6 "; + else if (encap_type) + encap_ipv = "IPv4 "; + + /* + * unprivileged functions can't insert mismatch filters + * for encapsulated or unicast traffic, so downgrade + * those warnings to debug. + */ + netif_cond_dbg(efx, drv, efx->net_dev, + rc == -EPERM && (encap_type || !multicast), warn, + "%s%s%s mismatch filter insert failed rc=%d\n", + encap_name, encap_ipv, um, rc); + } else if (multicast) { + /* mapping from encap types to default filter IDs (multicast) */ + static enum efx_mcdi_filter_default_filters map[] = { + [EFX_ENCAP_TYPE_NONE] = EFX_EF10_MCDEF, + [EFX_ENCAP_TYPE_VXLAN] = EFX_EF10_VXLAN4_MCDEF, + [EFX_ENCAP_TYPE_NVGRE] = EFX_EF10_NVGRE4_MCDEF, + [EFX_ENCAP_TYPE_GENEVE] = EFX_EF10_GENEVE4_MCDEF, + [EFX_ENCAP_TYPE_VXLAN | EFX_ENCAP_FLAG_IPV6] = + EFX_EF10_VXLAN6_MCDEF, + [EFX_ENCAP_TYPE_NVGRE | EFX_ENCAP_FLAG_IPV6] = + EFX_EF10_NVGRE6_MCDEF, + [EFX_ENCAP_TYPE_GENEVE | EFX_ENCAP_FLAG_IPV6] = + EFX_EF10_GENEVE6_MCDEF, + }; + + /* quick bounds check (BCAST result impossible) */ + BUILD_BUG_ON(EFX_EF10_BCAST != 0); + if (encap_type >= ARRAY_SIZE(map) || map[encap_type] == 0) { + WARN_ON(1); + return -EINVAL; + } + /* then follow map */ + id = &vlan->default_filters[map[encap_type]]; + + EFX_WARN_ON_PARANOID(*id != EFX_EF10_FILTER_ID_INVALID); + *id = efx_mcdi_filter_get_unsafe_id(rc); + if (!nic_data->workaround_26807 && !encap_type) { + /* Also need an Ethernet broadcast filter */ + efx_filter_init_rx(&spec, EFX_FILTER_PRI_AUTO, + filter_flags, 0); + eth_broadcast_addr(baddr); + efx_filter_set_eth_local(&spec, vlan->vid, baddr); + rc = efx_mcdi_filter_insert_locked(efx, &spec, true); + if (rc < 0) { + netif_warn(efx, drv, efx->net_dev, + "Broadcast filter insert failed rc=%d\n", + rc); + if (rollback) { + /* Roll back the mc_def filter */ + efx_mcdi_filter_remove_unsafe( + efx, EFX_FILTER_PRI_AUTO, + *id); + *id = EFX_EF10_FILTER_ID_INVALID; + return rc; + } + } else { + EFX_WARN_ON_PARANOID( + vlan->default_filters[EFX_EF10_BCAST] != + EFX_EF10_FILTER_ID_INVALID); + vlan->default_filters[EFX_EF10_BCAST] = + efx_mcdi_filter_get_unsafe_id(rc); + } + } + rc = 0; + } else { + /* mapping from encap types to default filter IDs (unicast) */ + static enum efx_mcdi_filter_default_filters map[] = { + [EFX_ENCAP_TYPE_NONE] = EFX_EF10_UCDEF, + [EFX_ENCAP_TYPE_VXLAN] = EFX_EF10_VXLAN4_UCDEF, + [EFX_ENCAP_TYPE_NVGRE] = EFX_EF10_NVGRE4_UCDEF, + [EFX_ENCAP_TYPE_GENEVE] = EFX_EF10_GENEVE4_UCDEF, + [EFX_ENCAP_TYPE_VXLAN | EFX_ENCAP_FLAG_IPV6] = + EFX_EF10_VXLAN6_UCDEF, + [EFX_ENCAP_TYPE_NVGRE | EFX_ENCAP_FLAG_IPV6] = + EFX_EF10_NVGRE6_UCDEF, + [EFX_ENCAP_TYPE_GENEVE | EFX_ENCAP_FLAG_IPV6] = + EFX_EF10_GENEVE6_UCDEF, + }; + + /* quick bounds check (BCAST result impossible) */ + BUILD_BUG_ON(EFX_EF10_BCAST != 0); + if (encap_type >= ARRAY_SIZE(map) || map[encap_type] == 0) { + WARN_ON(1); + return -EINVAL; + } + /* then follow map */ + id = &vlan->default_filters[map[encap_type]]; + EFX_WARN_ON_PARANOID(*id != EFX_EF10_FILTER_ID_INVALID); + *id = rc; + rc = 0; + } + return rc; +} + +/* + * Caller must hold efx->filter_sem for read if race against + * efx_mcdi_filter_table_remove() is possible + */ +static void efx_mcdi_filter_vlan_sync_rx_mode(struct efx_nic *efx, + struct efx_mcdi_filter_vlan *vlan) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_ef10_nic_data *nic_data = efx->nic_data; + + /* + * Do not install unspecified VID if VLAN filtering is enabled. + * Do not install all specified VIDs if VLAN filtering is disabled. + */ + if ((vlan->vid == EFX_FILTER_VID_UNSPEC) == table->vlan_filter) + return; + + /* Insert/renew unicast filters */ + if (table->uc_promisc) { + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NONE, + false, false); + efx_mcdi_filter_insert_addr_list(efx, vlan, false, false); + } else { + /* + * If any of the filters failed to insert, fall back to + * promiscuous mode - add in the uc_def filter. But keep + * our individual unicast filters. + */ + if (efx_mcdi_filter_insert_addr_list(efx, vlan, false, false)) + efx_mcdi_filter_insert_def(efx, vlan, + EFX_ENCAP_TYPE_NONE, + false, false); + } + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN, + false, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN | + EFX_ENCAP_FLAG_IPV6, + false, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE, + false, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE | + EFX_ENCAP_FLAG_IPV6, + false, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE, + false, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE | + EFX_ENCAP_FLAG_IPV6, + false, false); + + /* + * Insert/renew multicast filters + * + * If changing promiscuous state with cascaded multicast filters, remove + * old filters first, so that packets are dropped rather than duplicated + */ + if (nic_data->workaround_26807 && + table->mc_promisc_last != table->mc_promisc) + efx_mcdi_filter_remove_old(efx); + if (table->mc_promisc) { + if (nic_data->workaround_26807) { + /* + * If we failed to insert promiscuous filters, rollback + * and fall back to individual multicast filters + */ + if (efx_mcdi_filter_insert_def(efx, vlan, + EFX_ENCAP_TYPE_NONE, + true, true)) { + /* Changing promisc state, so remove old filters */ + efx_mcdi_filter_remove_old(efx); + efx_mcdi_filter_insert_addr_list(efx, vlan, + true, false); + } + } else { + /* + * If we failed to insert promiscuous filters, don't + * rollback. Regardless, also insert the mc_list, + * unless it's incomplete due to overflow + */ + efx_mcdi_filter_insert_def(efx, vlan, + EFX_ENCAP_TYPE_NONE, + true, false); + if (!table->mc_overflow) + efx_mcdi_filter_insert_addr_list(efx, vlan, + true, false); + } + } else { + /* + * If any filters failed to insert, rollback and fall back to + * promiscuous mode - mc_def filter and maybe broadcast. If + * that fails, roll back again and insert as many of our + * individual multicast filters as we can. + */ + if (efx_mcdi_filter_insert_addr_list(efx, vlan, true, true)) { + /* Changing promisc state, so remove old filters */ + if (nic_data->workaround_26807) + efx_mcdi_filter_remove_old(efx); + if (efx_mcdi_filter_insert_def(efx, vlan, + EFX_ENCAP_TYPE_NONE, + true, true)) + efx_mcdi_filter_insert_addr_list(efx, vlan, + true, false); + } + } + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN, + true, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_VXLAN | + EFX_ENCAP_FLAG_IPV6, + true, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE, + true, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_NVGRE | + EFX_ENCAP_FLAG_IPV6, + true, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE, + true, false); + efx_mcdi_filter_insert_def(efx, vlan, EFX_ENCAP_TYPE_GENEVE | + EFX_ENCAP_FLAG_IPV6, + true, false); +} + +int efx_mcdi_filter_clear_rx(struct efx_nic *efx, + enum efx_filter_priority priority) +{ + struct efx_mcdi_filter_table *table; + unsigned int priority_mask; + unsigned int i; + int rc; + + priority_mask = (((1U << (priority + 1)) - 1) & + ~(1U << EFX_FILTER_PRI_AUTO)); + + down_read(&efx->filter_sem); + table = efx->filter_state; + down_write(&table->lock); + for (i = 0; i < EFX_MCDI_FILTER_TBL_ROWS; i++) { + rc = efx_mcdi_filter_remove_internal(efx, priority_mask, + i, true); + if (rc && rc != -ENOENT) + break; + rc = 0; + } + + up_write(&table->lock); + up_read(&efx->filter_sem); + return rc; +} + +u32 efx_mcdi_filter_count_rx_used(struct efx_nic *efx, + enum efx_filter_priority priority) +{ + struct efx_mcdi_filter_table *table; + unsigned int filter_idx; + s32 count = 0; + + down_read(&efx->filter_sem); + table = efx->filter_state; + down_read(&table->lock); + for (filter_idx = 0; filter_idx < EFX_MCDI_FILTER_TBL_ROWS; filter_idx++) { + if (table->entry[filter_idx].spec && + efx_mcdi_filter_entry_spec(table, filter_idx)->priority == + priority) + ++count; + } + up_read(&table->lock); + up_read(&efx->filter_sem); + return count; +} + +u32 efx_mcdi_filter_get_rx_id_limit(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + + return table->rx_match_count * EFX_MCDI_FILTER_TBL_ROWS * 2; +} + +s32 efx_mcdi_filter_get_rx_ids(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 *buf, u32 size) +{ + struct efx_mcdi_filter_table *table; + struct efx_filter_spec *spec; + unsigned int filter_idx; + s32 count = 0; + + down_read(&efx->filter_sem); + table = efx->filter_state; + down_read(&table->lock); + + for (filter_idx = 0; filter_idx < EFX_MCDI_FILTER_TBL_ROWS; filter_idx++) { + spec = efx_mcdi_filter_entry_spec(table, filter_idx); + if (spec && spec->priority == priority) { + if (count == size) { + count = -EMSGSIZE; + break; + } + buf[count++] = + efx_mcdi_filter_make_filter_id( + efx_mcdi_filter_pri(table, spec), + filter_idx); + } + } + up_read(&table->lock); + up_read(&efx->filter_sem); + return count; +} + +static int efx_mcdi_filter_match_flags_from_mcdi(bool encap, u32 mcdi_flags) +{ + int match_flags = 0; + +#define MAP_FLAG(gen_flag, mcdi_field) do { \ + u32 old_mcdi_flags = mcdi_flags; \ + mcdi_flags &= ~(1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_ ## \ + mcdi_field ## _LBN); \ + if (mcdi_flags != old_mcdi_flags) \ + match_flags |= EFX_FILTER_MATCH_ ## gen_flag; \ + } while (0) + + if (encap) { + /* encap filters must specify encap type */ + match_flags |= EFX_FILTER_MATCH_ENCAP_TYPE; + /* and imply ethertype and ip proto */ + mcdi_flags &= + ~(1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_IP_PROTO_LBN); + mcdi_flags &= + ~(1 << MC_CMD_FILTER_OP_EXT_IN_MATCH_ETHER_TYPE_LBN); + /* VLAN tags refer to the outer packet */ + MAP_FLAG(INNER_VID, INNER_VLAN); + MAP_FLAG(OUTER_VID, OUTER_VLAN); + /* everything else refers to the inner packet */ + MAP_FLAG(LOC_MAC_IG, IFRM_UNKNOWN_UCAST_DST); + MAP_FLAG(LOC_MAC_IG, IFRM_UNKNOWN_MCAST_DST); + MAP_FLAG(REM_HOST, IFRM_SRC_IP); + MAP_FLAG(LOC_HOST, IFRM_DST_IP); + MAP_FLAG(REM_MAC, IFRM_SRC_MAC); + MAP_FLAG(REM_PORT, IFRM_SRC_PORT); + MAP_FLAG(LOC_MAC, IFRM_DST_MAC); + MAP_FLAG(LOC_PORT, IFRM_DST_PORT); + MAP_FLAG(ETHER_TYPE, IFRM_ETHER_TYPE); + MAP_FLAG(IP_PROTO, IFRM_IP_PROTO); + } else { + MAP_FLAG(LOC_MAC_IG, UNKNOWN_UCAST_DST); + MAP_FLAG(LOC_MAC_IG, UNKNOWN_MCAST_DST); + MAP_FLAG(REM_HOST, SRC_IP); + MAP_FLAG(LOC_HOST, DST_IP); + MAP_FLAG(REM_MAC, SRC_MAC); + MAP_FLAG(REM_PORT, SRC_PORT); + MAP_FLAG(LOC_MAC, DST_MAC); + MAP_FLAG(LOC_PORT, DST_PORT); + MAP_FLAG(ETHER_TYPE, ETHER_TYPE); + MAP_FLAG(INNER_VID, INNER_VLAN); + MAP_FLAG(OUTER_VID, OUTER_VLAN); + MAP_FLAG(IP_PROTO, IP_PROTO); + } +#undef MAP_FLAG + + /* Did we map them all? */ + if (mcdi_flags) + return -EINVAL; + + return match_flags; +} + +bool efx_mcdi_filter_match_supported(struct efx_mcdi_filter_table *table, + bool encap, + enum efx_filter_match_flags match_flags) +{ + unsigned int match_pri; + int mf; + + for (match_pri = 0; + match_pri < table->rx_match_count; + match_pri++) { + mf = efx_mcdi_filter_match_flags_from_mcdi(encap, + table->rx_match_mcdi_flags[match_pri]); + if (mf == match_flags) + return true; + } + + return false; +} + +static int +efx_mcdi_filter_table_probe_matches(struct efx_nic *efx, + struct efx_mcdi_filter_table *table, + bool encap) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_GET_PARSER_DISP_INFO_IN_LEN); + MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_PARSER_DISP_INFO_OUT_LENMAX); + unsigned int pd_match_pri, pd_match_count; + size_t outlen; + int rc; + + /* Find out which RX filter types are supported, and their priorities */ + MCDI_SET_DWORD(inbuf, GET_PARSER_DISP_INFO_IN_OP, + encap ? + MC_CMD_GET_PARSER_DISP_INFO_IN_OP_GET_SUPPORTED_ENCAP_RX_MATCHES : + MC_CMD_GET_PARSER_DISP_INFO_IN_OP_GET_SUPPORTED_RX_MATCHES); + rc = efx_mcdi_rpc(efx, MC_CMD_GET_PARSER_DISP_INFO, + inbuf, sizeof(inbuf), outbuf, sizeof(outbuf), + &outlen); + if (rc) + return rc; + + pd_match_count = MCDI_VAR_ARRAY_LEN( + outlen, GET_PARSER_DISP_INFO_OUT_SUPPORTED_MATCHES); + + for (pd_match_pri = 0; pd_match_pri < pd_match_count; pd_match_pri++) { + u32 mcdi_flags = + MCDI_ARRAY_DWORD( + outbuf, + GET_PARSER_DISP_INFO_OUT_SUPPORTED_MATCHES, + pd_match_pri); + rc = efx_mcdi_filter_match_flags_from_mcdi(encap, mcdi_flags); + if (rc < 0) { + netif_dbg(efx, probe, efx->net_dev, + "%s: fw flags %#x pri %u not supported in driver\n", + __func__, mcdi_flags, pd_match_pri); + } else { + netif_dbg(efx, probe, efx->net_dev, + "%s: fw flags %#x pri %u supported as driver flags %#x pri %u\n", + __func__, mcdi_flags, pd_match_pri, + rc, table->rx_match_count); + table->rx_match_mcdi_flags[table->rx_match_count] = mcdi_flags; + table->rx_match_count++; + } + } + + return 0; +} + +int efx_mcdi_filter_table_probe(struct efx_nic *efx) +{ + struct efx_ef10_nic_data *nic_data = efx->nic_data; + struct net_device *net_dev = efx->net_dev; + struct efx_mcdi_filter_table *table; + struct efx_mcdi_filter_vlan *vlan; + int rc; + + if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) + return -EINVAL; + + if (efx->filter_state) /* already probed */ + return 0; + + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + table->rx_match_count = 0; + rc = efx_mcdi_filter_table_probe_matches(efx, table, false); + if (rc) + goto fail; + if (nic_data->datapath_caps & + (1 << MC_CMD_GET_CAPABILITIES_OUT_VXLAN_NVGRE_LBN)) + rc = efx_mcdi_filter_table_probe_matches(efx, table, true); + if (rc) + goto fail; + if ((efx_supported_features(efx) & NETIF_F_HW_VLAN_CTAG_FILTER) && + !(efx_mcdi_filter_match_supported(table, false, + (EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_LOC_MAC)) && + efx_mcdi_filter_match_supported(table, false, + (EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_LOC_MAC_IG)))) { + netif_info(efx, probe, net_dev, + "VLAN filters are not supported in this firmware variant\n"); + net_dev->features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; + efx->fixed_features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; + net_dev->hw_features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; + } + + table->entry = vzalloc(array_size(EFX_MCDI_FILTER_TBL_ROWS, + sizeof(*table->entry))); + if (!table->entry) { + rc = -ENOMEM; + goto fail; + } + + table->mc_promisc_last = false; + table->vlan_filter = + !!(efx->net_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER); + INIT_LIST_HEAD(&table->vlan_list); + init_rwsem(&table->lock); + + efx->filter_state = table; + + list_for_each_entry(vlan, &nic_data->vlan_list, list) { + rc = efx_mcdi_filter_add_vlan(efx, vlan->vid); + if (rc) + goto fail_add_vlan; + } + + return 0; + +fail_add_vlan: + efx_mcdi_filter_cleanup_vlans(efx); + efx->filter_state = NULL; +fail: + kfree(table); + return rc; +} + +/* + * Caller must hold efx->filter_sem for read if race against + * efx_mcdi_filter_table_remove() is possible + */ +void efx_mcdi_filter_table_restore(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_ef10_nic_data *nic_data = efx->nic_data; + unsigned int invalid_filters = 0, failed = 0; + struct efx_mcdi_filter_vlan *vlan; + struct efx_filter_spec *spec; + struct efx_rss_context *ctx; + unsigned int filter_idx; + u32 mcdi_flags; + int match_pri; + int rc, i; + + WARN_ON(!rwsem_is_locked(&efx->filter_sem)); + + if (!nic_data->must_restore_filters) + return; + + if (!table) + return; + + down_write(&table->lock); + mutex_lock(&efx->rss_lock); + + for (filter_idx = 0; filter_idx < EFX_MCDI_FILTER_TBL_ROWS; filter_idx++) { + spec = efx_mcdi_filter_entry_spec(table, filter_idx); + if (!spec) + continue; + + mcdi_flags = efx_mcdi_filter_mcdi_flags_from_spec(spec); + match_pri = 0; + while (match_pri < table->rx_match_count && + table->rx_match_mcdi_flags[match_pri] != mcdi_flags) + ++match_pri; + if (match_pri >= table->rx_match_count) { + invalid_filters++; + goto not_restored; + } + if (spec->rss_context) + ctx = efx_find_rss_context_entry(efx, spec->rss_context); + else + ctx = &efx->rss_context; + if (spec->flags & EFX_FILTER_FLAG_RX_RSS) { + if (!ctx) { + netif_warn(efx, drv, efx->net_dev, + "Warning: unable to restore a filter with nonexistent RSS context %u.\n", + spec->rss_context); + invalid_filters++; + goto not_restored; + } + if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) { + netif_warn(efx, drv, efx->net_dev, + "Warning: unable to restore a filter with RSS context %u as it was not created.\n", + spec->rss_context); + invalid_filters++; + goto not_restored; + } + } + + rc = efx_mcdi_filter_push(efx, spec, + &table->entry[filter_idx].handle, + ctx, false); + if (rc) + failed++; + + if (rc) { +not_restored: + list_for_each_entry(vlan, &table->vlan_list, list) + for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; ++i) + if (vlan->default_filters[i] == filter_idx) + vlan->default_filters[i] = + EFX_EF10_FILTER_ID_INVALID; + + kfree(spec); + efx_mcdi_filter_set_entry(table, filter_idx, NULL, 0); + } + } + + mutex_unlock(&efx->rss_lock); + up_write(&table->lock); + + /* + * This can happen validly if the MC's capabilities have changed, so + * is not an error. + */ + if (invalid_filters) + netif_dbg(efx, drv, efx->net_dev, + "Did not restore %u filters that are now unsupported.\n", + invalid_filters); + + if (failed) + netif_err(efx, hw, efx->net_dev, + "unable to restore %u filters\n", failed); + else + nic_data->must_restore_filters = false; +} + +void efx_mcdi_filter_table_remove(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + MCDI_DECLARE_BUF(inbuf, MC_CMD_FILTER_OP_EXT_IN_LEN); + struct efx_filter_spec *spec; + unsigned int filter_idx; + int rc; + + efx_mcdi_filter_cleanup_vlans(efx); + efx->filter_state = NULL; + /* + * If we were called without locking, then it's not safe to free + * the table as others might be using it. So we just WARN, leak + * the memory, and potentially get an inconsistent filter table + * state. + * This should never actually happen. + */ + if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) + return; + + if (!table) + return; + + for (filter_idx = 0; filter_idx < EFX_MCDI_FILTER_TBL_ROWS; filter_idx++) { + spec = efx_mcdi_filter_entry_spec(table, filter_idx); + if (!spec) + continue; + + MCDI_SET_DWORD(inbuf, FILTER_OP_IN_OP, + efx_mcdi_filter_is_exclusive(spec) ? + MC_CMD_FILTER_OP_IN_OP_REMOVE : + MC_CMD_FILTER_OP_IN_OP_UNSUBSCRIBE); + MCDI_SET_QWORD(inbuf, FILTER_OP_IN_HANDLE, + table->entry[filter_idx].handle); + rc = efx_mcdi_rpc_quiet(efx, MC_CMD_FILTER_OP, inbuf, + sizeof(inbuf), NULL, 0, NULL); + if (rc) + netif_info(efx, drv, efx->net_dev, + "%s: filter %04x remove failed\n", + __func__, filter_idx); + kfree(spec); + } + + vfree(table->entry); + kfree(table); +} + +static void efx_mcdi_filter_mark_one_old(struct efx_nic *efx, uint16_t *id) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + unsigned int filter_idx; + + efx_rwsem_assert_write_locked(&table->lock); + + if (*id != EFX_EF10_FILTER_ID_INVALID) { + filter_idx = efx_mcdi_filter_get_unsafe_id(*id); + if (!table->entry[filter_idx].spec) + netif_dbg(efx, drv, efx->net_dev, + "marked null spec old %04x:%04x\n", *id, + filter_idx); + table->entry[filter_idx].spec |= EFX_EF10_FILTER_FLAG_AUTO_OLD; + *id = EFX_EF10_FILTER_ID_INVALID; + } +} + +/* Mark old per-VLAN filters that may need to be removed */ +static void _efx_mcdi_filter_vlan_mark_old(struct efx_nic *efx, + struct efx_mcdi_filter_vlan *vlan) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + unsigned int i; + + for (i = 0; i < table->dev_uc_count; i++) + efx_mcdi_filter_mark_one_old(efx, &vlan->uc[i]); + for (i = 0; i < table->dev_mc_count; i++) + efx_mcdi_filter_mark_one_old(efx, &vlan->mc[i]); + for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; i++) + efx_mcdi_filter_mark_one_old(efx, &vlan->default_filters[i]); +} + +/* + * Mark old filters that may need to be removed. + * Caller must hold efx->filter_sem for read if race against + * efx_mcdi_filter_table_remove() is possible + */ +static void efx_mcdi_filter_mark_old(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_mcdi_filter_vlan *vlan; + + down_write(&table->lock); + list_for_each_entry(vlan, &table->vlan_list, list) + _efx_mcdi_filter_vlan_mark_old(efx, vlan); + up_write(&table->lock); +} + +int efx_mcdi_filter_add_vlan(struct efx_nic *efx, u16 vid) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_mcdi_filter_vlan *vlan; + unsigned int i; + + if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) + return -EINVAL; + + vlan = efx_mcdi_filter_find_vlan(efx, vid); + if (WARN_ON(vlan)) { + netif_err(efx, drv, efx->net_dev, + "VLAN %u already added\n", vid); + return -EALREADY; + } + + vlan = kzalloc(sizeof(*vlan), GFP_KERNEL); + if (!vlan) + return -ENOMEM; + + vlan->vid = vid; + + for (i = 0; i < ARRAY_SIZE(vlan->uc); i++) + vlan->uc[i] = EFX_EF10_FILTER_ID_INVALID; + for (i = 0; i < ARRAY_SIZE(vlan->mc); i++) + vlan->mc[i] = EFX_EF10_FILTER_ID_INVALID; + for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; i++) + vlan->default_filters[i] = EFX_EF10_FILTER_ID_INVALID; + + list_add_tail(&vlan->list, &table->vlan_list); + + if (efx_dev_registered(efx)) + efx_mcdi_filter_vlan_sync_rx_mode(efx, vlan); + + return 0; +} + +static void efx_mcdi_filter_del_vlan_internal(struct efx_nic *efx, + struct efx_mcdi_filter_vlan *vlan) +{ + unsigned int i; + + /* See comment in efx_mcdi_filter_table_remove() */ + if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) + return; + + list_del(&vlan->list); + + for (i = 0; i < ARRAY_SIZE(vlan->uc); i++) + efx_mcdi_filter_remove_unsafe(efx, EFX_FILTER_PRI_AUTO, + vlan->uc[i]); + for (i = 0; i < ARRAY_SIZE(vlan->mc); i++) + efx_mcdi_filter_remove_unsafe(efx, EFX_FILTER_PRI_AUTO, + vlan->mc[i]); + for (i = 0; i < EFX_EF10_NUM_DEFAULT_FILTERS; i++) + if (vlan->default_filters[i] != EFX_EF10_FILTER_ID_INVALID) + efx_mcdi_filter_remove_unsafe(efx, EFX_FILTER_PRI_AUTO, + vlan->default_filters[i]); + + kfree(vlan); +} + +void efx_mcdi_filter_del_vlan(struct efx_nic *efx, u16 vid) +{ + struct efx_mcdi_filter_vlan *vlan; + + /* See comment in efx_mcdi_filter_table_remove() */ + if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) + return; + + vlan = efx_mcdi_filter_find_vlan(efx, vid); + if (!vlan) { + netif_err(efx, drv, efx->net_dev, + "VLAN %u not found in filter state\n", vid); + return; + } + + efx_mcdi_filter_del_vlan_internal(efx, vlan); +} + +struct efx_mcdi_filter_vlan *efx_mcdi_filter_find_vlan(struct efx_nic *efx, + u16 vid) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_mcdi_filter_vlan *vlan; + + WARN_ON(!rwsem_is_locked(&efx->filter_sem)); + + list_for_each_entry(vlan, &table->vlan_list, list) { + if (vlan->vid == vid) + return vlan; + } + + return NULL; +} + +void efx_mcdi_filter_cleanup_vlans(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct efx_mcdi_filter_vlan *vlan, *next_vlan; + + /* See comment in efx_mcdi_filter_table_remove() */ + if (!efx_rwsem_assert_write_locked(&efx->filter_sem)) + return; + + if (!table) + return; + + list_for_each_entry_safe(vlan, next_vlan, &table->vlan_list, list) + efx_mcdi_filter_del_vlan_internal(efx, vlan); +} + +static void efx_mcdi_filter_uc_addr_list(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct net_device *net_dev = efx->net_dev; + struct netdev_hw_addr *uc; + unsigned int i; + + table->uc_promisc = !!(net_dev->flags & IFF_PROMISC); + ether_addr_copy(table->dev_uc_list[0].addr, net_dev->dev_addr); + i = 1; + netdev_for_each_uc_addr(uc, net_dev) { + if (i >= EFX_EF10_FILTER_DEV_UC_MAX) { + table->uc_promisc = true; + break; + } + ether_addr_copy(table->dev_uc_list[i].addr, uc->addr); + i++; + } + + table->dev_uc_count = i; +} + +static void efx_mcdi_filter_mc_addr_list(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct net_device *net_dev = efx->net_dev; + struct netdev_hw_addr *mc; + unsigned int i; + + table->mc_overflow = false; + table->mc_promisc = !!(net_dev->flags & (IFF_PROMISC | IFF_ALLMULTI)); + + i = 0; + netdev_for_each_mc_addr(mc, net_dev) { + if (i >= EFX_EF10_FILTER_DEV_MC_MAX) { + table->mc_promisc = true; + table->mc_overflow = true; + break; + } + ether_addr_copy(table->dev_mc_list[i].addr, mc->addr); + i++; + } + + table->dev_mc_count = i; +} + +/* + * Caller must hold efx->filter_sem for read if race against + * efx_mcdi_filter_table_remove() is possible + */ +void efx_mcdi_filter_sync_rx_mode(struct efx_nic *efx) +{ + struct efx_mcdi_filter_table *table = efx->filter_state; + struct net_device *net_dev = efx->net_dev; + struct efx_mcdi_filter_vlan *vlan; + bool vlan_filter; + + if (!efx_dev_registered(efx)) + return; + + if (!table) + return; + + efx_mcdi_filter_mark_old(efx); + + /* + * Copy/convert the address lists; add the primary station + * address and broadcast address + */ + netif_addr_lock_bh(net_dev); + efx_mcdi_filter_uc_addr_list(efx); + efx_mcdi_filter_mc_addr_list(efx); + netif_addr_unlock_bh(net_dev); + + /* + * If VLAN filtering changes, all old filters are finally removed. + * Do it in advance to avoid conflicts for unicast untagged and + * VLAN 0 tagged filters. + */ + vlan_filter = !!(net_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER); + if (table->vlan_filter != vlan_filter) { + table->vlan_filter = vlan_filter; + efx_mcdi_filter_remove_old(efx); + } + + list_for_each_entry(vlan, &table->vlan_list, list) + efx_mcdi_filter_vlan_sync_rx_mode(efx, vlan); + + efx_mcdi_filter_remove_old(efx); + table->mc_promisc_last = table->mc_promisc; +} + +#ifdef CONFIG_RFS_ACCEL + +bool efx_mcdi_filter_rfs_expire_one(struct efx_nic *efx, u32 flow_id, + unsigned int filter_idx) +{ + struct efx_filter_spec *spec, saved_spec; + struct efx_mcdi_filter_table *table; + struct efx_arfs_rule *rule = NULL; + bool ret = true, force = false; + u16 arfs_id; + + down_read(&efx->filter_sem); + table = efx->filter_state; + down_write(&table->lock); + spec = efx_mcdi_filter_entry_spec(table, filter_idx); + + if (!spec || spec->priority != EFX_FILTER_PRI_HINT) + goto out_unlock; + + spin_lock_bh(&efx->rps_hash_lock); + if (!efx->rps_hash_table) { + /* In the absence of the table, we always return 0 to ARFS. */ + arfs_id = 0; + } else { + rule = efx_rps_hash_find(efx, spec); + if (!rule) + /* ARFS table doesn't know of this filter, so remove it */ + goto expire; + arfs_id = rule->arfs_id; + ret = efx_rps_check_rule(rule, filter_idx, &force); + if (force) + goto expire; + if (!ret) { + spin_unlock_bh(&efx->rps_hash_lock); + goto out_unlock; + } + } + if (!rps_may_expire_flow(efx->net_dev, spec->dmaq_id, flow_id, arfs_id)) + ret = false; + else if (rule) + rule->filter_id = EFX_ARFS_FILTER_ID_REMOVING; +expire: + saved_spec = *spec; /* remove operation will kfree spec */ + spin_unlock_bh(&efx->rps_hash_lock); + /* + * At this point (since we dropped the lock), another thread might queue + * up a fresh insertion request (but the actual insertion will be held + * up by our possession of the filter table lock). In that case, it + * will set rule->filter_id to EFX_ARFS_FILTER_ID_PENDING, meaning that + * the rule is not removed by efx_rps_hash_del() below. + */ + if (ret) + ret = efx_mcdi_filter_remove_internal(efx, 1U << spec->priority, + filter_idx, true) == 0; + /* + * While we can't safely dereference rule (we dropped the lock), we can + * still test it for NULL. + */ + if (ret && rule) { + /* Expiring, so remove entry from ARFS table */ + spin_lock_bh(&efx->rps_hash_lock); + efx_rps_hash_del(efx, &saved_spec); + spin_unlock_bh(&efx->rps_hash_lock); + } +out_unlock: + up_write(&table->lock); + up_read(&efx->filter_sem); + return ret; +} + +#endif /* CONFIG_RFS_ACCEL */ + +#define RSS_MODE_HASH_ADDRS (1 << RSS_MODE_HASH_SRC_ADDR_LBN |\ + 1 << RSS_MODE_HASH_DST_ADDR_LBN) +#define RSS_MODE_HASH_PORTS (1 << RSS_MODE_HASH_SRC_PORT_LBN |\ + 1 << RSS_MODE_HASH_DST_PORT_LBN) +#define RSS_CONTEXT_FLAGS_DEFAULT (1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_IPV4_EN_LBN |\ + 1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_TCPV4_EN_LBN |\ + 1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_IPV6_EN_LBN |\ + 1 << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TOEPLITZ_TCPV6_EN_LBN |\ + (RSS_MODE_HASH_ADDRS | RSS_MODE_HASH_PORTS) << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TCP_IPV4_RSS_MODE_LBN |\ + RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV4_RSS_MODE_LBN |\ + RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_OTHER_IPV4_RSS_MODE_LBN |\ + (RSS_MODE_HASH_ADDRS | RSS_MODE_HASH_PORTS) << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_TCP_IPV6_RSS_MODE_LBN |\ + RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV6_RSS_MODE_LBN |\ + RSS_MODE_HASH_ADDRS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_OTHER_IPV6_RSS_MODE_LBN) + +int efx_mcdi_get_rss_context_flags(struct efx_nic *efx, u32 context, u32 *flags) +{ + /* + * Firmware had a bug (sfc bug 61952) where it would not actually + * fill in the flags field in the response to MC_CMD_RSS_CONTEXT_GET_FLAGS. + * This meant that it would always contain whatever was previously + * in the MCDI buffer. Fortunately, all firmware versions with + * this bug have the same default flags value for a newly-allocated + * RSS context, and the only time we want to get the flags is just + * after allocating. Moreover, the response has a 32-bit hole + * where the context ID would be in the request, so we can use an + * overlength buffer in the request and pre-fill the flags field + * with what we believe the default to be. Thus if the firmware + * has the bug, it will leave our pre-filled value in the flags + * field of the response, and we will get the right answer. + * + * However, this does mean that this function should NOT be used if + * the RSS context flags might not be their defaults - it is ONLY + * reliably correct for a newly-allocated RSS context. + */ + MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_LEN); + MCDI_DECLARE_BUF(outbuf, MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_LEN); + size_t outlen; + int rc; + + /* Check we have a hole for the context ID */ + BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_GET_FLAGS_IN_LEN != MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_FLAGS_OFST); + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_FLAGS_IN_RSS_CONTEXT_ID, context); + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_FLAGS_OUT_FLAGS, + RSS_CONTEXT_FLAGS_DEFAULT); + rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_FLAGS, inbuf, + sizeof(inbuf), outbuf, sizeof(outbuf), &outlen); + if (rc == 0) { + if (outlen < MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_LEN) + rc = -EIO; + else + *flags = MCDI_DWORD(outbuf, RSS_CONTEXT_GET_FLAGS_OUT_FLAGS); + } + return rc; +} + +/* + * Attempt to enable 4-tuple UDP hashing on the specified RSS context. + * If we fail, we just leave the RSS context at its default hash settings, + * which is safe but may slightly reduce performance. + * Defaults are 4-tuple for TCP and 2-tuple for UDP and other-IP, so we + * just need to set the UDP ports flags (for both IP versions). + */ +void efx_mcdi_set_rss_context_flags(struct efx_nic *efx, + struct efx_rss_context *ctx) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_SET_FLAGS_IN_LEN); + u32 flags; + + BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_SET_FLAGS_OUT_LEN != 0); + + if (efx_mcdi_get_rss_context_flags(efx, ctx->context_id, &flags) != 0) + return; + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_RSS_CONTEXT_ID, + ctx->context_id); + flags |= RSS_MODE_HASH_PORTS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV4_RSS_MODE_LBN; + flags |= RSS_MODE_HASH_PORTS << MC_CMD_RSS_CONTEXT_GET_FLAGS_OUT_UDP_IPV6_RSS_MODE_LBN; + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_SET_FLAGS_IN_FLAGS, flags); + if (!efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_FLAGS, inbuf, sizeof(inbuf), + NULL, 0, NULL)) + /* Succeeded, so UDP 4-tuple is now enabled */ + ctx->rx_hash_udp_4tuple = true; +} + +static int efx_mcdi_filter_alloc_rss_context(struct efx_nic *efx, bool exclusive, + struct efx_rss_context *ctx, + unsigned *context_size) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_ALLOC_IN_LEN); + MCDI_DECLARE_BUF(outbuf, MC_CMD_RSS_CONTEXT_ALLOC_OUT_LEN); + struct efx_ef10_nic_data *nic_data = efx->nic_data; + size_t outlen; + int rc; + u32 alloc_type = exclusive ? + MC_CMD_RSS_CONTEXT_ALLOC_IN_TYPE_EXCLUSIVE : + MC_CMD_RSS_CONTEXT_ALLOC_IN_TYPE_SHARED; + unsigned rss_spread = exclusive ? + efx->rss_spread : + min(rounddown_pow_of_two(efx->rss_spread), + EFX_EF10_MAX_SHARED_RSS_CONTEXT_SIZE); + + if (!exclusive && rss_spread == 1) { + ctx->context_id = EFX_MCDI_RSS_CONTEXT_INVALID; + if (context_size) + *context_size = 1; + return 0; + } + + if (nic_data->datapath_caps & + 1 << MC_CMD_GET_CAPABILITIES_OUT_RX_RSS_LIMITED_LBN) + return -EOPNOTSUPP; + + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_ALLOC_IN_UPSTREAM_PORT_ID, + nic_data->vport_id); + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_ALLOC_IN_TYPE, alloc_type); + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_ALLOC_IN_NUM_QUEUES, rss_spread); + + rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_ALLOC, inbuf, sizeof(inbuf), + outbuf, sizeof(outbuf), &outlen); + if (rc != 0) + return rc; + + if (outlen < MC_CMD_RSS_CONTEXT_ALLOC_OUT_LEN) + return -EIO; + + ctx->context_id = MCDI_DWORD(outbuf, RSS_CONTEXT_ALLOC_OUT_RSS_CONTEXT_ID); + + if (context_size) + *context_size = rss_spread; + + if (nic_data->datapath_caps & + 1 << MC_CMD_GET_CAPABILITIES_OUT_ADDITIONAL_RSS_MODES_LBN) + efx_mcdi_set_rss_context_flags(efx, ctx); + + return 0; +} + +static int efx_mcdi_filter_free_rss_context(struct efx_nic *efx, u32 context) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_FREE_IN_LEN); + + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_FREE_IN_RSS_CONTEXT_ID, + context); + return efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_FREE, inbuf, sizeof(inbuf), + NULL, 0, NULL); +} + +static int efx_mcdi_filter_populate_rss_table(struct efx_nic *efx, u32 context, + const u32 *rx_indir_table, const u8 *key) +{ + MCDI_DECLARE_BUF(tablebuf, MC_CMD_RSS_CONTEXT_SET_TABLE_IN_LEN); + MCDI_DECLARE_BUF(keybuf, MC_CMD_RSS_CONTEXT_SET_KEY_IN_LEN); + int i, rc; + + MCDI_SET_DWORD(tablebuf, RSS_CONTEXT_SET_TABLE_IN_RSS_CONTEXT_ID, + context); + BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_indir_table) != + MC_CMD_RSS_CONTEXT_SET_TABLE_IN_INDIRECTION_TABLE_LEN); + + /* This iterates over the length of efx->rss_context.rx_indir_table, but + * copies bytes from rx_indir_table. That's because the latter is a + * pointer rather than an array, but should have the same length. + * The efx->rss_context.rx_hash_key loop below is similar. + */ + for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_indir_table); ++i) + MCDI_PTR(tablebuf, + RSS_CONTEXT_SET_TABLE_IN_INDIRECTION_TABLE)[i] = + (u8) rx_indir_table[i]; + + rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_TABLE, tablebuf, + sizeof(tablebuf), NULL, 0, NULL); + if (rc != 0) + return rc; + + MCDI_SET_DWORD(keybuf, RSS_CONTEXT_SET_KEY_IN_RSS_CONTEXT_ID, + context); + BUILD_BUG_ON(ARRAY_SIZE(efx->rss_context.rx_hash_key) != + MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN); + for (i = 0; i < ARRAY_SIZE(efx->rss_context.rx_hash_key); ++i) + MCDI_PTR(keybuf, RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY)[i] = key[i]; + + return efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_SET_KEY, keybuf, + sizeof(keybuf), NULL, 0, NULL); +} + +void efx_mcdi_rx_free_indir_table(struct efx_nic *efx) +{ + int rc; + + if (efx->rss_context.context_id != EFX_MCDI_RSS_CONTEXT_INVALID) { + rc = efx_mcdi_filter_free_rss_context(efx, efx->rss_context.context_id); + WARN_ON(rc != 0); + } + efx->rss_context.context_id = EFX_MCDI_RSS_CONTEXT_INVALID; +} + +static int efx_mcdi_filter_rx_push_shared_rss_config(struct efx_nic *efx, + unsigned *context_size) +{ + struct efx_ef10_nic_data *nic_data = efx->nic_data; + int rc = efx_mcdi_filter_alloc_rss_context(efx, false, &efx->rss_context, + context_size); + + if (rc != 0) + return rc; + + nic_data->rx_rss_context_exclusive = false; + efx_set_default_rx_indir_table(efx, &efx->rss_context); + return 0; +} + +static int efx_mcdi_filter_rx_push_exclusive_rss_config(struct efx_nic *efx, + const u32 *rx_indir_table, + const u8 *key) +{ + u32 old_rx_rss_context = efx->rss_context.context_id; + struct efx_ef10_nic_data *nic_data = efx->nic_data; + int rc; + + if (efx->rss_context.context_id == EFX_MCDI_RSS_CONTEXT_INVALID || + !nic_data->rx_rss_context_exclusive) { + rc = efx_mcdi_filter_alloc_rss_context(efx, true, &efx->rss_context, + NULL); + if (rc == -EOPNOTSUPP) + return rc; + else if (rc != 0) + goto fail1; + } + + rc = efx_mcdi_filter_populate_rss_table(efx, efx->rss_context.context_id, + rx_indir_table, key); + if (rc != 0) + goto fail2; + + if (efx->rss_context.context_id != old_rx_rss_context && + old_rx_rss_context != EFX_MCDI_RSS_CONTEXT_INVALID) + WARN_ON(efx_mcdi_filter_free_rss_context(efx, old_rx_rss_context) != 0); + nic_data->rx_rss_context_exclusive = true; + if (rx_indir_table != efx->rss_context.rx_indir_table) + memcpy(efx->rss_context.rx_indir_table, rx_indir_table, + sizeof(efx->rss_context.rx_indir_table)); + if (key != efx->rss_context.rx_hash_key) + memcpy(efx->rss_context.rx_hash_key, key, + efx->type->rx_hash_key_size); + + return 0; + +fail2: + if (old_rx_rss_context != efx->rss_context.context_id) { + WARN_ON(efx_mcdi_filter_free_rss_context(efx, efx->rss_context.context_id) != 0); + efx->rss_context.context_id = old_rx_rss_context; + } +fail1: + netif_err(efx, hw, efx->net_dev, "%s: failed rc=%d\n", __func__, rc); + return rc; +} + +int efx_mcdi_rx_push_rss_context_config(struct efx_nic *efx, + struct efx_rss_context *ctx, + const u32 *rx_indir_table, + const u8 *key) +{ + int rc; + + WARN_ON(!mutex_is_locked(&efx->rss_lock)); + + if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) { + rc = efx_mcdi_filter_alloc_rss_context(efx, true, ctx, NULL); + if (rc) + return rc; + } + + if (!rx_indir_table) /* Delete this context */ + return efx_mcdi_filter_free_rss_context(efx, ctx->context_id); + + rc = efx_mcdi_filter_populate_rss_table(efx, ctx->context_id, + rx_indir_table, key); + if (rc) + return rc; + + memcpy(ctx->rx_indir_table, rx_indir_table, + sizeof(efx->rss_context.rx_indir_table)); + memcpy(ctx->rx_hash_key, key, efx->type->rx_hash_key_size); + + return 0; +} + +int efx_mcdi_rx_pull_rss_context_config(struct efx_nic *efx, + struct efx_rss_context *ctx) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN); + MCDI_DECLARE_BUF(tablebuf, MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_LEN); + MCDI_DECLARE_BUF(keybuf, MC_CMD_RSS_CONTEXT_GET_KEY_OUT_LEN); + size_t outlen; + int rc, i; + + WARN_ON(!mutex_is_locked(&efx->rss_lock)); + + BUILD_BUG_ON(MC_CMD_RSS_CONTEXT_GET_TABLE_IN_LEN != + MC_CMD_RSS_CONTEXT_GET_KEY_IN_LEN); + + if (ctx->context_id == EFX_MCDI_RSS_CONTEXT_INVALID) + return -ENOENT; + + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_TABLE_IN_RSS_CONTEXT_ID, + ctx->context_id); + BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_indir_table) != + MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE_LEN); + rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_TABLE, inbuf, sizeof(inbuf), + tablebuf, sizeof(tablebuf), &outlen); + if (rc != 0) + return rc; + + if (WARN_ON(outlen != MC_CMD_RSS_CONTEXT_GET_TABLE_OUT_LEN)) + return -EIO; + + for (i = 0; i < ARRAY_SIZE(ctx->rx_indir_table); i++) + ctx->rx_indir_table[i] = MCDI_PTR(tablebuf, + RSS_CONTEXT_GET_TABLE_OUT_INDIRECTION_TABLE)[i]; + + MCDI_SET_DWORD(inbuf, RSS_CONTEXT_GET_KEY_IN_RSS_CONTEXT_ID, + ctx->context_id); + BUILD_BUG_ON(ARRAY_SIZE(ctx->rx_hash_key) != + MC_CMD_RSS_CONTEXT_SET_KEY_IN_TOEPLITZ_KEY_LEN); + rc = efx_mcdi_rpc(efx, MC_CMD_RSS_CONTEXT_GET_KEY, inbuf, sizeof(inbuf), + keybuf, sizeof(keybuf), &outlen); + if (rc != 0) + return rc; + + if (WARN_ON(outlen != MC_CMD_RSS_CONTEXT_GET_KEY_OUT_LEN)) + return -EIO; + + for (i = 0; i < ARRAY_SIZE(ctx->rx_hash_key); ++i) + ctx->rx_hash_key[i] = MCDI_PTR( + keybuf, RSS_CONTEXT_GET_KEY_OUT_TOEPLITZ_KEY)[i]; + + return 0; +} + +int efx_mcdi_rx_pull_rss_config(struct efx_nic *efx) +{ + int rc; + + mutex_lock(&efx->rss_lock); + rc = efx_mcdi_rx_pull_rss_context_config(efx, &efx->rss_context); + mutex_unlock(&efx->rss_lock); + return rc; +} + +void efx_mcdi_rx_restore_rss_contexts(struct efx_nic *efx) +{ + struct efx_ef10_nic_data *nic_data = efx->nic_data; + struct efx_rss_context *ctx; + int rc; + + WARN_ON(!mutex_is_locked(&efx->rss_lock)); + + if (!nic_data->must_restore_rss_contexts) + return; + + list_for_each_entry(ctx, &efx->rss_context.list, list) { + /* previous NIC RSS context is gone */ + ctx->context_id = EFX_MCDI_RSS_CONTEXT_INVALID; + /* so try to allocate a new one */ + rc = efx_mcdi_rx_push_rss_context_config(efx, ctx, + ctx->rx_indir_table, + ctx->rx_hash_key); + if (rc) + netif_warn(efx, probe, efx->net_dev, + "failed to restore RSS context %u, rc=%d" + "; RSS filters may fail to be applied\n", + ctx->user_id, rc); + } + nic_data->must_restore_rss_contexts = false; +} + +int efx_mcdi_pf_rx_push_rss_config(struct efx_nic *efx, bool user, + const u32 *rx_indir_table, + const u8 *key) +{ + int rc; + + if (efx->rss_spread == 1) + return 0; + + if (!key) + key = efx->rss_context.rx_hash_key; + + rc = efx_mcdi_filter_rx_push_exclusive_rss_config(efx, rx_indir_table, key); + + if (rc == -ENOBUFS && !user) { + unsigned context_size; + bool mismatch = false; + size_t i; + + for (i = 0; + i < ARRAY_SIZE(efx->rss_context.rx_indir_table) && !mismatch; + i++) + mismatch = rx_indir_table[i] != + ethtool_rxfh_indir_default(i, efx->rss_spread); + + rc = efx_mcdi_filter_rx_push_shared_rss_config(efx, &context_size); + if (rc == 0) { + if (context_size != efx->rss_spread) + netif_warn(efx, probe, efx->net_dev, + "Could not allocate an exclusive RSS" + " context; allocated a shared one of" + " different size." + " Wanted %u, got %u.\n", + efx->rss_spread, context_size); + else if (mismatch) + netif_warn(efx, probe, efx->net_dev, + "Could not allocate an exclusive RSS" + " context; allocated a shared one but" + " could not apply custom" + " indirection.\n"); + else + netif_info(efx, probe, efx->net_dev, + "Could not allocate an exclusive RSS" + " context; allocated a shared one.\n"); + } + } + return rc; +} + +int efx_mcdi_vf_rx_push_rss_config(struct efx_nic *efx, bool user, + const u32 *rx_indir_table + __attribute__ ((unused)), + const u8 *key + __attribute__ ((unused))) +{ + if (user) + return -EOPNOTSUPP; + if (efx->rss_context.context_id != EFX_MCDI_RSS_CONTEXT_INVALID) + return 0; + return efx_mcdi_filter_rx_push_shared_rss_config(efx, NULL); +} diff --git a/drivers/net/ethernet/sfc/mcdi_filters.h b/drivers/net/ethernet/sfc/mcdi_filters.h new file mode 100644 index 000000000000..1837f4f5d661 --- /dev/null +++ b/drivers/net/ethernet/sfc/mcdi_filters.h @@ -0,0 +1,159 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/**************************************************************************** + * Driver for Solarflare network controllers and boards + * Copyright 2019 Solarflare Communications Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, incorporated herein by reference. + */ +#ifndef EFX_MCDI_FILTERS_H +#define EFX_MCDI_FILTERS_H + +#include "net_driver.h" +#include "filter.h" +#include "mcdi_pcol.h" + +#define EFX_EF10_FILTER_DEV_UC_MAX 32 +#define EFX_EF10_FILTER_DEV_MC_MAX 256 + +enum efx_mcdi_filter_default_filters { + EFX_EF10_BCAST, + EFX_EF10_UCDEF, + EFX_EF10_MCDEF, + EFX_EF10_VXLAN4_UCDEF, + EFX_EF10_VXLAN4_MCDEF, + EFX_EF10_VXLAN6_UCDEF, + EFX_EF10_VXLAN6_MCDEF, + EFX_EF10_NVGRE4_UCDEF, + EFX_EF10_NVGRE4_MCDEF, + EFX_EF10_NVGRE6_UCDEF, + EFX_EF10_NVGRE6_MCDEF, + EFX_EF10_GENEVE4_UCDEF, + EFX_EF10_GENEVE4_MCDEF, + EFX_EF10_GENEVE6_UCDEF, + EFX_EF10_GENEVE6_MCDEF, + + EFX_EF10_NUM_DEFAULT_FILTERS +}; + +/* Per-VLAN filters information */ +struct efx_mcdi_filter_vlan { + struct list_head list; + u16 vid; + u16 uc[EFX_EF10_FILTER_DEV_UC_MAX]; + u16 mc[EFX_EF10_FILTER_DEV_MC_MAX]; + u16 default_filters[EFX_EF10_NUM_DEFAULT_FILTERS]; +}; + +struct efx_mcdi_dev_addr { + u8 addr[ETH_ALEN]; +}; + +struct efx_mcdi_filter_table { +/* The MCDI match masks supported by this fw & hw, in order of priority */ + u32 rx_match_mcdi_flags[ + MC_CMD_GET_PARSER_DISP_INFO_OUT_SUPPORTED_MATCHES_MAXNUM * 2]; + unsigned int rx_match_count; + + struct rw_semaphore lock; /* Protects entries */ + struct { + unsigned long spec; /* pointer to spec plus flag bits */ +/* AUTO_OLD is used to mark and sweep MAC filters for the device address lists. */ +/* unused flag 1UL */ +#define EFX_EF10_FILTER_FLAG_AUTO_OLD 2UL +#define EFX_EF10_FILTER_FLAGS 3UL + u64 handle; /* firmware handle */ + } *entry; +/* Shadow of net_device address lists, guarded by mac_lock */ + struct efx_mcdi_dev_addr dev_uc_list[EFX_EF10_FILTER_DEV_UC_MAX]; + struct efx_mcdi_dev_addr dev_mc_list[EFX_EF10_FILTER_DEV_MC_MAX]; + int dev_uc_count; + int dev_mc_count; + bool uc_promisc; + bool mc_promisc; +/* Whether in multicast promiscuous mode when last changed */ + bool mc_promisc_last; + bool mc_overflow; /* Too many MC addrs; should always imply mc_promisc */ + bool vlan_filter; + struct list_head vlan_list; +}; + +int efx_mcdi_filter_table_probe(struct efx_nic *efx); +void efx_mcdi_filter_table_remove(struct efx_nic *efx); +void efx_mcdi_filter_table_restore(struct efx_nic *efx); + +/* + * The filter table(s) are managed by firmware and we have write-only + * access. When removing filters we must identify them to the + * firmware by a 64-bit handle, but this is too wide for Linux kernel + * interfaces (32-bit for RX NFC, 16-bit for RFS). Also, we need to + * be able to tell in advance whether a requested insertion will + * replace an existing filter. Therefore we maintain a software hash + * table, which should be at least as large as the hardware hash + * table. + * + * Huntington has a single 8K filter table shared between all filter + * types and both ports. + */ +#define EFX_MCDI_FILTER_TBL_ROWS 8192 + +bool efx_mcdi_filter_match_supported(struct efx_mcdi_filter_table *table, + bool encap, + enum efx_filter_match_flags match_flags); + +void efx_mcdi_filter_sync_rx_mode(struct efx_nic *efx); +s32 efx_mcdi_filter_insert(struct efx_nic *efx, struct efx_filter_spec *spec, + bool replace_equal); +int efx_mcdi_filter_remove_safe(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 filter_id); +int efx_mcdi_filter_get_safe(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 filter_id, struct efx_filter_spec *spec); + +u32 efx_mcdi_filter_count_rx_used(struct efx_nic *efx, + enum efx_filter_priority priority); +int efx_mcdi_filter_clear_rx(struct efx_nic *efx, + enum efx_filter_priority priority); +u32 efx_mcdi_filter_get_rx_id_limit(struct efx_nic *efx); +s32 efx_mcdi_filter_get_rx_ids(struct efx_nic *efx, + enum efx_filter_priority priority, + u32 *buf, u32 size); + +void efx_mcdi_filter_cleanup_vlans(struct efx_nic *efx); +int efx_mcdi_filter_add_vlan(struct efx_nic *efx, u16 vid); +struct efx_mcdi_filter_vlan *efx_mcdi_filter_find_vlan(struct efx_nic *efx, u16 vid); +void efx_mcdi_filter_del_vlan(struct efx_nic *efx, u16 vid); + +void efx_mcdi_rx_free_indir_table(struct efx_nic *efx); +int efx_mcdi_rx_push_rss_context_config(struct efx_nic *efx, + struct efx_rss_context *ctx, + const u32 *rx_indir_table, + const u8 *key); +int efx_mcdi_pf_rx_push_rss_config(struct efx_nic *efx, bool user, + const u32 *rx_indir_table, + const u8 *key); +int efx_mcdi_vf_rx_push_rss_config(struct efx_nic *efx, bool user, + const u32 *rx_indir_table + __attribute__ ((unused)), + const u8 *key + __attribute__ ((unused))); +int efx_mcdi_rx_pull_rss_config(struct efx_nic *efx); +int efx_mcdi_rx_pull_rss_context_config(struct efx_nic *efx, + struct efx_rss_context *ctx); +int efx_mcdi_get_rss_context_flags(struct efx_nic *efx, u32 context, + u32 *flags); +void efx_mcdi_set_rss_context_flags(struct efx_nic *efx, + struct efx_rss_context *ctx); +void efx_mcdi_rx_restore_rss_contexts(struct efx_nic *efx); + +static inline void efx_mcdi_update_rx_scatter(struct efx_nic *efx) +{ + /* no need to do anything here */ +} + +bool efx_mcdi_filter_rfs_expire_one(struct efx_nic *efx, u32 flow_id, + unsigned int filter_idx); + +#endif diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index 495f6cfdbd91..e8224b543dfc 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -935,7 +935,6 @@ static int netsec_process_rx(struct netsec_priv *priv, int budget) struct netsec_rx_pkt_info rx_info; enum dma_data_direction dma_dir; struct bpf_prog *xdp_prog; - struct sk_buff *skb = NULL; u16 xdp_xmit = 0; u32 xdp_act = 0; int done = 0; @@ -949,7 +948,8 @@ static int netsec_process_rx(struct netsec_priv *priv, int budget) struct netsec_de *de = dring->vaddr + (DESC_SZ * idx); struct netsec_desc *desc = &dring->desc[idx]; struct page *page = virt_to_page(desc->addr); - u32 xdp_result = XDP_PASS; + u32 xdp_result = NETSEC_XDP_PASS; + struct sk_buff *skb = NULL; u16 pkt_len, desc_len; dma_addr_t dma_handle; struct xdp_buff xdp; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 4775f49d7f3b..d10ac54bf385 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -412,9 +412,9 @@ stmmac_probe_config_dt(struct platform_device *pdev, const char **mac) *mac = NULL; } - rc = of_get_phy_mode(np, &plat->phy_interface); - if (rc) - return ERR_PTR(rc); + plat->phy_interface = device_get_phy_mode(&pdev->dev); + if (plat->phy_interface < 0) + return ERR_PTR(plat->phy_interface); plat->interface = stmmac_of_get_mac_mode(np); if (plat->interface < 0) diff --git a/drivers/net/fddi/skfp/skfddi.c b/drivers/net/fddi/skfp/skfddi.c index 15754451cb87..69c29a2ef95d 100644 --- a/drivers/net/fddi/skfp/skfddi.c +++ b/drivers/net/fddi/skfp/skfddi.c @@ -1520,22 +1520,8 @@ void mac_drv_tx_complete(struct s_smc *smc, volatile struct s_smt_fp_txd *txd) #ifdef DUMPPACKETS void dump_data(unsigned char *Data, int length) { - int i, j; - unsigned char s[255], sh[10]; - if (length > 64) { - length = 64; - } printk(KERN_INFO "---Packet start---\n"); - for (i = 0, j = 0; i < length / 8; i++, j += 8) - printk(KERN_INFO "%02x %02x %02x %02x %02x %02x %02x %02x\n", - Data[j + 0], Data[j + 1], Data[j + 2], Data[j + 3], - Data[j + 4], Data[j + 5], Data[j + 6], Data[j + 7]); - strcpy(s, ""); - for (i = 0; i < length % 8; i++) { - sprintf(sh, "%02x ", Data[j + i]); - strcat(s, sh); - } - printk(KERN_INFO "%s\n", s); + print_hex_dump(KERN_INFO, "", DUMP_PREFIX_NONE, 16, 1, Data, min_t(size_t, length, 64), false); printk(KERN_INFO "------------------\n"); } // dump_data #else diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index bc85db8e6cdf..7032a2405e1a 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -804,19 +804,21 @@ static struct sock *gtp_encap_enable_socket(int fd, int type, return NULL; } - if (sock->sk->sk_protocol != IPPROTO_UDP) { + sk = sock->sk; + if (sk->sk_protocol != IPPROTO_UDP || + sk->sk_type != SOCK_DGRAM || + (sk->sk_family != AF_INET && sk->sk_family != AF_INET6)) { pr_debug("socket fd=%d not UDP\n", fd); sk = ERR_PTR(-EINVAL); goto out_sock; } - lock_sock(sock->sk); - if (sock->sk->sk_user_data) { + lock_sock(sk); + if (sk->sk_user_data) { sk = ERR_PTR(-EBUSY); goto out_rel_sock; } - sk = sock->sk; sock_hold(sk); tuncfg.sk_user_data = gtp; diff --git a/drivers/net/hyperv/Makefile b/drivers/net/hyperv/Makefile index 3a2aa0708166..0db7ccaec4a4 100644 --- a/drivers/net/hyperv/Makefile +++ b/drivers/net/hyperv/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_HYPERV_NET) += hv_netvsc.o -hv_netvsc-y := netvsc_drv.o netvsc.o rndis_filter.o netvsc_trace.o +hv_netvsc-y := netvsc_drv.o netvsc.o rndis_filter.o netvsc_trace.o netvsc_bpf.o diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h index dc44819946e6..abda736e7c7d 100644 --- a/drivers/net/hyperv/hyperv_net.h +++ b/drivers/net/hyperv/hyperv_net.h @@ -142,6 +142,8 @@ struct netvsc_device_info { u32 send_section_size; u32 recv_section_size; + struct bpf_prog *bprog; + u8 rss_key[NETVSC_HASH_KEYLEN]; }; @@ -189,7 +191,8 @@ int netvsc_send(struct net_device *net, struct hv_netvsc_packet *packet, struct rndis_message *rndis_msg, struct hv_page_buffer *page_buffer, - struct sk_buff *skb); + struct sk_buff *skb, + bool xdp_tx); void netvsc_linkstatus_callback(struct net_device *net, struct rndis_message *resp); int netvsc_recv_callback(struct net_device *net, @@ -198,6 +201,16 @@ int netvsc_recv_callback(struct net_device *net, void netvsc_channel_cb(void *context); int netvsc_poll(struct napi_struct *napi, int budget); +u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan, + struct xdp_buff *xdp); +unsigned int netvsc_xdp_fraglen(unsigned int len); +struct bpf_prog *netvsc_xdp_get(struct netvsc_device *nvdev); +int netvsc_xdp_set(struct net_device *dev, struct bpf_prog *prog, + struct netlink_ext_ack *extack, + struct netvsc_device *nvdev); +int netvsc_vf_setxdp(struct net_device *vf_netdev, struct bpf_prog *prog); +int netvsc_bpf(struct net_device *dev, struct netdev_bpf *bpf); + int rndis_set_subchannel(struct net_device *ndev, struct netvsc_device *nvdev, struct netvsc_device_info *dev_info); @@ -832,6 +845,8 @@ struct nvsp_message { #define RNDIS_MAX_PKT_DEFAULT 8 #define RNDIS_PKT_ALIGN_DEFAULT 8 +#define NETVSC_XDP_HDRM 256 + struct multi_send_data { struct sk_buff *skb; /* skb containing the pkt */ struct hv_netvsc_packet *pkt; /* netvsc pkt pending */ @@ -867,6 +882,7 @@ struct netvsc_stats { u64 bytes; u64 broadcast; u64 multicast; + u64 xdp_drop; struct u64_stats_sync syncp; }; @@ -972,6 +988,9 @@ struct netvsc_channel { atomic_t queue_sends; struct nvsc_rsc rsc; + struct bpf_prog __rcu *bpf_prog; + struct xdp_rxq_info xdp_rxq; + struct netvsc_stats tx_stats; struct netvsc_stats rx_stats; }; diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index eab83e71567a..ae3f3084c2ed 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -122,8 +122,10 @@ static void free_netvsc_device(struct rcu_head *head) vfree(nvdev->send_buf); kfree(nvdev->send_section_map); - for (i = 0; i < VRSS_CHANNEL_MAX; i++) + for (i = 0; i < VRSS_CHANNEL_MAX; i++) { + xdp_rxq_info_unreg(&nvdev->chan_table[i].xdp_rxq); vfree(nvdev->chan_table[i].mrc.slots); + } kfree(nvdev); } @@ -900,7 +902,8 @@ int netvsc_send(struct net_device *ndev, struct hv_netvsc_packet *packet, struct rndis_message *rndis_msg, struct hv_page_buffer *pb, - struct sk_buff *skb) + struct sk_buff *skb, + bool xdp_tx) { struct net_device_context *ndev_ctx = netdev_priv(ndev); struct netvsc_device *net_device @@ -923,10 +926,11 @@ int netvsc_send(struct net_device *ndev, packet->send_buf_index = NETVSC_INVALID_INDEX; packet->cp_partial = false; - /* Send control message directly without accessing msd (Multi-Send - * Data) field which may be changed during data packet processing. + /* Send a control message or XDP packet directly without accessing + * msd (Multi-Send Data) field which may be changed during data packet + * processing. */ - if (!skb) + if (!skb || xdp_tx) return netvsc_send_pkt(device, packet, net_device, pb, skb); /* batch packets in send buffer if possible */ @@ -1392,6 +1396,21 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device, nvchan->net_device = net_device; u64_stats_init(&nvchan->tx_stats.syncp); u64_stats_init(&nvchan->rx_stats.syncp); + + ret = xdp_rxq_info_reg(&nvchan->xdp_rxq, ndev, i); + + if (ret) { + netdev_err(ndev, "xdp_rxq_info_reg fail: %d\n", ret); + goto cleanup2; + } + + ret = xdp_rxq_info_reg_mem_model(&nvchan->xdp_rxq, + MEM_TYPE_PAGE_SHARED, NULL); + + if (ret) { + netdev_err(ndev, "xdp reg_mem_model fail: %d\n", ret); + goto cleanup2; + } } /* Enable NAPI handler before init callbacks */ @@ -1437,6 +1456,8 @@ close: cleanup: netif_napi_del(&net_device->chan_table[0].napi); + +cleanup2: free_netvsc_device(&net_device->rcu); return ERR_PTR(ret); diff --git a/drivers/net/hyperv/netvsc_bpf.c b/drivers/net/hyperv/netvsc_bpf.c new file mode 100644 index 000000000000..20adfe544294 --- /dev/null +++ b/drivers/net/hyperv/netvsc_bpf.c @@ -0,0 +1,209 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2019, Microsoft Corporation. + * + * Author: + * Haiyang Zhang <haiyangz@microsoft.com> + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/ethtool.h> +#include <linux/bpf.h> +#include <linux/bpf_trace.h> +#include <linux/kernel.h> +#include <net/xdp.h> + +#include <linux/mutex.h> +#include <linux/rtnetlink.h> + +#include "hyperv_net.h" + +u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan, + struct xdp_buff *xdp) +{ + void *data = nvchan->rsc.data[0]; + u32 len = nvchan->rsc.len[0]; + struct page *page = NULL; + struct bpf_prog *prog; + u32 act = XDP_PASS; + + xdp->data_hard_start = NULL; + + rcu_read_lock(); + prog = rcu_dereference(nvchan->bpf_prog); + + if (!prog) + goto out; + + /* allocate page buffer for data */ + page = alloc_page(GFP_ATOMIC); + if (!page) { + act = XDP_DROP; + goto out; + } + + xdp->data_hard_start = page_address(page); + xdp->data = xdp->data_hard_start + NETVSC_XDP_HDRM; + xdp_set_data_meta_invalid(xdp); + xdp->data_end = xdp->data + len; + xdp->rxq = &nvchan->xdp_rxq; + xdp->handle = 0; + + memcpy(xdp->data, data, len); + + act = bpf_prog_run_xdp(prog, xdp); + + switch (act) { + case XDP_PASS: + case XDP_TX: + case XDP_DROP: + break; + + case XDP_ABORTED: + trace_xdp_exception(ndev, prog, act); + break; + + default: + bpf_warn_invalid_xdp_action(act); + } + +out: + rcu_read_unlock(); + + if (page && act != XDP_PASS && act != XDP_TX) { + __free_page(page); + xdp->data_hard_start = NULL; + } + + return act; +} + +unsigned int netvsc_xdp_fraglen(unsigned int len) +{ + return SKB_DATA_ALIGN(len) + + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); +} + +struct bpf_prog *netvsc_xdp_get(struct netvsc_device *nvdev) +{ + return rtnl_dereference(nvdev->chan_table[0].bpf_prog); +} + +int netvsc_xdp_set(struct net_device *dev, struct bpf_prog *prog, + struct netlink_ext_ack *extack, + struct netvsc_device *nvdev) +{ + struct bpf_prog *old_prog; + int buf_max, i; + + old_prog = netvsc_xdp_get(nvdev); + + if (!old_prog && !prog) + return 0; + + buf_max = NETVSC_XDP_HDRM + netvsc_xdp_fraglen(dev->mtu + ETH_HLEN); + if (prog && buf_max > PAGE_SIZE) { + netdev_err(dev, "XDP: mtu:%u too large, buf_max:%u\n", + dev->mtu, buf_max); + NL_SET_ERR_MSG_MOD(extack, "XDP: mtu too large"); + + return -EOPNOTSUPP; + } + + if (prog && (dev->features & NETIF_F_LRO)) { + netdev_err(dev, "XDP: not support LRO\n"); + NL_SET_ERR_MSG_MOD(extack, "XDP: not support LRO"); + + return -EOPNOTSUPP; + } + + if (prog) + bpf_prog_add(prog, nvdev->num_chn); + + for (i = 0; i < nvdev->num_chn; i++) + rcu_assign_pointer(nvdev->chan_table[i].bpf_prog, prog); + + if (old_prog) + for (i = 0; i < nvdev->num_chn; i++) + bpf_prog_put(old_prog); + + return 0; +} + +int netvsc_vf_setxdp(struct net_device *vf_netdev, struct bpf_prog *prog) +{ + struct netdev_bpf xdp; + bpf_op_t ndo_bpf; + + ASSERT_RTNL(); + + if (!vf_netdev) + return 0; + + ndo_bpf = vf_netdev->netdev_ops->ndo_bpf; + if (!ndo_bpf) + return 0; + + memset(&xdp, 0, sizeof(xdp)); + + xdp.command = XDP_SETUP_PROG; + xdp.prog = prog; + + return ndo_bpf(vf_netdev, &xdp); +} + +static u32 netvsc_xdp_query(struct netvsc_device *nvdev) +{ + struct bpf_prog *prog = netvsc_xdp_get(nvdev); + + if (prog) + return prog->aux->id; + + return 0; +} + +int netvsc_bpf(struct net_device *dev, struct netdev_bpf *bpf) +{ + struct net_device_context *ndevctx = netdev_priv(dev); + struct netvsc_device *nvdev = rtnl_dereference(ndevctx->nvdev); + struct net_device *vf_netdev = rtnl_dereference(ndevctx->vf_netdev); + struct netlink_ext_ack *extack = bpf->extack; + int ret; + + if (!nvdev || nvdev->destroy) { + if (bpf->command == XDP_QUERY_PROG) { + bpf->prog_id = 0; + return 0; /* Query must always succeed */ + } else { + return -ENODEV; + } + } + + switch (bpf->command) { + case XDP_SETUP_PROG: + ret = netvsc_xdp_set(dev, bpf->prog, extack, nvdev); + + if (ret) + return ret; + + ret = netvsc_vf_setxdp(vf_netdev, bpf->prog); + + if (ret) { + netdev_err(dev, "vf_setxdp failed:%d\n", ret); + NL_SET_ERR_MSG_MOD(extack, "vf_setxdp failed"); + + netvsc_xdp_set(dev, NULL, extack, nvdev); + } + + return ret; + + case XDP_QUERY_PROG: + bpf->prog_id = netvsc_xdp_query(nvdev); + return 0; + + default: + return -EINVAL; + } +} diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8a402a..8fc71bd49894 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -25,6 +25,7 @@ #include <linux/slab.h> #include <linux/rtnetlink.h> #include <linux/netpoll.h> +#include <linux/bpf.h> #include <net/arp.h> #include <net/route.h> @@ -519,7 +520,7 @@ static int netvsc_vf_xmit(struct net_device *net, struct net_device *vf_netdev, return rc; } -static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) +static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx) { struct net_device_context *net_device_ctx = netdev_priv(net); struct hv_netvsc_packet *packet = NULL; @@ -686,7 +687,7 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net) /* timestamp packet in software */ skb_tx_timestamp(skb); - ret = netvsc_send(net, packet, rndis_msg, pb, skb); + ret = netvsc_send(net, packet, rndis_msg, pb, skb, xdp_tx); if (likely(ret == 0)) return NETDEV_TX_OK; @@ -709,6 +710,11 @@ no_memory: goto drop; } +static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + return netvsc_xmit(skb, ndev, false); +} + /* * netvsc_linkstatus_callback - Link up/down notification */ @@ -751,6 +757,22 @@ void netvsc_linkstatus_callback(struct net_device *net, schedule_delayed_work(&ndev_ctx->dwork, 0); } +static void netvsc_xdp_xmit(struct sk_buff *skb, struct net_device *ndev) +{ + int rc; + + skb->queue_mapping = skb_get_rx_queue(skb); + __skb_push(skb, ETH_HLEN); + + rc = netvsc_xmit(skb, ndev, true); + + if (dev_xmit_complete(rc)) + return; + + dev_kfree_skb_any(skb); + ndev->stats.tx_dropped++; +} + static void netvsc_comp_ipcsum(struct sk_buff *skb) { struct iphdr *iph = (struct iphdr *)skb->data; @@ -760,7 +782,8 @@ static void netvsc_comp_ipcsum(struct sk_buff *skb) } static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net, - struct netvsc_channel *nvchan) + struct netvsc_channel *nvchan, + struct xdp_buff *xdp) { struct napi_struct *napi = &nvchan->napi; const struct ndis_pkt_8021q_info *vlan = nvchan->rsc.vlan; @@ -768,18 +791,37 @@ static struct sk_buff *netvsc_alloc_recv_skb(struct net_device *net, nvchan->rsc.csum_info; const u32 *hash_info = nvchan->rsc.hash_info; struct sk_buff *skb; + void *xbuf = xdp->data_hard_start; int i; - skb = napi_alloc_skb(napi, nvchan->rsc.pktlen); - if (!skb) - return skb; + if (xbuf) { + unsigned int hdroom = xdp->data - xdp->data_hard_start; + unsigned int xlen = xdp->data_end - xdp->data; + unsigned int frag_size = netvsc_xdp_fraglen(hdroom + xlen); - /* - * Copy to skb. This copy is needed here since the memory pointed by - * hv_netvsc_packet cannot be deallocated - */ - for (i = 0; i < nvchan->rsc.cnt; i++) - skb_put_data(skb, nvchan->rsc.data[i], nvchan->rsc.len[i]); + skb = build_skb(xbuf, frag_size); + + if (!skb) { + __free_page(virt_to_page(xbuf)); + return NULL; + } + + skb_reserve(skb, hdroom); + skb_put(skb, xlen); + skb->dev = napi->dev; + } else { + skb = napi_alloc_skb(napi, nvchan->rsc.pktlen); + + if (!skb) + return NULL; + + /* Copy to skb. This copy is needed here since the memory + * pointed by hv_netvsc_packet cannot be deallocated. + */ + for (i = 0; i < nvchan->rsc.cnt; i++) + skb_put_data(skb, nvchan->rsc.data[i], + nvchan->rsc.len[i]); + } skb->protocol = eth_type_trans(skb, net); @@ -829,13 +871,25 @@ int netvsc_recv_callback(struct net_device *net, struct vmbus_channel *channel = nvchan->channel; u16 q_idx = channel->offermsg.offer.sub_channel_index; struct sk_buff *skb; - struct netvsc_stats *rx_stats; + struct netvsc_stats *rx_stats = &nvchan->rx_stats; + struct xdp_buff xdp; + u32 act; if (net->reg_state != NETREG_REGISTERED) return NVSP_STAT_FAIL; + act = netvsc_run_xdp(net, nvchan, &xdp); + + if (act != XDP_PASS && act != XDP_TX) { + u64_stats_update_begin(&rx_stats->syncp); + rx_stats->xdp_drop++; + u64_stats_update_end(&rx_stats->syncp); + + return NVSP_STAT_SUCCESS; /* consumed by XDP */ + } + /* Allocate a skb - TODO direct I/O to pages? */ - skb = netvsc_alloc_recv_skb(net, nvchan); + skb = netvsc_alloc_recv_skb(net, nvchan, &xdp); if (unlikely(!skb)) { ++net_device_ctx->eth_stats.rx_no_memory; @@ -849,7 +903,6 @@ int netvsc_recv_callback(struct net_device *net, * on the synthetic device because modifying the VF device * statistics will not work correctly. */ - rx_stats = &nvchan->rx_stats; u64_stats_update_begin(&rx_stats->syncp); rx_stats->packets++; rx_stats->bytes += nvchan->rsc.pktlen; @@ -860,6 +913,11 @@ int netvsc_recv_callback(struct net_device *net, ++rx_stats->multicast; u64_stats_update_end(&rx_stats->syncp); + if (act == XDP_TX) { + netvsc_xdp_xmit(skb, net); + return NVSP_STAT_SUCCESS; + } + napi_gro_receive(&nvchan->napi, skb); return NVSP_STAT_SUCCESS; } @@ -886,10 +944,11 @@ static void netvsc_get_channels(struct net_device *net, /* Alloc struct netvsc_device_info, and initialize it from either existing * struct netvsc_device, or from default values. */ -static struct netvsc_device_info *netvsc_devinfo_get - (struct netvsc_device *nvdev) +static +struct netvsc_device_info *netvsc_devinfo_get(struct netvsc_device *nvdev) { struct netvsc_device_info *dev_info; + struct bpf_prog *prog; dev_info = kzalloc(sizeof(*dev_info), GFP_ATOMIC); @@ -897,6 +956,8 @@ static struct netvsc_device_info *netvsc_devinfo_get return NULL; if (nvdev) { + ASSERT_RTNL(); + dev_info->num_chn = nvdev->num_chn; dev_info->send_sections = nvdev->send_section_cnt; dev_info->send_section_size = nvdev->send_section_size; @@ -905,6 +966,12 @@ static struct netvsc_device_info *netvsc_devinfo_get memcpy(dev_info->rss_key, nvdev->extension->rss_key, NETVSC_HASH_KEYLEN); + + prog = netvsc_xdp_get(nvdev); + if (prog) { + bpf_prog_inc(prog); + dev_info->bprog = prog; + } } else { dev_info->num_chn = VRSS_CHANNEL_DEFAULT; dev_info->send_sections = NETVSC_DEFAULT_TX; @@ -916,6 +983,17 @@ static struct netvsc_device_info *netvsc_devinfo_get return dev_info; } +/* Free struct netvsc_device_info */ +static void netvsc_devinfo_put(struct netvsc_device_info *dev_info) +{ + if (dev_info->bprog) { + ASSERT_RTNL(); + bpf_prog_put(dev_info->bprog); + } + + kfree(dev_info); +} + static int netvsc_detach(struct net_device *ndev, struct netvsc_device *nvdev) { @@ -927,6 +1005,8 @@ static int netvsc_detach(struct net_device *ndev, if (cancel_work_sync(&nvdev->subchan_work)) nvdev->num_chn = 1; + netvsc_xdp_set(ndev, NULL, NULL, nvdev); + /* If device was up (receiving) then shutdown */ if (netif_running(ndev)) { netvsc_tx_disable(nvdev, ndev); @@ -960,7 +1040,8 @@ static int netvsc_attach(struct net_device *ndev, struct hv_device *hdev = ndev_ctx->device_ctx; struct netvsc_device *nvdev; struct rndis_device *rdev; - int ret; + struct bpf_prog *prog; + int ret = 0; nvdev = rndis_filter_device_add(hdev, dev_info); if (IS_ERR(nvdev)) @@ -976,6 +1057,13 @@ static int netvsc_attach(struct net_device *ndev, } } + prog = dev_info->bprog; + if (prog) { + ret = netvsc_xdp_set(ndev, prog, NULL, nvdev); + if (ret) + goto err1; + } + /* In any case device is now ready */ netif_device_attach(ndev); @@ -985,7 +1073,7 @@ static int netvsc_attach(struct net_device *ndev, if (netif_running(ndev)) { ret = rndis_filter_open(nvdev); if (ret) - goto err; + goto err2; rdev = nvdev->extension; if (!rdev->link_state) @@ -994,9 +1082,10 @@ static int netvsc_attach(struct net_device *ndev, return 0; -err: +err2: netif_device_detach(ndev); +err1: rndis_filter_device_remove(hdev, nvdev); return ret; @@ -1046,7 +1135,7 @@ static int netvsc_set_channels(struct net_device *net, } out: - kfree(device_info); + netvsc_devinfo_put(device_info); return ret; } @@ -1153,7 +1242,7 @@ rollback_vf: dev_set_mtu(vf_netdev, orig_mtu); out: - kfree(device_info); + netvsc_devinfo_put(device_info); return ret; } @@ -1378,8 +1467,8 @@ static const struct { /* statistics per queue (rx/tx packets/bytes) */ #define NETVSC_PCPU_STATS_LEN (num_present_cpus() * ARRAY_SIZE(pcpu_stats)) -/* 4 statistics per queue (rx/tx packets/bytes) */ -#define NETVSC_QUEUE_STATS_LEN(dev) ((dev)->num_chn * 4) +/* 5 statistics per queue (rx/tx packets/bytes, rx xdp_drop) */ +#define NETVSC_QUEUE_STATS_LEN(dev) ((dev)->num_chn * 5) static int netvsc_get_sset_count(struct net_device *dev, int string_set) { @@ -1411,6 +1500,7 @@ static void netvsc_get_ethtool_stats(struct net_device *dev, struct netvsc_ethtool_pcpu_stats *pcpu_sum; unsigned int start; u64 packets, bytes; + u64 xdp_drop; int i, j, cpu; if (!nvdev) @@ -1439,9 +1529,11 @@ static void netvsc_get_ethtool_stats(struct net_device *dev, start = u64_stats_fetch_begin_irq(&qstats->syncp); packets = qstats->packets; bytes = qstats->bytes; + xdp_drop = qstats->xdp_drop; } while (u64_stats_fetch_retry_irq(&qstats->syncp, start)); data[i++] = packets; data[i++] = bytes; + data[i++] = xdp_drop; } pcpu_sum = kvmalloc_array(num_possible_cpus(), @@ -1489,6 +1581,8 @@ static void netvsc_get_strings(struct net_device *dev, u32 stringset, u8 *data) p += ETH_GSTRING_LEN; sprintf(p, "rx_queue_%u_bytes", i); p += ETH_GSTRING_LEN; + sprintf(p, "rx_queue_%u_xdp_drop", i); + p += ETH_GSTRING_LEN; } for_each_present_cpu(cpu) { @@ -1785,10 +1879,27 @@ static int netvsc_set_ringparam(struct net_device *ndev, } out: - kfree(device_info); + netvsc_devinfo_put(device_info); return ret; } +static netdev_features_t netvsc_fix_features(struct net_device *ndev, + netdev_features_t features) +{ + struct net_device_context *ndevctx = netdev_priv(ndev); + struct netvsc_device *nvdev = rtnl_dereference(ndevctx->nvdev); + + if (!nvdev || nvdev->destroy) + return features; + + if ((features & NETIF_F_LRO) && netvsc_xdp_get(nvdev)) { + features ^= NETIF_F_LRO; + netdev_info(ndev, "Skip LRO - unsupported with XDP\n"); + } + + return features; +} + static int netvsc_set_features(struct net_device *ndev, netdev_features_t features) { @@ -1875,12 +1986,14 @@ static const struct net_device_ops device_ops = { .ndo_start_xmit = netvsc_start_xmit, .ndo_change_rx_flags = netvsc_change_rx_flags, .ndo_set_rx_mode = netvsc_set_rx_mode, + .ndo_fix_features = netvsc_fix_features, .ndo_set_features = netvsc_set_features, .ndo_change_mtu = netvsc_change_mtu, .ndo_validate_addr = eth_validate_addr, .ndo_set_mac_address = netvsc_set_mac_addr, .ndo_select_queue = netvsc_select_queue, .ndo_get_stats64 = netvsc_get_stats64, + .ndo_bpf = netvsc_bpf, }; /* @@ -2167,6 +2280,7 @@ static int netvsc_register_vf(struct net_device *vf_netdev) { struct net_device_context *net_device_ctx; struct netvsc_device *netvsc_dev; + struct bpf_prog *prog; struct net_device *ndev; int ret; @@ -2211,6 +2325,9 @@ static int netvsc_register_vf(struct net_device *vf_netdev) vf_netdev->wanted_features = ndev->features; netdev_update_features(vf_netdev); + prog = netvsc_xdp_get(netvsc_dev); + netvsc_vf_setxdp(vf_netdev, prog); + return NOTIFY_OK; } @@ -2252,6 +2369,8 @@ static int netvsc_unregister_vf(struct net_device *vf_netdev) netdev_info(ndev, "VF unregistering: %s\n", vf_netdev->name); + netvsc_vf_setxdp(vf_netdev, NULL); + netdev_rx_handler_unregister(vf_netdev); netdev_upper_dev_unlink(vf_netdev, ndev); RCU_INIT_POINTER(net_device_ctx->vf_netdev, NULL); @@ -2363,14 +2482,14 @@ static int netvsc_probe(struct hv_device *dev, list_add(&net_device_ctx->list, &netvsc_dev_list); rtnl_unlock(); - kfree(device_info); + netvsc_devinfo_put(device_info); return 0; register_failed: rtnl_unlock(); rndis_filter_device_remove(dev, nvdev); rndis_failed: - kfree(device_info); + netvsc_devinfo_put(device_info); devinfo_failed: free_percpu(net_device_ctx->vf_stats); no_stats: @@ -2398,8 +2517,10 @@ static int netvsc_remove(struct hv_device *dev) rtnl_lock(); nvdev = rtnl_dereference(ndev_ctx->nvdev); - if (nvdev) + if (nvdev) { cancel_work_sync(&nvdev->subchan_work); + netvsc_xdp_set(net, NULL, NULL, nvdev); + } /* * Call to the vsc driver to let it know that the device is being @@ -2472,11 +2593,11 @@ static int netvsc_resume(struct hv_device *dev) ret = netvsc_attach(net, device_info); - rtnl_unlock(); - - kfree(device_info); + netvsc_devinfo_put(device_info); net_device_ctx->saved_netvsc_dev_info = NULL; + rtnl_unlock(); + return ret; } static const struct hv_vmbus_device_id id_table[] = { diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c index e66d77dc28c8..b81ceba38218 100644 --- a/drivers/net/hyperv/rndis_filter.c +++ b/drivers/net/hyperv/rndis_filter.c @@ -235,7 +235,7 @@ static int rndis_filter_send_request(struct rndis_device *dev, trace_rndis_send(dev->ndev, 0, &req->request_msg); rcu_read_lock_bh(); - ret = netvsc_send(dev->ndev, packet, NULL, pb, NULL); + ret = netvsc_send(dev->ndev, packet, NULL, pb, NULL, false); rcu_read_unlock_bh(); return ret; diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index 6b5ee26795a2..71fc778ce398 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -346,9 +346,10 @@ config DAVICOM_PHY Currently supports dm9161e and dm9131 config DP83822_PHY - tristate "Texas Instruments DP83822/825 PHYs" + tristate "Texas Instruments DP83822/825/826 PHYs" ---help--- - Supports the DP83822 and DP83825I PHYs. + Supports the DP83822, DP83825I, DP83825CM, DP83825CS, DP83825S, + DP83826C and DP83826NC PHYs. config DP83TC811_PHY tristate "Texas Instruments DP83TC811 PHY" diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c index 8a4b1d167ce2..fe9aa3ad52a7 100644 --- a/drivers/net/phy/dp83822.c +++ b/drivers/net/phy/dp83822.c @@ -1,6 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -/* - * Driver for the Texas Instruments DP83822 PHY +/* Driver for the Texas Instruments DP83822, DP83825 and DP83826 PHYs. * * Copyright (C) 2017 Texas Instruments Inc. */ @@ -15,7 +14,12 @@ #include <linux/netdevice.h> #define DP83822_PHY_ID 0x2000a240 +#define DP83825S_PHY_ID 0x2000a140 #define DP83825I_PHY_ID 0x2000a150 +#define DP83825CM_PHY_ID 0x2000a160 +#define DP83825CS_PHY_ID 0x2000a170 +#define DP83826C_PHY_ID 0x2000a130 +#define DP83826NC_PHY_ID 0x2000a110 #define DP83822_DEVADDR 0x1f @@ -319,12 +323,22 @@ static int dp83822_resume(struct phy_device *phydev) static struct phy_driver dp83822_driver[] = { DP83822_PHY_DRIVER(DP83822_PHY_ID, "TI DP83822"), DP83822_PHY_DRIVER(DP83825I_PHY_ID, "TI DP83825I"), + DP83822_PHY_DRIVER(DP83826C_PHY_ID, "TI DP83826C"), + DP83822_PHY_DRIVER(DP83826NC_PHY_ID, "TI DP83826NC"), + DP83822_PHY_DRIVER(DP83825S_PHY_ID, "TI DP83825S"), + DP83822_PHY_DRIVER(DP83825CM_PHY_ID, "TI DP83825M"), + DP83822_PHY_DRIVER(DP83825CS_PHY_ID, "TI DP83825CS"), }; module_phy_driver(dp83822_driver); static struct mdio_device_id __maybe_unused dp83822_tbl[] = { { DP83822_PHY_ID, 0xfffffff0 }, { DP83825I_PHY_ID, 0xfffffff0 }, + { DP83826C_PHY_ID, 0xfffffff0 }, + { DP83826NC_PHY_ID, 0xfffffff0 }, + { DP83825S_PHY_ID, 0xfffffff0 }, + { DP83825CM_PHY_ID, 0xfffffff0 }, + { DP83825CS_PHY_ID, 0xfffffff0 }, { }, }; MODULE_DEVICE_TABLE(mdio, dp83822_tbl); diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c index 317d3a8df316..6f4d7ba8b109 100644 --- a/drivers/net/slip/slip.c +++ b/drivers/net/slip/slip.c @@ -452,9 +452,16 @@ static void slip_transmit(struct work_struct *work) */ static void slip_write_wakeup(struct tty_struct *tty) { - struct slip *sl = tty->disc_data; + struct slip *sl; + + rcu_read_lock(); + sl = rcu_dereference(tty->disc_data); + if (!sl) + goto out; schedule_work(&sl->tx_work); +out: + rcu_read_unlock(); } static void sl_tx_timeout(struct net_device *dev, unsigned int txqueue) @@ -882,10 +889,11 @@ static void slip_close(struct tty_struct *tty) return; spin_lock_bh(&sl->lock); - tty->disc_data = NULL; + rcu_assign_pointer(tty->disc_data, NULL); sl->tty = NULL; spin_unlock_bh(&sl->lock); + synchronize_rcu(); flush_work(&sl->tx_work); /* VSV = very important to remove timers */ diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 3a5a6c655dda..650c937ed56b 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1936,6 +1936,10 @@ drop: if (ret != XDP_PASS) { rcu_read_unlock(); local_bh_enable(); + if (frags) { + tfile->napi.skb = NULL; + mutex_unlock(&tfile->napi_mutex); + } return total_len; } } diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index d2d61f082710..eccbf4cd7149 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -20,6 +20,7 @@ #include <linux/mdio.h> #include <linux/phy.h> #include <net/ip6_checksum.h> +#include <net/vxlan.h> #include <linux/interrupt.h> #include <linux/irqdomain.h> #include <linux/irq.h> @@ -3660,6 +3661,19 @@ static void lan78xx_tx_timeout(struct net_device *net, unsigned int txqueue) tasklet_schedule(&dev->bh); } +static netdev_features_t lan78xx_features_check(struct sk_buff *skb, + struct net_device *netdev, + netdev_features_t features) +{ + if (skb->len + TX_OVERHEAD > MAX_SINGLE_PACKET_SIZE) + features &= ~NETIF_F_GSO_MASK; + + features = vlan_features_check(skb, features); + features = vxlan_features_check(skb, features); + + return features; +} + static const struct net_device_ops lan78xx_netdev_ops = { .ndo_open = lan78xx_open, .ndo_stop = lan78xx_stop, @@ -3673,6 +3687,7 @@ static const struct net_device_ops lan78xx_netdev_ops = { .ndo_set_features = lan78xx_set_features, .ndo_vlan_rx_add_vid = lan78xx_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = lan78xx_vlan_rx_kill_vid, + .ndo_features_check = lan78xx_features_check, }; static void lan78xx_stat_monitor(struct timer_list *t) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 36051288034a..e8cd8c05b156 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -31,7 +31,7 @@ #define NETNEXT_VERSION "11" /* Information for net */ -#define NET_VERSION "10" +#define NET_VERSION "11" #define DRIVER_VERSION "v1." NETNEXT_VERSION "." NET_VERSION #define DRIVER_AUTHOR "Realtek linux nic maintainers <nic_swsd@realtek.com>" @@ -68,6 +68,7 @@ #define PLA_LED_FEATURE 0xdd92 #define PLA_PHYAR 0xde00 #define PLA_BOOT_CTRL 0xe004 +#define PLA_LWAKE_CTRL_REG 0xe007 #define PLA_GPHY_INTR_IMR 0xe022 #define PLA_EEE_CR 0xe040 #define PLA_EEEP_CR 0xe080 @@ -95,6 +96,7 @@ #define PLA_TALLYCNT 0xe890 #define PLA_SFF_STS_7 0xe8de #define PLA_PHYSTATUS 0xe908 +#define PLA_CONFIG6 0xe90a /* CONFIG6 */ #define PLA_BP_BA 0xfc26 #define PLA_BP_0 0xfc28 #define PLA_BP_1 0xfc2a @@ -107,6 +109,7 @@ #define PLA_BP_EN 0xfc38 #define USB_USB2PHY 0xb41e +#define USB_SSPHYLINK1 0xb426 #define USB_SSPHYLINK2 0xb428 #define USB_U2P3_CTRL 0xb460 #define USB_CSR_DUMMY1 0xb464 @@ -300,6 +303,9 @@ #define LINK_ON_WAKE_EN 0x0010 #define LINK_OFF_WAKE_EN 0x0008 +/* PLA_CONFIG6 */ +#define LANWAKE_CLR_EN BIT(0) + /* PLA_CONFIG5 */ #define BWF_EN 0x0040 #define MWF_EN 0x0020 @@ -312,6 +318,7 @@ /* PLA_PHY_PWR */ #define TX_10M_IDLE_EN 0x0080 #define PFM_PWM_SWITCH 0x0040 +#define TEST_IO_OFF BIT(4) /* PLA_MAC_PWR_CTRL */ #define D3_CLK_GATED_EN 0x00004000 @@ -324,6 +331,7 @@ #define MAC_CLK_SPDWN_EN BIT(15) /* PLA_MAC_PWR_CTRL3 */ +#define PLA_MCU_SPDWN_EN BIT(14) #define PKT_AVAIL_SPDWN_EN 0x0100 #define SUSPEND_SPDWN_EN 0x0004 #define U1U2_SPDWN_EN 0x0002 @@ -354,6 +362,9 @@ /* PLA_BOOT_CTRL */ #define AUTOLOAD_DONE 0x0002 +/* PLA_LWAKE_CTRL_REG */ +#define LANWAKE_PIN BIT(7) + /* PLA_SUSPEND_FLAG */ #define LINK_CHG_EVENT BIT(0) @@ -365,13 +376,18 @@ #define DEBUG_LTSSM 0x0082 /* PLA_EXTRA_STATUS */ +#define CUR_LINK_OK BIT(15) #define U3P3_CHECK_EN BIT(7) /* RTL_VER_05 only */ #define LINK_CHANGE_FLAG BIT(8) +#define POLL_LINK_CHG BIT(0) /* USB_USB2PHY */ #define USB2PHY_SUSPEND 0x0001 #define USB2PHY_L1 0x0002 +/* USB_SSPHYLINK1 */ +#define DELAY_PHY_PWR_CHG BIT(1) + /* USB_SSPHYLINK2 */ #define pwd_dn_scale_mask 0x3ffe #define pwd_dn_scale(x) ((x) << 1) @@ -2861,6 +2877,17 @@ static int rtl8153_enable(struct r8152 *tp) r8153_set_rx_early_timeout(tp); r8153_set_rx_early_size(tp); + if (tp->version == RTL_VER_09) { + u32 ocp_data; + + ocp_data = ocp_read_word(tp, MCU_TYPE_USB, USB_FW_TASK); + ocp_data &= ~FC_PATCH_TASK; + ocp_write_word(tp, MCU_TYPE_USB, USB_FW_TASK, ocp_data); + usleep_range(1000, 2000); + ocp_data |= FC_PATCH_TASK; + ocp_write_word(tp, MCU_TYPE_USB, USB_FW_TASK, ocp_data); + } + return rtl_enable(tp); } @@ -3374,8 +3401,8 @@ static void rtl8153b_runtime_enable(struct r8152 *tp, bool enable) r8153b_ups_en(tp, false); r8153_queue_wake(tp, false); rtl_runtime_suspend_enable(tp, false); - r8153_u2p3en(tp, true); - r8153b_u1u2en(tp, true); + if (tp->udev->speed != USB_SPEED_HIGH) + r8153b_u1u2en(tp, true); } } @@ -4673,7 +4700,6 @@ static void r8153b_hw_phy_cfg(struct r8152 *tp) r8153_aldps_en(tp, true); r8152b_enable_fc(tp); - r8153_u2p3en(tp, true); set_bit(PHY_RESET, &tp->flags); } @@ -4952,6 +4978,8 @@ static void rtl8152_down(struct r8152 *tp) static void rtl8153_up(struct r8152 *tp) { + u32 ocp_data; + if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; @@ -4959,6 +4987,19 @@ static void rtl8153_up(struct r8152 *tp) r8153_u2p3en(tp, false); r8153_aldps_en(tp, false); r8153_first_init(tp); + + ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6); + ocp_data |= LANWAKE_CLR_EN; + ocp_write_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6, ocp_data); + + ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_LWAKE_CTRL_REG); + ocp_data &= ~LANWAKE_PIN; + ocp_write_byte(tp, MCU_TYPE_PLA, PLA_LWAKE_CTRL_REG, ocp_data); + + ocp_data = ocp_read_word(tp, MCU_TYPE_USB, USB_SSPHYLINK1); + ocp_data &= ~DELAY_PHY_PWR_CHG; + ocp_write_word(tp, MCU_TYPE_USB, USB_SSPHYLINK1, ocp_data); + r8153_aldps_en(tp, true); switch (tp->version) { @@ -4977,11 +5018,17 @@ static void rtl8153_up(struct r8152 *tp) static void rtl8153_down(struct r8152 *tp) { + u32 ocp_data; + if (test_bit(RTL8152_UNPLUG, &tp->flags)) { rtl_drop_queued_tx(tp); return; } + ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6); + ocp_data &= ~LANWAKE_CLR_EN; + ocp_write_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6, ocp_data); + r8153_u1u2en(tp, false); r8153_u2p3en(tp, false); r8153_power_cut_en(tp, false); @@ -4992,6 +5039,8 @@ static void rtl8153_down(struct r8152 *tp) static void rtl8153b_up(struct r8152 *tp) { + u32 ocp_data; + if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; @@ -5002,18 +5051,29 @@ static void rtl8153b_up(struct r8152 *tp) r8153_first_init(tp); ocp_write_dword(tp, MCU_TYPE_USB, USB_RX_BUF_TH, RX_THR_B); + ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3); + ocp_data &= ~PLA_MCU_SPDWN_EN; + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, ocp_data); + r8153_aldps_en(tp, true); - r8153_u2p3en(tp, true); - r8153b_u1u2en(tp, true); + + if (tp->udev->speed != USB_SPEED_HIGH) + r8153b_u1u2en(tp, true); } static void rtl8153b_down(struct r8152 *tp) { + u32 ocp_data; + if (test_bit(RTL8152_UNPLUG, &tp->flags)) { rtl_drop_queued_tx(tp); return; } + ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3); + ocp_data |= PLA_MCU_SPDWN_EN; + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, ocp_data); + r8153b_u1u2en(tp, false); r8153_u2p3en(tp, false); r8153b_power_cut_en(tp, false); @@ -5385,6 +5445,16 @@ static void r8153_init(struct r8152 *tp) else ocp_data |= DYNAMIC_BURST; ocp_write_byte(tp, MCU_TYPE_USB, USB_CSR_DUMMY1, ocp_data); + + r8153_queue_wake(tp, false); + + ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_EXTRA_STATUS); + if (rtl8152_get_speed(tp) & LINK_STATUS) + ocp_data |= CUR_LINK_OK; + else + ocp_data &= ~CUR_LINK_OK; + ocp_data |= POLL_LINK_CHG; + ocp_write_word(tp, MCU_TYPE_PLA, PLA_EXTRA_STATUS, ocp_data); } ocp_data = ocp_read_byte(tp, MCU_TYPE_USB, USB_CSR_DUMMY2); @@ -5414,10 +5484,19 @@ static void r8153_init(struct r8152 *tp) ocp_write_word(tp, MCU_TYPE_USB, USB_CONNECT_TIMER, 0x0001); r8153_power_cut_en(tp, false); + rtl_runtime_suspend_enable(tp, false); r8153_u1u2en(tp, true); r8153_mac_clk_spd(tp, false); usb_enable_lpm(tp->udev); + ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6); + ocp_data |= LANWAKE_CLR_EN; + ocp_write_byte(tp, MCU_TYPE_PLA, PLA_CONFIG6, ocp_data); + + ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_LWAKE_CTRL_REG); + ocp_data &= ~LANWAKE_PIN; + ocp_write_byte(tp, MCU_TYPE_PLA, PLA_LWAKE_CTRL_REG, ocp_data); + /* rx aggregation */ ocp_data = ocp_read_word(tp, MCU_TYPE_USB, USB_USB_CTRL); ocp_data &= ~(RX_AGG_DISABLE | RX_ZERO_EN); @@ -5482,7 +5561,17 @@ static void r8153b_init(struct r8152 *tp) r8153b_ups_en(tp, false); r8153_queue_wake(tp, false); rtl_runtime_suspend_enable(tp, false); - r8153b_u1u2en(tp, true); + + ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_EXTRA_STATUS); + if (rtl8152_get_speed(tp) & LINK_STATUS) + ocp_data |= CUR_LINK_OK; + else + ocp_data &= ~CUR_LINK_OK; + ocp_data |= POLL_LINK_CHG; + ocp_write_word(tp, MCU_TYPE_PLA, PLA_EXTRA_STATUS, ocp_data); + + if (tp->udev->speed != USB_SPEED_HIGH) + r8153b_u1u2en(tp, true); usb_enable_lpm(tp->udev); /* MAC clock speed down */ @@ -5490,6 +5579,19 @@ static void r8153b_init(struct r8152 *tp) ocp_data |= MAC_CLK_SPDWN_EN; ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL2, ocp_data); + ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3); + ocp_data &= ~PLA_MCU_SPDWN_EN; + ocp_write_word(tp, MCU_TYPE_PLA, PLA_MAC_PWR_CTRL3, ocp_data); + + if (tp->version == RTL_VER_09) { + /* Disable Test IO for 32QFN */ + if (ocp_read_byte(tp, MCU_TYPE_PLA, 0xdc00) & BIT(5)) { + ocp_data = ocp_read_word(tp, MCU_TYPE_PLA, PLA_PHY_PWR); + ocp_data |= TEST_IO_OFF; + ocp_write_word(tp, MCU_TYPE_PLA, PLA_PHY_PWR, ocp_data); + } + } + set_bit(GREEN_ETHERNET, &tp->flags); /* rx aggregation */ @@ -6705,6 +6807,11 @@ static int rtl8152_probe(struct usb_interface *intf, intf->needs_remote_wakeup = 1; + if (!rtl_can_wakeup(tp)) + __rtl_set_wol(tp, 0); + else + tp->saved_wolopts = __rtl_get_wol(tp); + tp->rtl_ops.init(tp); #if IS_BUILTIN(CONFIG_USB_RTL8152) /* Retry in case request_firmware() is not ready yet. */ @@ -6722,10 +6829,6 @@ static int rtl8152_probe(struct usb_interface *intf, goto out1; } - if (!rtl_can_wakeup(tp)) - __rtl_set_wol(tp, 0); - - tp->saved_wolopts = __rtl_get_wol(tp); if (tp->saved_wolopts) device_set_wakeup_enable(&udev->dev, true); else diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c index da2d179430ca..49cc4b7ed516 100644 --- a/drivers/net/wireless/ath/ar5523/ar5523.c +++ b/drivers/net/wireless/ath/ar5523/ar5523.c @@ -74,7 +74,7 @@ static void ar5523_read_reply(struct ar5523 *ar, struct ar5523_cmd_hdr *hdr, if (cmd->odata) { if (cmd->olen < olen) { - ar5523_err(ar, "olen to small %d < %d\n", + ar5523_err(ar, "olen too small %d < %d\n", cmd->olen, olen); cmd->olen = 0; cmd->res = -EOVERFLOW; @@ -1770,6 +1770,8 @@ static const struct usb_device_id ar5523_id_table[] = { AR5523_DEVICE_UX(0x0846, 0x4300), /* Netgear / WG111U */ AR5523_DEVICE_UG(0x0846, 0x4250), /* Netgear / WG111T */ AR5523_DEVICE_UG(0x0846, 0x5f00), /* Netgear / WPN111 */ + AR5523_DEVICE_UG(0x083a, 0x4506), /* SMC / EZ Connect + SMCWUSBT-G2 */ AR5523_DEVICE_UG(0x157e, 0x3006), /* Umedia / AR5523_1 */ AR5523_DEVICE_UX(0x157e, 0x3205), /* Umedia / AR5523_2 */ AR5523_DEVICE_UG(0x157e, 0x3006), /* Umedia / TEW444UBEU */ diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c index 7fb15180c685..38a5814cf345 100644 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c @@ -2140,7 +2140,7 @@ static bool ath10k_htt_rx_pn_check_replay_hl(struct ath10k *ar, if (last_pn_valid) pn_invalid = ath10k_htt_rx_pn_cmp48(&new_pn, last_pn); else - peer->tids_last_pn_valid[tid] = 1; + peer->tids_last_pn_valid[tid] = true; if (!pn_invalid) last_pn->pn48 = new_pn.pn48; diff --git a/drivers/net/wireless/ath/ath10k/hw.h b/drivers/net/wireless/ath/ath10k/hw.h index 21b7a2a873b0..775fd62fb92d 100644 --- a/drivers/net/wireless/ath/ath10k/hw.h +++ b/drivers/net/wireless/ath/ath10k/hw.h @@ -816,7 +816,7 @@ ath10k_is_rssi_enable(struct ath10k_hw_params *hw, #define TARGET_10_4_TX_DBG_LOG_SIZE 1024 #define TARGET_10_4_NUM_WDS_ENTRIES 32 -#define TARGET_10_4_DMA_BURST_SIZE 0 +#define TARGET_10_4_DMA_BURST_SIZE 1 #define TARGET_10_4_MAC_AGGR_DELIM 0 #define TARGET_10_4_RX_SKIP_DEFRAG_TIMEOUT_DUP_DETECTION_CHECK 1 #define TARGET_10_4_VOW_CONFIG 0 diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c index bb44f5a0941b..ded7a220a4aa 100644 --- a/drivers/net/wireless/ath/ath10k/pci.c +++ b/drivers/net/wireless/ath/ath10k/pci.c @@ -1578,7 +1578,7 @@ static int ath10k_pci_set_ram_config(struct ath10k *ar, u32 config) return 0; } -/* if an error happened returns < 0, otherwise the length */ +/* Always returns the length */ static int ath10k_pci_dump_memory_sram(struct ath10k *ar, const struct ath10k_mem_region *region, u8 *buf) @@ -1604,11 +1604,22 @@ static int ath10k_pci_dump_memory_reg(struct ath10k *ar, { struct ath10k_pci *ar_pci = ath10k_pci_priv(ar); u32 i; + int ret; + + mutex_lock(&ar->conf_mutex); + if (ar->state != ATH10K_STATE_ON) { + ath10k_warn(ar, "Skipping pci_dump_memory_reg invalid state\n"); + ret = -EIO; + goto done; + } for (i = 0; i < region->len; i += 4) *(u32 *)(buf + i) = ioread32(ar_pci->mem + region->start + i); - return region->len; + ret = region->len; +done: + mutex_unlock(&ar->conf_mutex); + return ret; } /* if an error happened returns < 0, otherwise the length */ @@ -1704,7 +1715,11 @@ static void ath10k_pci_dump_memory(struct ath10k *ar, count = ath10k_pci_dump_memory_sram(ar, current_region, buf); break; case ATH10K_MEM_REGION_TYPE_IOREG: - count = ath10k_pci_dump_memory_reg(ar, current_region, buf); + ret = ath10k_pci_dump_memory_reg(ar, current_region, buf); + if (ret < 0) + break; + + count = ret; break; default: ret = ath10k_pci_dump_memory_generic(ar, current_region, buf); diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c index 7b524b614f22..85dce43c5439 100644 --- a/drivers/net/wireless/ath/ath10k/qmi.c +++ b/drivers/net/wireless/ath/ath10k/qmi.c @@ -84,6 +84,9 @@ static int ath10k_qmi_setup_msa_permissions(struct ath10k_qmi *qmi) int ret; int i; + if (qmi->msa_fixed_perm) + return 0; + for (i = 0; i < qmi->nr_mem_region; i++) { ret = ath10k_qmi_map_msa_permission(qmi, &qmi->mem_region[i]); if (ret) @@ -102,6 +105,9 @@ static void ath10k_qmi_remove_msa_permission(struct ath10k_qmi *qmi) { int i; + if (qmi->msa_fixed_perm) + return; + for (i = 0; i < qmi->nr_mem_region; i++) ath10k_qmi_unmap_msa_permission(qmi, &qmi->mem_region[i]); } @@ -1035,6 +1041,9 @@ static int ath10k_qmi_setup_msa_resources(struct ath10k_qmi *qmi, u32 msa_size) qmi->msa_mem_size = msa_size; } + if (of_property_read_bool(dev->of_node, "qcom,msa-fixed-perm")) + qmi->msa_fixed_perm = true; + ath10k_dbg(ar, ATH10K_DBG_QMI, "msa pa: %pad , msa va: 0x%p\n", &qmi->msa_pa, qmi->msa_va); diff --git a/drivers/net/wireless/ath/ath10k/qmi.h b/drivers/net/wireless/ath/ath10k/qmi.h index 40aafb875ed0..dc257375f161 100644 --- a/drivers/net/wireless/ath/ath10k/qmi.h +++ b/drivers/net/wireless/ath/ath10k/qmi.h @@ -104,6 +104,7 @@ struct ath10k_qmi { bool fw_ready; char fw_build_timestamp[MAX_TIMESTAMP_LEN + 1]; struct ath10k_qmi_cal_data cal_data[MAX_NUM_CAL_V01]; + bool msa_fixed_perm; }; int ath10k_qmi_wlan_enable(struct ath10k *ar, diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 6fdf71b8b676..e5316b911e1d 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -642,16 +642,23 @@ static int ath10k_sdio_mbox_rx_fetch(struct ath10k *ar) ret = ath10k_sdio_readsb(ar, ar_sdio->mbox_info.htc_addr, skb->data, pkt->alloc_len); - - if (ret) { - ar_sdio->n_rx_pkts = 0; - ath10k_sdio_mbox_free_rx_pkt(pkt); - return ret; - } + if (ret) + goto err; htc_hdr = (struct ath10k_htc_hdr *)skb->data; pkt->act_len = le16_to_cpu(htc_hdr->len) + sizeof(*htc_hdr); + + if (pkt->act_len > pkt->alloc_len) { + ret = -EINVAL; + goto err; + } + skb_put(skb, pkt->act_len); + return 0; + +err: + ar_sdio->n_rx_pkts = 0; + ath10k_sdio_mbox_free_rx_pkt(pkt); return ret; } @@ -687,6 +694,11 @@ static int ath10k_sdio_mbox_rx_fetch_bundle(struct ath10k *ar) htc_hdr = (struct ath10k_htc_hdr *)(ar_sdio->vsg_buffer + pkt_offset); pkt->act_len = le16_to_cpu(htc_hdr->len) + sizeof(*htc_hdr); + if (pkt->act_len > pkt->alloc_len ) { + ret = -EINVAL; + goto err; + } + skb_put_data(pkt->skb, htc_hdr, pkt->act_len); pkt_offset += pkt->alloc_len; } diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c index 7e85c4916e7f..21081b4a27d7 100644 --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -46,7 +46,7 @@ static const char * const ath10k_regulators[] = { }; static const char * const ath10k_clocks[] = { - "cxo_ref_clk_pin", + "cxo_ref_clk_pin", "qdss", }; static void ath10k_snoc_htc_tx_cb(struct ath10k_ce_pipe *ce_state); @@ -582,7 +582,7 @@ static void ath10k_snoc_process_rx_cb(struct ath10k_ce_pipe *ce_state, max_nbytes, DMA_FROM_DEVICE); if (unlikely(max_nbytes < nbytes)) { - ath10k_warn(ar, "rxed more than expected (nbytes %d, max %d)", + ath10k_warn(ar, "rxed more than expected (nbytes %d, max %d)\n", nbytes, max_nbytes); dev_kfree_skb_any(skb); continue; @@ -1201,7 +1201,7 @@ static int ath10k_snoc_request_irq(struct ath10k *ar) irqflags, ce_name[id], ar); if (ret) { ath10k_err(ar, - "failed to register IRQ handler for CE %d: %d", + "failed to register IRQ handler for CE %d: %d\n", id, ret); goto err_irq; } @@ -1466,7 +1466,6 @@ MODULE_DEVICE_TABLE(of, ath10k_snoc_dt_match); static int ath10k_snoc_probe(struct platform_device *pdev) { const struct ath10k_snoc_drv_priv *drv_data; - const struct of_device_id *of_id; struct ath10k_snoc *ar_snoc; struct device *dev; struct ath10k *ar; @@ -1474,18 +1473,16 @@ static int ath10k_snoc_probe(struct platform_device *pdev) int ret; u32 i; - of_id = of_match_device(ath10k_snoc_dt_match, &pdev->dev); - if (!of_id) { - dev_err(&pdev->dev, "failed to find matching device tree id\n"); + dev = &pdev->dev; + drv_data = device_get_match_data(dev); + if (!drv_data) { + dev_err(dev, "failed to find matching device tree id\n"); return -EINVAL; } - drv_data = of_id->data; - dev = &pdev->dev; - ret = dma_set_mask_and_coherent(dev, drv_data->dma_mask); if (ret) { - dev_err(dev, "failed to set dma mask: %d", ret); + dev_err(dev, "failed to set dma mask: %d\n", ret); return ret; } diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c index 13f7531d945d..61885d4d662c 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.c +++ b/drivers/net/wireless/ath/ath10k/wmi.c @@ -9490,7 +9490,7 @@ static int ath10k_wmi_mgmt_tx_clean_up_pending(int msdu_id, void *ptr, msdu = pkt_addr->vaddr; dma_unmap_single(ar->dev, pkt_addr->paddr, - msdu->len, DMA_FROM_DEVICE); + msdu->len, DMA_TO_DEVICE); ieee80211_free_txskb(ar->hw, msdu); return 0; diff --git a/drivers/net/wireless/ath/ath11k/Kconfig b/drivers/net/wireless/ath/ath11k/Kconfig index cfab4fb86aef..c88e16d4022b 100644 --- a/drivers/net/wireless/ath/ath11k/Kconfig +++ b/drivers/net/wireless/ath/ath11k/Kconfig @@ -22,7 +22,7 @@ config ATH11K_DEBUG config ATH11K_DEBUGFS bool "QCA ath11k debugfs support" - depends on ATH11K && DEBUG_FS + depends on ATH11K && DEBUG_FS && MAC80211_DEBUGFS ---help--- Enable ath11k debugfs support diff --git a/drivers/net/wireless/ath/ath11k/Makefile b/drivers/net/wireless/ath/ath11k/Makefile index a91d75c1cfeb..2761d07d938e 100644 --- a/drivers/net/wireless/ath/ath11k/Makefile +++ b/drivers/net/wireless/ath/ath11k/Makefile @@ -17,8 +17,7 @@ ath11k-y += core.o \ ce.o \ peer.o -ath11k-$(CONFIG_ATH11K_DEBUGFS) += debug_htt_stats.o -ath11k-$(CONFIG_MAC80211_DEBUGFS) += debugfs_sta.o +ath11k-$(CONFIG_ATH11K_DEBUGFS) += debug_htt_stats.o debugfs_sta.o ath11k-$(CONFIG_NL80211_TESTMODE) += testmode.o ath11k-$(CONFIG_ATH11K_TRACING) += trace.o diff --git a/drivers/net/wireless/ath/ath11k/debug.c b/drivers/net/wireless/ath/ath11k/debug.c index f48daf17f2d2..8d485171b0b3 100644 --- a/drivers/net/wireless/ath/ath11k/debug.c +++ b/drivers/net/wireless/ath/ath11k/debug.c @@ -931,7 +931,22 @@ static ssize_t ath11k_write_pktlog_filter(struct file *file, HTT_RX_FILTER_TLV_FLAGS_PACKET_HEADER | HTT_RX_FILTER_TLV_FLAGS_ATTENTION; } else if (mode == ATH11K_PKTLOG_MODE_LITE) { + ret = ath11k_dp_tx_htt_h2t_ppdu_stats_req(ar, + HTT_PPDU_STATS_TAG_PKTLOG); + if (ret) { + ath11k_err(ar->ab, "failed to enable pktlog lite: %d\n", ret); + goto out; + } + rx_filter = HTT_RX_FILTER_TLV_LITE_MODE; + } else { + ret = ath11k_dp_tx_htt_h2t_ppdu_stats_req(ar, + HTT_PPDU_STATS_TAG_DEFAULT); + if (ret) { + ath11k_err(ar->ab, "failed to send htt ppdu stats req: %d\n", + ret); + goto out; + } } tlv_filter.rx_filter = rx_filter; diff --git a/drivers/net/wireless/ath/ath11k/debug.h b/drivers/net/wireless/ath/ath11k/debug.h index a317a7bdb9a2..8e8d5588b541 100644 --- a/drivers/net/wireless/ath/ath11k/debug.h +++ b/drivers/net/wireless/ath/ath11k/debug.h @@ -172,6 +172,16 @@ static inline int ath11k_debug_is_extd_rx_stats_enabled(struct ath11k *ar) { return ar->debug.extd_rx_stats; } + +void ath11k_sta_add_debugfs(struct ieee80211_hw *hw, struct ieee80211_vif *vif, + struct ieee80211_sta *sta, struct dentry *dir); +void +ath11k_accumulate_per_peer_tx_stats(struct ath11k_sta *arsta, + struct ath11k_per_peer_tx_stats *peer_stats, + u8 legacy_rate_idx); +void ath11k_update_per_peer_stats_from_txcompl(struct ath11k *ar, + struct sk_buff *msdu, + struct hal_tx_status *ts); #else static inline int ath11k_debug_soc_create(struct ath11k_base *ab) { @@ -243,19 +253,7 @@ static inline bool ath11k_debug_is_pktlog_peer_valid(struct ath11k *ar, u8 *addr { return false; } -#endif /* CONFIG_ATH11K_DEBUGFS */ -#ifdef CONFIG_MAC80211_DEBUGFS -void ath11k_sta_add_debugfs(struct ieee80211_hw *hw, struct ieee80211_vif *vif, - struct ieee80211_sta *sta, struct dentry *dir); -void -ath11k_accumulate_per_peer_tx_stats(struct ath11k_sta *arsta, - struct ath11k_per_peer_tx_stats *peer_stats, - u8 legacy_rate_idx); -void ath11k_update_per_peer_stats_from_txcompl(struct ath11k *ar, - struct sk_buff *msdu, - struct hal_tx_status *ts); -#else /* !CONFIG_MAC80211_DEBUGFS */ static inline void ath11k_accumulate_per_peer_tx_stats(struct ath11k_sta *arsta, struct ath11k_per_peer_tx_stats *peer_stats, diff --git a/drivers/net/wireless/ath/ath11k/debug_htt_stats.c b/drivers/net/wireless/ath/ath11k/debug_htt_stats.c index 090fffa5e53c..9939e909628f 100644 --- a/drivers/net/wireless/ath/ath11k/debug_htt_stats.c +++ b/drivers/net/wireless/ath/ath11k/debug_htt_stats.c @@ -776,11 +776,14 @@ static inline void htt_print_tx_peer_rate_stats_tlv(const void *tag_buf, u32 len = stats_req->buf_len; u32 buf_len = ATH11K_HTT_STATS_BUF_SIZE; char str_buf[HTT_MAX_STRING_LEN] = {0}; - char *tx_gi[HTT_TX_PEER_STATS_NUM_GI_COUNTERS]; + char *tx_gi[HTT_TX_PEER_STATS_NUM_GI_COUNTERS] = {NULL}; u8 j; - for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) + for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) { tx_gi[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!tx_gi[j]) + goto fail; + } len += HTT_DBG_OUT(buf + len, buf_len - len, "HTT_TX_PEER_RATE_STATS_TLV:"); len += HTT_DBG_OUT(buf + len, buf_len - len, "tx_ldpc = %u", @@ -841,15 +844,16 @@ static inline void htt_print_tx_peer_rate_stats_tlv(const void *tag_buf, HTT_TX_PDEV_STATS_NUM_DCM_COUNTERS); len += HTT_DBG_OUT(buf + len, buf_len - len, "tx_dcm = %s\n", str_buf); - for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) - kfree(tx_gi[j]); - if (len >= buf_len) buf[buf_len - 1] = 0; else buf[len] = 0; stats_req->buf_len = len; + +fail: + for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) + kfree(tx_gi[j]); } static inline void htt_print_rx_peer_rate_stats_tlv(const void *tag_buf, @@ -860,15 +864,21 @@ static inline void htt_print_rx_peer_rate_stats_tlv(const void *tag_buf, u32 len = stats_req->buf_len; u32 buf_len = ATH11K_HTT_STATS_BUF_SIZE; u8 j; - char *rssi_chain[HTT_RX_PEER_STATS_NUM_SPATIAL_STREAMS]; - char *rx_gi[HTT_RX_PEER_STATS_NUM_GI_COUNTERS]; + char *rssi_chain[HTT_RX_PEER_STATS_NUM_SPATIAL_STREAMS] = {NULL}; + char *rx_gi[HTT_RX_PEER_STATS_NUM_GI_COUNTERS] = {NULL}; char str_buf[HTT_MAX_STRING_LEN] = {0}; - for (j = 0; j < HTT_RX_PEER_STATS_NUM_SPATIAL_STREAMS; j++) + for (j = 0; j < HTT_RX_PEER_STATS_NUM_SPATIAL_STREAMS; j++) { rssi_chain[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!rssi_chain[j]) + goto fail; + } - for (j = 0; j < HTT_RX_PEER_STATS_NUM_GI_COUNTERS; j++) + for (j = 0; j < HTT_RX_PEER_STATS_NUM_GI_COUNTERS; j++) { rx_gi[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!rx_gi[j]) + goto fail; + } len += HTT_DBG_OUT(buf + len, buf_len - len, "HTT_RX_PEER_RATE_STATS_TLV:"); len += HTT_DBG_OUT(buf + len, buf_len - len, "nsts = %u", @@ -928,18 +938,19 @@ static inline void htt_print_rx_peer_rate_stats_tlv(const void *tag_buf, HTT_RX_PDEV_STATS_NUM_PREAMBLE_TYPES); len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_pream = %s\n", str_buf); - for (j = 0; j < HTT_RX_PEER_STATS_NUM_SPATIAL_STREAMS; j++) - kfree(rssi_chain[j]); - - for (j = 0; j < HTT_RX_PEER_STATS_NUM_GI_COUNTERS; j++) - kfree(rx_gi[j]); - if (len >= buf_len) buf[buf_len - 1] = 0; else buf[len] = 0; stats_req->buf_len = len; + +fail: + for (j = 0; j < HTT_RX_PEER_STATS_NUM_SPATIAL_STREAMS; j++) + kfree(rssi_chain[j]); + + for (j = 0; j < HTT_RX_PEER_STATS_NUM_GI_COUNTERS; j++) + kfree(rx_gi[j]); } static inline void @@ -2832,10 +2843,13 @@ static inline void htt_print_tx_pdev_rate_stats_tlv(const void *tag_buf, u32 buf_len = ATH11K_HTT_STATS_BUF_SIZE; u8 j; char str_buf[HTT_MAX_STRING_LEN] = {0}; - char *tx_gi[HTT_TX_PEER_STATS_NUM_GI_COUNTERS]; + char *tx_gi[HTT_TX_PEER_STATS_NUM_GI_COUNTERS] = {NULL}; - for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) + for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) { tx_gi[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!tx_gi[j]) + goto fail; + } len += HTT_DBG_OUT(buf + len, buf_len - len, "HTT_TX_PDEV_RATE_STATS_TLV:"); len += HTT_DBG_OUT(buf + len, buf_len - len, "mac_id = %u", @@ -2988,15 +3002,15 @@ static inline void htt_print_tx_pdev_rate_stats_tlv(const void *tag_buf, HTT_TX_PDEV_STATS_NUM_DCM_COUNTERS); len += HTT_DBG_OUT(buf + len, buf_len - len, "tx_dcm = %s\n", str_buf); - for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) - kfree(tx_gi[j]); - if (len >= buf_len) buf[buf_len - 1] = 0; else buf[len] = 0; stats_req->buf_len = len; +fail: + for (j = 0; j < HTT_TX_PEER_STATS_NUM_GI_COUNTERS; j++) + kfree(tx_gi[j]); } static inline void htt_print_rx_pdev_rate_stats_tlv(const void *tag_buf, @@ -3006,16 +3020,30 @@ static inline void htt_print_rx_pdev_rate_stats_tlv(const void *tag_buf, u8 *buf = stats_req->buf; u32 len = stats_req->buf_len; u32 buf_len = ATH11K_HTT_STATS_BUF_SIZE; - u8 j; - char *rssi_chain[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS]; - char *rx_gi[HTT_RX_PDEV_STATS_NUM_GI_COUNTERS]; + u8 i, j; + u16 index = 0; + char *rssi_chain[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS] = {NULL}; + char *rx_gi[HTT_RX_PDEV_STATS_NUM_GI_COUNTERS] = {NULL}; char str_buf[HTT_MAX_STRING_LEN] = {0}; + char *rx_pilot_evm_db[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS] = {NULL}; - for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) { rssi_chain[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!rssi_chain[j]) + goto fail; + } - for (j = 0; j < HTT_RX_PDEV_STATS_NUM_GI_COUNTERS; j++) + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_GI_COUNTERS; j++) { rx_gi[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!rx_gi[j]) + goto fail; + } + + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) { + rx_pilot_evm_db[j] = kmalloc(HTT_MAX_STRING_LEN, GFP_ATOMIC); + if (!rx_pilot_evm_db[j]) + goto fail; + } len += HTT_DBG_OUT(buf + len, buf_len - len, "HTT_RX_PDEV_RATE_STATS_TLV:"); len += HTT_DBG_OUT(buf + len, buf_len - len, "mac_id = %u", @@ -3059,6 +3087,32 @@ static inline void htt_print_rx_pdev_rate_stats_tlv(const void *tag_buf, ARRAY_TO_STRING(str_buf, htt_stats_buf->rx_bw, HTT_RX_PDEV_STATS_NUM_BW_COUNTERS); len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_bw = %s ", str_buf); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_evm_nss_count = %u", + htt_stats_buf->nss_count); + + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_evm_pilot_count = %u", + htt_stats_buf->pilot_count); + + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) { + index = 0; + + for (i = 0; i < HTT_RX_PDEV_STATS_RXEVM_MAX_PILOTS_PER_NSS; i++) + index += snprintf(&rx_pilot_evm_db[j][index], + HTT_MAX_STRING_LEN - index, + " %u:%d,", + i, + htt_stats_buf->rx_pilot_evm_db[j][i]); + len += HTT_DBG_OUT(buf + len, buf_len - len, "pilot_evm_dB[%u] = %s ", + j, rx_pilot_evm_db[j]); + } + + index = 0; + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + for (i = 0; i < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; i++) + index += snprintf(&str_buf[index], + HTT_MAX_STRING_LEN - index, + " %u:%d,", i, htt_stats_buf->rx_pilot_evm_db_mean[i]); + len += HTT_DBG_OUT(buf + len, buf_len - len, "pilot_evm_dB_mean = %s ", str_buf); for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) { ARRAY_TO_STRING(rssi_chain[j], htt_stats_buf->rssi_chain[j], @@ -3079,12 +3133,6 @@ static inline void htt_print_rx_pdev_rate_stats_tlv(const void *tag_buf, HTT_RX_PDEV_STATS_NUM_PREAMBLE_TYPES); len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_pream = %s", str_buf); - for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) - kfree(rssi_chain[j]); - - for (j = 0; j < HTT_RX_PDEV_STATS_NUM_GI_COUNTERS; j++) - kfree(rx_gi[j]); - len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_11ax_su_ext = %u", htt_stats_buf->rx_11ax_su_ext); len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_11ac_mumimo = %u", @@ -3110,8 +3158,89 @@ static inline void htt_print_rx_pdev_rate_stats_tlv(const void *tag_buf, len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_active_dur_us_low = %u", htt_stats_buf->rx_active_dur_us_low); - len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_active_dur_us_high = %u\n", - htt_stats_buf->rx_active_dur_us_high); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_active_dur_us_high = %u", + htt_stats_buf->rx_active_dur_us_high); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_11ax_ul_ofdma = %u", + htt_stats_buf->rx_11ax_ul_ofdma); + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->ul_ofdma_rx_mcs, + HTT_RX_PDEV_STATS_NUM_MCS_COUNTERS); + len += HTT_DBG_OUT(buf + len, buf_len - len, "ul_ofdma_rx_mcs = %s ", str_buf); + + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_GI_COUNTERS; j++) { + ARRAY_TO_STRING(rx_gi[j], htt_stats_buf->ul_ofdma_rx_gi[j], + HTT_RX_PDEV_STATS_NUM_MCS_COUNTERS); + len += HTT_DBG_OUT(buf + len, buf_len - len, "ul_ofdma_rx_gi[%u] = %s ", + j, rx_gi[j]); + } + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->ul_ofdma_rx_nss, + HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS); + len += HTT_DBG_OUT(buf + len, buf_len - len, "ul_ofdma_rx_nss = %s ", str_buf); + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->ul_ofdma_rx_bw, + HTT_RX_PDEV_STATS_NUM_BW_COUNTERS); + len += HTT_DBG_OUT(buf + len, buf_len - len, "ul_ofdma_rx_bw = %s ", str_buf); + + len += HTT_DBG_OUT(buf + len, buf_len - len, "ul_ofdma_rx_stbc = %u", + htt_stats_buf->ul_ofdma_rx_stbc); + len += HTT_DBG_OUT(buf + len, buf_len - len, "ul_ofdma_rx_ldpc = %u", + htt_stats_buf->ul_ofdma_rx_ldpc); + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->rx_ulofdma_non_data_ppdu, + HTT_RX_PDEV_MAX_OFDMA_NUM_USER); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_ulofdma_non_data_ppdu = %s ", + str_buf); + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->rx_ulofdma_data_ppdu, + HTT_RX_PDEV_MAX_OFDMA_NUM_USER); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_ulofdma_data_ppdu = %s ", + str_buf); + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->rx_ulofdma_mpdu_ok, + HTT_RX_PDEV_MAX_OFDMA_NUM_USER); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_ulofdma_mpdu_ok = %s ", str_buf); + + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + ARRAY_TO_STRING(str_buf, htt_stats_buf->rx_ulofdma_mpdu_fail, + HTT_RX_PDEV_MAX_OFDMA_NUM_USER); + len += HTT_DBG_OUT(buf + len, buf_len - len, "rx_ulofdma_mpdu_fail = %s", + str_buf); + + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) { + index = 0; + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + for (i = 0; i < HTT_RX_PDEV_MAX_OFDMA_NUM_USER; i++) + index += snprintf(&str_buf[index], + HTT_MAX_STRING_LEN - index, + " %u:%d,", + i, htt_stats_buf->rx_ul_fd_rssi[j][i]); + len += HTT_DBG_OUT(buf + len, buf_len - len, + "rx_ul_fd_rssi: nss[%u] = %s", j, str_buf); + } + + len += HTT_DBG_OUT(buf + len, buf_len - len, "per_chain_rssi_pkt_type = %#x", + htt_stats_buf->per_chain_rssi_pkt_type); + + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) { + index = 0; + memset(str_buf, 0x0, HTT_MAX_STRING_LEN); + for (i = 0; i < HTT_RX_PDEV_STATS_NUM_BW_COUNTERS; i++) + index += snprintf(&str_buf[index], + HTT_MAX_STRING_LEN - index, + " %u:%d,", + i, + htt_stats_buf->rx_per_chain_rssi_in_dbm[j][i]); + len += HTT_DBG_OUT(buf + len, buf_len - len, + "rx_per_chain_rssi_in_dbm[%u] = %s ", j, str_buf); + } + len += HTT_DBG_OUT(buf + len, buf_len - len, "\n"); if (len >= buf_len) buf[buf_len - 1] = 0; @@ -3119,6 +3248,16 @@ static inline void htt_print_rx_pdev_rate_stats_tlv(const void *tag_buf, buf[len] = 0; stats_req->buf_len = len; + +fail: + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) + kfree(rssi_chain[j]); + + for (j = 0; j < HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS; j++) + kfree(rx_pilot_evm_db[j]); + + for (i = 0; i < HTT_RX_PDEV_STATS_NUM_GI_COUNTERS; i++) + kfree(rx_gi[i]); } static inline void htt_print_rx_soc_fw_stats_tlv(const void *tag_buf, diff --git a/drivers/net/wireless/ath/ath11k/debug_htt_stats.h b/drivers/net/wireless/ath/ath11k/debug_htt_stats.h index 618f1946bf49..4bdb62dd7b8d 100644 --- a/drivers/net/wireless/ath/ath11k/debug_htt_stats.h +++ b/drivers/net/wireless/ath/ath11k/debug_htt_stats.h @@ -1231,6 +1231,8 @@ struct htt_tx_pdev_rate_stats_tlv { #define HTT_RX_PDEV_STATS_NUM_BW_COUNTERS 4 #define HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS 8 #define HTT_RX_PDEV_STATS_NUM_PREAMBLE_TYPES HTT_STATS_PREAM_COUNT +#define HTT_RX_PDEV_MAX_OFDMA_NUM_USER 8 +#define HTT_RX_PDEV_STATS_RXEVM_MAX_PILOTS_PER_NSS 16 struct htt_rx_pdev_rate_stats_tlv { u32 mac_id__word; @@ -1269,6 +1271,46 @@ struct htt_rx_pdev_rate_stats_tlv { u32 rx_legacy_ofdm_rate[HTT_RX_PDEV_STATS_NUM_LEGACY_OFDM_STATS]; u32 rx_active_dur_us_low; u32 rx_active_dur_us_high; + + u32 rx_11ax_ul_ofdma; + + u32 ul_ofdma_rx_mcs[HTT_RX_PDEV_STATS_NUM_MCS_COUNTERS]; + u32 ul_ofdma_rx_gi[HTT_TX_PDEV_STATS_NUM_GI_COUNTERS] + [HTT_RX_PDEV_STATS_NUM_MCS_COUNTERS]; + u32 ul_ofdma_rx_nss[HTT_TX_PDEV_STATS_NUM_SPATIAL_STREAMS]; + u32 ul_ofdma_rx_bw[HTT_TX_PDEV_STATS_NUM_BW_COUNTERS]; + u32 ul_ofdma_rx_stbc; + u32 ul_ofdma_rx_ldpc; + + /* record the stats for each user index */ + u32 rx_ulofdma_non_data_ppdu[HTT_RX_PDEV_MAX_OFDMA_NUM_USER]; /* ppdu level */ + u32 rx_ulofdma_data_ppdu[HTT_RX_PDEV_MAX_OFDMA_NUM_USER]; /* ppdu level */ + u32 rx_ulofdma_mpdu_ok[HTT_RX_PDEV_MAX_OFDMA_NUM_USER]; /* mpdu level */ + u32 rx_ulofdma_mpdu_fail[HTT_RX_PDEV_MAX_OFDMA_NUM_USER]; /* mpdu level */ + + u32 nss_count; + u32 pilot_count; + /* RxEVM stats in dB */ + s32 rx_pilot_evm_db[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS] + [HTT_RX_PDEV_STATS_RXEVM_MAX_PILOTS_PER_NSS]; + /* rx_pilot_evm_db_mean: + * EVM mean across pilots, computed as + * mean(10*log10(rx_pilot_evm_linear)) = mean(rx_pilot_evm_db) + */ + s32 rx_pilot_evm_db_mean[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS]; + s8 rx_ul_fd_rssi[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS] + [HTT_RX_PDEV_MAX_OFDMA_NUM_USER]; /* dBm units */ + /* per_chain_rssi_pkt_type: + * This field shows what type of rx frame the per-chain RSSI was computed + * on, by recording the frame type and sub-type as bit-fields within this + * field: + * BIT [3 : 0] :- IEEE80211_FC0_TYPE + * BIT [7 : 4] :- IEEE80211_FC0_SUBTYPE + * BIT [31 : 8] :- Reserved + */ + u32 per_chain_rssi_pkt_type; + s8 rx_per_chain_rssi_in_dbm[HTT_RX_PDEV_STATS_NUM_SPATIAL_STREAMS] + [HTT_RX_PDEV_STATS_NUM_BW_COUNTERS]; }; /* == RX PDEV/SOC STATS == */ diff --git a/drivers/net/wireless/ath/ath11k/dp.h b/drivers/net/wireless/ath/ath11k/dp.h index 2f0980f2c762..6ef5be4201b2 100644 --- a/drivers/net/wireless/ath/ath11k/dp.h +++ b/drivers/net/wireless/ath/ath11k/dp.h @@ -507,6 +507,14 @@ enum htt_ppdu_stats_tag_type { | BIT(HTT_PPDU_STATS_TAG_USR_COMPLTN_FLUSH) \ | BIT(HTT_PPDU_STATS_TAG_USR_COMMON_ARRAY)) +#define HTT_PPDU_STATS_TAG_PKTLOG (BIT(HTT_PPDU_STATS_TAG_USR_MPDU_ENQ_BITMAP_64) | \ + BIT(HTT_PPDU_STATS_TAG_USR_MPDU_ENQ_BITMAP_256) | \ + BIT(HTT_PPDU_STATS_TAG_USR_COMPLTN_BA_BITMAP_64) | \ + BIT(HTT_PPDU_STATS_TAG_USR_COMPLTN_BA_BITMAP_256) | \ + BIT(HTT_PPDU_STATS_TAG_INFO) | \ + BIT(HTT_PPDU_STATS_TAG_TX_MGMTCTRL_PAYLOAD) | \ + HTT_PPDU_STATS_TAG_DEFAULT) + /* HTT_H2T_MSG_TYPE_RX_RING_SELECTION_CFG Message * * details: diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c index 3a3dc7680622..b08da839b7d9 100644 --- a/drivers/net/wireless/ath/ath11k/dp_rx.c +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c @@ -2067,7 +2067,8 @@ static void ath11k_dp_rx_deliver_msdu(struct ath11k *ar, struct napi_struct *nap struct sk_buff *msdu) { static const struct ieee80211_radiotap_he known = { - .data1 = cpu_to_le16(IEEE80211_RADIOTAP_HE_DATA1_DATA_MCS_KNOWN), + .data1 = cpu_to_le16(IEEE80211_RADIOTAP_HE_DATA1_DATA_MCS_KNOWN | + IEEE80211_RADIOTAP_HE_DATA1_BW_RU_ALLOC_KNOWN), .data2 = cpu_to_le16(IEEE80211_RADIOTAP_HE_DATA2_GI_KNOWN), }; struct ieee80211_rx_status *status; diff --git a/drivers/net/wireless/ath/ath11k/dp_tx.c b/drivers/net/wireless/ath/ath11k/dp_tx.c index 918305dda106..6d7d33761caf 100644 --- a/drivers/net/wireless/ath/ath11k/dp_tx.c +++ b/drivers/net/wireless/ath/ath11k/dp_tx.c @@ -461,7 +461,7 @@ void ath11k_dp_tx_completion_handler(struct ath11k_base *ab, int ring_id) int hal_ring_id = dp->tx_ring[ring_id].tcl_comp_ring.ring_id; struct hal_srng *status_ring = &ab->hal.srng_list[hal_ring_id]; struct sk_buff *msdu; - struct hal_tx_status ts; + struct hal_tx_status ts = { 0 }; struct dp_tx_ring *tx_ring = &dp->tx_ring[ring_id]; u32 *desc; u32 msdu_id; @@ -630,7 +630,7 @@ int ath11k_dp_tx_htt_srng_setup(struct ath11k_base *ab, u32 ring_id, dma_addr_t hp_addr, tp_addr; enum htt_srng_ring_type htt_ring_type; enum htt_srng_ring_id htt_ring_id; - int ret = 0; + int ret; skb = ath11k_htc_alloc_skb(ab, len); if (!skb) @@ -642,9 +642,10 @@ int ath11k_dp_tx_htt_srng_setup(struct ath11k_base *ab, u32 ring_id, hp_addr = ath11k_hal_srng_get_hp_addr(ab, srng); tp_addr = ath11k_hal_srng_get_tp_addr(ab, srng); - if (ath11k_dp_tx_get_ring_id_type(ab, mac_id, ring_id, - ring_type, &htt_ring_type, - &htt_ring_id)) + ret = ath11k_dp_tx_get_ring_id_type(ab, mac_id, ring_id, + ring_type, &htt_ring_type, + &htt_ring_id); + if (ret) goto err_free; skb_put(skb, len); @@ -669,10 +670,8 @@ int ath11k_dp_tx_htt_srng_setup(struct ath11k_base *ab, u32 ring_id, HAL_ADDR_MSB_REG_SHIFT; ret = ath11k_hal_srng_get_entrysize(ring_type); - if (ret < 0) { - ret = -EINVAL; + if (ret < 0) goto err_free; - } ring_entry_sz = ret; @@ -817,7 +816,7 @@ int ath11k_dp_tx_htt_rx_filter_setup(struct ath11k_base *ab, u32 ring_id, int len = sizeof(*cmd); enum htt_srng_ring_type htt_ring_type; enum htt_srng_ring_id htt_ring_id; - int ret = 0; + int ret; skb = ath11k_htc_alloc_skb(ab, len); if (!skb) @@ -826,9 +825,10 @@ int ath11k_dp_tx_htt_rx_filter_setup(struct ath11k_base *ab, u32 ring_id, memset(¶ms, 0, sizeof(params)); ath11k_hal_srng_get_params(ab, srng, ¶ms); - if (ath11k_dp_tx_get_ring_id_type(ab, mac_id, ring_id, - ring_type, &htt_ring_type, - &htt_ring_id)) + ret = ath11k_dp_tx_get_ring_id_type(ab, mac_id, ring_id, + ring_type, &htt_ring_type, + &htt_ring_id); + if (ret) goto err_free; skb_put(skb, len); diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index 556eef9881a7..6640662f5ede 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -3520,8 +3520,8 @@ static int ath11k_mac_copy_he_cap(struct ath11k *ar, static void ath11k_mac_setup_he_cap(struct ath11k *ar, struct ath11k_pdev_cap *cap) { - struct ieee80211_supported_band *band = NULL; - int count = 0; + struct ieee80211_supported_band *band; + int count; if (cap->supported_bands & WMI_HOST_WLAN_2G_CAP) { count = ath11k_mac_copy_he_cap(ar, cap, @@ -3529,6 +3529,7 @@ static void ath11k_mac_setup_he_cap(struct ath11k *ar, NL80211_BAND_2GHZ); band = &ar->mac.sbands[NL80211_BAND_2GHZ]; band->iftype_data = ar->mac.iftype[NL80211_BAND_2GHZ]; + band->n_iftype_data = count; } if (cap->supported_bands & WMI_HOST_WLAN_5G_CAP) { @@ -3537,9 +3538,8 @@ static void ath11k_mac_setup_he_cap(struct ath11k *ar, NL80211_BAND_5GHZ); band = &ar->mac.sbands[NL80211_BAND_5GHZ]; band->iftype_data = ar->mac.iftype[NL80211_BAND_5GHZ]; + band->n_iftype_data = count; } - - band->n_iftype_data = count; } static int __ath11k_set_antenna(struct ath11k *ar, u32 tx_ant, u32 rx_ant) @@ -4212,12 +4212,6 @@ static int ath11k_mac_op_add_interface(struct ieee80211_hw *hw, arvif->vdev_id, ret); } - ret = ath11k_mac_set_txbf_conf(arvif); - if (ret) { - ath11k_warn(ar->ab, "failed to set txbf conf for vdev %d: %d\n", - arvif->vdev_id, ret); - } - ath11k_dp_vdev_tx_attach(ar, arvif); mutex_unlock(&ar->conf_mutex); @@ -4567,6 +4561,11 @@ ath11k_mac_vdev_start_restart(struct ath11k_vif *arvif, arg.channel.freq, arg.vdev_id); } + ret = ath11k_mac_set_txbf_conf(arvif); + if (ret) + ath11k_warn(ab, "failed to set txbf conf for vdev %d: %d\n", + arvif->vdev_id, ret); + return 0; } @@ -5468,7 +5467,7 @@ static const struct ieee80211_ops ath11k_ops = { .flush = ath11k_mac_op_flush, .sta_statistics = ath11k_mac_op_sta_statistics, CFG80211_TESTMODE_CMD(ath11k_tm_cmd) -#ifdef CONFIG_MAC80211_DEBUGFS +#ifdef CONFIG_ATH11K_DEBUGFS .sta_add_debugfs = ath11k_sta_add_debugfs, #endif }; diff --git a/drivers/net/wireless/ath/ath9k/ar9003_aic.c b/drivers/net/wireless/ath/ath9k/ar9003_aic.c index 547cd46da260..d0f1e8bcc846 100644 --- a/drivers/net/wireless/ath/ath9k/ar9003_aic.c +++ b/drivers/net/wireless/ath/ath9k/ar9003_aic.c @@ -406,7 +406,7 @@ static bool ar9003_aic_cal_post_process(struct ath_hw *ah) sram.com_att_6db = ar9003_aic_find_index(1, fixed_com_att_db); - sram.valid = 1; + sram.valid = true; sram.rot_dir_att_db = min(max(rot_dir_path_att_db, diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c b/drivers/net/wireless/ath/wcn36xx/smd.c index 523550f94a3f..77269ac7f352 100644 --- a/drivers/net/wireless/ath/wcn36xx/smd.c +++ b/drivers/net/wireless/ath/wcn36xx/smd.c @@ -1620,7 +1620,7 @@ int wcn36xx_smd_send_beacon(struct wcn36xx *wcn, struct ieee80211_vif *vif, msg_body.beacon_length6 = msg_body.beacon_length + 6; if (msg_body.beacon_length > BEACON_TEMPLATE_SIZE) { - wcn36xx_err("Beacon is to big: beacon size=%d\n", + wcn36xx_err("Beacon is too big: beacon size=%d\n", msg_body.beacon_length); ret = -ENOMEM; goto out; diff --git a/drivers/net/wireless/ath/wil6210/main.c b/drivers/net/wireless/ath/wil6210/main.c index 6d39547e20fd..3ba5b2550a8c 100644 --- a/drivers/net/wireless/ath/wil6210/main.c +++ b/drivers/net/wireless/ath/wil6210/main.c @@ -762,7 +762,7 @@ int wil_priv_init(struct wil6210_priv *wil) */ wil->rx_buff_id_count = WIL_RX_BUFF_ARR_SIZE_DEFAULT; - wil->amsdu_en = 1; + wil->amsdu_en = true; return 0; diff --git a/drivers/net/wireless/ath/wil6210/txrx.c b/drivers/net/wireless/ath/wil6210/txrx.c index 17118d643d7e..bc8c15fb609d 100644 --- a/drivers/net/wireless/ath/wil6210/txrx.c +++ b/drivers/net/wireless/ath/wil6210/txrx.c @@ -1140,7 +1140,7 @@ static int wil_tx_desc_map(union wil_tx_desc *desc, dma_addr_t pa, void wil_tx_data_init(struct wil_ring_tx_data *txdata) { spin_lock_bh(&txdata->lock); - txdata->dot1x_open = 0; + txdata->dot1x_open = false; txdata->enabled = 0; txdata->idle = 0; txdata->last_idle = 0; diff --git a/drivers/net/wireless/ath/wil6210/wmi.c b/drivers/net/wireless/ath/wil6210/wmi.c index dcba0a4c47b4..23e1ed6a9d6d 100644 --- a/drivers/net/wireless/ath/wil6210/wmi.c +++ b/drivers/net/wireless/ath/wil6210/wmi.c @@ -1513,14 +1513,14 @@ static void wmi_link_stats_parse(struct wil6210_vif *vif, u64 tsf, if (vif->fw_stats_ready) { /* clean old statistics */ vif->fw_stats_tsf = 0; - vif->fw_stats_ready = 0; + vif->fw_stats_ready = false; } wil_link_stats_store_basic(vif, payload + hdr_size); if (!has_next) { vif->fw_stats_tsf = tsf; - vif->fw_stats_ready = 1; + vif->fw_stats_ready = true; } break; @@ -1535,14 +1535,14 @@ static void wmi_link_stats_parse(struct wil6210_vif *vif, u64 tsf, if (wil->fw_stats_global.ready) { /* clean old statistics */ wil->fw_stats_global.tsf = 0; - wil->fw_stats_global.ready = 0; + wil->fw_stats_global.ready = false; } wil_link_stats_store_global(vif, payload + hdr_size); if (!has_next) { wil->fw_stats_global.tsf = tsf; - wil->fw_stats_global.ready = 1; + wil->fw_stats_global.ready = true; } break; diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c index 6eb3064c3721..a2328d3eee03 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c @@ -11,6 +11,7 @@ #include <linux/vmalloc.h> #include <net/cfg80211.h> #include <net/netlink.h> +#include <uapi/linux/if_arp.h> #include <brcmu_utils.h> #include <defs.h> @@ -619,6 +620,82 @@ static bool brcmf_is_ibssmode(struct brcmf_cfg80211_vif *vif) return vif->wdev.iftype == NL80211_IFTYPE_ADHOC; } +/** + * brcmf_mon_add_vif() - create monitor mode virtual interface + * + * @wiphy: wiphy device of new interface. + * @name: name of the new interface. + */ +static struct wireless_dev *brcmf_mon_add_vif(struct wiphy *wiphy, + const char *name) +{ + struct brcmf_cfg80211_info *cfg = wiphy_to_cfg(wiphy); + struct brcmf_cfg80211_vif *vif; + struct net_device *ndev; + struct brcmf_if *ifp; + int err; + + if (cfg->pub->mon_if) { + err = -EEXIST; + goto err_out; + } + + vif = brcmf_alloc_vif(cfg, NL80211_IFTYPE_MONITOR); + if (IS_ERR(vif)) { + err = PTR_ERR(vif); + goto err_out; + } + + ndev = alloc_netdev(sizeof(*ifp), name, NET_NAME_UNKNOWN, ether_setup); + if (!ndev) { + err = -ENOMEM; + goto err_free_vif; + } + ndev->type = ARPHRD_IEEE80211_RADIOTAP; + ndev->ieee80211_ptr = &vif->wdev; + ndev->needs_free_netdev = true; + ndev->priv_destructor = brcmf_cfg80211_free_netdev; + SET_NETDEV_DEV(ndev, wiphy_dev(cfg->wiphy)); + + ifp = netdev_priv(ndev); + ifp->vif = vif; + ifp->ndev = ndev; + ifp->drvr = cfg->pub; + + vif->ifp = ifp; + vif->wdev.netdev = ndev; + + err = brcmf_net_mon_attach(ifp); + if (err) { + brcmf_err("Failed to attach %s device\n", ndev->name); + free_netdev(ndev); + goto err_free_vif; + } + + cfg->pub->mon_if = ifp; + + return &vif->wdev; + +err_free_vif: + brcmf_free_vif(vif); +err_out: + return ERR_PTR(err); +} + +static int brcmf_mon_del_vif(struct wiphy *wiphy, struct wireless_dev *wdev) +{ + struct brcmf_cfg80211_info *cfg = wiphy_to_cfg(wiphy); + struct net_device *ndev = wdev->netdev; + + ndev->netdev_ops->ndo_stop(ndev); + + brcmf_net_detach(ndev, true); + + cfg->pub->mon_if = NULL; + + return 0; +} + static struct wireless_dev *brcmf_cfg80211_add_iface(struct wiphy *wiphy, const char *name, unsigned char name_assign_type, @@ -641,9 +718,10 @@ static struct wireless_dev *brcmf_cfg80211_add_iface(struct wiphy *wiphy, case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_AP_VLAN: case NL80211_IFTYPE_WDS: - case NL80211_IFTYPE_MONITOR: case NL80211_IFTYPE_MESH_POINT: return ERR_PTR(-EOPNOTSUPP); + case NL80211_IFTYPE_MONITOR: + return brcmf_mon_add_vif(wiphy, name); case NL80211_IFTYPE_AP: wdev = brcmf_ap_add_vif(wiphy, name, params); break; @@ -826,9 +904,10 @@ int brcmf_cfg80211_del_iface(struct wiphy *wiphy, struct wireless_dev *wdev) case NL80211_IFTYPE_STATION: case NL80211_IFTYPE_AP_VLAN: case NL80211_IFTYPE_WDS: - case NL80211_IFTYPE_MONITOR: case NL80211_IFTYPE_MESH_POINT: return -EOPNOTSUPP; + case NL80211_IFTYPE_MONITOR: + return brcmf_mon_del_vif(wiphy, wdev); case NL80211_IFTYPE_AP: return brcmf_cfg80211_del_ap_iface(wiphy, wdev); case NL80211_IFTYPE_P2P_CLIENT: @@ -6547,12 +6626,14 @@ static int brcmf_setup_ifmodes(struct wiphy *wiphy, struct brcmf_if *ifp) struct ieee80211_iface_limit *c0_limits = NULL; struct ieee80211_iface_limit *p2p_limits = NULL; struct ieee80211_iface_limit *mbss_limits = NULL; - bool mbss, p2p, rsdb; - int i, c, n_combos; + bool mon_flag, mbss, p2p, rsdb, mchan; + int i, c, n_combos, n_limits; + mon_flag = brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MONITOR_FLAG); mbss = brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MBSS); p2p = brcmf_feat_is_enabled(ifp, BRCMF_FEAT_P2P); rsdb = brcmf_feat_is_enabled(ifp, BRCMF_FEAT_RSDB); + mchan = brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MCHAN); n_combos = 1 + !!(p2p && !rsdb) + !!mbss; combo = kcalloc(n_combos, sizeof(*combo), GFP_KERNEL); @@ -6562,59 +6643,45 @@ static int brcmf_setup_ifmodes(struct wiphy *wiphy, struct brcmf_if *ifp) wiphy->interface_modes = BIT(NL80211_IFTYPE_STATION) | BIT(NL80211_IFTYPE_ADHOC) | BIT(NL80211_IFTYPE_AP); + if (mon_flag) + wiphy->interface_modes |= BIT(NL80211_IFTYPE_MONITOR); + if (p2p) + wiphy->interface_modes |= BIT(NL80211_IFTYPE_P2P_CLIENT) | + BIT(NL80211_IFTYPE_P2P_GO) | + BIT(NL80211_IFTYPE_P2P_DEVICE); c = 0; i = 0; - if (p2p && rsdb) - c0_limits = kcalloc(4, sizeof(*c0_limits), GFP_KERNEL); - else if (p2p) - c0_limits = kcalloc(3, sizeof(*c0_limits), GFP_KERNEL); - else - c0_limits = kcalloc(2, sizeof(*c0_limits), GFP_KERNEL); + n_limits = 1 + mon_flag + (p2p ? 2 : 0) + (rsdb || !p2p); + c0_limits = kcalloc(n_limits, sizeof(*c0_limits), GFP_KERNEL); if (!c0_limits) goto err; - if (p2p && rsdb) { - combo[c].num_different_channels = 2; - wiphy->interface_modes |= BIT(NL80211_IFTYPE_P2P_CLIENT) | - BIT(NL80211_IFTYPE_P2P_GO) | - BIT(NL80211_IFTYPE_P2P_DEVICE); - c0_limits[i].max = 2; - c0_limits[i++].types = BIT(NL80211_IFTYPE_STATION); + + combo[c].num_different_channels = 1 + (rsdb || (p2p && mchan)); + c0_limits[i].max = 1 + rsdb; + c0_limits[i++].types = BIT(NL80211_IFTYPE_STATION); + if (mon_flag) { + c0_limits[i].max = 1; + c0_limits[i++].types = BIT(NL80211_IFTYPE_MONITOR); + } + if (p2p) { c0_limits[i].max = 1; c0_limits[i++].types = BIT(NL80211_IFTYPE_P2P_DEVICE); - c0_limits[i].max = 2; + c0_limits[i].max = 1 + rsdb; c0_limits[i++].types = BIT(NL80211_IFTYPE_P2P_CLIENT) | BIT(NL80211_IFTYPE_P2P_GO); + } + if (p2p && rsdb) { c0_limits[i].max = 2; c0_limits[i++].types = BIT(NL80211_IFTYPE_AP); combo[c].max_interfaces = 5; } else if (p2p) { - if (brcmf_feat_is_enabled(ifp, BRCMF_FEAT_MCHAN)) - combo[c].num_different_channels = 2; - else - combo[c].num_different_channels = 1; - c0_limits[i].max = 1; - c0_limits[i++].types = BIT(NL80211_IFTYPE_STATION); - wiphy->interface_modes |= BIT(NL80211_IFTYPE_P2P_CLIENT) | - BIT(NL80211_IFTYPE_P2P_GO) | - BIT(NL80211_IFTYPE_P2P_DEVICE); - c0_limits[i].max = 1; - c0_limits[i++].types = BIT(NL80211_IFTYPE_P2P_DEVICE); - c0_limits[i].max = 1; - c0_limits[i++].types = BIT(NL80211_IFTYPE_P2P_CLIENT) | - BIT(NL80211_IFTYPE_P2P_GO); combo[c].max_interfaces = i; } else if (rsdb) { - combo[c].num_different_channels = 2; - c0_limits[i].max = 2; - c0_limits[i++].types = BIT(NL80211_IFTYPE_STATION); c0_limits[i].max = 2; c0_limits[i++].types = BIT(NL80211_IFTYPE_AP); combo[c].max_interfaces = 3; } else { - combo[c].num_different_channels = 1; - c0_limits[i].max = 1; - c0_limits[i++].types = BIT(NL80211_IFTYPE_STATION); c0_limits[i].max = 1; c0_limits[i++].types = BIT(NL80211_IFTYPE_AP); combo[c].max_interfaces = i; @@ -6645,14 +6712,20 @@ static int brcmf_setup_ifmodes(struct wiphy *wiphy, struct brcmf_if *ifp) if (mbss) { c++; i = 0; - mbss_limits = kcalloc(1, sizeof(*mbss_limits), GFP_KERNEL); + n_limits = 1 + mon_flag; + mbss_limits = kcalloc(n_limits, sizeof(*mbss_limits), + GFP_KERNEL); if (!mbss_limits) goto err; mbss_limits[i].max = 4; mbss_limits[i++].types = BIT(NL80211_IFTYPE_AP); + if (mon_flag) { + mbss_limits[i].max = 1; + mbss_limits[i++].types = BIT(NL80211_IFTYPE_MONITOR); + } combo[c].beacon_int_infra_match = true; combo[c].num_different_channels = 1; - combo[c].max_interfaces = 4; + combo[c].max_interfaces = 4 + mon_flag; combo[c].n_limits = i; combo[c].limits = mbss_limits; } diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c index d3ddd97fe768..23627c953a5e 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c @@ -673,7 +673,7 @@ fail: return -EBADE; } -static void brcmf_net_detach(struct net_device *ndev, bool rtnl_locked) +void brcmf_net_detach(struct net_device *ndev, bool rtnl_locked) { if (ndev->reg_state == NETREG_REGISTERED) { if (rtnl_locked) @@ -686,6 +686,72 @@ static void brcmf_net_detach(struct net_device *ndev, bool rtnl_locked) } } +static int brcmf_net_mon_open(struct net_device *ndev) +{ + struct brcmf_if *ifp = netdev_priv(ndev); + struct brcmf_pub *drvr = ifp->drvr; + u32 monitor; + int err; + + brcmf_dbg(TRACE, "Enter\n"); + + err = brcmf_fil_cmd_int_get(ifp, BRCMF_C_GET_MONITOR, &monitor); + if (err) { + bphy_err(drvr, "BRCMF_C_GET_MONITOR error (%d)\n", err); + return err; + } else if (monitor) { + bphy_err(drvr, "Monitor mode is already enabled\n"); + return -EEXIST; + } + + monitor = 3; + err = brcmf_fil_cmd_int_set(ifp, BRCMF_C_SET_MONITOR, monitor); + if (err) + bphy_err(drvr, "BRCMF_C_SET_MONITOR error (%d)\n", err); + + return err; +} + +static int brcmf_net_mon_stop(struct net_device *ndev) +{ + struct brcmf_if *ifp = netdev_priv(ndev); + struct brcmf_pub *drvr = ifp->drvr; + u32 monitor; + int err; + + brcmf_dbg(TRACE, "Enter\n"); + + monitor = 0; + err = brcmf_fil_cmd_int_set(ifp, BRCMF_C_SET_MONITOR, monitor); + if (err) + bphy_err(drvr, "BRCMF_C_SET_MONITOR error (%d)\n", err); + + return err; +} + +static const struct net_device_ops brcmf_netdev_ops_mon = { + .ndo_open = brcmf_net_mon_open, + .ndo_stop = brcmf_net_mon_stop, +}; + +int brcmf_net_mon_attach(struct brcmf_if *ifp) +{ + struct brcmf_pub *drvr = ifp->drvr; + struct net_device *ndev; + int err; + + brcmf_dbg(TRACE, "Enter\n"); + + ndev = ifp->ndev; + ndev->netdev_ops = &brcmf_netdev_ops_mon; + + err = register_netdevice(ndev); + if (err) + bphy_err(drvr, "Failed to register %s device\n", ndev->name); + + return err; +} + void brcmf_net_setcarrier(struct brcmf_if *ifp, bool on) { struct net_device *ndev; diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h index 6699637d3bf8..33b2ab3b54b0 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.h @@ -210,6 +210,8 @@ void brcmf_txflowblock_if(struct brcmf_if *ifp, void brcmf_txfinalize(struct brcmf_if *ifp, struct sk_buff *txp, bool success); void brcmf_netif_rx(struct brcmf_if *ifp, struct sk_buff *skb); void brcmf_netif_mon_rx(struct brcmf_if *ifp, struct sk_buff *skb); +void brcmf_net_detach(struct net_device *ndev, bool rtnl_locked); +int brcmf_net_mon_attach(struct brcmf_if *ifp); void brcmf_net_setcarrier(struct brcmf_if *ifp, bool on); int __init brcmf_core_init(void); void __exit brcmf_core_exit(void); diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.c index 1c9c74cc958e..5da0dda0d899 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.c @@ -38,6 +38,7 @@ static const struct brcmf_feat_fwcap brcmf_fwcap_map[] = { { BRCMF_FEAT_MCHAN, "mchan" }, { BRCMF_FEAT_P2P, "p2p" }, { BRCMF_FEAT_MONITOR, "monitor" }, + { BRCMF_FEAT_MONITOR_FLAG, "rtap" }, { BRCMF_FEAT_MONITOR_FMT_RADIOTAP, "rtap" }, { BRCMF_FEAT_DOT11H, "802.11h" }, { BRCMF_FEAT_SAE, "sae" }, diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.h index 280a1f6412d4..cda3fc1bab7f 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.h +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/feature.h @@ -23,6 +23,7 @@ * GSCAN: enhanced scan offload feature. * FWSUP: Firmware supplicant. * MONITOR: firmware can pass monitor packets to host. + * MONITOR_FLAG: firmware flags monitor packets. * MONITOR_FMT_RADIOTAP: firmware provides monitor packets with radiotap header * MONITOR_FMT_HW_RX_HDR: firmware provides monitor packets with hw/ucode header * DOT11H: firmware supports 802.11h @@ -44,6 +45,7 @@ BRCMF_FEAT_DEF(GSCAN) \ BRCMF_FEAT_DEF(FWSUP) \ BRCMF_FEAT_DEF(MONITOR) \ + BRCMF_FEAT_DEF(MONITOR_FLAG) \ BRCMF_FEAT_DEF(MONITOR_FMT_RADIOTAP) \ BRCMF_FEAT_DEF(MONITOR_FMT_HW_RX_HDR) \ BRCMF_FEAT_DEF(DOT11H) \ diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil.h index 0ff6f5212a94..ae4cf4372908 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil.h +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil.h @@ -49,6 +49,8 @@ #define BRCMF_C_GET_PM 85 #define BRCMF_C_SET_PM 86 #define BRCMF_C_GET_REVINFO 98 +#define BRCMF_C_GET_MONITOR 107 +#define BRCMF_C_SET_MONITOR 108 #define BRCMF_C_GET_CURR_RATESET 114 #define BRCMF_C_GET_AP 117 #define BRCMF_C_SET_AP 118 diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c index 2bd892df83cc..5e1a11c07551 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c @@ -908,7 +908,7 @@ static u8 brcmf_fws_hdrpush(struct brcmf_fws_info *fws, struct sk_buff *skb) wlh += wlh[1] + 2; if (entry->send_tim_signal) { - entry->send_tim_signal = 0; + entry->send_tim_signal = false; wlh[0] = BRCMF_FWS_TYPE_PENDING_TRAFFIC_BMP; wlh[1] = BRCMF_FWS_TYPE_PENDING_TRAFFIC_BMP_LEN; wlh[2] = entry->mac_handle; diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c index e3dd8623be4e..8bb4f1fa790e 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c @@ -365,7 +365,7 @@ brcmf_msgbuf_get_pktid(struct device *dev, struct brcmf_msgbuf_pktids *pktids, struct brcmf_msgbuf_pktid *pktid; struct sk_buff *skb; - if (idx < 0 || idx >= pktids->array_size) { + if (idx >= pktids->array_size) { brcmf_err("Invalid packet id %d (max %d)\n", idx, pktids->array_size); return NULL; diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c index f9df95bc7fa1..f9047db6a11d 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c @@ -4243,6 +4243,12 @@ static void brcmf_sdio_firmware_callback(struct device *dev, int err, } if (err == 0) { + /* Assign bus interface call back */ + sdiod->bus_if->dev = sdiod->dev; + sdiod->bus_if->ops = &brcmf_sdio_bus_ops; + sdiod->bus_if->chip = bus->ci->chip; + sdiod->bus_if->chiprev = bus->ci->chiprev; + /* Allow full data communication using DPC from now on. */ brcmf_sdiod_change_state(bus->sdiodev, BRCMF_SDIOD_DATA); @@ -4259,12 +4265,6 @@ static void brcmf_sdio_firmware_callback(struct device *dev, int err, sdio_release_host(sdiod->func1); - /* Assign bus interface call back */ - sdiod->bus_if->dev = sdiod->dev; - sdiod->bus_if->ops = &brcmf_sdio_bus_ops; - sdiod->bus_if->chip = bus->ci->chip; - sdiod->bus_if->chiprev = bus->ci->chiprev; - err = brcmf_alloc(sdiod->dev, sdiod->settings); if (err) { brcmf_err("brcmf_alloc failed\n"); diff --git a/drivers/net/wireless/cisco/airo.c b/drivers/net/wireless/cisco/airo.c index f43c06569ea1..c4c8f1b62e1e 100644 --- a/drivers/net/wireless/cisco/airo.c +++ b/drivers/net/wireless/cisco/airo.c @@ -7790,16 +7790,8 @@ static int readrids(struct net_device *dev, aironet_ioctl *comp) { case AIROGVLIST: ridcode = RID_APLIST; break; case AIROGDRVNAM: ridcode = RID_DRVNAME; break; case AIROGEHTENC: ridcode = RID_ETHERENCAP; break; - case AIROGWEPKTMP: ridcode = RID_WEP_TEMP; - /* Only super-user can read WEP keys */ - if (!capable(CAP_NET_ADMIN)) - return -EPERM; - break; - case AIROGWEPKNV: ridcode = RID_WEP_PERM; - /* Only super-user can read WEP keys */ - if (!capable(CAP_NET_ADMIN)) - return -EPERM; - break; + case AIROGWEPKTMP: ridcode = RID_WEP_TEMP; break; + case AIROGWEPKNV: ridcode = RID_WEP_PERM; break; case AIROGSTAT: ridcode = RID_STATUS; break; case AIROGSTATSD32: ridcode = RID_STATSDELTA; break; case AIROGSTATSC32: ridcode = RID_STATS; break; @@ -7813,7 +7805,13 @@ static int readrids(struct net_device *dev, aironet_ioctl *comp) { return -EINVAL; } - if ((iobuf = kmalloc(RIDSIZE, GFP_KERNEL)) == NULL) + if (ridcode == RID_WEP_TEMP || ridcode == RID_WEP_PERM) { + /* Only super-user can read WEP keys */ + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + } + + if ((iobuf = kzalloc(RIDSIZE, GFP_KERNEL)) == NULL) return -ENOMEM; PC4500_readrid(ai,ridcode,iobuf,RIDSIZE, 1); diff --git a/drivers/net/wireless/intel/iwlegacy/common.c b/drivers/net/wireless/intel/iwlegacy/common.c index d966b29b45ee..348c17ce72f5 100644 --- a/drivers/net/wireless/intel/iwlegacy/common.c +++ b/drivers/net/wireless/intel/iwlegacy/common.c @@ -699,7 +699,7 @@ il_eeprom_init(struct il_priv *il) u32 gp = _il_rd(il, CSR_EEPROM_GP); int sz; int ret; - u16 addr; + int addr; /* allocate eeprom */ sz = il->cfg->eeprom_size; diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/1000.c b/drivers/net/wireless/intel/iwlwifi/cfg/1000.c index b92255b91b72..8a4579bb10d3 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/1000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/1000.c @@ -77,8 +77,7 @@ static const struct iwl_eeprom_params iwl1000_eeprom_params = { .trans.base_params = &iwl1000_base_params, \ .eeprom_params = &iwl1000_eeprom_params, \ .led_mode = IWL_LED_BLINK, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl1000_bgn_cfg = { .name = "Intel(R) Centrino(R) Wireless-N 1000 BGN", @@ -104,8 +103,7 @@ const struct iwl_cfg iwl1000_bg_cfg = { .eeprom_params = &iwl1000_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ .rx_with_siso_diversity = true, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl100_bgn_cfg = { .name = "Intel(R) Centrino(R) Wireless-N 100 BGN", diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/2000.c b/drivers/net/wireless/intel/iwlwifi/cfg/2000.c index 2b1ae0cecc83..7140a5f3ea8b 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/2000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/2000.c @@ -103,8 +103,7 @@ static const struct iwl_eeprom_params iwl20x0_eeprom_params = { .trans.base_params = &iwl2000_base_params, \ .eeprom_params = &iwl20x0_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl2000_2bgn_cfg = { @@ -131,8 +130,7 @@ const struct iwl_cfg iwl2000_2bgn_d_cfg = { .trans.base_params = &iwl2030_base_params, \ .eeprom_params = &iwl20x0_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl2030_2bgn_cfg = { .name = "Intel(R) Centrino(R) Wireless-N 2230 BGN", @@ -153,8 +151,7 @@ const struct iwl_cfg iwl2030_2bgn_cfg = { .eeprom_params = &iwl20x0_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ .rx_with_siso_diversity = true, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl105_bgn_cfg = { .name = "Intel(R) Centrino(R) Wireless-N 105 BGN", @@ -181,8 +178,7 @@ const struct iwl_cfg iwl105_bgn_d_cfg = { .eeprom_params = &iwl20x0_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ .rx_with_siso_diversity = true, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl135_bgn_cfg = { .name = "Intel(R) Centrino(R) Wireless-N 135 BGN", diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/22000.c b/drivers/net/wireless/intel/iwlwifi/cfg/22000.c index 56dc335a788c..a22a830019c0 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/22000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/22000.c @@ -92,6 +92,7 @@ #define IWL_22000_SO_A_GF_A_FW_PRE "iwlwifi-so-a0-gf-a0-" #define IWL_22000_TY_A_GF_A_FW_PRE "iwlwifi-ty-a0-gf-a0-" #define IWL_22000_SO_A_GF4_A_FW_PRE "iwlwifi-so-a0-gf4-a0-" +#define IWL_22000_SOSNJ_A_GF4_A_FW_PRE "iwlwifi-SoSnj-a0-gf4-a0-" #define IWL_22000_HR_MODULE_FIRMWARE(api) \ IWL_22000_HR_FW_PRE __stringify(api) ".ucode" @@ -199,7 +200,6 @@ static const struct iwl_ht_params iwl_22000_ht_params = { IWL_DEVICE_22000_COMMON, \ .trans.device_family = IWL_DEVICE_FAMILY_22000, \ .trans.base_params = &iwl_22000_base_params, \ - .trans.csr = &iwl_csr_v1, \ .gp2_reg_addr = 0xa02c68, \ .mon_dram_regs = { \ .write_ptr = { \ @@ -217,7 +217,6 @@ static const struct iwl_ht_params iwl_22000_ht_params = { .trans.umac_prph_offset = 0x300000, \ .trans.device_family = IWL_DEVICE_FAMILY_AX210, \ .trans.base_params = &iwl_ax210_base_params, \ - .trans.csr = &iwl_csr_v1, \ .min_txq_size = 128, \ .gp2_reg_addr = 0xd02c68, \ .min_256_ba_txq_size = 512, \ @@ -236,24 +235,16 @@ static const struct iwl_ht_params iwl_22000_ht_params = { }, \ } -const struct iwl_cfg iwl22000_2ac_cfg_hr = { - .name = "Intel(R) Dual Band Wireless AC 22000", - .fw_name_pre = IWL_22000_HR_FW_PRE, - IWL_DEVICE_22500, -}; - -const struct iwl_cfg iwl22000_2ac_cfg_hr_cdb = { - .name = "Intel(R) Dual Band Wireless AC 22000", - .fw_name_pre = IWL_22000_HR_CDB_FW_PRE, - IWL_DEVICE_22500, - .cdb = true, -}; - -const struct iwl_cfg iwl22000_2ac_cfg_jf = { - .name = "Intel(R) Dual Band Wireless AC 22000", - .fw_name_pre = IWL_22000_JF_FW_PRE, - IWL_DEVICE_22500, -}; +/* + * If the device doesn't support HE, no need to have that many buffers. + * 22000 devices can split multiple frames into a single RB, so fewer are + * needed; AX210 cannot (but use smaller RBs by default) - these sizes + * were picked according to 8 MSDUs inside 256 A-MSDUs in an A-MPDU, with + * additional overhead to account for processing time. + */ +#define IWL_NUM_RBDS_NON_HE 512 +#define IWL_NUM_RBDS_22000_HE 2048 +#define IWL_NUM_RBDS_AX210_HE 4096 const struct iwl_cfg iwl_ax101_cfg_qu_hr = { .name = "Intel(R) Wi-Fi 6 AX101", @@ -266,6 +257,7 @@ const struct iwl_cfg iwl_ax101_cfg_qu_hr = { */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .tx_with_siso_diversity = true, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax201_cfg_qu_hr = { @@ -278,6 +270,7 @@ const struct iwl_cfg iwl_ax201_cfg_qu_hr = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax101_cfg_qu_c0_hr_b0 = { @@ -290,6 +283,7 @@ const struct iwl_cfg iwl_ax101_cfg_qu_c0_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax201_cfg_qu_c0_hr_b0 = { @@ -302,6 +296,7 @@ const struct iwl_cfg iwl_ax201_cfg_qu_c0_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax101_cfg_quz_hr = { @@ -314,6 +309,7 @@ const struct iwl_cfg iwl_ax101_cfg_quz_hr = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax201_cfg_quz_hr = { @@ -326,6 +322,7 @@ const struct iwl_cfg iwl_ax201_cfg_quz_hr = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax1650s_cfg_quz_hr = { @@ -338,6 +335,7 @@ const struct iwl_cfg iwl_ax1650s_cfg_quz_hr = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax1650i_cfg_quz_hr = { @@ -350,6 +348,7 @@ const struct iwl_cfg iwl_ax1650i_cfg_quz_hr = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl_ax200_cfg_cc = { @@ -363,6 +362,7 @@ const struct iwl_cfg iwl_ax200_cfg_cc = { */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .trans.bisr_workaround = 1, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg killer1650x_2ax_cfg = { @@ -376,6 +376,7 @@ const struct iwl_cfg killer1650x_2ax_cfg = { */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .trans.bisr_workaround = 1, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg killer1650w_2ax_cfg = { @@ -389,6 +390,7 @@ const struct iwl_cfg killer1650w_2ax_cfg = { */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .trans.bisr_workaround = 1, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; /* @@ -400,48 +402,56 @@ const struct iwl_cfg iwl9461_2ac_cfg_qu_b0_jf_b0 = { .name = "Intel(R) Wireless-AC 9461", .fw_name_pre = IWL_QU_B_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9462_2ac_cfg_qu_b0_jf_b0 = { .name = "Intel(R) Wireless-AC 9462", .fw_name_pre = IWL_QU_B_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_cfg_qu_b0_jf_b0 = { .name = "Intel(R) Wireless-AC 9560", .fw_name_pre = IWL_QU_B_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_160_cfg_qu_b0_jf_b0 = { .name = "Intel(R) Wireless-AC 9560 160MHz", .fw_name_pre = IWL_QU_B_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9461_2ac_cfg_qu_c0_jf_b0 = { .name = "Intel(R) Wireless-AC 9461", .fw_name_pre = IWL_QU_C_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9462_2ac_cfg_qu_c0_jf_b0 = { .name = "Intel(R) Wireless-AC 9462", .fw_name_pre = IWL_QU_C_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_cfg_qu_c0_jf_b0 = { .name = "Intel(R) Wireless-AC 9560", .fw_name_pre = IWL_QU_C_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_160_cfg_qu_c0_jf_b0 = { .name = "Intel(R) Wireless-AC 9560 160MHz", .fw_name_pre = IWL_QU_C_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_cfg_qnj_jf_b0 = { @@ -454,6 +464,7 @@ const struct iwl_cfg iwl9560_2ac_cfg_qnj_jf_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_cfg_quz_a0_jf_b0_soc = { @@ -468,6 +479,7 @@ const struct iwl_cfg iwl9560_2ac_cfg_quz_a0_jf_b0_soc = { .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .integrated = true, .soc_latency = 5000, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_2ac_160_cfg_quz_a0_jf_b0_soc = { @@ -482,6 +494,7 @@ const struct iwl_cfg iwl9560_2ac_160_cfg_quz_a0_jf_b0_soc = { .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .integrated = true, .soc_latency = 5000, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9461_2ac_cfg_quz_a0_jf_b0_soc = { @@ -496,6 +509,7 @@ const struct iwl_cfg iwl9461_2ac_cfg_quz_a0_jf_b0_soc = { .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .integrated = true, .soc_latency = 5000, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9462_2ac_cfg_quz_a0_jf_b0_soc = { @@ -510,6 +524,7 @@ const struct iwl_cfg iwl9462_2ac_cfg_quz_a0_jf_b0_soc = { .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .integrated = true, .soc_latency = 5000, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_killer_s_2ac_cfg_quz_a0_jf_b0_soc = { @@ -524,6 +539,7 @@ const struct iwl_cfg iwl9560_killer_s_2ac_cfg_quz_a0_jf_b0_soc = { .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .integrated = true, .soc_latency = 5000, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwl9560_killer_i_2ac_cfg_quz_a0_jf_b0_soc = { @@ -538,18 +554,21 @@ const struct iwl_cfg iwl9560_killer_i_2ac_cfg_quz_a0_jf_b0_soc = { .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, .integrated = true, .soc_latency = 5000, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg killer1550i_2ac_cfg_qu_b0_jf_b0 = { .name = "Killer (R) Wireless-AC 1550i Wireless Network Adapter (9560NGW)", .fw_name_pre = IWL_QU_B_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg killer1550s_2ac_cfg_qu_b0_jf_b0 = { .name = "Killer (R) Wireless-AC 1550s Wireless Network Adapter (9560NGW)", .fw_name_pre = IWL_QU_B_JF_B_FW_PRE, IWL_DEVICE_22500, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg killer1650s_2ax_cfg_qu_b0_hr_b0 = { @@ -562,6 +581,7 @@ const struct iwl_cfg killer1650s_2ax_cfg_qu_b0_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg killer1650i_2ax_cfg_qu_b0_hr_b0 = { @@ -574,6 +594,7 @@ const struct iwl_cfg killer1650i_2ax_cfg_qu_b0_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg killer1650s_2ax_cfg_qu_c0_hr_b0 = { @@ -586,6 +607,7 @@ const struct iwl_cfg killer1650s_2ax_cfg_qu_c0_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg killer1650i_2ax_cfg_qu_c0_hr_b0 = { @@ -598,6 +620,7 @@ const struct iwl_cfg killer1650i_2ax_cfg_qu_c0_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl22000_2ax_cfg_jf = { @@ -610,6 +633,7 @@ const struct iwl_cfg iwl22000_2ax_cfg_jf = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl22000_2ax_cfg_qnj_hr_a0_f0 = { @@ -622,6 +646,7 @@ const struct iwl_cfg iwl22000_2ax_cfg_qnj_hr_a0_f0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl22000_2ax_cfg_qnj_hr_b0 = { @@ -634,6 +659,7 @@ const struct iwl_cfg iwl22000_2ax_cfg_qnj_hr_b0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwl22000_2ax_cfg_qnj_hr_a0 = { @@ -646,18 +672,21 @@ const struct iwl_cfg iwl22000_2ax_cfg_qnj_hr_a0 = { * HT size; mac80211 would otherwise pick the HE max (256) by default. */ .max_tx_agg_size = IEEE80211_MAX_AMPDU_BUF_HT, + .num_rbds = IWL_NUM_RBDS_22000_HE, }; const struct iwl_cfg iwlax210_2ax_cfg_so_jf_a0 = { .name = "Intel(R) Wireless-AC 9560 160MHz", .fw_name_pre = IWL_22000_SO_A_JF_B_FW_PRE, IWL_DEVICE_AX210, + .num_rbds = IWL_NUM_RBDS_NON_HE, }; const struct iwl_cfg iwlax210_2ax_cfg_so_hr_a0 = { .name = "Intel(R) Wi-Fi 7 AX210 160MHz", .fw_name_pre = IWL_22000_SO_A_HR_B_FW_PRE, IWL_DEVICE_AX210, + .num_rbds = IWL_NUM_RBDS_AX210_HE, }; const struct iwl_cfg iwlax211_2ax_cfg_so_gf_a0 = { @@ -665,6 +694,7 @@ const struct iwl_cfg iwlax211_2ax_cfg_so_gf_a0 = { .fw_name_pre = IWL_22000_SO_A_GF_A_FW_PRE, .uhb_supported = true, IWL_DEVICE_AX210, + .num_rbds = IWL_NUM_RBDS_AX210_HE, }; const struct iwl_cfg iwlax210_2ax_cfg_ty_gf_a0 = { @@ -672,12 +702,23 @@ const struct iwl_cfg iwlax210_2ax_cfg_ty_gf_a0 = { .fw_name_pre = IWL_22000_TY_A_GF_A_FW_PRE, .uhb_supported = true, IWL_DEVICE_AX210, + .num_rbds = IWL_NUM_RBDS_AX210_HE, }; const struct iwl_cfg iwlax411_2ax_cfg_so_gf4_a0 = { .name = "Intel(R) Wi-Fi 7 AX411 160MHz", .fw_name_pre = IWL_22000_SO_A_GF4_A_FW_PRE, + .uhb_supported = true, + IWL_DEVICE_AX210, + .num_rbds = IWL_NUM_RBDS_AX210_HE, +}; + +const struct iwl_cfg iwlax411_2ax_cfg_sosnj_gf4_a0 = { + .name = "Intel(R) Wi-Fi 7 AX411 160MHz", + .fw_name_pre = IWL_22000_SOSNJ_A_GF4_A_FW_PRE, + .uhb_supported = true, IWL_DEVICE_AX210, + .num_rbds = IWL_NUM_RBDS_AX210_HE, }; MODULE_FIRMWARE(IWL_22000_HR_MODULE_FIRMWARE(IWL_22000_UCODE_API_MAX)); diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/5000.c b/drivers/net/wireless/intel/iwlwifi/cfg/5000.c index aab4495c6085..3591336dc644 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/5000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/5000.c @@ -75,8 +75,7 @@ static const struct iwl_eeprom_params iwl5000_eeprom_params = { .trans.base_params = &iwl5000_base_params, \ .eeprom_params = &iwl5000_eeprom_params, \ .led_mode = IWL_LED_BLINK, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl5300_agn_cfg = { .name = "Intel(R) Ultimate N WiFi Link 5300 AGN", @@ -125,7 +124,6 @@ const struct iwl_cfg iwl5350_agn_cfg = { .ht_params = &iwl5000_ht_params, .led_mode = IWL_LED_BLINK, .internal_wimax_coex = true, - .trans.csr = &iwl_csr_v1, }; #define IWL_DEVICE_5150 \ @@ -141,8 +139,7 @@ const struct iwl_cfg iwl5350_agn_cfg = { .eeprom_params = &iwl5000_eeprom_params, \ .led_mode = IWL_LED_BLINK, \ .internal_wimax_coex = true, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl5150_agn_cfg = { .name = "Intel(R) WiMAX/WiFi Link 5150 AGN", diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/6000.c b/drivers/net/wireless/intel/iwlwifi/cfg/6000.c index 39ea81903dbe..b4a8a6804c39 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/6000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/6000.c @@ -124,8 +124,7 @@ static const struct iwl_eeprom_params iwl6000_eeprom_params = { .trans.base_params = &iwl6000_g2_base_params, \ .eeprom_params = &iwl6000_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl6005_2agn_cfg = { .name = "Intel(R) Centrino(R) Advanced-N 6205 AGN", @@ -179,8 +178,7 @@ const struct iwl_cfg iwl6005_2agn_mow2_cfg = { .trans.base_params = &iwl6000_g2_base_params, \ .eeprom_params = &iwl6000_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl6030_2agn_cfg = { .name = "Intel(R) Centrino(R) Advanced-N 6230 AGN", @@ -216,8 +214,7 @@ const struct iwl_cfg iwl6030_2bg_cfg = { .trans.base_params = &iwl6000_g2_base_params, \ .eeprom_params = &iwl6000_eeprom_params, \ .led_mode = IWL_LED_RF_STATE, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl6035_2agn_cfg = { .name = "Intel(R) Centrino(R) Advanced-N 6235 AGN", @@ -272,8 +269,7 @@ const struct iwl_cfg iwl130_bg_cfg = { .trans.base_params = &iwl6000_base_params, \ .eeprom_params = &iwl6000_eeprom_params, \ .led_mode = IWL_LED_BLINK, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl6000i_2agn_cfg = { .name = "Intel(R) Centrino(R) Advanced-N 6200 AGN", @@ -306,8 +302,7 @@ const struct iwl_cfg iwl6000i_2bg_cfg = { .eeprom_params = &iwl6000_eeprom_params, \ .led_mode = IWL_LED_BLINK, \ .internal_wimax_coex = true, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl6050_2agn_cfg = { .name = "Intel(R) Centrino(R) Advanced-N + WiMAX 6250 AGN", @@ -333,8 +328,7 @@ const struct iwl_cfg iwl6050_2abg_cfg = { .eeprom_params = &iwl6000_eeprom_params, \ .led_mode = IWL_LED_BLINK, \ .internal_wimax_coex = true, \ - .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .trans.csr = &iwl_csr_v1 + .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K const struct iwl_cfg iwl6150_bgn_cfg = { .name = "Intel(R) Centrino(R) Wireless-N + WiMAX 6150 BGN", @@ -361,7 +355,6 @@ const struct iwl_cfg iwl6000_3agn_cfg = { .eeprom_params = &iwl6000_eeprom_params, .ht_params = &iwl6000_ht_params, .led_mode = IWL_LED_BLINK, - .trans.csr = &iwl_csr_v1, }; MODULE_FIRMWARE(IWL6000_MODULE_FIRMWARE(IWL6000_UCODE_API_MAX)); diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/7000.c b/drivers/net/wireless/intel/iwlwifi/cfg/7000.c index deb520aeb3f8..b72993e07fc2 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/7000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/7000.c @@ -154,8 +154,7 @@ static const struct iwl_ht_params iwl7000_ht_params = { .nvm_hw_section_num = 0, \ .non_shared_ant = ANT_A, \ .max_ht_ampdu_exponent = IEEE80211_HT_MAX_AMPDU_64K, \ - .dccm_offset = IWL7000_DCCM_OFFSET, \ - .trans.csr = &iwl_csr_v1 + .dccm_offset = IWL7000_DCCM_OFFSET #define IWL_DEVICE_7000 \ IWL_DEVICE_7000_COMMON, \ diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/8000.c b/drivers/net/wireless/intel/iwlwifi/cfg/8000.c index b3cc477140c0..280d84fa5cb1 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/8000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/8000.c @@ -151,8 +151,7 @@ static const struct iwl_tt_params iwl8000_tt_params = { .apmg_not_supported = true, \ .nvm_type = IWL_NVM_EXT, \ .dbgc_supported = true, \ - .min_umac_error_event_table = 0x800000, \ - .trans.csr = &iwl_csr_v1 + .min_umac_error_event_table = 0x800000 #define IWL_DEVICE_8000 \ IWL_DEVICE_8000_COMMON, \ diff --git a/drivers/net/wireless/intel/iwlwifi/cfg/9000.c b/drivers/net/wireless/intel/iwlwifi/cfg/9000.c index e9155b9b5ee4..379ea788e424 100644 --- a/drivers/net/wireless/intel/iwlwifi/cfg/9000.c +++ b/drivers/net/wireless/intel/iwlwifi/cfg/9000.c @@ -138,13 +138,13 @@ static const struct iwl_tt_params iwl9000_tt_params = { .thermal_params = &iwl9000_tt_params, \ .apmg_not_supported = true, \ .trans.mq_rx_supported = true, \ + .num_rbds = 512, \ .vht_mu_mimo_supported = true, \ .mac_addr_from_csr = true, \ .trans.rf_id = true, \ .nvm_type = IWL_NVM_EXT, \ .dbgc_supported = true, \ .min_umac_error_event_table = 0x800000, \ - .trans.csr = &iwl_csr_v1, \ .d3_debug_data_base_addr = 0x401000, \ .d3_debug_data_length = 92 * 1024, \ .ht_params = &iwl9000_ht_params, \ @@ -171,6 +171,12 @@ static const struct iwl_tt_params iwl9000_tt_params = { }, \ } +const struct iwl_cfg_trans_params iwl9000_trans_cfg = { + .device_family = IWL_DEVICE_FAMILY_9000, + .base_params = &iwl9000_base_params, + .mq_rx_supported = true, + .rf_id = true, +}; const struct iwl_cfg iwl9160_2ac_cfg = { .name = "Intel(R) Dual Band Wireless AC 9160", @@ -184,8 +190,10 @@ const struct iwl_cfg iwl9260_2ac_cfg = { IWL_DEVICE_9000, }; +const char iwl9260_160_name[] = "Intel(R) Wireless-AC 9260 160MHz"; +const char iwl9560_160_name[] = "Intel(R) Wireless-AC 9560 160MHz"; + const struct iwl_cfg iwl9260_2ac_160_cfg = { - .name = "Intel(R) Wireless-AC 9260 160MHz", .fw_name_pre = IWL9260_FW_PRE, IWL_DEVICE_9000, }; diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/main.c b/drivers/net/wireless/intel/iwlwifi/dvm/main.c index 4f2789bb3b5b..598ee7315558 100644 --- a/drivers/net/wireless/intel/iwlwifi/dvm/main.c +++ b/drivers/net/wireless/intel/iwlwifi/dvm/main.c @@ -1255,7 +1255,7 @@ static struct iwl_op_mode *iwl_op_mode_dvm_start(struct iwl_trans *trans, ************************/ hw = iwl_alloc_all(); if (!hw) { - pr_err("%s: Cannot allocate network device\n", cfg->name); + pr_err("%s: Cannot allocate network device\n", trans->name); goto out; } @@ -1390,7 +1390,7 @@ static struct iwl_op_mode *iwl_op_mode_dvm_start(struct iwl_trans *trans, * 2. Read REV register ***********************/ IWL_INFO(priv, "Detected %s, REV=0x%X\n", - priv->cfg->name, priv->trans->hw_rev); + priv->trans->name, priv->trans->hw_rev); if (iwl_trans_start_hw(priv->trans)) goto out_free_hw; diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/tx.c b/drivers/net/wireless/intel/iwlwifi/dvm/tx.c index cd73fc5cfcbb..fd454836adbe 100644 --- a/drivers/net/wireless/intel/iwlwifi/dvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/dvm/tx.c @@ -267,7 +267,7 @@ int iwlagn_tx_skb(struct iwl_priv *priv, struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); struct iwl_station_priv *sta_priv = NULL; struct iwl_rxon_context *ctx = &priv->contexts[IWL_RXON_CTX_BSS]; - struct iwl_device_cmd *dev_cmd; + struct iwl_device_tx_cmd *dev_cmd; struct iwl_tx_cmd *tx_cmd; __le16 fc; u8 hdr_len; @@ -348,7 +348,6 @@ int iwlagn_tx_skb(struct iwl_priv *priv, if (unlikely(!dev_cmd)) goto drop_unlock_priv; - memset(dev_cmd, 0, sizeof(*dev_cmd)); dev_cmd->hdr.cmd = REPLY_TX; tx_cmd = (struct iwl_tx_cmd *) dev_cmd->payload; diff --git a/drivers/net/wireless/intel/iwlwifi/fw/acpi.c b/drivers/net/wireless/intel/iwlwifi/fw/acpi.c index 40fe2d667622..48d375a86d86 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/acpi.c +++ b/drivers/net/wireless/intel/iwlwifi/fw/acpi.c @@ -357,8 +357,8 @@ int iwl_sar_get_ewrd_table(struct iwl_fw_runtime *fwrt) { union acpi_object *wifi_pkg, *data; bool enabled; - int i, n_profiles, tbl_rev; - int ret = 0; + int i, n_profiles, tbl_rev, pos; + int ret = 0; data = iwl_acpi_get_object(fwrt->dev, ACPI_EWRD_METHOD); if (IS_ERR(data)) @@ -390,10 +390,10 @@ int iwl_sar_get_ewrd_table(struct iwl_fw_runtime *fwrt) goto out_free; } - for (i = 0; i < n_profiles; i++) { - /* the tables start at element 3 */ - int pos = 3; + /* the tables start at element 3 */ + pos = 3; + for (i = 0; i < n_profiles; i++) { /* The EWRD profiles officially go from 2 to 4, but we * save them in sar_profiles[1-3] (because we don't * have profile 0). So in the array we start from 1. diff --git a/drivers/net/wireless/intel/iwlwifi/fw/api/location.h b/drivers/net/wireless/intel/iwlwifi/fw/api/location.h index 7a0fe5adefa5..a0d6802c2715 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/api/location.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/api/location.h @@ -240,7 +240,7 @@ enum iwl_tof_responder_cfg_flags { }; /** - * struct iwl_tof_responder_config_cmd - ToF AP mode (for debug) + * struct iwl_tof_responder_config_cmd_v6 - ToF AP mode (for debug) * @cmd_valid_fields: &iwl_tof_responder_cmd_valid_field * @responder_cfg_flags: &iwl_tof_responder_cfg_flags * @bandwidth: current AP Bandwidth: &enum iwl_tof_bandwidth @@ -258,7 +258,7 @@ enum iwl_tof_responder_cfg_flags { * @bssid: Current AP BSSID * @reserved2: reserved */ -struct iwl_tof_responder_config_cmd { +struct iwl_tof_responder_config_cmd_v6 { __le32 cmd_valid_fields; __le32 responder_cfg_flags; u8 bandwidth; @@ -274,6 +274,42 @@ struct iwl_tof_responder_config_cmd { __le16 reserved2; } __packed; /* TOF_RESPONDER_CONFIG_CMD_API_S_VER_6 */ +/** + * struct iwl_tof_responder_config_cmd - ToF AP mode (for debug) + * @cmd_valid_fields: &iwl_tof_responder_cmd_valid_field + * @responder_cfg_flags: &iwl_tof_responder_cfg_flags + * @format_bw: bits 0 - 3: &enum iwl_location_frame_format. + * bits 4 - 7: &enum iwl_location_bw. + * @rate: current AP rate + * @channel_num: current AP Channel + * @ctrl_ch_position: coding of the control channel position relative to + * the center frequency, see iwl_mvm_get_ctrl_pos() + * @sta_id: index of the AP STA when in AP mode + * @reserved1: reserved + * @toa_offset: Artificial addition [pSec] for the ToA - to be used for debug + * purposes, simulating station movement by adding various values + * to this field + * @common_calib: XVT: common calibration value + * @specific_calib: XVT: specific calibration value + * @bssid: Current AP BSSID + * @reserved2: reserved + */ +struct iwl_tof_responder_config_cmd { + __le32 cmd_valid_fields; + __le32 responder_cfg_flags; + u8 format_bw; + u8 rate; + u8 channel_num; + u8 ctrl_ch_position; + u8 sta_id; + u8 reserved1; + __le16 toa_offset; + __le16 common_calib; + __le16 specific_calib; + u8 bssid[ETH_ALEN]; + __le16 reserved2; +} __packed; /* TOF_RESPONDER_CONFIG_CMD_API_S_VER_6 */ + #define IWL_LCI_CIVIC_IE_MAX_SIZE 400 /** @@ -403,7 +439,7 @@ enum iwl_initiator_ap_flags { }; /** - * struct iwl_tof_range_req_ap_entry - AP configuration parameters + * struct iwl_tof_range_req_ap_entry_v3 - AP configuration parameters * @initiator_ap_flags: see &enum iwl_initiator_ap_flags. * @channel_num: AP Channel number * @bandwidth: AP bandwidth. One of iwl_tof_bandwidth. @@ -420,7 +456,7 @@ enum iwl_initiator_ap_flags { * @reserved: For alignment and future use * @tsf_delta: not in use */ -struct iwl_tof_range_req_ap_entry { +struct iwl_tof_range_req_ap_entry_v3 { __le32 initiator_ap_flags; u8 channel_num; u8 bandwidth; @@ -435,6 +471,72 @@ struct iwl_tof_range_req_ap_entry { } __packed; /* LOCATION_RANGE_REQ_AP_ENTRY_CMD_API_S_VER_3 */ /** + * enum iwl_location_frame_format - location frame formats + * @IWL_LOCATION_FRAME_FORMAT_LEGACY: legacy + * @IWL_LOCATION_FRAME_FORMAT_HT: HT + * @IWL_LOCATION_FRAME_FORMAT_VHT: VHT + * @IWL_LOCATION_FRAME_FORMAT_HE: HE + */ +enum iwl_location_frame_format { + IWL_LOCATION_FRAME_FORMAT_LEGACY, + IWL_LOCATION_FRAME_FORMAT_HT, + IWL_LOCATION_FRAME_FORMAT_VHT, + IWL_LOCATION_FRAME_FORMAT_HE, +}; + +/** + * enum iwl_location_bw - location bandwidth selection + * @IWL_LOCATION_BW_20MHZ: 20MHz + * @IWL_LOCATION_BW_40MHZ: 40MHz + * @IWL_LOCATION_BW_80MHZ: 80MHz + */ +enum iwl_location_bw { + IWL_LOCATION_BW_20MHZ, + IWL_LOCATION_BW_40MHZ, + IWL_LOCATION_BW_80MHZ, +}; + +#define HLTK_11AZ_LEN 32 +#define TK_11AZ_LEN 32 + +#define LOCATION_BW_POS 4 + +/** + * struct iwl_tof_range_req_ap_entry - AP configuration parameters + * @initiator_ap_flags: see &enum iwl_initiator_ap_flags. + * @channel_num: AP Channel number + * @format_bw: bits 0 - 3: &enum iwl_location_frame_format. + * bits 4 - 7: &enum iwl_location_bw. + * @ctrl_ch_position: Coding of the control channel position relative to the + * center frequency, see iwl_mvm_get_ctrl_pos(). + * @ftmr_max_retries: Max number of retries to send the FTMR in case of no + * reply from the AP. + * @bssid: AP's BSSID + * @burst_period: Recommended value to be sent to the AP. Measurement + * periodicity In units of 100ms. ignored if num_of_bursts_exp = 0 + * @samples_per_burst: the number of FTMs pairs in single Burst (1-31); + * @num_of_bursts: Recommended value to be sent to the AP. 2s Exponent of + * the number of measurement iterations (min 2^0 = 1, max 2^14) + * @reserved: For alignment and future use + * @hltk: HLTK to be used for secured 11az measurement + * @tk: TK to be used for secured 11az measurement + */ +struct iwl_tof_range_req_ap_entry { + __le32 initiator_ap_flags; + u8 channel_num; + u8 format_bw; + u8 ctrl_ch_position; + u8 ftmr_max_retries; + u8 bssid[ETH_ALEN]; + __le16 burst_period; + u8 samples_per_burst; + u8 num_of_bursts; + __le16 reserved; + u8 hltk[HLTK_11AZ_LEN]; + u8 tk[TK_11AZ_LEN]; +} __packed; /* LOCATION_RANGE_REQ_AP_ENTRY_CMD_API_S_VER_4 */ + +/** * enum iwl_tof_response_mode * @IWL_MVM_TOF_RESPONSE_ASAP: report each AP measurement separately as soon as * possible (not supported for this release) @@ -536,6 +638,38 @@ struct iwl_tof_range_req_cmd_v5 { /* LOCATION_RANGE_REQ_CMD_API_S_VER_5 */ /** + * struct iwl_tof_range_req_cmd_v7 - start measurement cmd + * @initiator_flags: see flags @ iwl_tof_initiator_flags + * @request_id: A Token incremented per request. The same Token will be + * sent back in the range response + * @num_of_ap: Number of APs to measure (error if > IWL_MVM_TOF_MAX_APS) + * @range_req_bssid: ranging request BSSID + * @macaddr_mask: Bits set to 0 shall be copied from the MAC address template. + * Bits set to 1 shall be randomized by the UMAC + * @macaddr_template: MAC address template to use for non-randomized bits + * @req_timeout_ms: Requested timeout of the response in units of milliseconds. + * This is the session time for completing the measurement. + * @tsf_mac_id: report the measurement start time for each ap in terms of the + * TSF of this mac id. 0xff to disable TSF reporting. + * @common_calib: The common calib value to inject to this measurement calc + * @specific_calib: The specific calib value to inject to this measurement calc + * @ap: per-AP request data, see &struct iwl_tof_range_req_ap_entry_v2. + */ +struct iwl_tof_range_req_cmd_v7 { + __le32 initiator_flags; + u8 request_id; + u8 num_of_ap; + u8 range_req_bssid[ETH_ALEN]; + u8 macaddr_mask[ETH_ALEN]; + u8 macaddr_template[ETH_ALEN]; + __le32 req_timeout_ms; + __le32 tsf_mac_id; + __le16 common_calib; + __le16 specific_calib; + struct iwl_tof_range_req_ap_entry_v3 ap[IWL_MVM_TOF_MAX_APS]; +} __packed; /* LOCATION_RANGE_REQ_CMD_API_S_VER_7 */ + +/** * struct iwl_tof_range_req_cmd - start measurement cmd * @initiator_flags: see flags @ iwl_tof_initiator_flags * @request_id: A Token incremented per request. The same Token will be @@ -565,7 +699,7 @@ struct iwl_tof_range_req_cmd { __le16 common_calib; __le16 specific_calib; struct iwl_tof_range_req_ap_entry ap[IWL_MVM_TOF_MAX_APS]; -} __packed; /* LOCATION_RANGE_REQ_CMD_API_S_VER_7 */ +} __packed; /* LOCATION_RANGE_REQ_CMD_API_S_VER_8 */ /* * enum iwl_tof_range_request_status - status of the sent request diff --git a/drivers/net/wireless/intel/iwlwifi/fw/api/scan.h b/drivers/net/wireless/intel/iwlwifi/fw/api/scan.h index 408798f351c6..1b2b5fa56e19 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/api/scan.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/api/scan.h @@ -922,21 +922,6 @@ struct iwl_scan_probe_params_v4 { #define SCAN_MAX_NUM_CHANS_V3 67 /** - * struct iwl_scan_channel_params_v3 - * @flags: channel flags &enum iwl_scan_channel_flags - * @count: num of channels in scan request - * @reserved: for future use and alignment - * @channel_config: array of explicit channel configurations - * for 2.4Ghz and 5.2Ghz bands - */ -struct iwl_scan_channel_params_v3 { - u8 flags; - u8 count; - __le16 reserved; - struct iwl_scan_channel_cfg_umac channel_config[SCAN_MAX_NUM_CHANS_V3]; -} __packed; /* SCAN_CHANNEL_PARAMS_API_S_VER_3 */ - -/** * struct iwl_scan_channel_params_v4 * @flags: channel flags &enum iwl_scan_channel_flags * @count: num of channels in scan request @@ -1011,20 +996,6 @@ struct iwl_scan_periodic_parms_v1 { } __packed; /* SCAN_PERIODIC_PARAMS_API_S_VER_1 */ /** - * struct iwl_scan_req_params_v11 - * @general_params: &struct iwl_scan_general_params_v10 - * @channel_params: &struct iwl_scan_channel_params_v3 - * @periodic_params: &struct iwl_scan_periodic_parms_v1 - * @probe_params: &struct iwl_scan_probe_params_v3 - */ -struct iwl_scan_req_params_v11 { - struct iwl_scan_general_params_v10 general_params; - struct iwl_scan_channel_params_v3 channel_params; - struct iwl_scan_periodic_parms_v1 periodic_params; - struct iwl_scan_probe_params_v3 probe_params; -} __packed; /* SCAN_REQUEST_PARAMS_API_S_VER_11 */ - -/** * struct iwl_scan_req_params_v12 * @general_params: &struct iwl_scan_general_params_v10 * @channel_params: &struct iwl_scan_channel_params_v4 @@ -1053,18 +1024,6 @@ struct iwl_scan_req_params_v13 { } __packed; /* SCAN_REQUEST_PARAMS_API_S_VER_13 */ /** - * struct iwl_scan_req_umac_v11 - * @uid: scan id, &enum iwl_umac_scan_uid_offsets - * @ooc_priority: out of channel priority - &enum iwl_scan_priority - * @scan_params: scan parameters - */ -struct iwl_scan_req_umac_v11 { - __le32 uid; - __le32 ooc_priority; - struct iwl_scan_req_params_v11 scan_params; -} __packed; /* SCAN_REQUEST_CMD_UMAC_API_S_VER_11 */ - -/** * struct iwl_scan_req_umac_v12 * @uid: scan id, &enum iwl_umac_scan_uid_offsets * @ooc_priority: out of channel priority - &enum iwl_scan_priority diff --git a/drivers/net/wireless/intel/iwlwifi/fw/api/tx.h b/drivers/net/wireless/intel/iwlwifi/fw/api/tx.h index f89a9e16a8c0..f1d1fe96fecc 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/api/tx.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/api/tx.h @@ -813,6 +813,7 @@ enum iwl_mac_beacon_flags { IWL_MAC_BEACON_ANT_A = BIT(9), IWL_MAC_BEACON_ANT_B = BIT(10), IWL_MAC_BEACON_ANT_C = BIT(11), + IWL_MAC_BEACON_FILS = BIT(12), }; /** @@ -820,6 +821,7 @@ enum iwl_mac_beacon_flags { * @byte_cnt: byte count of the beacon frame. * @flags: least significant byte for rate code. The most significant byte * is &enum iwl_mac_beacon_flags. + * @short_ssid: Short SSID * @reserved: reserved * @template_id: currently equal to the mac context id of the coresponding mac. * @tim_idx: the offset of the tim IE in the beacon @@ -831,14 +833,15 @@ enum iwl_mac_beacon_flags { struct iwl_mac_beacon_cmd { __le16 byte_cnt; __le16 flags; - __le64 reserved; + __le32 short_ssid; + __le32 reserved; __le32 template_id; __le32 tim_idx; __le32 tim_size; __le32 ecsa_offset; __le32 csa_offset; struct ieee80211_hdr frame[0]; -} __packed; /* BEACON_TEMPLATE_CMD_API_S_VER_9 */ +} __packed; /* BEACON_TEMPLATE_CMD_API_S_VER_10 */ struct iwl_beacon_notif { struct iwl_mvm_tx_resp beacon_notify_hdr; diff --git a/drivers/net/wireless/intel/iwlwifi/fw/dbg.c b/drivers/net/wireless/intel/iwlwifi/fw/dbg.c index ed90dd104366..91df1ee25dd0 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/dbg.c +++ b/drivers/net/wireless/intel/iwlwifi/fw/dbg.c @@ -929,7 +929,7 @@ iwl_fw_error_dump_file(struct iwl_fw_runtime *fwrt, cpu_to_le32(CSR_HW_REV_STEP(fwrt->trans->hw_rev)); memcpy(dump_info->fw_human_readable, fwrt->fw->human_readable, sizeof(dump_info->fw_human_readable)); - strncpy(dump_info->dev_human_readable, fwrt->trans->cfg->name, + strncpy(dump_info->dev_human_readable, fwrt->trans->name, sizeof(dump_info->dev_human_readable) - 1); strncpy(dump_info->bus_human_readable, fwrt->dev->bus->name, sizeof(dump_info->bus_human_readable) - 1); @@ -1230,13 +1230,15 @@ static bool iwl_ini_txf_iter(struct iwl_fw_runtime *fwrt, iter->lmac = 0; } - if (!iter->internal_txf) + if (!iter->internal_txf) { for (iter->fifo++; iter->fifo < txf_num; iter->fifo++) { iter->fifo_size = cfg->lmac[iter->lmac].txfifo_size[iter->fifo]; if (iter->fifo_size && (lmac_bitmap & BIT(iter->fifo))) return true; } + iter->fifo--; + } iter->internal_txf = 1; @@ -2351,9 +2353,6 @@ int iwl_fw_dbg_ini_collect(struct iwl_fw_runtime *fwrt, u32 occur, delay; unsigned long idx; - if (test_bit(STATUS_GEN_ACTIVE_TRIGS, &fwrt->status)) - return -EBUSY; - if (!iwl_fw_ini_trigger_on(fwrt, trig)) { IWL_WARN(fwrt, "WRT: Trigger %d is not active, aborting dump\n", tp_id); @@ -2669,12 +2668,7 @@ int iwl_fw_dbg_stop_restart_recording(struct iwl_fw_runtime *fwrt, { int ret = 0; - /* if the FW crashed or not debug monitor cfg was given, there is - * no point in changing the recording state - */ - if (test_bit(STATUS_FW_ERROR, &fwrt->trans->status) || - (!fwrt->trans->dbg.dest_tlv && - fwrt->trans->dbg.ini_dest == IWL_FW_INI_LOCATION_INVALID)) + if (test_bit(STATUS_FW_ERROR, &fwrt->trans->status)) return 0; if (fw_has_capa(&fwrt->fw->ucode_capa, diff --git a/drivers/net/wireless/intel/iwlwifi/fw/debugfs.c b/drivers/net/wireless/intel/iwlwifi/fw/debugfs.c index ca3b1a461dea..89f74116569d 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/debugfs.c +++ b/drivers/net/wireless/intel/iwlwifi/fw/debugfs.c @@ -320,31 +320,6 @@ out: FWRT_DEBUGFS_WRITE_FILE_OPS(send_hcmd, 512); -static ssize_t iwl_dbgfs_fw_dbg_domain_write(struct iwl_fw_runtime *fwrt, - char *buf, size_t count) -{ - u32 new_domain; - int ret; - - if (!iwl_trans_fw_running(fwrt->trans)) - return -EIO; - - ret = kstrtou32(buf, 0, &new_domain); - if (ret) - return ret; - - if (new_domain != fwrt->trans->dbg.domains_bitmap) { - ret = iwl_dbg_tlv_gen_active_trigs(fwrt, new_domain); - if (ret) - return ret; - - iwl_dbg_tlv_time_point(fwrt, IWL_FW_INI_TIME_POINT_PERIODIC, - NULL); - } - - return count; -} - static ssize_t iwl_dbgfs_fw_dbg_domain_read(struct iwl_fw_runtime *fwrt, size_t size, char *buf) { @@ -352,7 +327,7 @@ static ssize_t iwl_dbgfs_fw_dbg_domain_read(struct iwl_fw_runtime *fwrt, fwrt->trans->dbg.domains_bitmap); } -FWRT_DEBUGFS_READ_WRITE_FILE_OPS(fw_dbg_domain, 20); +FWRT_DEBUGFS_READ_FILE_OPS(fw_dbg_domain, 20); void iwl_fwrt_dbgfs_register(struct iwl_fw_runtime *fwrt, struct dentry *dbgfs_dir) @@ -360,5 +335,5 @@ void iwl_fwrt_dbgfs_register(struct iwl_fw_runtime *fwrt, INIT_DELAYED_WORK(&fwrt->timestamp.wk, iwl_fw_timestamp_marker_wk); FWRT_DEBUGFS_ADD_FILE(timestamp_marker, dbgfs_dir, 0200); FWRT_DEBUGFS_ADD_FILE(send_hcmd, dbgfs_dir, 0200); - FWRT_DEBUGFS_ADD_FILE(fw_dbg_domain, dbgfs_dir, 0600); + FWRT_DEBUGFS_ADD_FILE(fw_dbg_domain, dbgfs_dir, 0400); } diff --git a/drivers/net/wireless/intel/iwlwifi/fw/img.h b/drivers/net/wireless/intel/iwlwifi/fw/img.h index 994880a83652..90ca5f929cf9 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/img.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/img.h @@ -251,7 +251,7 @@ struct iwl_fw_dbg { struct iwl_fw { u32 ucode_ver; - char fw_version[ETHTOOL_FWVERS_LEN]; + char fw_version[64]; /* ucode images */ struct fw_img img[IWL_UCODE_TYPE_MAX]; diff --git a/drivers/net/wireless/intel/iwlwifi/fw/runtime.h b/drivers/net/wireless/intel/iwlwifi/fw/runtime.h index c24575ff0e54..f8c6ed823bc5 100644 --- a/drivers/net/wireless/intel/iwlwifi/fw/runtime.h +++ b/drivers/net/wireless/intel/iwlwifi/fw/runtime.h @@ -69,7 +69,7 @@ #include "iwl-eeprom-parse.h" #include "fw/acpi.h" -#define IWL_FW_DBG_DOMAIN IWL_FW_INI_DOMAIN_ALWAYS_ON +#define IWL_FW_DBG_DOMAIN IWL_TRANS_FW_DBG_DOMAIN(fwrt->trans) struct iwl_fw_runtime_ops { int (*dump_start)(void *ctx); @@ -130,14 +130,6 @@ struct iwl_txf_iter_data { }; /** - * enum iwl_fw_runtime_status - fw runtime status flags - * @STATUS_GEN_ACTIVE_TRIGS: generating active trigger list - */ -enum iwl_fw_runtime_status { - STATUS_GEN_ACTIVE_TRIGS, -}; - -/** * struct iwl_fw_runtime - runtime data for firmware * @fw: firmware image * @cfg: NIC configuration @@ -150,7 +142,6 @@ enum iwl_fw_runtime_status { * @smem_cfg: saved firmware SMEM configuration * @cur_fw_img: current firmware image, must be maintained by * the driver by calling &iwl_fw_set_current_image() - * @status: &enum iwl_fw_runtime_status * @dump: debug dump data */ struct iwl_fw_runtime { @@ -171,8 +162,6 @@ struct iwl_fw_runtime { /* memory configuration */ struct iwl_fwrt_shared_mem_cfg smem_cfg; - unsigned long status; - /* debug */ struct { const struct iwl_fw_dump_desc *desc; diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-config.h b/drivers/net/wireless/intel/iwlwifi/iwl-config.h index 317eac066082..be6a2bf9ce74 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-config.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-config.h @@ -285,52 +285,6 @@ struct iwl_pwr_tx_backoff { }; /** - * struct iwl_csr_params - * - * @flag_sw_reset: reset the device - * @flag_mac_clock_ready: - * Indicates MAC (ucode processor, etc.) is powered up and can run. - * Internal resources are accessible. - * NOTE: This does not indicate that the processor is actually running. - * NOTE: This does not indicate that device has completed - * init or post-power-down restore of internal SRAM memory. - * Use CSR_UCODE_DRV_GP1_BIT_MAC_SLEEP as indication that - * SRAM is restored and uCode is in normal operation mode. - * This note is relevant only for pre 5xxx devices. - * NOTE: After device reset, this bit remains "0" until host sets - * INIT_DONE - * @flag_init_done: Host sets this to put device into fully operational - * D0 power mode. Host resets this after SW_RESET to put device into - * low power mode. - * @flag_mac_access_req: Host sets this to request and maintain MAC wakeup, - * to allow host access to device-internal resources. Host must wait for - * mac_clock_ready (and !GOING_TO_SLEEP) before accessing non-CSR device - * registers. - * @flag_val_mac_access_en: mac access is enabled - * @flag_master_dis: disable master - * @flag_stop_master: stop master - * @addr_sw_reset: address for resetting the device - * @mac_addr0_otp: first part of MAC address from OTP - * @mac_addr1_otp: second part of MAC address from OTP - * @mac_addr0_strap: first part of MAC address from strap - * @mac_addr1_strap: second part of MAC address from strap - */ -struct iwl_csr_params { - u8 flag_sw_reset; - u8 flag_mac_clock_ready; - u8 flag_init_done; - u8 flag_mac_access_req; - u8 flag_val_mac_access_en; - u8 flag_master_dis; - u8 flag_stop_master; - u8 addr_sw_reset; - u32 mac_addr0_otp; - u32 mac_addr1_otp; - u32 mac_addr0_strap; - u32 mac_addr1_strap; -}; - -/** * struct iwl_cfg_trans - information needed to start the trans * * These values cannot be changed when multiple configs are used for a @@ -348,7 +302,6 @@ struct iwl_csr_params { */ struct iwl_cfg_trans_params { const struct iwl_base_params *base_params; - const struct iwl_csr_params *csr; enum iwl_device_family device_family; u32 umac_prph_offset; u32 rf_id:1, @@ -431,6 +384,8 @@ struct iwl_fw_mon_regs { * @uhb_supported: ultra high band channels supported * @min_256_ba_txq_size: minimum number of slots required in a TX queue which * supports 256 BA aggregation + * @num_rbds: number of receive buffer descriptors to use + * (only used for multi-queue capable devices) * * We enable the driver to be backward compatible wrt. hardware features. * API differences in uCode shouldn't be handled here but through TLVs @@ -485,6 +440,7 @@ struct iwl_cfg { u8 max_vht_ampdu_exponent; u8 ucode_api_max; u8 ucode_api_min; + u16 num_rbds; u32 min_umac_error_event_table; u32 extra_phy_cfg_flags; u32 d3_debug_data_base_addr; @@ -496,12 +452,22 @@ struct iwl_cfg { const struct iwl_fw_mon_regs mon_smem_regs; }; -extern const struct iwl_csr_params iwl_csr_v1; -extern const struct iwl_csr_params iwl_csr_v2; +#define IWL_CFG_ANY (~0) + +struct iwl_dev_info { + u16 device; + u16 subdevice; + const struct iwl_cfg *cfg; + const char *name; +}; /* * This list declares the config structures for all devices. */ +extern const struct iwl_cfg_trans_params iwl9000_trans_cfg; +extern const char iwl9260_160_name[]; +extern const char iwl9560_160_name[]; + #if IS_ENABLED(CONFIG_IWLDVM) extern const struct iwl_cfg iwl5300_agn_cfg; extern const struct iwl_cfg iwl5100_agn_cfg; @@ -595,9 +561,6 @@ extern const struct iwl_cfg iwl9560_2ac_cfg_shared_clk; extern const struct iwl_cfg iwl9560_2ac_160_cfg_shared_clk; extern const struct iwl_cfg iwl9560_killer_2ac_cfg_shared_clk; extern const struct iwl_cfg iwl9560_killer_s_2ac_cfg_shared_clk; -extern const struct iwl_cfg iwl22000_2ac_cfg_hr; -extern const struct iwl_cfg iwl22000_2ac_cfg_hr_cdb; -extern const struct iwl_cfg iwl22000_2ac_cfg_jf; extern const struct iwl_cfg iwl_ax101_cfg_qu_hr; extern const struct iwl_cfg iwl_ax101_cfg_qu_c0_hr_b0; extern const struct iwl_cfg iwl_ax101_cfg_quz_hr; @@ -636,6 +599,7 @@ extern const struct iwl_cfg iwlax210_2ax_cfg_so_hr_a0; extern const struct iwl_cfg iwlax211_2ax_cfg_so_gf_a0; extern const struct iwl_cfg iwlax210_2ax_cfg_ty_gf_a0; extern const struct iwl_cfg iwlax411_2ax_cfg_so_gf4_a0; +extern const struct iwl_cfg iwlax411_2ax_cfg_sosnj_gf4_a0; #endif /* CPTCFG_IWLMVM || CPTCFG_IWLFMAC */ #endif /* __IWL_CONFIG_H__ */ diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-context-info.h b/drivers/net/wireless/intel/iwlwifi/iwl-context-info.h index 5ed07e37e3ee..eeaa8cbdddce 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-context-info.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-context-info.h @@ -6,7 +6,7 @@ * GPL LICENSE SUMMARY * * Copyright(c) 2017 Intel Deutschland GmbH - * Copyright(c) 2018 Intel Corporation + * Copyright(c) 2018 - 2019 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as @@ -20,7 +20,7 @@ * BSD LICENSE * * Copyright(c) 2017 Intel Deutschland GmbH - * Copyright(c) 2018 Intel Corporation + * Copyright(c) 2018 - 2019 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -64,12 +64,12 @@ * the init done for driver command that configures several system modes * @IWL_CTXT_INFO_EARLY_DEBUG: enable early debug * @IWL_CTXT_INFO_ENABLE_CDMP: enable core dump - * @IWL_CTXT_INFO_RB_CB_SIZE_POS: position of the RBD Cyclic Buffer Size + * @IWL_CTXT_INFO_RB_CB_SIZE: mask of the RBD Cyclic Buffer Size * exponent, the actual size is 2**value, valid sizes are 8-2048. * The value is four bits long. Maximum valid exponent is 12 * @IWL_CTXT_INFO_TFD_FORMAT_LONG: use long TFD Format (the * default is short format - not supported by the driver) - * @IWL_CTXT_INFO_RB_SIZE_POS: RB size position + * @IWL_CTXT_INFO_RB_SIZE: RB size mask * (values are IWL_CTXT_INFO_RB_SIZE_*K) * @IWL_CTXT_INFO_RB_SIZE_1K: Value for 1K RB size * @IWL_CTXT_INFO_RB_SIZE_2K: Value for 2K RB size @@ -83,12 +83,12 @@ * @IWL_CTXT_INFO_RB_SIZE_32K: Value for 32K RB size */ enum iwl_context_info_flags { - IWL_CTXT_INFO_AUTO_FUNC_INIT = BIT(0), - IWL_CTXT_INFO_EARLY_DEBUG = BIT(1), - IWL_CTXT_INFO_ENABLE_CDMP = BIT(2), - IWL_CTXT_INFO_RB_CB_SIZE_POS = 4, - IWL_CTXT_INFO_TFD_FORMAT_LONG = BIT(8), - IWL_CTXT_INFO_RB_SIZE_POS = 9, + IWL_CTXT_INFO_AUTO_FUNC_INIT = 0x0001, + IWL_CTXT_INFO_EARLY_DEBUG = 0x0002, + IWL_CTXT_INFO_ENABLE_CDMP = 0x0004, + IWL_CTXT_INFO_RB_CB_SIZE = 0x00f0, + IWL_CTXT_INFO_TFD_FORMAT_LONG = 0x0100, + IWL_CTXT_INFO_RB_SIZE = 0x1e00, IWL_CTXT_INFO_RB_SIZE_1K = 0x1, IWL_CTXT_INFO_RB_SIZE_2K = 0x2, IWL_CTXT_INFO_RB_SIZE_4K = 0x4, diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-csr.h b/drivers/net/wireless/intel/iwlwifi/iwl-csr.h index 92d9898ab7c2..cb9e8e189a1a 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-csr.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-csr.h @@ -256,6 +256,7 @@ /* RESET */ #define CSR_RESET_REG_FLAG_NEVO_RESET (0x00000001) #define CSR_RESET_REG_FLAG_FORCE_NMI (0x00000002) +#define CSR_RESET_REG_FLAG_SW_RESET (0x00000080) #define CSR_RESET_REG_FLAG_MASTER_DISABLED (0x00000100) #define CSR_RESET_REG_FLAG_STOP_MASTER (0x00000200) #define CSR_RESET_LINK_PWR_MGMT_DISABLED (0x80000000) @@ -278,11 +279,35 @@ * 4: GOING_TO_SLEEP * Indicates MAC is entering a power-saving sleep power-down. * Not a good time to access device-internal resources. + * 3: MAC_ACCESS_REQ + * Host sets this to request and maintain MAC wakeup, to allow host + * access to device-internal resources. Host must wait for + * MAC_CLOCK_READY (and !GOING_TO_SLEEP) before accessing non-CSR + * device registers. + * 2: INIT_DONE + * Host sets this to put device into fully operational D0 power mode. + * Host resets this after SW_RESET to put device into low power mode. + * 0: MAC_CLOCK_READY + * Indicates MAC (ucode processor, etc.) is powered up and can run. + * Internal resources are accessible. + * NOTE: This does not indicate that the processor is actually running. + * NOTE: This does not indicate that device has completed + * init or post-power-down restore of internal SRAM memory. + * Use CSR_UCODE_DRV_GP1_BIT_MAC_SLEEP as indication that + * SRAM is restored and uCode is in normal operation mode. + * Later devices (5xxx/6xxx/1xxx) use non-volatile SRAM, and + * do not need to save/restore it. + * NOTE: After device reset, this bit remains "0" until host sets + * INIT_DONE */ +#define CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY (0x00000001) #define CSR_GP_CNTRL_REG_FLAG_INIT_DONE (0x00000004) -#define CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP (0x00000010) +#define CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ (0x00000008) +#define CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP (0x00000010) #define CSR_GP_CNTRL_REG_FLAG_XTAL_ON (0x00000400) +#define CSR_GP_CNTRL_REG_VAL_MAC_ACCESS_EN (0x00000001) + #define CSR_GP_CNTRL_REG_MSK_POWER_SAVE_TYPE (0x07000000) #define CSR_GP_CNTRL_REG_FLAG_RFKILL_WAKE_L1A_EN (0x04000000) #define CSR_GP_CNTRL_REG_FLAG_HW_RF_KILL_SW (0x08000000) @@ -379,7 +404,7 @@ enum { /* CSR GIO */ -#define CSR_GIO_REG_VAL_L0S_ENABLED (0x00000002) +#define CSR_GIO_REG_VAL_L0S_DISABLED (0x00000002) /* * UCODE-DRIVER GP (general purpose) mailbox register 1 diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c index f266647dc08c..4208e720f6e6 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c @@ -290,10 +290,19 @@ void iwl_dbg_tlv_alloc(struct iwl_trans *trans, struct iwl_ucode_tlv *tlv, struct iwl_fw_ini_header *hdr = (void *)&tlv->data[0]; u32 type = le32_to_cpu(tlv->type); u32 tlv_idx = type - IWL_UCODE_TLV_DEBUG_BASE; + u32 domain = le32_to_cpu(hdr->domain); enum iwl_ini_cfg_state *cfg_state = ext ? &trans->dbg.external_ini_cfg : &trans->dbg.internal_ini_cfg; int ret; + if (domain != IWL_FW_INI_DOMAIN_ALWAYS_ON && + !(domain & trans->dbg.domains_bitmap)) { + IWL_DEBUG_FW(trans, + "WRT: Skipping TLV with disabled domain 0x%0x (0x%0x)\n", + domain, trans->dbg.domains_bitmap); + return; + } + if (tlv_idx >= ARRAY_SIZE(dbg_tlv_alloc) || !dbg_tlv_alloc[tlv_idx]) { IWL_ERR(trans, "WRT: Unsupported TLV type 0x%x\n", type); goto out_err; @@ -480,7 +489,14 @@ static int iwl_dbg_tlv_alloc_fragment(struct iwl_fw_runtime *fwrt, if (!frag || frag->size || !pages) return -EIO; - while (pages) { + /* + * We try to allocate as many pages as we can, starting with + * the requested amount and going down until we can allocate + * something. Because of DIV_ROUND_UP(), pages will never go + * down to 0 and stop the loop, so stop when pages reaches 1, + * which is too small anyway. + */ + while (pages > 1) { block = dma_alloc_coherent(fwrt->dev, pages * PAGE_SIZE, &physical, GFP_KERNEL | __GFP_NOWARN); @@ -660,7 +676,6 @@ static void iwl_dbg_tlv_send_hcmds(struct iwl_fw_runtime *fwrt, list_for_each_entry(node, hcmd_list, list) { struct iwl_fw_ini_hcmd_tlv *hcmd = (void *)node->tlv.data; struct iwl_fw_ini_hcmd *hcmd_data = &hcmd->hcmd; - u32 domain = le32_to_cpu(hcmd->hdr.domain); u16 hcmd_len = le32_to_cpu(node->tlv.length) - sizeof(*hcmd); struct iwl_host_cmd cmd = { .id = WIDE_ID(hcmd_data->group, hcmd_data->id), @@ -668,10 +683,6 @@ static void iwl_dbg_tlv_send_hcmds(struct iwl_fw_runtime *fwrt, .data = { hcmd_data->data, }, }; - if (domain != IWL_FW_INI_DOMAIN_ALWAYS_ON && - !(domain & fwrt->trans->dbg.domains_bitmap)) - continue; - iwl_trans_send_cmd(fwrt->trans, &cmd); } } @@ -891,55 +902,17 @@ static void iwl_dbg_tlv_gen_active_trig_list(struct iwl_fw_runtime *fwrt, struct iwl_dbg_tlv_time_point_data *tp) { - struct iwl_dbg_tlv_node *node, *tmp; + struct iwl_dbg_tlv_node *node; struct list_head *trig_list = &tp->trig_list; struct list_head *active_trig_list = &tp->active_trig_list; - list_for_each_entry_safe(node, tmp, active_trig_list, list) { - list_del(&node->list); - kfree(node); - } - list_for_each_entry(node, trig_list, list) { struct iwl_ucode_tlv *tlv = &node->tlv; - struct iwl_fw_ini_trigger_tlv *trig = (void *)tlv->data; - u32 domain = le32_to_cpu(trig->hdr.domain); - - if (domain != IWL_FW_INI_DOMAIN_ALWAYS_ON && - !(domain & fwrt->trans->dbg.domains_bitmap)) - continue; iwl_dbg_tlv_add_active_trigger(fwrt, active_trig_list, tlv); } } -int iwl_dbg_tlv_gen_active_trigs(struct iwl_fw_runtime *fwrt, u32 new_domain) -{ - int i; - - if (test_and_set_bit(STATUS_GEN_ACTIVE_TRIGS, &fwrt->status)) - return -EBUSY; - - iwl_fw_flush_dumps(fwrt); - - fwrt->trans->dbg.domains_bitmap = new_domain; - - IWL_DEBUG_FW(fwrt, - "WRT: Generating active triggers list, domain 0x%x\n", - fwrt->trans->dbg.domains_bitmap); - - for (i = 0; i < ARRAY_SIZE(fwrt->trans->dbg.time_point); i++) { - struct iwl_dbg_tlv_time_point_data *tp = - &fwrt->trans->dbg.time_point[i]; - - iwl_dbg_tlv_gen_active_trig_list(fwrt, tp); - } - - clear_bit(STATUS_GEN_ACTIVE_TRIGS, &fwrt->status); - - return 0; -} - static bool iwl_dbg_tlv_check_fw_pkt(struct iwl_fw_runtime *fwrt, struct iwl_fwrt_dump_data *dump_data, union iwl_dbg_tlv_tp_data *tp_data, @@ -1013,7 +986,16 @@ static void iwl_dbg_tlv_init_cfg(struct iwl_fw_runtime *fwrt) enum iwl_fw_ini_buffer_location *ini_dest = &fwrt->trans->dbg.ini_dest; int ret, i; - iwl_dbg_tlv_gen_active_trigs(fwrt, IWL_FW_DBG_DOMAIN); + IWL_DEBUG_FW(fwrt, + "WRT: Generating active triggers list, domain 0x%x\n", + fwrt->trans->dbg.domains_bitmap); + + for (i = 0; i < ARRAY_SIZE(fwrt->trans->dbg.time_point); i++) { + struct iwl_dbg_tlv_time_point_data *tp = + &fwrt->trans->dbg.time_point[i]; + + iwl_dbg_tlv_gen_active_trig_list(fwrt, tp); + } *ini_dest = IWL_FW_INI_LOCATION_INVALID; for (i = 0; i < IWL_FW_INI_ALLOCATION_NUM; i++) { diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.h b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.h index f18946872569..1360676b3b21 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.h @@ -105,7 +105,6 @@ void iwl_dbg_tlv_init(struct iwl_trans *trans); void iwl_dbg_tlv_time_point(struct iwl_fw_runtime *fwrt, enum iwl_fw_ini_time_point tp_id, union iwl_dbg_tlv_tp_data *tp_data); -int iwl_dbg_tlv_gen_active_trigs(struct iwl_fw_runtime *fwrt, u32 new_domain); void iwl_dbg_tlv_del_timers(struct iwl_trans *trans); #endif /* __iwl_dbg_tlv_h__*/ diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c index 4096ccf58b07..2d1cb4647c3b 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c @@ -493,6 +493,16 @@ static void iwl_set_ucode_capabilities(struct iwl_drv *drv, const u8 *data, } } +static const char *iwl_reduced_fw_name(struct iwl_drv *drv) +{ + const char *name = drv->firmware_name; + + if (strncmp(name, "iwlwifi-", 8) == 0) + name += 8; + + return name; +} + static int iwl_parse_v1_v2_firmware(struct iwl_drv *drv, const struct firmware *ucode_raw, struct iwl_firmware_pieces *pieces) @@ -551,12 +561,12 @@ static int iwl_parse_v1_v2_firmware(struct iwl_drv *drv, snprintf(drv->fw.fw_version, sizeof(drv->fw.fw_version), - "%u.%u.%u.%u%s", + "%u.%u.%u.%u%s %s", IWL_UCODE_MAJOR(drv->fw.ucode_ver), IWL_UCODE_MINOR(drv->fw.ucode_ver), IWL_UCODE_API(drv->fw.ucode_ver), IWL_UCODE_SERIAL(drv->fw.ucode_ver), - buildstr); + buildstr, iwl_reduced_fw_name(drv)); /* Verify size of file vs. image size info in file's header */ @@ -636,12 +646,12 @@ static int iwl_parse_tlv_firmware(struct iwl_drv *drv, snprintf(drv->fw.fw_version, sizeof(drv->fw.fw_version), - "%u.%u.%u.%u%s", + "%u.%u.%u.%u%s %s", IWL_UCODE_MAJOR(drv->fw.ucode_ver), IWL_UCODE_MINOR(drv->fw.ucode_ver), IWL_UCODE_API(drv->fw.ucode_ver), IWL_UCODE_SERIAL(drv->fw.ucode_ver), - buildstr); + buildstr, iwl_reduced_fw_name(drv)); data = ucode->data; @@ -895,11 +905,13 @@ static int iwl_parse_tlv_firmware(struct iwl_drv *drv, if (major >= 35) snprintf(drv->fw.fw_version, sizeof(drv->fw.fw_version), - "%u.%08x.%u", major, minor, local_comp); + "%u.%08x.%u %s", major, minor, + local_comp, iwl_reduced_fw_name(drv)); else snprintf(drv->fw.fw_version, sizeof(drv->fw.fw_version), - "%u.%u.%u", major, minor, local_comp); + "%u.%u.%u %s", major, minor, + local_comp, iwl_reduced_fw_name(drv)); break; } case IWL_UCODE_TLV_FW_DBG_DEST: { @@ -1647,6 +1659,8 @@ struct iwl_drv *iwl_drv_start(struct iwl_trans *trans) drv->trans->dbgfs_dir = debugfs_create_dir("trans", drv->dbgfs_drv); #endif + drv->trans->dbg.domains_bitmap = IWL_TRANS_FW_DBG_DOMAIN(drv->trans); + ret = iwl_request_firmware(drv, true); if (ret) { IWL_ERR(trans, "Couldn't request the fw\n"); @@ -1817,9 +1831,6 @@ MODULE_PARM_DESC(antenna_coupling, module_param_named(nvm_file, iwlwifi_mod_params.nvm_file, charp, 0444); MODULE_PARM_DESC(nvm_file, "NVM file name"); -module_param_named(lar_disable, iwlwifi_mod_params.lar_disable, bool, 0444); -MODULE_PARM_DESC(lar_disable, "disable LAR functionality (default: N)"); - module_param_named(uapsd_disable, iwlwifi_mod_params.uapsd_disable, uint, 0644); MODULE_PARM_DESC(uapsd_disable, "disable U-APSD functionality bitmap 1: BSS 2: P2P Client (default: 3)"); diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-fh.h b/drivers/net/wireless/intel/iwlwifi/iwl-fh.h index 8836f85afe85..bf673ce5f183 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-fh.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-fh.h @@ -611,10 +611,7 @@ static inline unsigned int FH_MEM_CBBC_QUEUE(struct iwl_trans *trans, */ #define FH_TX_CHICKEN_BITS_SCD_AUTO_RETRY_EN (0x00000002) -#define MQ_RX_TABLE_SIZE 512 -#define MQ_RX_TABLE_MASK (MQ_RX_TABLE_SIZE - 1) -#define MQ_RX_NUM_RBDS (MQ_RX_TABLE_SIZE - 1) -#define RX_POOL_SIZE (MQ_RX_NUM_RBDS + \ +#define RX_POOL_SIZE(rbds) ((rbds) - 1 + \ IWL_MAX_RX_HW_QUEUES * \ (RX_CLAIM_REQ_ALLOC - RX_POST_REQ_ALLOC)) /* cb size is the exponent */ diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-io.c b/drivers/net/wireless/intel/iwlwifi/iwl-io.c index 1b7414bf7bef..2139f0b8f2bb 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-io.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-io.c @@ -70,36 +70,6 @@ #include "iwl-prph.h" #include "iwl-fh.h" -const struct iwl_csr_params iwl_csr_v1 = { - .flag_mac_clock_ready = 0, - .flag_val_mac_access_en = 0, - .flag_init_done = 2, - .flag_mac_access_req = 3, - .flag_sw_reset = 7, - .flag_master_dis = 8, - .flag_stop_master = 9, - .addr_sw_reset = CSR_BASE + 0x020, - .mac_addr0_otp = 0x380, - .mac_addr1_otp = 0x384, - .mac_addr0_strap = 0x388, - .mac_addr1_strap = 0x38C -}; - -const struct iwl_csr_params iwl_csr_v2 = { - .flag_init_done = 6, - .flag_mac_clock_ready = 20, - .flag_val_mac_access_en = 20, - .flag_mac_access_req = 21, - .flag_master_dis = 28, - .flag_stop_master = 29, - .flag_sw_reset = 31, - .addr_sw_reset = CSR_BASE + 0x024, - .mac_addr0_otp = 0x30, - .mac_addr1_otp = 0x34, - .mac_addr0_strap = 0x38, - .mac_addr1_strap = 0x3C -}; - void iwl_write8(struct iwl_trans *trans, u32 ofs, u8 val) { trace_iwlwifi_dev_iowrite8(trans->dev, ofs, val); @@ -506,8 +476,7 @@ int iwl_finish_nic_init(struct iwl_trans *trans, * Set "initialization complete" bit to move adapter from * D0U* --> D0A* (powered-up active) state. */ - iwl_set_bit(trans, CSR_GP_CNTRL, - BIT(cfg_trans->csr->flag_init_done)); + iwl_set_bit(trans, CSR_GP_CNTRL, CSR_GP_CNTRL_REG_FLAG_INIT_DONE); if (cfg_trans->device_family == IWL_DEVICE_FAMILY_8000) udelay(2); @@ -518,8 +487,8 @@ int iwl_finish_nic_init(struct iwl_trans *trans, * and accesses to uCode SRAM. */ err = iwl_poll_bit(trans, CSR_GP_CNTRL, - BIT(cfg_trans->csr->flag_mac_clock_ready), - BIT(cfg_trans->csr->flag_mac_clock_ready), + CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY, + CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY, 25000); if (err < 0) IWL_DEBUG_INFO(trans, "Failed to wake NIC\n"); diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-modparams.h b/drivers/net/wireless/intel/iwlwifi/iwl-modparams.h index ebea3f308b5d..82e5cac23d8d 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-modparams.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-modparams.h @@ -115,7 +115,6 @@ enum iwl_uapsd_disable { * @nvm_file: specifies a external NVM file * @uapsd_disable: disable U-APSD, see &enum iwl_uapsd_disable, default = * IWL_DISABLE_UAPSD_BSS | IWL_DISABLE_UAPSD_P2P_CLIENT - * @lar_disable: disable LAR (regulatory), default = 0 * @fw_monitor: allow to use firmware monitor * @disable_11ac: disable VHT capabilities, default = false. * @remove_when_gone: remove an inaccessible device from the PCIe bus. @@ -136,7 +135,6 @@ struct iwl_mod_params { int antenna_coupling; char *nvm_file; u32 uapsd_disable; - bool lar_disable; bool fw_monitor; bool disable_11ac; /** diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c index 1e240a2a8329..bab0999f002c 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c @@ -224,6 +224,34 @@ enum iwl_nvm_channel_flags { NVM_CHANNEL_DC_HIGH = BIT(12), }; +/** + * enum iwl_reg_capa_flags - global flags applied for the whole regulatory + * domain. + * @REG_CAPA_BF_CCD_LOW_BAND: Beam-forming or Cyclic Delay Diversity in the + * 2.4Ghz band is allowed. + * @REG_CAPA_BF_CCD_HIGH_BAND: Beam-forming or Cyclic Delay Diversity in the + * 5Ghz band is allowed. + * @REG_CAPA_160MHZ_ALLOWED: 11ac channel with a width of 160Mhz is allowed + * for this regulatory domain (valid only in 5Ghz). + * @REG_CAPA_80MHZ_ALLOWED: 11ac channel with a width of 80Mhz is allowed + * for this regulatory domain (valid only in 5Ghz). + * @REG_CAPA_MCS_8_ALLOWED: 11ac with MCS 8 is allowed. + * @REG_CAPA_MCS_9_ALLOWED: 11ac with MCS 9 is allowed. + * @REG_CAPA_40MHZ_FORBIDDEN: 11n channel with a width of 40Mhz is forbidden + * for this regulatory domain (valid only in 5Ghz). + * @REG_CAPA_DC_HIGH_ENABLED: DC HIGH allowed. + */ +enum iwl_reg_capa_flags { + REG_CAPA_BF_CCD_LOW_BAND = BIT(0), + REG_CAPA_BF_CCD_HIGH_BAND = BIT(1), + REG_CAPA_160MHZ_ALLOWED = BIT(2), + REG_CAPA_80MHZ_ALLOWED = BIT(3), + REG_CAPA_MCS_8_ALLOWED = BIT(4), + REG_CAPA_MCS_9_ALLOWED = BIT(5), + REG_CAPA_40MHZ_FORBIDDEN = BIT(7), + REG_CAPA_DC_HIGH_ENABLED = BIT(9), +}; + static inline void iwl_nvm_print_channel_flags(struct device *dev, u32 level, int chan, u32 flags) { @@ -801,12 +829,8 @@ static void iwl_flip_hw_address(__le32 mac_addr0, __le32 mac_addr1, u8 *dest) static void iwl_set_hw_address_from_csr(struct iwl_trans *trans, struct iwl_nvm_data *data) { - __le32 mac_addr0 = - cpu_to_le32(iwl_read32(trans, - trans->trans_cfg->csr->mac_addr0_strap)); - __le32 mac_addr1 = - cpu_to_le32(iwl_read32(trans, - trans->trans_cfg->csr->mac_addr1_strap)); + __le32 mac_addr0 = cpu_to_le32(iwl_read32(trans, CSR_MAC_ADDR0_STRAP)); + __le32 mac_addr1 = cpu_to_le32(iwl_read32(trans, CSR_MAC_ADDR1_STRAP)); iwl_flip_hw_address(mac_addr0, mac_addr1, data->hw_addr); /* @@ -816,10 +840,8 @@ static void iwl_set_hw_address_from_csr(struct iwl_trans *trans, if (is_valid_ether_addr(data->hw_addr)) return; - mac_addr0 = cpu_to_le32(iwl_read32(trans, - trans->trans_cfg->csr->mac_addr0_otp)); - mac_addr1 = cpu_to_le32(iwl_read32(trans, - trans->trans_cfg->csr->mac_addr1_otp)); + mac_addr0 = cpu_to_le32(iwl_read32(trans, CSR_MAC_ADDR0_OTP)); + mac_addr1 = cpu_to_le32(iwl_read32(trans, CSR_MAC_ADDR1_OTP)); iwl_flip_hw_address(mac_addr0, mac_addr1, data->hw_addr); } @@ -939,10 +961,11 @@ iwl_nvm_no_wide_in_5ghz(struct iwl_trans *trans, const struct iwl_cfg *cfg, struct iwl_nvm_data * iwl_parse_nvm_data(struct iwl_trans *trans, const struct iwl_cfg *cfg, + const struct iwl_fw *fw, const __be16 *nvm_hw, const __le16 *nvm_sw, const __le16 *nvm_calib, const __le16 *regulatory, const __le16 *mac_override, const __le16 *phy_sku, - u8 tx_chains, u8 rx_chains, bool lar_fw_supported) + u8 tx_chains, u8 rx_chains) { struct iwl_nvm_data *data; bool lar_enabled; @@ -1022,7 +1045,8 @@ iwl_parse_nvm_data(struct iwl_trans *trans, const struct iwl_cfg *cfg, return NULL; } - if (lar_fw_supported && lar_enabled) + if (lar_enabled && + fw_has_capa(&fw->ucode_capa, IWL_UCODE_TLV_CAPA_LAR_SUPPORT)) sbands_flags |= IWL_NVM_SBANDS_FLAGS_LAR; if (iwl_nvm_no_wide_in_5ghz(trans, cfg, nvm_hw)) @@ -1038,6 +1062,7 @@ IWL_EXPORT_SYMBOL(iwl_parse_nvm_data); static u32 iwl_nvm_get_regdom_bw_flags(const u16 *nvm_chan, int ch_idx, u16 nvm_flags, + u16 cap_flags, const struct iwl_cfg *cfg) { u32 flags = NL80211_RRF_NO_HT40; @@ -1076,13 +1101,27 @@ static u32 iwl_nvm_get_regdom_bw_flags(const u16 *nvm_chan, (flags & NL80211_RRF_NO_IR)) flags |= NL80211_RRF_GO_CONCURRENT; + /* + * cap_flags is per regulatory domain so apply it for every channel + */ + if (ch_idx >= NUM_2GHZ_CHANNELS) { + if (cap_flags & REG_CAPA_40MHZ_FORBIDDEN) + flags |= NL80211_RRF_NO_HT40; + + if (!(cap_flags & REG_CAPA_80MHZ_ALLOWED)) + flags |= NL80211_RRF_NO_80MHZ; + + if (!(cap_flags & REG_CAPA_160MHZ_ALLOWED)) + flags |= NL80211_RRF_NO_160MHZ; + } + return flags; } struct ieee80211_regdomain * iwl_parse_nvm_mcc_info(struct device *dev, const struct iwl_cfg *cfg, int num_of_ch, __le32 *channels, u16 fw_mcc, - u16 geo_info) + u16 geo_info, u16 cap) { int ch_idx; u16 ch_flags; @@ -1140,7 +1179,8 @@ iwl_parse_nvm_mcc_info(struct device *dev, const struct iwl_cfg *cfg, } reg_rule_flags = iwl_nvm_get_regdom_bw_flags(nvm_chan, ch_idx, - ch_flags, cfg); + ch_flags, cap, + cfg); /* we can't continue the same rule */ if (ch_idx == 0 || prev_reg_rule_flags != reg_rule_flags || @@ -1405,9 +1445,6 @@ struct iwl_nvm_data *iwl_get_nvm(struct iwl_trans *trans, .id = WIDE_ID(REGULATORY_AND_NVM_GROUP, NVM_GET_INFO) }; int ret; - bool lar_fw_supported = !iwlwifi_mod_params.lar_disable && - fw_has_capa(&fw->ucode_capa, - IWL_UCODE_TLV_CAPA_LAR_SUPPORT); bool empty_otp; u32 mac_flags; u32 sbands_flags = 0; @@ -1485,7 +1522,9 @@ struct iwl_nvm_data *iwl_get_nvm(struct iwl_trans *trans, nvm->valid_tx_ant = (u8)le32_to_cpu(rsp->phy_sku.tx_chains); nvm->valid_rx_ant = (u8)le32_to_cpu(rsp->phy_sku.rx_chains); - if (le32_to_cpu(rsp->regulatory.lar_enabled) && lar_fw_supported) { + if (le32_to_cpu(rsp->regulatory.lar_enabled) && + fw_has_capa(&fw->ucode_capa, + IWL_UCODE_TLV_CAPA_LAR_SUPPORT)) { nvm->lar_enabled = true; sbands_flags |= IWL_NVM_SBANDS_FLAGS_LAR; } diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.h b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.h index b7e1ddf8f177..fb0b385d10fd 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.h @@ -7,7 +7,7 @@ * * Copyright(c) 2008 - 2015 Intel Corporation. All rights reserved. * Copyright(c) 2016 - 2017 Intel Deutschland GmbH - * Copyright(c) 2018 Intel Corporation + * Copyright(c) 2018 - 2019 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as @@ -29,7 +29,7 @@ * * Copyright(c) 2005 - 2014 Intel Corporation. All rights reserved. * Copyright(c) 2016 - 2017 Intel Deutschland GmbH - * Copyright(c) 2018 Intel Corporation + * Copyright(c) 2018 - 2019 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -85,10 +85,11 @@ enum iwl_nvm_sbands_flags { */ struct iwl_nvm_data * iwl_parse_nvm_data(struct iwl_trans *trans, const struct iwl_cfg *cfg, + const struct iwl_fw *fw, const __be16 *nvm_hw, const __le16 *nvm_sw, const __le16 *nvm_calib, const __le16 *regulatory, const __le16 *mac_override, const __le16 *phy_sku, - u8 tx_chains, u8 rx_chains, bool lar_fw_supported); + u8 tx_chains, u8 rx_chains); /** * iwl_parse_mcc_info - parse MCC (mobile country code) info coming from FW @@ -103,7 +104,7 @@ iwl_parse_nvm_data(struct iwl_trans *trans, const struct iwl_cfg *cfg, struct ieee80211_regdomain * iwl_parse_nvm_mcc_info(struct device *dev, const struct iwl_cfg *cfg, int num_of_ch, __le32 *channels, u16 fw_mcc, - u16 geo_info); + u16 geo_info, u16 cap); /** * struct iwl_nvm_section - describes an NVM section in memory. diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-prph.h b/drivers/net/wireless/intel/iwlwifi/iwl-prph.h index 14c8ba23f3b9..1136d9784f9d 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-prph.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-prph.h @@ -411,13 +411,6 @@ enum { HW_STEP_LOCATION_BITS = 24, }; -#define AUX_MISC_MASTER1_EN 0xA20818 -enum aux_misc_master1_en { - AUX_MISC_MASTER1_EN_SBE_MSK = 0x1, -}; - -#define AUX_MISC_MASTER1_SMPHR_STATUS 0xA20800 -#define RSA_ENABLE 0xA24B08 #define PREG_AUX_BUS_WPROT_0 0xA04CC0 /* device family 9000 WPROT register */ @@ -430,6 +423,9 @@ enum aux_misc_master1_en { #define UMAG_SB_CPU_1_STATUS 0xA038C0 #define UMAG_SB_CPU_2_STATUS 0xA038C4 #define UMAG_GEN_HW_STATUS 0xA038C8 +#define UREG_UMAC_CURRENT_PC 0xa05c18 +#define UREG_LMAC1_CURRENT_PC 0xa05c1c +#define UREG_LMAC2_CURRENT_PC 0xa05c20 /* For UMAG_GEN_HW_STATUS reg check */ enum { diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-trans.c b/drivers/net/wireless/intel/iwlwifi/iwl-trans.c index 28bdc9a9617e..f91197e4ae40 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-trans.c +++ b/drivers/net/wireless/intel/iwlwifi/iwl-trans.c @@ -66,7 +66,9 @@ struct iwl_trans *iwl_trans_alloc(unsigned int priv_size, struct device *dev, - const struct iwl_trans_ops *ops) + const struct iwl_trans_ops *ops, + unsigned int cmd_pool_size, + unsigned int cmd_pool_align) { struct iwl_trans *trans; #ifdef CONFIG_LOCKDEP @@ -90,10 +92,8 @@ struct iwl_trans *iwl_trans_alloc(unsigned int priv_size, "iwl_cmd_pool:%s", dev_name(trans->dev)); trans->dev_cmd_pool = kmem_cache_create(trans->dev_cmd_pool_name, - sizeof(struct iwl_device_cmd), - sizeof(void *), - SLAB_HWCACHE_ALIGN, - NULL); + cmd_pool_size, cmd_pool_align, + SLAB_HWCACHE_ALIGN, NULL); if (!trans->dev_cmd_pool) return NULL; diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-trans.h b/drivers/net/wireless/intel/iwlwifi/iwl-trans.h index 8cadad7364ac..7b3b1f4c99b4 100644 --- a/drivers/net/wireless/intel/iwlwifi/iwl-trans.h +++ b/drivers/net/wireless/intel/iwlwifi/iwl-trans.h @@ -112,6 +112,8 @@ * 6) Eventually, the free function will be called. */ +#define IWL_TRANS_FW_DBG_DOMAIN(trans) IWL_FW_INI_DOMAIN_ALWAYS_ON + #define FH_RSCSR_FRAME_SIZE_MSK 0x00003FFF /* bits 0-13 */ #define FH_RSCSR_FRAME_INVALID 0x55550000 #define FH_RSCSR_FRAME_ALIGN 0x40 @@ -193,6 +195,18 @@ struct iwl_device_cmd { }; } __packed; +/** + * struct iwl_device_tx_cmd - buffer for TX command + * @hdr: the header + * @payload: the payload placeholder + * + * The actual structure is sized dynamically according to need. + */ +struct iwl_device_tx_cmd { + struct iwl_cmd_header hdr; + u8 payload[]; +} __packed; + #define TFD_MAX_PAYLOAD_SIZE (sizeof(struct iwl_device_cmd)) /* @@ -358,6 +372,24 @@ iwl_trans_get_rb_size_order(enum iwl_amsdu_size rb_size) } } +static inline int +iwl_trans_get_rb_size(enum iwl_amsdu_size rb_size) +{ + switch (rb_size) { + case IWL_AMSDU_2K: + return 2 * 1024; + case IWL_AMSDU_4K: + return 4 * 1024; + case IWL_AMSDU_8K: + return 8 * 1024; + case IWL_AMSDU_12K: + return 12 * 1024; + default: + WARN_ON(1); + return 0; + } +} + struct iwl_hcmd_names { u8 cmd_id; const char *const cmd_name; @@ -544,7 +576,7 @@ struct iwl_trans_ops { int (*send_cmd)(struct iwl_trans *trans, struct iwl_host_cmd *cmd); int (*tx)(struct iwl_trans *trans, struct sk_buff *skb, - struct iwl_device_cmd *dev_cmd, int queue); + struct iwl_device_tx_cmd *dev_cmd, int queue); void (*reclaim)(struct iwl_trans *trans, int queue, int ssn, struct sk_buff_head *skbs); @@ -839,6 +871,8 @@ struct iwl_trans { enum iwl_plat_pm_mode system_pm_mode; + const char *name; + /* pointer to trans specific struct */ /*Ensure that this pointer will always be aligned to sizeof pointer */ char trans_specific[0] __aligned(sizeof(void *)); @@ -948,22 +982,22 @@ iwl_trans_dump_data(struct iwl_trans *trans, u32 dump_mask) return trans->ops->dump_data(trans, dump_mask); } -static inline struct iwl_device_cmd * +static inline struct iwl_device_tx_cmd * iwl_trans_alloc_tx_cmd(struct iwl_trans *trans) { - return kmem_cache_alloc(trans->dev_cmd_pool, GFP_ATOMIC); + return kmem_cache_zalloc(trans->dev_cmd_pool, GFP_ATOMIC); } int iwl_trans_send_cmd(struct iwl_trans *trans, struct iwl_host_cmd *cmd); static inline void iwl_trans_free_tx_cmd(struct iwl_trans *trans, - struct iwl_device_cmd *dev_cmd) + struct iwl_device_tx_cmd *dev_cmd) { kmem_cache_free(trans->dev_cmd_pool, dev_cmd); } static inline int iwl_trans_tx(struct iwl_trans *trans, struct sk_buff *skb, - struct iwl_device_cmd *dev_cmd, int queue) + struct iwl_device_tx_cmd *dev_cmd, int queue) { if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) return -EIO; @@ -1271,7 +1305,9 @@ static inline bool iwl_trans_dbg_ini_valid(struct iwl_trans *trans) *****************************************************/ struct iwl_trans *iwl_trans_alloc(unsigned int priv_size, struct device *dev, - const struct iwl_trans_ops *ops); + const struct iwl_trans_ops *ops, + unsigned int cmd_pool_size, + unsigned int cmd_pool_align); void iwl_trans_free(struct iwl_trans *trans); /***************************************************** diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/constants.h b/drivers/net/wireless/intel/iwlwifi/mvm/constants.h index 60aff2ecec12..58df25e2fb32 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/constants.h +++ b/drivers/net/wireless/intel/iwlwifi/mvm/constants.h @@ -154,5 +154,6 @@ #define IWL_MVM_D3_DEBUG false #define IWL_MVM_USE_TWT false #define IWL_MVM_AMPDU_CONSEC_DROPS_DELBA 10 +#define IWL_MVM_USE_NSSN_SYNC 0 #endif /* __MVM_CONSTANTS_H */ diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/d3.c b/drivers/net/wireless/intel/iwlwifi/mvm/d3.c index 43ebb2149b63..8878409d2f07 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/d3.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/d3.c @@ -989,6 +989,8 @@ static int __iwl_mvm_suspend(struct ieee80211_hw *hw, mutex_lock(&mvm->mutex); + set_bit(IWL_MVM_STATUS_IN_D3, &mvm->status); + vif = iwl_mvm_get_bss_vif(mvm); if (IS_ERR_OR_NULL(vif)) { ret = 1; @@ -1083,6 +1085,8 @@ static int __iwl_mvm_suspend(struct ieee80211_hw *hw, ieee80211_restart_hw(mvm->hw); } } + + clear_bit(IWL_MVM_STATUS_IN_D3, &mvm->status); } out_noreset: mutex_unlock(&mvm->mutex); @@ -1929,6 +1933,8 @@ static int __iwl_mvm_resume(struct iwl_mvm *mvm, bool test) mutex_lock(&mvm->mutex); + clear_bit(IWL_MVM_STATUS_IN_D3, &mvm->status); + /* get the BSS vif pointer again */ vif = iwl_mvm_get_bss_vif(mvm); if (IS_ERR_OR_NULL(vif)) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs.c b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs.c index aa659162a7c2..190cf15b825c 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/debugfs.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/debugfs.c @@ -752,7 +752,7 @@ static ssize_t iwl_dbgfs_fw_ver_read(struct file *file, char __user *user_buf, pos += scnprintf(pos, endpos - pos, "FW: %s\n", mvm->fwrt.fw->human_readable); pos += scnprintf(pos, endpos - pos, "Device: %s\n", - mvm->fwrt.trans->cfg->name); + mvm->fwrt.trans->name); pos += scnprintf(pos, endpos - pos, "Bus: %s\n", mvm->fwrt.dev->bus->name); diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c b/drivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c index 9f4b117db9d7..f783d6d53b6f 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/ftm-initiator.c @@ -208,10 +208,11 @@ static void iwl_mvm_ftm_cmd(struct iwl_mvm *mvm, struct ieee80211_vif *vif, cmd->tsf_mac_id = cpu_to_le32(0xff); } -static int iwl_mvm_ftm_target_chandef(struct iwl_mvm *mvm, - struct cfg80211_pmsr_request_peer *peer, - u8 *channel, u8 *bandwidth, - u8 *ctrl_ch_position) +static int +iwl_mvm_ftm_target_chandef_v1(struct iwl_mvm *mvm, + struct cfg80211_pmsr_request_peer *peer, + u8 *channel, u8 *bandwidth, + u8 *ctrl_ch_position) { u32 freq = peer->chandef.chan->center_freq; @@ -243,15 +244,54 @@ static int iwl_mvm_ftm_target_chandef(struct iwl_mvm *mvm, } static int +iwl_mvm_ftm_target_chandef_v2(struct iwl_mvm *mvm, + struct cfg80211_pmsr_request_peer *peer, + u8 *channel, u8 *format_bw, + u8 *ctrl_ch_position) +{ + u32 freq = peer->chandef.chan->center_freq; + + *channel = ieee80211_frequency_to_channel(freq); + + switch (peer->chandef.width) { + case NL80211_CHAN_WIDTH_20_NOHT: + *format_bw = IWL_LOCATION_FRAME_FORMAT_LEGACY; + *format_bw |= IWL_LOCATION_BW_20MHZ << LOCATION_BW_POS; + break; + case NL80211_CHAN_WIDTH_20: + *format_bw = IWL_LOCATION_FRAME_FORMAT_HT; + *format_bw |= IWL_LOCATION_BW_20MHZ << LOCATION_BW_POS; + break; + case NL80211_CHAN_WIDTH_40: + *format_bw = IWL_LOCATION_FRAME_FORMAT_HT; + *format_bw |= IWL_LOCATION_BW_40MHZ << LOCATION_BW_POS; + break; + case NL80211_CHAN_WIDTH_80: + *format_bw = IWL_LOCATION_FRAME_FORMAT_VHT; + *format_bw |= IWL_LOCATION_BW_80MHZ << LOCATION_BW_POS; + break; + default: + IWL_ERR(mvm, "Unsupported BW in FTM request (%d)\n", + peer->chandef.width); + return -EINVAL; + } + + *ctrl_ch_position = (peer->chandef.width > NL80211_CHAN_WIDTH_20) ? + iwl_mvm_get_ctrl_pos(&peer->chandef) : 0; + + return 0; +} + +static int iwl_mvm_ftm_put_target_v2(struct iwl_mvm *mvm, struct cfg80211_pmsr_request_peer *peer, struct iwl_tof_range_req_ap_entry_v2 *target) { int ret; - ret = iwl_mvm_ftm_target_chandef(mvm, peer, &target->channel_num, - &target->bandwidth, - &target->ctrl_ch_position); + ret = iwl_mvm_ftm_target_chandef_v1(mvm, peer, &target->channel_num, + &target->bandwidth, + &target->ctrl_ch_position); if (ret) return ret; @@ -278,18 +318,11 @@ iwl_mvm_ftm_put_target_v2(struct iwl_mvm *mvm, #define FTM_PUT_FLAG(flag) (target->initiator_ap_flags |= \ cpu_to_le32(IWL_INITIATOR_AP_FLAGS_##flag)) -static int iwl_mvm_ftm_put_target(struct iwl_mvm *mvm, - struct cfg80211_pmsr_request_peer *peer, - struct iwl_tof_range_req_ap_entry *target) +static void +iwl_mvm_ftm_put_target_common(struct iwl_mvm *mvm, + struct cfg80211_pmsr_request_peer *peer, + struct iwl_tof_range_req_ap_entry *target) { - int ret; - - ret = iwl_mvm_ftm_target_chandef(mvm, peer, &target->channel_num, - &target->bandwidth, - &target->ctrl_ch_position); - if (ret) - return ret; - memcpy(target->bssid, peer->addr, ETH_ALEN); target->burst_period = cpu_to_le16(peer->ftm.burst_period); @@ -314,60 +347,166 @@ static int iwl_mvm_ftm_put_target(struct iwl_mvm *mvm, FTM_PUT_FLAG(ALGO_LR); else if (IWL_MVM_FTM_INITIATOR_ALGO == IWL_TOF_ALGO_TYPE_FFT) FTM_PUT_FLAG(ALGO_FFT); +} + +static int +iwl_mvm_ftm_put_target_v3(struct iwl_mvm *mvm, + struct cfg80211_pmsr_request_peer *peer, + struct iwl_tof_range_req_ap_entry_v3 *target) +{ + int ret; + + ret = iwl_mvm_ftm_target_chandef_v1(mvm, peer, &target->channel_num, + &target->bandwidth, + &target->ctrl_ch_position); + if (ret) + return ret; + + /* + * Versions 3 and 4 has some common fields, so + * iwl_mvm_ftm_put_target_common() can be used for version 7 too. + */ + iwl_mvm_ftm_put_target_common(mvm, peer, (void *)target); return 0; } -int iwl_mvm_ftm_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct cfg80211_pmsr_request *req) +static int iwl_mvm_ftm_put_target_v4(struct iwl_mvm *mvm, + struct cfg80211_pmsr_request_peer *peer, + struct iwl_tof_range_req_ap_entry *target) +{ + int ret; + + ret = iwl_mvm_ftm_target_chandef_v2(mvm, peer, &target->channel_num, + &target->format_bw, + &target->ctrl_ch_position); + if (ret) + return ret; + + iwl_mvm_ftm_put_target_common(mvm, peer, target); + + return 0; +} + +static int iwl_mvm_ftm_send_cmd(struct iwl_mvm *mvm, struct iwl_host_cmd *hcmd) +{ + u32 status; + int err = iwl_mvm_send_cmd_status(mvm, hcmd, &status); + + if (!err && status) { + IWL_ERR(mvm, "FTM range request command failure, status: %u\n", + status); + err = iwl_ftm_range_request_status_to_err(status); + } + + return err; +} + +static int iwl_mvm_ftm_start_v5(struct iwl_mvm *mvm, struct ieee80211_vif *vif, + struct cfg80211_pmsr_request *req) { struct iwl_tof_range_req_cmd_v5 cmd_v5; - struct iwl_tof_range_req_cmd cmd; - bool new_api = fw_has_api(&mvm->fw->ucode_capa, - IWL_UCODE_TLV_API_FTM_NEW_RANGE_REQ); - u8 num_of_ap; struct iwl_host_cmd hcmd = { .id = iwl_cmd_id(TOF_RANGE_REQ_CMD, LOCATION_GROUP, 0), .dataflags[0] = IWL_HCMD_DFL_DUP, + .data[0] = &cmd_v5, + .len[0] = sizeof(cmd_v5), }; - u32 status = 0; - int err, i; + u8 i; + int err; - lockdep_assert_held(&mvm->mutex); + iwl_mvm_ftm_cmd_v5(mvm, vif, &cmd_v5, req); - if (mvm->ftm_initiator.req) - return -EBUSY; + for (i = 0; i < cmd_v5.num_of_ap; i++) { + struct cfg80211_pmsr_request_peer *peer = &req->peers[i]; - if (new_api) { - iwl_mvm_ftm_cmd(mvm, vif, &cmd, req); - hcmd.data[0] = &cmd; - hcmd.len[0] = sizeof(cmd); - num_of_ap = cmd.num_of_ap; - } else { - iwl_mvm_ftm_cmd_v5(mvm, vif, &cmd_v5, req); - hcmd.data[0] = &cmd_v5; - hcmd.len[0] = sizeof(cmd_v5); - num_of_ap = cmd_v5.num_of_ap; + err = iwl_mvm_ftm_put_target_v2(mvm, peer, &cmd_v5.ap[i]); + if (err) + return err; } - for (i = 0; i < num_of_ap; i++) { + return iwl_mvm_ftm_send_cmd(mvm, &hcmd); +} + +static int iwl_mvm_ftm_start_v7(struct iwl_mvm *mvm, struct ieee80211_vif *vif, + struct cfg80211_pmsr_request *req) +{ + struct iwl_tof_range_req_cmd_v7 cmd_v7; + struct iwl_host_cmd hcmd = { + .id = iwl_cmd_id(TOF_RANGE_REQ_CMD, LOCATION_GROUP, 0), + .dataflags[0] = IWL_HCMD_DFL_DUP, + .data[0] = &cmd_v7, + .len[0] = sizeof(cmd_v7), + }; + u8 i; + int err; + + /* + * Versions 7 and 8 has the same structure except from the responders + * list, so iwl_mvm_ftm_cmd() can be used for version 7 too. + */ + iwl_mvm_ftm_cmd(mvm, vif, (void *)&cmd_v7, req); + + for (i = 0; i < cmd_v7.num_of_ap; i++) { struct cfg80211_pmsr_request_peer *peer = &req->peers[i]; - if (new_api) - err = iwl_mvm_ftm_put_target(mvm, peer, &cmd.ap[i]); - else - err = iwl_mvm_ftm_put_target_v2(mvm, peer, - &cmd_v5.ap[i]); + err = iwl_mvm_ftm_put_target_v3(mvm, peer, &cmd_v7.ap[i]); + if (err) + return err; + } + + return iwl_mvm_ftm_send_cmd(mvm, &hcmd); +} +static int iwl_mvm_ftm_start_v8(struct iwl_mvm *mvm, struct ieee80211_vif *vif, + struct cfg80211_pmsr_request *req) +{ + struct iwl_tof_range_req_cmd cmd; + struct iwl_host_cmd hcmd = { + .id = iwl_cmd_id(TOF_RANGE_REQ_CMD, LOCATION_GROUP, 0), + .dataflags[0] = IWL_HCMD_DFL_DUP, + .data[0] = &cmd, + .len[0] = sizeof(cmd), + }; + u8 i; + int err; + + iwl_mvm_ftm_cmd(mvm, vif, &cmd, req); + + for (i = 0; i < cmd.num_of_ap; i++) { + struct cfg80211_pmsr_request_peer *peer = &req->peers[i]; + + err = iwl_mvm_ftm_put_target_v4(mvm, peer, &cmd.ap[i]); if (err) return err; } - err = iwl_mvm_send_cmd_status(mvm, &hcmd, &status); - if (!err && status) { - IWL_ERR(mvm, "FTM range request command failure, status: %u\n", - status); - err = iwl_ftm_range_request_status_to_err(status); + return iwl_mvm_ftm_send_cmd(mvm, &hcmd); +} + +int iwl_mvm_ftm_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif, + struct cfg80211_pmsr_request *req) +{ + bool new_api = fw_has_api(&mvm->fw->ucode_capa, + IWL_UCODE_TLV_API_FTM_NEW_RANGE_REQ); + int err; + + lockdep_assert_held(&mvm->mutex); + + if (mvm->ftm_initiator.req) + return -EBUSY; + + if (new_api) { + u8 cmd_ver = iwl_mvm_lookup_cmd_ver(mvm->fw, LOCATION_GROUP, + TOF_RANGE_REQ_CMD); + + if (cmd_ver == 8) + err = iwl_mvm_ftm_start_v8(mvm, vif, req); + else + err = iwl_mvm_ftm_start_v7(mvm, vif, req); + + } else { + err = iwl_mvm_ftm_start_v5(mvm, vif, req); } if (!err) { diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ftm-responder.c b/drivers/net/wireless/intel/iwlwifi/mvm/ftm-responder.c index 1513b8b4062f..834564198409 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/ftm-responder.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/ftm-responder.c @@ -6,7 +6,7 @@ * GPL LICENSE SUMMARY * * Copyright(c) 2015 - 2017 Intel Deutschland GmbH - * Copyright (C) 2018 Intel Corporation + * Copyright (C) 2018 - 2019 Intel Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as @@ -27,7 +27,7 @@ * BSD LICENSE * * Copyright(c) 2015 - 2017 Intel Deutschland GmbH - * Copyright (C) 2018 Intel Corporation + * Copyright (C) 2018 - 2019 Intel Corporation * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -62,12 +62,72 @@ #include "mvm.h" #include "constants.h" +static int iwl_mvm_ftm_responder_set_bw_v1(struct cfg80211_chan_def *chandef, + u8 *bw, u8 *ctrl_ch_position) +{ + switch (chandef->width) { + case NL80211_CHAN_WIDTH_20_NOHT: + *bw = IWL_TOF_BW_20_LEGACY; + break; + case NL80211_CHAN_WIDTH_20: + *bw = IWL_TOF_BW_20_HT; + break; + case NL80211_CHAN_WIDTH_40: + *bw = IWL_TOF_BW_40; + *ctrl_ch_position = iwl_mvm_get_ctrl_pos(chandef); + break; + case NL80211_CHAN_WIDTH_80: + *bw = IWL_TOF_BW_80; + *ctrl_ch_position = iwl_mvm_get_ctrl_pos(chandef); + break; + default: + return -ENOTSUPP; + } + + return 0; +} + +static int iwl_mvm_ftm_responder_set_bw_v2(struct cfg80211_chan_def *chandef, + u8 *format_bw, + u8 *ctrl_ch_position) +{ + switch (chandef->width) { + case NL80211_CHAN_WIDTH_20_NOHT: + *format_bw = IWL_LOCATION_FRAME_FORMAT_LEGACY; + *format_bw |= IWL_LOCATION_BW_20MHZ << LOCATION_BW_POS; + break; + case NL80211_CHAN_WIDTH_20: + *format_bw = IWL_LOCATION_FRAME_FORMAT_HT; + *format_bw |= IWL_LOCATION_BW_20MHZ << LOCATION_BW_POS; + break; + case NL80211_CHAN_WIDTH_40: + *format_bw = IWL_LOCATION_FRAME_FORMAT_HT; + *format_bw |= IWL_LOCATION_BW_40MHZ << LOCATION_BW_POS; + *ctrl_ch_position = iwl_mvm_get_ctrl_pos(chandef); + break; + case NL80211_CHAN_WIDTH_80: + *format_bw = IWL_LOCATION_FRAME_FORMAT_VHT; + *format_bw |= IWL_LOCATION_BW_80MHZ << LOCATION_BW_POS; + *ctrl_ch_position = iwl_mvm_get_ctrl_pos(chandef); + break; + default: + return -ENOTSUPP; + } + + return 0; +} + static int iwl_mvm_ftm_responder_cmd(struct iwl_mvm *mvm, struct ieee80211_vif *vif, struct cfg80211_chan_def *chandef) { struct iwl_mvm_vif *mvmvif = iwl_mvm_vif_from_mac80211(vif); + /* + * The command structure is the same for versions 6 and 7, (only the + * field interpretation is different), so the same struct can be use + * for all cases. + */ struct iwl_tof_responder_config_cmd cmd = { .channel_num = chandef->chan->hw_value, .cmd_valid_fields = @@ -76,27 +136,22 @@ iwl_mvm_ftm_responder_cmd(struct iwl_mvm *mvm, IWL_TOF_RESPONDER_CMD_VALID_STA_ID), .sta_id = mvmvif->bcast_sta.sta_id, }; + u8 cmd_ver = iwl_mvm_lookup_cmd_ver(mvm->fw, LOCATION_GROUP, + TOF_RESPONDER_CONFIG_CMD); + int err; lockdep_assert_held(&mvm->mutex); - switch (chandef->width) { - case NL80211_CHAN_WIDTH_20_NOHT: - cmd.bandwidth = IWL_TOF_BW_20_LEGACY; - break; - case NL80211_CHAN_WIDTH_20: - cmd.bandwidth = IWL_TOF_BW_20_HT; - break; - case NL80211_CHAN_WIDTH_40: - cmd.bandwidth = IWL_TOF_BW_40; - cmd.ctrl_ch_position = iwl_mvm_get_ctrl_pos(chandef); - break; - case NL80211_CHAN_WIDTH_80: - cmd.bandwidth = IWL_TOF_BW_80; - cmd.ctrl_ch_position = iwl_mvm_get_ctrl_pos(chandef); - break; - default: - WARN_ON(1); - return -EINVAL; + if (cmd_ver == 7) + err = iwl_mvm_ftm_responder_set_bw_v2(chandef, &cmd.format_bw, + &cmd.ctrl_ch_position); + else + err = iwl_mvm_ftm_responder_set_bw_v1(chandef, &cmd.format_bw, + &cmd.ctrl_ch_position); + + if (err) { + IWL_ERR(mvm, "Failed to set responder bandwidth\n"); + return err; } memcpy(cmd.bssid, vif->addr, ETH_ALEN); diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c index dd685f7eb410..54c094e88474 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c @@ -353,22 +353,35 @@ static int iwl_mvm_load_ucode_wait_alive(struct iwl_mvm *mvm, if (ret) { struct iwl_trans *trans = mvm->trans; - if (ret == -ETIMEDOUT) - iwl_fw_dbg_error_collect(&mvm->fwrt, - FW_DBG_TRIGGER_ALIVE_TIMEOUT); - - if (trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_22000) + if (trans->trans_cfg->device_family >= + IWL_DEVICE_FAMILY_22000) { IWL_ERR(mvm, "SecBoot CPU1 Status: 0x%x, CPU2 Status: 0x%x\n", iwl_read_umac_prph(trans, UMAG_SB_CPU_1_STATUS), iwl_read_umac_prph(trans, UMAG_SB_CPU_2_STATUS)); - else if (trans->trans_cfg->device_family >= - IWL_DEVICE_FAMILY_8000) + IWL_ERR(mvm, "UMAC PC: 0x%x\n", + iwl_read_umac_prph(trans, + UREG_UMAC_CURRENT_PC)); + IWL_ERR(mvm, "LMAC PC: 0x%x\n", + iwl_read_umac_prph(trans, + UREG_LMAC1_CURRENT_PC)); + if (iwl_mvm_is_cdb_supported(mvm)) + IWL_ERR(mvm, "LMAC2 PC: 0x%x\n", + iwl_read_umac_prph(trans, + UREG_LMAC2_CURRENT_PC)); + } else if (trans->trans_cfg->device_family >= + IWL_DEVICE_FAMILY_8000) { IWL_ERR(mvm, "SecBoot CPU1 Status: 0x%x, CPU2 Status: 0x%x\n", iwl_read_prph(trans, SB_CPU_1_STATUS), iwl_read_prph(trans, SB_CPU_2_STATUS)); + } + + if (ret == -ETIMEDOUT) + iwl_fw_dbg_error_collect(&mvm->fwrt, + FW_DBG_TRIGGER_ALIVE_TIMEOUT); + iwl_fw_set_current_image(&mvm->fwrt, old_type); return ret; } @@ -841,9 +854,13 @@ int iwl_mvm_ppag_send_cmd(struct iwl_mvm *mvm) return 0; } + if (!mvm->fwrt.ppag_table.enabled) { + IWL_DEBUG_RADIO(mvm, + "PPAG not enabled, command not sent.\n"); + return 0; + } + IWL_DEBUG_RADIO(mvm, "Sending PER_PLATFORM_ANT_GAIN_CMD\n"); - IWL_DEBUG_RADIO(mvm, "PPAG is %s\n", - mvm->fwrt.ppag_table.enabled ? "enabled" : "disabled"); for (i = 0; i < ACPI_PPAG_NUM_CHAINS; i++) { for (j = 0; j < ACPI_PPAG_NUM_SUB_BANDS; j++) { diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c index 32dc9d6f0fb6..6717f25c46b1 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c @@ -256,7 +256,8 @@ struct ieee80211_regdomain *iwl_mvm_get_regdomain(struct wiphy *wiphy, __le32_to_cpu(resp->n_channels), resp->channels, __le16_to_cpu(resp->mcc), - __le16_to_cpu(resp->geo_info)); + __le16_to_cpu(resp->geo_info), + __le16_to_cpu(resp->cap)); /* Store the return source id */ src_id = resp->source_id; kfree(resp); @@ -754,6 +755,20 @@ int iwl_mvm_mac_setup_register(struct iwl_mvm *mvm) return ret; } +static void iwl_mvm_tx_skb(struct iwl_mvm *mvm, struct sk_buff *skb, + struct ieee80211_sta *sta) +{ + if (likely(sta)) { + if (likely(iwl_mvm_tx_skb_sta(mvm, skb, sta) == 0)) + return; + } else { + if (likely(iwl_mvm_tx_skb_non_sta(mvm, skb) == 0)) + return; + } + + ieee80211_free_txskb(mvm->hw, skb); +} + static void iwl_mvm_mac_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, struct sk_buff *skb) @@ -797,14 +812,7 @@ static void iwl_mvm_mac_tx(struct ieee80211_hw *hw, } } - if (sta) { - if (iwl_mvm_tx_skb(mvm, skb, sta)) - goto drop; - return; - } - - if (iwl_mvm_tx_skb_non_sta(mvm, skb)) - goto drop; + iwl_mvm_tx_skb(mvm, skb, sta); return; drop: ieee80211_free_txskb(hw, skb); @@ -854,10 +862,7 @@ void iwl_mvm_mac_itxq_xmit(struct ieee80211_hw *hw, struct ieee80211_txq *txq) break; } - if (!txq->sta) - iwl_mvm_tx_skb_non_sta(mvm, skb); - else - iwl_mvm_tx_skb(mvm, skb, txq->sta); + iwl_mvm_tx_skb(mvm, skb, txq->sta); } } while (atomic_dec_return(&mvmtxq->tx_request)); rcu_read_unlock(); @@ -4771,6 +4776,125 @@ static int iwl_mvm_mac_get_survey(struct ieee80211_hw *hw, int idx, return ret; } +static void iwl_mvm_set_sta_rate(u32 rate_n_flags, struct rate_info *rinfo) +{ + switch (rate_n_flags & RATE_MCS_CHAN_WIDTH_MSK) { + case RATE_MCS_CHAN_WIDTH_20: + rinfo->bw = RATE_INFO_BW_20; + break; + case RATE_MCS_CHAN_WIDTH_40: + rinfo->bw = RATE_INFO_BW_40; + break; + case RATE_MCS_CHAN_WIDTH_80: + rinfo->bw = RATE_INFO_BW_80; + break; + case RATE_MCS_CHAN_WIDTH_160: + rinfo->bw = RATE_INFO_BW_160; + break; + } + + if (rate_n_flags & RATE_MCS_HT_MSK) { + rinfo->flags |= RATE_INFO_FLAGS_MCS; + rinfo->mcs = u32_get_bits(rate_n_flags, RATE_HT_MCS_INDEX_MSK); + rinfo->nss = u32_get_bits(rate_n_flags, + RATE_HT_MCS_NSS_MSK) + 1; + if (rate_n_flags & RATE_MCS_SGI_MSK) + rinfo->flags |= RATE_INFO_FLAGS_SHORT_GI; + } else if (rate_n_flags & RATE_MCS_VHT_MSK) { + rinfo->flags |= RATE_INFO_FLAGS_VHT_MCS; + rinfo->mcs = u32_get_bits(rate_n_flags, + RATE_VHT_MCS_RATE_CODE_MSK); + rinfo->nss = u32_get_bits(rate_n_flags, + RATE_VHT_MCS_NSS_MSK) + 1; + if (rate_n_flags & RATE_MCS_SGI_MSK) + rinfo->flags |= RATE_INFO_FLAGS_SHORT_GI; + } else if (rate_n_flags & RATE_MCS_HE_MSK) { + u32 gi_ltf = u32_get_bits(rate_n_flags, + RATE_MCS_HE_GI_LTF_MSK); + + rinfo->flags |= RATE_INFO_FLAGS_HE_MCS; + rinfo->mcs = u32_get_bits(rate_n_flags, + RATE_VHT_MCS_RATE_CODE_MSK); + rinfo->nss = u32_get_bits(rate_n_flags, + RATE_VHT_MCS_NSS_MSK) + 1; + + if (rate_n_flags & RATE_MCS_HE_106T_MSK) { + rinfo->bw = RATE_INFO_BW_HE_RU; + rinfo->he_ru_alloc = NL80211_RATE_INFO_HE_RU_ALLOC_106; + } + + switch (rate_n_flags & RATE_MCS_HE_TYPE_MSK) { + case RATE_MCS_HE_TYPE_SU: + case RATE_MCS_HE_TYPE_EXT_SU: + if (gi_ltf == 0 || gi_ltf == 1) + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_0_8; + else if (gi_ltf == 2) + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_1_6; + else if (rate_n_flags & RATE_MCS_SGI_MSK) + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_0_8; + else + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_3_2; + break; + case RATE_MCS_HE_TYPE_MU: + if (gi_ltf == 0 || gi_ltf == 1) + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_0_8; + else if (gi_ltf == 2) + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_1_6; + else + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_3_2; + break; + case RATE_MCS_HE_TYPE_TRIG: + if (gi_ltf == 0 || gi_ltf == 1) + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_1_6; + else + rinfo->he_gi = NL80211_RATE_INFO_HE_GI_3_2; + break; + } + + if (rate_n_flags & RATE_HE_DUAL_CARRIER_MODE_MSK) + rinfo->he_dcm = 1; + } else { + switch (u32_get_bits(rate_n_flags, RATE_LEGACY_RATE_MSK)) { + case IWL_RATE_1M_PLCP: + rinfo->legacy = 10; + break; + case IWL_RATE_2M_PLCP: + rinfo->legacy = 20; + break; + case IWL_RATE_5M_PLCP: + rinfo->legacy = 55; + break; + case IWL_RATE_11M_PLCP: + rinfo->legacy = 110; + break; + case IWL_RATE_6M_PLCP: + rinfo->legacy = 60; + break; + case IWL_RATE_9M_PLCP: + rinfo->legacy = 90; + break; + case IWL_RATE_12M_PLCP: + rinfo->legacy = 120; + break; + case IWL_RATE_18M_PLCP: + rinfo->legacy = 180; + break; + case IWL_RATE_24M_PLCP: + rinfo->legacy = 240; + break; + case IWL_RATE_36M_PLCP: + rinfo->legacy = 360; + break; + case IWL_RATE_48M_PLCP: + rinfo->legacy = 480; + break; + case IWL_RATE_54M_PLCP: + rinfo->legacy = 540; + break; + } + } +} + static void iwl_mvm_mac_sta_statistics(struct ieee80211_hw *hw, struct ieee80211_vif *vif, struct ieee80211_sta *sta, @@ -4785,6 +4909,13 @@ static void iwl_mvm_mac_sta_statistics(struct ieee80211_hw *hw, sinfo->filled |= BIT_ULL(NL80211_STA_INFO_SIGNAL_AVG); } + if (iwl_mvm_has_tlc_offload(mvm)) { + struct iwl_lq_sta_rs_fw *lq_sta = &mvmsta->lq_sta.rs_fw; + + iwl_mvm_set_sta_rate(lq_sta->last_rate_n_flags, &sinfo->txrate); + sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_BITRATE); + } + /* if beacon filtering isn't on mac80211 does it anyway */ if (!(vif->driver_flags & IEEE80211_VIF_BEACON_FILTER)) return; diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h b/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h index 3ec8de00f3aa..d6ecc2d2bf21 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mvm.h @@ -1160,6 +1160,7 @@ struct iwl_mvm { * @IWL_MVM_STATUS_ROC_AUX_RUNNING: AUX remain-on-channel is running * @IWL_MVM_STATUS_FIRMWARE_RUNNING: firmware is running * @IWL_MVM_STATUS_NEED_FLUSH_P2P: need to flush P2P bcast STA + * @IWL_MVM_STATUS_IN_D3: in D3 (or at least about to go into it) */ enum iwl_mvm_status { IWL_MVM_STATUS_HW_RFKILL, @@ -1170,6 +1171,7 @@ enum iwl_mvm_status { IWL_MVM_STATUS_ROC_AUX_RUNNING, IWL_MVM_STATUS_FIRMWARE_RUNNING, IWL_MVM_STATUS_NEED_FLUSH_P2P, + IWL_MVM_STATUS_IN_D3, }; /* Keep track of completed init configuration */ @@ -1298,9 +1300,6 @@ static inline bool iwl_mvm_is_lar_supported(struct iwl_mvm *mvm) bool tlv_lar = fw_has_capa(&mvm->fw->ucode_capa, IWL_UCODE_TLV_CAPA_LAR_SUPPORT); - if (iwlwifi_mod_params.lar_disable) - return false; - /* * Enable LAR only if it is supported by the FW (TLV) && * enabled in the NVM @@ -1508,8 +1507,8 @@ int __must_check iwl_mvm_send_cmd_status(struct iwl_mvm *mvm, int __must_check iwl_mvm_send_cmd_pdu_status(struct iwl_mvm *mvm, u32 id, u16 len, const void *data, u32 *status); -int iwl_mvm_tx_skb(struct iwl_mvm *mvm, struct sk_buff *skb, - struct ieee80211_sta *sta); +int iwl_mvm_tx_skb_sta(struct iwl_mvm *mvm, struct sk_buff *skb, + struct ieee80211_sta *sta); int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb); void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, struct iwl_tx_cmd *tx_cmd, diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/nvm.c b/drivers/net/wireless/intel/iwlwifi/mvm/nvm.c index 945c1ea5cda8..70b29bf16bb9 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/nvm.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/nvm.c @@ -178,7 +178,7 @@ static int iwl_nvm_read_chunk(struct iwl_mvm *mvm, u16 section, } else { IWL_DEBUG_EEPROM(mvm->trans->dev, "NVM access command failed with status %d (device: %s)\n", - ret, mvm->cfg->name); + ret, mvm->trans->name); ret = -ENODATA; } goto exit; @@ -277,11 +277,10 @@ iwl_parse_nvm_sections(struct iwl_mvm *mvm) struct iwl_nvm_section *sections = mvm->nvm_sections; const __be16 *hw; const __le16 *sw, *calib, *regulatory, *mac_override, *phy_sku; - bool lar_enabled; int regulatory_type; /* Checking for required sections */ - if (mvm->trans->cfg->nvm_type != IWL_NVM_EXT) { + if (mvm->trans->cfg->nvm_type == IWL_NVM) { if (!mvm->nvm_sections[NVM_SECTION_TYPE_SW].data || !mvm->nvm_sections[mvm->cfg->nvm_hw_section_num].data) { IWL_ERR(mvm, "Can't parse empty OTP/NVM sections\n"); @@ -327,14 +326,9 @@ iwl_parse_nvm_sections(struct iwl_mvm *mvm) (const __le16 *)sections[NVM_SECTION_TYPE_REGULATORY_SDP].data : (const __le16 *)sections[NVM_SECTION_TYPE_REGULATORY].data; - lar_enabled = !iwlwifi_mod_params.lar_disable && - fw_has_capa(&mvm->fw->ucode_capa, - IWL_UCODE_TLV_CAPA_LAR_SUPPORT); - - return iwl_parse_nvm_data(mvm->trans, mvm->cfg, hw, sw, calib, + return iwl_parse_nvm_data(mvm->trans, mvm->cfg, mvm->fw, hw, sw, calib, regulatory, mac_override, phy_sku, - mvm->fw->valid_tx_ant, mvm->fw->valid_rx_ant, - lar_enabled); + mvm->fw->valid_tx_ant, mvm->fw->valid_rx_ant); } /* Loads the NVM data stored in mvm->nvm_sections into the NIC */ diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c index 1b07a8e8f069..dfe02440d474 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c @@ -830,7 +830,7 @@ iwl_op_mode_mvm_start(struct iwl_trans *trans, const struct iwl_cfg *cfg, } IWL_INFO(mvm, "Detected %s, REV=0x%X\n", - mvm->cfg->name, mvm->trans->hw_rev); + mvm->trans->name, mvm->trans->hw_rev); if (iwlwifi_mod_params.nvm_file) mvm->nvm_file_name = iwlwifi_mod_params.nvm_file; diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/power.c b/drivers/net/wireless/intel/iwlwifi/mvm/power.c index 25d7faea1c62..c146303ec73b 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/power.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/power.c @@ -198,7 +198,7 @@ static void iwl_mvm_power_configure_uapsd(struct iwl_mvm *mvm, if (!mvmvif->queue_params[ac].uapsd) continue; - if (mvm->fwrt.cur_fw_img != IWL_UCODE_WOWLAN) + if (!test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status)) cmd->flags |= cpu_to_le16(POWER_FLAGS_ADVANCE_PM_ENA_MSK); @@ -233,15 +233,15 @@ static void iwl_mvm_power_configure_uapsd(struct iwl_mvm *mvm, cmd->flags |= cpu_to_le16(POWER_FLAGS_SNOOZE_ENA_MSK); cmd->snooze_interval = cpu_to_le16(IWL_MVM_PS_SNOOZE_INTERVAL); cmd->snooze_window = - (mvm->fwrt.cur_fw_img == IWL_UCODE_WOWLAN) ? + test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status) ? cpu_to_le16(IWL_MVM_WOWLAN_PS_SNOOZE_WINDOW) : cpu_to_le16(IWL_MVM_PS_SNOOZE_WINDOW); } cmd->uapsd_max_sp = mvm->hw->uapsd_max_sp_len; - if (mvm->fwrt.cur_fw_img == IWL_UCODE_WOWLAN || cmd->flags & - cpu_to_le16(POWER_FLAGS_SNOOZE_ENA_MSK)) { + if (test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status) || + cmd->flags & cpu_to_le16(POWER_FLAGS_SNOOZE_ENA_MSK)) { cmd->rx_data_timeout_uapsd = cpu_to_le32(IWL_MVM_WOWLAN_PS_RX_DATA_TIMEOUT); cmd->tx_data_timeout_uapsd = @@ -354,8 +354,7 @@ static bool iwl_mvm_power_is_radar(struct ieee80211_vif *vif) static void iwl_mvm_power_config_skip_dtim(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct iwl_mac_power_cmd *cmd, - bool host_awake) + struct iwl_mac_power_cmd *cmd) { int dtimper = vif->bss_conf.dtim_period ?: 1; int skip; @@ -370,7 +369,7 @@ static void iwl_mvm_power_config_skip_dtim(struct iwl_mvm *mvm, if (dtimper >= 10) return; - if (host_awake) { + if (!test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status)) { if (iwlmvm_mod_params.power_scheme != IWL_POWER_SCHEME_LP) return; skip = 2; @@ -390,8 +389,7 @@ static void iwl_mvm_power_config_skip_dtim(struct iwl_mvm *mvm, static void iwl_mvm_power_build_cmd(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct iwl_mac_power_cmd *cmd, - bool host_awake) + struct iwl_mac_power_cmd *cmd) { int dtimper, bi; int keep_alive; @@ -437,9 +435,9 @@ static void iwl_mvm_power_build_cmd(struct iwl_mvm *mvm, cmd->lprx_rssi_threshold = POWER_LPRX_RSSI_THRESHOLD; } - iwl_mvm_power_config_skip_dtim(mvm, vif, cmd, host_awake); + iwl_mvm_power_config_skip_dtim(mvm, vif, cmd); - if (!host_awake) { + if (test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status)) { cmd->rx_data_timeout = cpu_to_le32(IWL_MVM_WOWLAN_PS_RX_DATA_TIMEOUT); cmd->tx_data_timeout = @@ -512,8 +510,7 @@ static int iwl_mvm_power_send_cmd(struct iwl_mvm *mvm, { struct iwl_mac_power_cmd cmd = {}; - iwl_mvm_power_build_cmd(mvm, vif, &cmd, - mvm->fwrt.cur_fw_img != IWL_UCODE_WOWLAN); + iwl_mvm_power_build_cmd(mvm, vif, &cmd); iwl_mvm_power_log(mvm, &cmd); #ifdef CONFIG_IWLWIFI_DEBUGFS memcpy(&iwl_mvm_vif_from_mac80211(vif)->mac_pwr_cmd, &cmd, sizeof(cmd)); @@ -536,7 +533,7 @@ int iwl_mvm_power_update_device(struct iwl_mvm *mvm) cmd.flags |= cpu_to_le16(DEVICE_POWER_FLAGS_POWER_SAVE_ENA_MSK); #ifdef CONFIG_IWLWIFI_DEBUGFS - if ((mvm->fwrt.cur_fw_img == IWL_UCODE_WOWLAN) ? + if (test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status) ? mvm->disable_power_off_d3 : mvm->disable_power_off) cmd.flags &= cpu_to_le16(~DEVICE_POWER_FLAGS_POWER_SAVE_ENA_MSK); @@ -943,7 +940,7 @@ static int iwl_mvm_power_set_ba(struct iwl_mvm *mvm, if (!mvmvif->bf_data.bf_enabled) return 0; - if (mvm->fwrt.cur_fw_img == IWL_UCODE_WOWLAN) + if (test_bit(IWL_MVM_STATUS_IN_D3, &mvm->status)) cmd.ba_escape_timer = cpu_to_le32(IWL_BA_ESCAPE_TIMER_D3); mvmvif->bf_data.ba_enabled = !(!mvmvif->pm_enabled || diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c index ef99c49247b7..c15f7dbc9516 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c @@ -514,14 +514,17 @@ static bool iwl_mvm_is_sn_less(u16 sn1, u16 sn2, u16 buffer_size) static void iwl_mvm_sync_nssn(struct iwl_mvm *mvm, u8 baid, u16 nssn) { - struct iwl_mvm_rss_sync_notif notif = { - .metadata.type = IWL_MVM_RXQ_NSSN_SYNC, - .metadata.sync = 0, - .nssn_sync.baid = baid, - .nssn_sync.nssn = nssn, - }; - - iwl_mvm_sync_rx_queues_internal(mvm, (void *)¬if, sizeof(notif)); + if (IWL_MVM_USE_NSSN_SYNC) { + struct iwl_mvm_rss_sync_notif notif = { + .metadata.type = IWL_MVM_RXQ_NSSN_SYNC, + .metadata.sync = 0, + .nssn_sync.baid = baid, + .nssn_sync.nssn = nssn, + }; + + iwl_mvm_sync_rx_queues_internal(mvm, (void *)¬if, + sizeof(notif)); + } } #define RX_REORDER_BUF_TIMEOUT_MQ (HZ / 10) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c index a046ac9fa852..3b263c81bcae 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c @@ -1213,7 +1213,7 @@ static int iwl_mvm_legacy_config_scan(struct iwl_mvm *mvm) cmd_size = sizeof(struct iwl_scan_config_v2); else cmd_size = sizeof(struct iwl_scan_config_v1); - cmd_size += num_channels; + cmd_size += mvm->fw->ucode_capa.n_scan_channels; cfg = kzalloc(cmd_size, GFP_KERNEL); if (!cfg) @@ -1907,20 +1907,6 @@ iwl_mvm_scan_umac_fill_probe_p_v4(struct iwl_mvm_scan_params *params, } static void -iwl_mvm_scan_umac_fill_ch_p_v3(struct iwl_mvm *mvm, - struct iwl_mvm_scan_params *params, - struct ieee80211_vif *vif, - struct iwl_scan_channel_params_v3 *cp) -{ - cp->flags = iwl_mvm_scan_umac_chan_flags_v2(mvm, params, vif); - cp->count = params->n_channels; - - iwl_mvm_umac_scan_cfg_channels(mvm, params->channels, - params->n_channels, 0, - cp->channel_config); -} - -static void iwl_mvm_scan_umac_fill_ch_p_v4(struct iwl_mvm *mvm, struct iwl_mvm_scan_params *params, struct ieee80211_vif *vif, @@ -1937,37 +1923,6 @@ iwl_mvm_scan_umac_fill_ch_p_v4(struct iwl_mvm *mvm, vif->type); } -static int iwl_mvm_scan_umac_v11(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct iwl_mvm_scan_params *params, int type, - int uid) -{ - struct iwl_scan_req_umac_v11 *cmd = mvm->scan_cmd; - struct iwl_scan_req_params_v11 *scan_p = &cmd->scan_params; - int ret; - u16 gen_flags; - - mvm->scan_uid_status[uid] = type; - - cmd->ooc_priority = cpu_to_le32(iwl_mvm_scan_umac_ooc_priority(params)); - cmd->uid = cpu_to_le32(uid); - - gen_flags = iwl_mvm_scan_umac_flags_v2(mvm, params, vif, type); - iwl_mvm_scan_umac_fill_general_p_v10(mvm, params, vif, - &scan_p->general_params, - gen_flags); - - ret = iwl_mvm_fill_scan_sched_params(params, - scan_p->periodic_params.schedule, - &scan_p->periodic_params.delay); - if (ret) - return ret; - - iwl_mvm_scan_umac_fill_probe_p_v3(params, &scan_p->probe_params); - iwl_mvm_scan_umac_fill_ch_p_v3(mvm, params, vif, - &scan_p->channel_params); - - return 0; -} static int iwl_mvm_scan_umac_v12(struct iwl_mvm *mvm, struct ieee80211_vif *vif, struct iwl_mvm_scan_params *params, int type, @@ -2152,7 +2107,6 @@ static const struct iwl_scan_umac_handler iwl_scan_umac_handlers[] = { /* set the newest version first to shorten the list traverse time */ IWL_SCAN_UMAC_HANDLER(13), IWL_SCAN_UMAC_HANDLER(12), - IWL_SCAN_UMAC_HANDLER(11), }; static int iwl_mvm_build_scan_cmd(struct iwl_mvm *mvm, @@ -2511,7 +2465,6 @@ static int iwl_scan_req_umac_get_size(u8 scan_ver) switch (scan_ver) { IWL_SCAN_REQ_UMAC_HANDLE_SIZE(13); IWL_SCAN_REQ_UMAC_HANDLE_SIZE(12); - IWL_SCAN_REQ_UMAC_HANDLE_SIZE(11); } return 0; diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index 6a241d37a057..a8d0d17f79fd 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -490,13 +490,13 @@ static void iwl_mvm_set_tx_cmd_crypto(struct iwl_mvm *mvm, /* * Allocates and sets the Tx cmd the driver data pointers in the skb */ -static struct iwl_device_cmd * +static struct iwl_device_tx_cmd * iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, struct ieee80211_tx_info *info, int hdrlen, struct ieee80211_sta *sta, u8 sta_id) { struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; - struct iwl_device_cmd *dev_cmd; + struct iwl_device_tx_cmd *dev_cmd; struct iwl_tx_cmd *tx_cmd; dev_cmd = iwl_trans_alloc_tx_cmd(mvm->trans); @@ -504,11 +504,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, if (unlikely(!dev_cmd)) return NULL; - /* Make sure we zero enough of dev_cmd */ - BUILD_BUG_ON(sizeof(struct iwl_tx_cmd_gen2) > sizeof(*tx_cmd)); - BUILD_BUG_ON(sizeof(struct iwl_tx_cmd_gen3) > sizeof(*tx_cmd)); - - memset(dev_cmd, 0, sizeof(dev_cmd->hdr) + sizeof(*tx_cmd)); dev_cmd->hdr.cmd = TX_CMD; if (iwl_mvm_has_new_tx_api(mvm)) { @@ -597,7 +592,7 @@ out: } static void iwl_mvm_skb_prepare_status(struct sk_buff *skb, - struct iwl_device_cmd *cmd) + struct iwl_device_tx_cmd *cmd) { struct ieee80211_tx_info *skb_info = IEEE80211_SKB_CB(skb); @@ -716,7 +711,7 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) { struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; struct ieee80211_tx_info info; - struct iwl_device_cmd *dev_cmd; + struct iwl_device_tx_cmd *dev_cmd; u8 sta_id; int hdrlen = ieee80211_hdrlen(hdr->frame_control); __le16 fc = hdr->frame_control; @@ -1073,7 +1068,7 @@ static int iwl_mvm_tx_mpdu(struct iwl_mvm *mvm, struct sk_buff *skb, { struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; struct iwl_mvm_sta *mvmsta; - struct iwl_device_cmd *dev_cmd; + struct iwl_device_tx_cmd *dev_cmd; __le16 fc; u16 seq_number = 0; u8 tid = IWL_MAX_TID_COUNT; @@ -1149,7 +1144,7 @@ static int iwl_mvm_tx_mpdu(struct iwl_mvm *mvm, struct sk_buff *skb, if (WARN_ONCE(txq_id == IWL_MVM_INVALID_QUEUE, "Invalid TXQ id")) { iwl_trans_free_tx_cmd(mvm->trans, dev_cmd); spin_unlock(&mvmsta->lock); - return 0; + return -1; } if (!iwl_mvm_has_new_tx_api(mvm)) { @@ -1201,8 +1196,8 @@ drop: return -1; } -int iwl_mvm_tx_skb(struct iwl_mvm *mvm, struct sk_buff *skb, - struct ieee80211_sta *sta) +int iwl_mvm_tx_skb_sta(struct iwl_mvm *mvm, struct sk_buff *skb, + struct ieee80211_sta *sta) { struct iwl_mvm_sta *mvmsta = iwl_mvm_sta_from_mac80211(sta); struct ieee80211_tx_info info; diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c index a4e09a5b1816..01f248ba8fec 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info-gen3.c @@ -206,7 +206,7 @@ int iwl_pcie_ctxt_info_gen3_init(struct iwl_trans *trans, ctxt_info_gen3->mtr_size = cpu_to_le16(TFD_QUEUE_CB_SIZE(cmdq_size)); ctxt_info_gen3->mcr_size = - cpu_to_le16(RX_QUEUE_CB_SIZE(MQ_RX_TABLE_SIZE)); + cpu_to_le16(RX_QUEUE_CB_SIZE(trans->cfg->num_rbds)); trans_pcie->ctxt_info_gen3 = ctxt_info_gen3; trans_pcie->prph_info = prph_info; diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c index d38cefbb779e..acd01d86f101 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/ctxt-info.c @@ -57,6 +57,42 @@ #include "internal.h" #include "iwl-prph.h" +static void *_iwl_pcie_ctxt_info_dma_alloc_coherent(struct iwl_trans *trans, + size_t size, + dma_addr_t *phys, + int depth) +{ + void *result; + + if (WARN(depth > 2, + "failed to allocate DMA memory not crossing 2^32 boundary")) + return NULL; + + result = dma_alloc_coherent(trans->dev, size, phys, GFP_KERNEL); + + if (!result) + return NULL; + + if (unlikely(iwl_pcie_crosses_4g_boundary(*phys, size))) { + void *old = result; + dma_addr_t oldphys = *phys; + + result = _iwl_pcie_ctxt_info_dma_alloc_coherent(trans, size, + phys, + depth + 1); + dma_free_coherent(trans->dev, size, old, oldphys); + } + + return result; +} + +static void *iwl_pcie_ctxt_info_dma_alloc_coherent(struct iwl_trans *trans, + size_t size, + dma_addr_t *phys) +{ + return _iwl_pcie_ctxt_info_dma_alloc_coherent(trans, size, phys, 0); +} + void iwl_pcie_ctxt_info_free_paging(struct iwl_trans *trans) { struct iwl_self_init_dram *dram = &trans->init_dram; @@ -161,14 +197,17 @@ int iwl_pcie_ctxt_info_init(struct iwl_trans *trans, struct iwl_context_info *ctxt_info; struct iwl_context_info_rbd_cfg *rx_cfg; u32 control_flags = 0, rb_size; + dma_addr_t phys; int ret; - ctxt_info = dma_alloc_coherent(trans->dev, sizeof(*ctxt_info), - &trans_pcie->ctxt_info_dma_addr, - GFP_KERNEL); + ctxt_info = iwl_pcie_ctxt_info_dma_alloc_coherent(trans, + sizeof(*ctxt_info), + &phys); if (!ctxt_info) return -ENOMEM; + trans_pcie->ctxt_info_dma_addr = phys; + ctxt_info->version.version = 0; ctxt_info->version.mac_id = cpu_to_le16((u16)iwl_read32(trans, CSR_HW_REV)); @@ -193,11 +232,12 @@ int iwl_pcie_ctxt_info_init(struct iwl_trans *trans, rb_size = IWL_CTXT_INFO_RB_SIZE_4K; } - BUILD_BUG_ON(RX_QUEUE_CB_SIZE(MQ_RX_TABLE_SIZE) > 0xF); - control_flags = IWL_CTXT_INFO_TFD_FORMAT_LONG | - (RX_QUEUE_CB_SIZE(MQ_RX_TABLE_SIZE) << - IWL_CTXT_INFO_RB_CB_SIZE_POS) | - (rb_size << IWL_CTXT_INFO_RB_SIZE_POS); + WARN_ON(RX_QUEUE_CB_SIZE(trans->cfg->num_rbds) > 12); + control_flags = IWL_CTXT_INFO_TFD_FORMAT_LONG; + control_flags |= + u32_encode_bits(RX_QUEUE_CB_SIZE(trans->cfg->num_rbds), + IWL_CTXT_INFO_RB_CB_SIZE); + control_flags |= u32_encode_bits(rb_size, IWL_CTXT_INFO_RB_SIZE); ctxt_info->control.control_flags = cpu_to_le32(control_flags); /* initialize RX default queue */ diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c index b0b7eca1754e..97f227f3cbc3 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c @@ -565,14 +565,8 @@ static const struct pci_device_id iwl_hw_card_ids[] = { {IWL_PCI_DEVICE(0x06F0, 0x40A4, iwl9462_2ac_cfg_quz_a0_jf_b0_soc)}, {IWL_PCI_DEVICE(0x06F0, 0x4234, iwl9560_2ac_cfg_quz_a0_jf_b0_soc)}, {IWL_PCI_DEVICE(0x06F0, 0x42A4, iwl9462_2ac_cfg_quz_a0_jf_b0_soc)}, - {IWL_PCI_DEVICE(0x2526, 0x0010, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x0014, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x0018, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x001C, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x0030, iwl9560_2ac_160_cfg)}, + {IWL_PCI_DEVICE(0x2526, 0x0034, iwl9560_2ac_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x0038, iwl9560_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x003C, iwl9560_2ac_160_cfg)}, {IWL_PCI_DEVICE(0x2526, 0x0060, iwl9461_2ac_cfg_soc)}, {IWL_PCI_DEVICE(0x2526, 0x0064, iwl9461_2ac_cfg_soc)}, {IWL_PCI_DEVICE(0x2526, 0x00A0, iwl9462_2ac_cfg_soc)}, @@ -598,21 +592,12 @@ static const struct pci_device_id iwl_hw_card_ids[] = { {IWL_PCI_DEVICE(0x2526, 0x1610, iwl9270_2ac_cfg)}, {IWL_PCI_DEVICE(0x2526, 0x2030, iwl9560_2ac_160_cfg_soc)}, {IWL_PCI_DEVICE(0x2526, 0x2034, iwl9560_2ac_160_cfg_soc)}, - {IWL_PCI_DEVICE(0x2526, 0x4010, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x4018, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x401C, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x4030, iwl9560_2ac_160_cfg)}, {IWL_PCI_DEVICE(0x2526, 0x4034, iwl9560_2ac_160_cfg_soc)}, {IWL_PCI_DEVICE(0x2526, 0x40A4, iwl9462_2ac_cfg_soc)}, {IWL_PCI_DEVICE(0x2526, 0x4234, iwl9560_2ac_cfg_soc)}, {IWL_PCI_DEVICE(0x2526, 0x42A4, iwl9462_2ac_cfg_soc)}, - {IWL_PCI_DEVICE(0x2526, 0x6010, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x6014, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x8014, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0x8010, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0xA014, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0xE010, iwl9260_2ac_160_cfg)}, - {IWL_PCI_DEVICE(0x2526, 0xE014, iwl9260_2ac_160_cfg)}, + {IWL_PCI_DEVICE(0x2526, PCI_ANY_ID, iwl9000_trans_cfg)}, + {IWL_PCI_DEVICE(0x271B, 0x0010, iwl9160_2ac_cfg)}, {IWL_PCI_DEVICE(0x271B, 0x0014, iwl9160_2ac_cfg)}, {IWL_PCI_DEVICE(0x271B, 0x0210, iwl9160_2ac_cfg)}, @@ -910,7 +895,6 @@ static const struct pci_device_id iwl_hw_card_ids[] = { {IWL_PCI_DEVICE(0x2720, 0x0074, iwl_ax201_cfg_qu_hr)}, {IWL_PCI_DEVICE(0x2720, 0x0078, iwl_ax201_cfg_qu_hr)}, {IWL_PCI_DEVICE(0x2720, 0x007C, iwl_ax201_cfg_qu_hr)}, - {IWL_PCI_DEVICE(0x2720, 0x0090, iwl22000_2ac_cfg_hr_cdb)}, {IWL_PCI_DEVICE(0x2720, 0x0244, iwl_ax101_cfg_qu_hr)}, {IWL_PCI_DEVICE(0x2720, 0x0310, iwl_ax201_cfg_qu_hr)}, {IWL_PCI_DEVICE(0x2720, 0x0A10, iwl_ax201_cfg_qu_hr)}, @@ -987,28 +971,74 @@ static const struct pci_device_id iwl_hw_card_ids[] = { }; MODULE_DEVICE_TABLE(pci, iwl_hw_card_ids); +#define IWL_DEV_INFO(_device, _subdevice, _cfg, _name) \ + { .device = (_device), .subdevice = (_subdevice), .cfg = &(_cfg), \ + .name = _name } + +static const struct iwl_dev_info iwl_dev_info_table[] = { +#if IS_ENABLED(CONFIG_IWLMVM) + IWL_DEV_INFO(0x2526, 0x0010, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x0014, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x0018, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x001C, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x6010, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x6014, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x8014, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0x8010, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0xA014, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0xE010, iwl9260_2ac_160_cfg, iwl9260_160_name), + IWL_DEV_INFO(0x2526, 0xE014, iwl9260_2ac_160_cfg, iwl9260_160_name), + + IWL_DEV_INFO(0x2526, 0x0030, iwl9560_2ac_160_cfg, iwl9560_160_name), + IWL_DEV_INFO(0x2526, 0x0038, iwl9560_2ac_160_cfg, iwl9560_160_name), + IWL_DEV_INFO(0x2526, 0x003C, iwl9560_2ac_160_cfg, iwl9560_160_name), + IWL_DEV_INFO(0x2526, 0x4030, iwl9560_2ac_160_cfg, iwl9560_160_name), +#endif /* CONFIG_IWLMVM */ +}; + /* PCI registers */ #define PCI_CFG_RETRY_TIMEOUT 0x041 static int iwl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { - const struct iwl_cfg *cfg = (struct iwl_cfg *)(ent->driver_data); + const struct iwl_cfg_trans_params *trans = + (struct iwl_cfg_trans_params *)(ent->driver_data); const struct iwl_cfg *cfg_7265d __maybe_unused = NULL; struct iwl_trans *iwl_trans; + struct iwl_trans_pcie *trans_pcie; unsigned long flags; - int ret; + int i, ret; + /* + * This is needed for backwards compatibility with the old + * tables, so we don't need to change all the config structs + * at the same time. The cfg is used to compare with the old + * full cfg structs. + */ + const struct iwl_cfg *cfg = (struct iwl_cfg *)(ent->driver_data); - iwl_trans = iwl_trans_pcie_alloc(pdev, ent, &cfg->trans); + /* make sure trans is the first element in iwl_cfg */ + BUILD_BUG_ON(offsetof(struct iwl_cfg, trans)); + + iwl_trans = iwl_trans_pcie_alloc(pdev, ent, trans); if (IS_ERR(iwl_trans)) return PTR_ERR(iwl_trans); - /* the trans_cfg should never change, so set it now */ - iwl_trans->trans_cfg = &cfg->trans; + trans_pcie = IWL_TRANS_GET_PCIE_TRANS(iwl_trans); - if (WARN_ONCE(!iwl_trans->trans_cfg->csr, - "CSR addresses aren't configured\n")) { - ret = -EINVAL; - goto out_free_trans; + /* the trans_cfg should never change, so set it now */ + iwl_trans->trans_cfg = trans; + + for (i = 0; i < ARRAY_SIZE(iwl_dev_info_table); i++) { + const struct iwl_dev_info *dev_info = &iwl_dev_info_table[i]; + + if ((dev_info->device == IWL_CFG_ANY || + dev_info->device == pdev->device) && + (dev_info->subdevice == IWL_CFG_ANY || + dev_info->subdevice == pdev->subsystem_device)) { + iwl_trans->cfg = dev_info->cfg; + iwl_trans->name = dev_info->name; + goto found; + } } #if IS_ENABLED(CONFIG_IWLMVM) @@ -1027,22 +1057,22 @@ static int iwl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) cfg_7265d = &iwl7265d_n_cfg; if (cfg_7265d && (iwl_trans->hw_rev & CSR_HW_REV_TYPE_MSK) == CSR_HW_REV_TYPE_7265D) - cfg = cfg_7265d; + iwl_trans->cfg = cfg_7265d; iwl_trans->hw_rf_id = iwl_read32(iwl_trans, CSR_HW_RF_ID); if (cfg == &iwlax210_2ax_cfg_so_hr_a0) { if (iwl_trans->hw_rev == CSR_HW_REV_TYPE_TY) { - cfg = &iwlax210_2ax_cfg_ty_gf_a0; + iwl_trans->cfg = &iwlax210_2ax_cfg_ty_gf_a0; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_JF)) { - cfg = &iwlax210_2ax_cfg_so_jf_a0; + iwl_trans->cfg = &iwlax210_2ax_cfg_so_jf_a0; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_GF)) { - cfg = &iwlax211_2ax_cfg_so_gf_a0; + iwl_trans->cfg = &iwlax211_2ax_cfg_so_gf_a0; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_GF4)) { - cfg = &iwlax411_2ax_cfg_so_gf4_a0; + iwl_trans->cfg = &iwlax411_2ax_cfg_so_gf4_a0; } } else if (cfg == &iwl_ax101_cfg_qu_hr) { if ((CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == @@ -1050,13 +1080,17 @@ static int iwl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) iwl_trans->hw_rev == CSR_HW_REV_TYPE_QNJ_B0) || (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR1))) { - cfg = &iwl22000_2ax_cfg_qnj_hr_b0; + iwl_trans->cfg = &iwl22000_2ax_cfg_qnj_hr_b0; + } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == + CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR) && + iwl_trans->hw_rev == CSR_HW_REV_TYPE_QUZ) { + iwl_trans->cfg = &iwl_ax101_cfg_quz_hr; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HR)) { - cfg = &iwl_ax101_cfg_qu_hr; + iwl_trans->cfg = &iwl_ax101_cfg_qu_hr; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_JF)) { - cfg = &iwl22000_2ax_cfg_jf; + iwl_trans->cfg = &iwl22000_2ax_cfg_jf; } else if (CSR_HW_RF_ID_TYPE_CHIP_ID(iwl_trans->hw_rf_id) == CSR_HW_RF_ID_TYPE_CHIP_ID(CSR_HW_RF_ID_TYPE_HRCDB)) { IWL_ERR(iwl_trans, "RF ID HRCDB is not supported\n"); @@ -1073,19 +1107,11 @@ static int iwl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) hw_status = iwl_read_prph(iwl_trans, UMAG_GEN_HW_STATUS); if (CSR_HW_RF_STEP(iwl_trans->hw_rf_id) == SILICON_B_STEP) - /* - * b step fw is the same for physical card and fpga - */ - cfg = &iwl22000_2ax_cfg_qnj_hr_b0; + iwl_trans->cfg = &iwl22000_2ax_cfg_qnj_hr_b0; else if ((hw_status & UMAG_GEN_HW_IS_FPGA) && - CSR_HW_RF_STEP(iwl_trans->hw_rf_id) == SILICON_A_STEP) { - cfg = &iwl22000_2ax_cfg_qnj_hr_a0_f0; - } else { - /* - * a step no FPGA - */ - cfg = &iwl22000_2ac_cfg_hr; - } + CSR_HW_RF_STEP(iwl_trans->hw_rf_id) == + SILICON_A_STEP) + iwl_trans->cfg = &iwl22000_2ax_cfg_qnj_hr_a0_f0; } /* @@ -1096,17 +1122,21 @@ static int iwl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) */ if (iwl_trans->hw_rev == CSR_HW_REV_TYPE_QU_C0) { if (cfg == &iwl_ax101_cfg_qu_hr) - cfg = &iwl_ax101_cfg_qu_c0_hr_b0; + iwl_trans->cfg = &iwl_ax101_cfg_qu_c0_hr_b0; else if (cfg == &iwl_ax201_cfg_qu_hr) - cfg = &iwl_ax201_cfg_qu_c0_hr_b0; + iwl_trans->cfg = &iwl_ax201_cfg_qu_c0_hr_b0; else if (cfg == &iwl9461_2ac_cfg_qu_b0_jf_b0) - cfg = &iwl9461_2ac_cfg_qu_c0_jf_b0; + iwl_trans->cfg = &iwl9461_2ac_cfg_qu_c0_jf_b0; else if (cfg == &iwl9462_2ac_cfg_qu_b0_jf_b0) - cfg = &iwl9462_2ac_cfg_qu_c0_jf_b0; + iwl_trans->cfg = &iwl9462_2ac_cfg_qu_c0_jf_b0; else if (cfg == &iwl9560_2ac_cfg_qu_b0_jf_b0) - cfg = &iwl9560_2ac_cfg_qu_c0_jf_b0; + iwl_trans->cfg = &iwl9560_2ac_cfg_qu_c0_jf_b0; else if (cfg == &iwl9560_2ac_160_cfg_qu_b0_jf_b0) - cfg = &iwl9560_2ac_160_cfg_qu_c0_jf_b0; + iwl_trans->cfg = &iwl9560_2ac_160_cfg_qu_c0_jf_b0; + else if (cfg == &killer1650s_2ax_cfg_qu_b0_hr_b0) + iwl_trans->cfg = &killer1650s_2ax_cfg_qu_c0_hr_b0; + else if (cfg == &killer1650i_2ax_cfg_qu_b0_hr_b0) + iwl_trans->cfg = &killer1650i_2ax_cfg_qu_c0_hr_b0; } /* same thing for QuZ... */ @@ -1126,8 +1156,27 @@ static int iwl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) } #endif - /* now set the real cfg we decided to use */ - iwl_trans->cfg = cfg; + /* + * If we didn't set the cfg yet, assume the trans is actually + * a full cfg from the old tables. + */ + if (!iwl_trans->cfg) + iwl_trans->cfg = cfg; + +found: + /* if we don't have a name yet, copy name from the old cfg */ + if (!iwl_trans->name) + iwl_trans->name = iwl_trans->cfg->name; + + if (iwl_trans->trans_cfg->mq_rx_supported) { + if (WARN_ON(!iwl_trans->cfg->num_rbds)) { + ret = -EINVAL; + goto out_free_trans; + } + trans_pcie->num_rx_bufs = iwl_trans->cfg->num_rbds; + } else { + trans_pcie->num_rx_bufs = RX_QUEUE_SIZE; + } if (iwl_trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_8000 && iwl_trans_grab_nic_access(iwl_trans, &flags)) { diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index a091690f6c79..72f144c3a46e 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -106,6 +106,8 @@ struct iwl_host_cmd; * @page: driver's pointer to the rxb page * @invalid: rxb is in driver ownership - not owned by HW * @vid: index of this rxb in the global table + * @offset: indicates which offset of the page (in bytes) + * this buffer uses (if multiple RBs fit into one page) */ struct iwl_rx_mem_buffer { dma_addr_t page_dma; @@ -113,6 +115,7 @@ struct iwl_rx_mem_buffer { u16 vid; bool invalid; struct list_head list; + u32 offset; }; /** @@ -305,7 +308,7 @@ struct iwl_cmd_meta { #define IWL_FIRST_TB_SIZE_ALIGN ALIGN(IWL_FIRST_TB_SIZE, 64) struct iwl_pcie_txq_entry { - struct iwl_device_cmd *cmd; + void *cmd; struct sk_buff *skb; /* buffer to free after command completes */ const void *free_buf; @@ -491,6 +494,7 @@ struct cont_rec { * @sw_csum_tx: if true, then the transport will compute the csum of the TXed * frame. * @rx_page_order: page order for receive buffer size + * @rx_buf_bytes: RX buffer (RB) size in bytes * @reg_lock: protect hw register access * @mutex: to protect stop_device / start_fw / start_hw * @cmd_in_flight: true when we have a host command in flight @@ -510,11 +514,16 @@ struct cont_rec { * @in_rescan: true if we have triggered a device rescan * @base_rb_stts: base virtual address of receive buffer status for all queues * @base_rb_stts_dma: base physical address of receive buffer status + * @supported_dma_mask: DMA mask to validate the actual address against, + * will be DMA_BIT_MASK(11) or DMA_BIT_MASK(12) depending on the device + * @alloc_page_lock: spinlock for the page allocator + * @alloc_page: allocated page to still use parts of + * @alloc_page_used: how much of the allocated page was already used (bytes) */ struct iwl_trans_pcie { struct iwl_rxq *rxq; - struct iwl_rx_mem_buffer rx_pool[RX_POOL_SIZE]; - struct iwl_rx_mem_buffer *global_table[RX_POOL_SIZE]; + struct iwl_rx_mem_buffer *rx_pool; + struct iwl_rx_mem_buffer **global_table; struct iwl_rb_allocator rba; union { struct iwl_context_info *ctxt_info; @@ -573,6 +582,7 @@ struct iwl_trans_pcie { u8 no_reclaim_cmds[MAX_NO_RECLAIM_CMDS]; u8 max_tbs; u16 tfd_size; + u16 num_rx_bufs; enum iwl_amsdu_size rx_buf_size; bool bc_table_dword; @@ -580,6 +590,13 @@ struct iwl_trans_pcie { bool sw_csum_tx; bool pcie_dbg_dumped_once; u32 rx_page_order; + u32 rx_buf_bytes; + u32 supported_dma_mask; + + /* allocator lock for the two values below */ + spinlock_t alloc_page_lock; + struct page *alloc_page; + u32 alloc_page_used; /*protect hw register */ spinlock_t reg_lock; @@ -672,6 +689,16 @@ void iwl_pcie_disable_ict(struct iwl_trans *trans); /***************************************************** * TX / HCMD ******************************************************/ +/* + * We need this inline in case dma_addr_t is only 32-bits - since the + * hardware is always 64-bit, the issue can still occur in that case, + * so use u64 for 'phys' here to force the addition in 64-bit. + */ +static inline bool iwl_pcie_crosses_4g_boundary(u64 phys, u16 len) +{ + return upper_32_bits(phys) != upper_32_bits(phys + len); +} + int iwl_pcie_tx_init(struct iwl_trans *trans); int iwl_pcie_gen2_tx_init(struct iwl_trans *trans, int txq_id, int queue_size); @@ -688,7 +715,7 @@ void iwl_trans_pcie_txq_set_shared_mode(struct iwl_trans *trans, u32 txq_id, void iwl_trans_pcie_log_scd_error(struct iwl_trans *trans, struct iwl_txq *txq); int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb, - struct iwl_device_cmd *dev_cmd, int txq_id); + struct iwl_device_tx_cmd *dev_cmd, int txq_id); void iwl_pcie_txq_check_wrptrs(struct iwl_trans *trans); int iwl_trans_pcie_send_hcmd(struct iwl_trans *trans, struct iwl_host_cmd *cmd); void iwl_pcie_cmdq_reclaim(struct iwl_trans *trans, int txq_id, int idx); @@ -1082,7 +1109,8 @@ void iwl_pcie_apply_destination(struct iwl_trans *trans); void iwl_pcie_free_tso_page(struct iwl_trans_pcie *trans_pcie, struct sk_buff *skb); #ifdef CONFIG_INET -struct iwl_tso_hdr_page *get_page_hdr(struct iwl_trans *trans, size_t len); +struct iwl_tso_hdr_page *get_page_hdr(struct iwl_trans *trans, size_t len, + struct sk_buff *skb); #endif /* common functions that are used by gen3 transport */ @@ -1106,7 +1134,7 @@ int iwl_trans_pcie_dyn_txq_alloc(struct iwl_trans *trans, unsigned int timeout); void iwl_trans_pcie_dyn_txq_free(struct iwl_trans *trans, int queue); int iwl_trans_pcie_gen2_tx(struct iwl_trans *trans, struct sk_buff *skb, - struct iwl_device_cmd *dev_cmd, int txq_id); + struct iwl_device_tx_cmd *dev_cmd, int txq_id); int iwl_trans_pcie_gen2_send_hcmd(struct iwl_trans *trans, struct iwl_host_cmd *cmd); void iwl_trans_pcie_gen2_stop_device(struct iwl_trans *trans); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c index 452da44a21e0..427fcea5cb2d 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c @@ -240,7 +240,7 @@ static void iwl_pcie_rxq_inc_wr_ptr(struct iwl_trans *trans, IWL_DEBUG_INFO(trans, "Rx queue requesting wakeup, GP1 = 0x%x\n", reg); iwl_set_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); rxq->need_update = true; return; } @@ -298,6 +298,7 @@ static void iwl_pcie_restock_bd(struct iwl_trans *trans, static void iwl_pcie_rxmq_restock(struct iwl_trans *trans, struct iwl_rxq *rxq) { + struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_rx_mem_buffer *rxb; /* @@ -318,8 +319,8 @@ static void iwl_pcie_rxmq_restock(struct iwl_trans *trans, list); list_del(&rxb->list); rxb->invalid = false; - /* 12 first bits are expected to be empty */ - WARN_ON(rxb->page_dma & DMA_BIT_MASK(12)); + /* some low bits are expected to be unset (depending on hw) */ + WARN_ON(rxb->page_dma & trans_pcie->supported_dma_mask); /* Point to Rx buffer via next RBD in circular buffer */ iwl_pcie_restock_bd(trans, rxq, rxb); rxq->write = (rxq->write + 1) & (rxq->queue_size - 1); @@ -412,15 +413,34 @@ void iwl_pcie_rxq_restock(struct iwl_trans *trans, struct iwl_rxq *rxq) * */ static struct page *iwl_pcie_rx_alloc_page(struct iwl_trans *trans, - gfp_t priority) + u32 *offset, gfp_t priority) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + unsigned int rbsize = iwl_trans_get_rb_size(trans_pcie->rx_buf_size); + unsigned int allocsize = PAGE_SIZE << trans_pcie->rx_page_order; struct page *page; gfp_t gfp_mask = priority; if (trans_pcie->rx_page_order > 0) gfp_mask |= __GFP_COMP; + if (trans_pcie->alloc_page) { + spin_lock_bh(&trans_pcie->alloc_page_lock); + /* recheck */ + if (trans_pcie->alloc_page) { + *offset = trans_pcie->alloc_page_used; + page = trans_pcie->alloc_page; + trans_pcie->alloc_page_used += rbsize; + if (trans_pcie->alloc_page_used >= allocsize) + trans_pcie->alloc_page = NULL; + else + get_page(page); + spin_unlock_bh(&trans_pcie->alloc_page_lock); + return page; + } + spin_unlock_bh(&trans_pcie->alloc_page_lock); + } + /* Alloc a new receive buffer */ page = alloc_pages(gfp_mask, trans_pcie->rx_page_order); if (!page) { @@ -436,6 +456,18 @@ static struct page *iwl_pcie_rx_alloc_page(struct iwl_trans *trans, "Failed to alloc_pages\n"); return NULL; } + + if (2 * rbsize <= allocsize) { + spin_lock_bh(&trans_pcie->alloc_page_lock); + if (!trans_pcie->alloc_page) { + get_page(page); + trans_pcie->alloc_page = page; + trans_pcie->alloc_page_used = rbsize; + } + spin_unlock_bh(&trans_pcie->alloc_page_lock); + } + + *offset = 0; return page; } @@ -456,6 +488,8 @@ void iwl_pcie_rxq_alloc_rbs(struct iwl_trans *trans, gfp_t priority, struct page *page; while (1) { + unsigned int offset; + spin_lock(&rxq->lock); if (list_empty(&rxq->rx_used)) { spin_unlock(&rxq->lock); @@ -463,8 +497,7 @@ void iwl_pcie_rxq_alloc_rbs(struct iwl_trans *trans, gfp_t priority, } spin_unlock(&rxq->lock); - /* Alloc a new receive buffer */ - page = iwl_pcie_rx_alloc_page(trans, priority); + page = iwl_pcie_rx_alloc_page(trans, &offset, priority); if (!page) return; @@ -482,10 +515,11 @@ void iwl_pcie_rxq_alloc_rbs(struct iwl_trans *trans, gfp_t priority, BUG_ON(rxb->page); rxb->page = page; + rxb->offset = offset; /* Get physical address of the RB */ rxb->page_dma = - dma_map_page(trans->dev, page, 0, - PAGE_SIZE << trans_pcie->rx_page_order, + dma_map_page(trans->dev, page, rxb->offset, + trans_pcie->rx_buf_bytes, DMA_FROM_DEVICE); if (dma_mapping_error(trans->dev, rxb->page_dma)) { rxb->page = NULL; @@ -510,12 +544,11 @@ void iwl_pcie_free_rbs_pool(struct iwl_trans *trans) struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); int i; - for (i = 0; i < RX_POOL_SIZE; i++) { + for (i = 0; i < RX_POOL_SIZE(trans_pcie->num_rx_bufs); i++) { if (!trans_pcie->rx_pool[i].page) continue; dma_unmap_page(trans->dev, trans_pcie->rx_pool[i].page_dma, - PAGE_SIZE << trans_pcie->rx_page_order, - DMA_FROM_DEVICE); + trans_pcie->rx_buf_bytes, DMA_FROM_DEVICE); __free_pages(trans_pcie->rx_pool[i].page, trans_pcie->rx_page_order); trans_pcie->rx_pool[i].page = NULL; @@ -568,15 +601,17 @@ static void iwl_pcie_rx_allocator(struct iwl_trans *trans) BUG_ON(rxb->page); /* Alloc a new receive buffer */ - page = iwl_pcie_rx_alloc_page(trans, gfp_mask); + page = iwl_pcie_rx_alloc_page(trans, &rxb->offset, + gfp_mask); if (!page) continue; rxb->page = page; /* Get physical address of the RB */ - rxb->page_dma = dma_map_page(trans->dev, page, 0, - PAGE_SIZE << trans_pcie->rx_page_order, - DMA_FROM_DEVICE); + rxb->page_dma = dma_map_page(trans->dev, page, + rxb->offset, + trans_pcie->rx_buf_bytes, + DMA_FROM_DEVICE); if (dma_mapping_error(trans->dev, rxb->page_dma)) { rxb->page = NULL; __free_pages(page, trans_pcie->rx_page_order); @@ -738,7 +773,7 @@ static int iwl_pcie_alloc_rxq_dma(struct iwl_trans *trans, spin_lock_init(&rxq->lock); if (trans->trans_cfg->mq_rx_supported) - rxq->queue_size = MQ_RX_TABLE_SIZE; + rxq->queue_size = trans->cfg->num_rbds; else rxq->queue_size = RX_QUEUE_SIZE; @@ -807,8 +842,18 @@ static int iwl_pcie_rx_alloc(struct iwl_trans *trans) trans_pcie->rxq = kcalloc(trans->num_rx_queues, sizeof(struct iwl_rxq), GFP_KERNEL); - if (!trans_pcie->rxq) - return -ENOMEM; + trans_pcie->rx_pool = kcalloc(RX_POOL_SIZE(trans_pcie->num_rx_bufs), + sizeof(trans_pcie->rx_pool[0]), + GFP_KERNEL); + trans_pcie->global_table = + kcalloc(RX_POOL_SIZE(trans_pcie->num_rx_bufs), + sizeof(trans_pcie->global_table[0]), + GFP_KERNEL); + if (!trans_pcie->rxq || !trans_pcie->rx_pool || + !trans_pcie->global_table) { + ret = -ENOMEM; + goto err; + } spin_lock_init(&rba->lock); @@ -845,6 +890,8 @@ err: trans_pcie->base_rb_stts = NULL; trans_pcie->base_rb_stts_dma = 0; } + kfree(trans_pcie->rx_pool); + kfree(trans_pcie->global_table); kfree(trans_pcie->rxq); return ret; @@ -1081,12 +1128,11 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans) /* move the pool to the default queue and allocator ownerships */ queue_size = trans->trans_cfg->mq_rx_supported ? - MQ_RX_NUM_RBDS : RX_QUEUE_SIZE; + trans_pcie->num_rx_bufs - 1 : RX_QUEUE_SIZE; allocator_pool_size = trans->num_rx_queues * (RX_CLAIM_REQ_ALLOC - RX_POST_REQ_ALLOC); num_alloc = queue_size + allocator_pool_size; - BUILD_BUG_ON(ARRAY_SIZE(trans_pcie->global_table) != - ARRAY_SIZE(trans_pcie->rx_pool)); + for (i = 0; i < num_alloc; i++) { struct iwl_rx_mem_buffer *rxb = &trans_pcie->rx_pool[i]; @@ -1177,7 +1223,12 @@ void iwl_pcie_rx_free(struct iwl_trans *trans) if (rxq->napi.poll) netif_napi_del(&rxq->napi); } + kfree(trans_pcie->rx_pool); + kfree(trans_pcie->global_table); kfree(trans_pcie->rxq); + + if (trans_pcie->alloc_page) + __free_pages(trans_pcie->alloc_page, trans_pcie->rx_page_order); } static void iwl_pcie_rx_move_to_allocator(struct iwl_rxq *rxq, @@ -1235,7 +1286,7 @@ static void iwl_pcie_rx_handle_rb(struct iwl_trans *trans, struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_txq *txq = trans_pcie->txq[trans_pcie->cmd_queue]; bool page_stolen = false; - int max_len = PAGE_SIZE << trans_pcie->rx_page_order; + int max_len = trans_pcie->rx_buf_bytes; u32 offset = 0; if (WARN_ON(!rxb)) @@ -1249,7 +1300,7 @@ static void iwl_pcie_rx_handle_rb(struct iwl_trans *trans, bool reclaim; int index, cmd_index, len; struct iwl_rx_cmd_buffer rxcb = { - ._offset = offset, + ._offset = rxb->offset + offset, ._rx_page_order = trans_pcie->rx_page_order, ._page = rxb->page, ._page_stolen = false, @@ -1355,8 +1406,8 @@ static void iwl_pcie_rx_handle_rb(struct iwl_trans *trans, * rx_free list for reuse later. */ if (rxb->page != NULL) { rxb->page_dma = - dma_map_page(trans->dev, rxb->page, 0, - PAGE_SIZE << trans_pcie->rx_page_order, + dma_map_page(trans->dev, rxb->page, rxb->offset, + trans_pcie->rx_buf_bytes, DMA_FROM_DEVICE); if (dma_mapping_error(trans->dev, rxb->page_dma)) { /* @@ -1390,13 +1441,12 @@ static struct iwl_rx_mem_buffer *iwl_pcie_get_rxb(struct iwl_trans *trans, return rxb; } - /* used_bd is a 32/16 bit but only 12 are used to retrieve the vid */ if (trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_AX210) - vid = le16_to_cpu(rxq->cd[i].rbid) & 0x0FFF; + vid = le16_to_cpu(rxq->cd[i].rbid); else - vid = le32_to_cpu(rxq->bd_32[i]) & 0x0FFF; + vid = le32_to_cpu(rxq->bd_32[i]) & 0x0FFF; /* 12-bit VID */ - if (!vid || vid > ARRAY_SIZE(trans_pcie->global_table)) + if (!vid || vid > RX_POOL_SIZE(trans_pcie->num_rx_bufs)) goto out_err; rxb = trans_pcie->global_table[vid - 1]; @@ -1529,13 +1579,13 @@ out: napi = &rxq->napi; if (napi->poll) { + napi_gro_flush(napi, false); + if (napi->rx_count) { netif_receive_skb_list(&napi->rx_list); INIT_LIST_HEAD(&napi->rx_list); napi->rx_count = 0; } - - napi_gro_flush(napi, false); } iwl_pcie_rxq_restock(trans, rxq); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c index 0d8b2a8ffa5d..19a2c72081ab 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans-gen2.c @@ -132,8 +132,7 @@ static void iwl_pcie_gen2_apm_stop(struct iwl_trans *trans, bool op_mode_leave) * Clear "initialization complete" bit to move adapter from * D0A* (powered-up Active) --> D0U* (Uninitialized) state. */ - iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_init_done)); + iwl_clear_bit(trans, CSR_GP_CNTRL, CSR_GP_CNTRL_REG_FLAG_INIT_DONE); } void _iwl_trans_pcie_gen2_stop_device(struct iwl_trans *trans) @@ -175,7 +174,7 @@ void _iwl_trans_pcie_gen2_stop_device(struct iwl_trans *trans) /* Make sure (redundant) we've released our request to stay awake */ iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); /* Stop the device, and put it in low power state */ iwl_pcie_gen2_apm_stop(trans, false); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c index a0677131634d..38d8fe21690a 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c @@ -79,6 +79,7 @@ #include "iwl-agn-hw.h" #include "fw/error-dump.h" #include "fw/dbg.h" +#include "fw/api/tx.h" #include "internal.h" #include "iwl-fh.h" @@ -183,8 +184,7 @@ out: static void iwl_trans_pcie_sw_reset(struct iwl_trans *trans) { /* Reset entire device - do controller reset (results in SHRD_HW_RST) */ - iwl_set_bit(trans, trans->trans_cfg->csr->addr_sw_reset, - BIT(trans->trans_cfg->csr->flag_sw_reset)); + iwl_set_bit(trans, CSR_RESET, CSR_RESET_REG_FLAG_SW_RESET); usleep_range(5000, 6000); } @@ -301,18 +301,13 @@ void iwl_pcie_apm_config(struct iwl_trans *trans) u16 cap; /* - * HW bug W/A for instability in PCIe bus L0S->L1 transition. - * Check if BIOS (or OS) enabled L1-ASPM on this device. - * If so (likely), disable L0S, so device moves directly L0->L1; - * costs negligible amount of power savings. - * If not (unlikely), enable L0S, so there is at least some - * power savings, even without L1. + * L0S states have been found to be unstable with our devices + * and in newer hardware they are not officially supported at + * all, so we must always set the L0S_DISABLED bit. */ + iwl_set_bit(trans, CSR_GIO_REG, CSR_GIO_REG_VAL_L0S_DISABLED); + pcie_capability_read_word(trans_pcie->pci_dev, PCI_EXP_LNKCTL, &lctl); - if (lctl & PCI_EXP_LNKCTL_ASPM_L1) - iwl_set_bit(trans, CSR_GIO_REG, CSR_GIO_REG_VAL_L0S_ENABLED); - else - iwl_clear_bit(trans, CSR_GIO_REG, CSR_GIO_REG_VAL_L0S_ENABLED); trans->pm_support = !(lctl & PCI_EXP_LNKCTL_ASPM_L0S); pcie_capability_read_word(trans_pcie->pci_dev, PCI_EXP_DEVCTL2, &cap); @@ -487,8 +482,7 @@ static void iwl_pcie_apm_lp_xtal_enable(struct iwl_trans *trans) * Clear "initialization complete" bit to move adapter from * D0A* (powered-up Active) --> D0U* (Uninitialized) state. */ - iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_init_done)); + iwl_clear_bit(trans, CSR_GP_CNTRL, CSR_GP_CNTRL_REG_FLAG_INIT_DONE); /* Activates XTAL resources monitor */ __iwl_trans_pcie_set_bit(trans, CSR_MONITOR_CFG_REG, @@ -510,12 +504,11 @@ void iwl_pcie_apm_stop_master(struct iwl_trans *trans) int ret; /* stop device's busmaster DMA activity */ - iwl_set_bit(trans, trans->trans_cfg->csr->addr_sw_reset, - BIT(trans->trans_cfg->csr->flag_stop_master)); + iwl_set_bit(trans, CSR_RESET, CSR_RESET_REG_FLAG_STOP_MASTER); - ret = iwl_poll_bit(trans, trans->trans_cfg->csr->addr_sw_reset, - BIT(trans->trans_cfg->csr->flag_master_dis), - BIT(trans->trans_cfg->csr->flag_master_dis), 100); + ret = iwl_poll_bit(trans, CSR_RESET, + CSR_RESET_REG_FLAG_MASTER_DISABLED, + CSR_RESET_REG_FLAG_MASTER_DISABLED, 100); if (ret < 0) IWL_WARN(trans, "Master Disable Timed Out, 100 usec\n"); @@ -564,8 +557,7 @@ static void iwl_pcie_apm_stop(struct iwl_trans *trans, bool op_mode_leave) * Clear "initialization complete" bit to move adapter from * D0A* (powered-up Active) --> D0U* (Uninitialized) state. */ - iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_init_done)); + iwl_clear_bit(trans, CSR_GP_CNTRL, CSR_GP_CNTRL_REG_FLAG_INIT_DONE); } static int iwl_pcie_nic_init(struct iwl_trans *trans) @@ -1270,7 +1262,7 @@ static void _iwl_trans_pcie_stop_device(struct iwl_trans *trans) /* Make sure (redundant) we've released our request to stay awake */ iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); /* Stop the device, and put it in low power state */ iwl_pcie_apm_stop(trans, false); @@ -1494,9 +1486,8 @@ void iwl_pcie_d3_complete_suspend(struct iwl_trans *trans, iwl_pcie_synchronize_irqs(trans); iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); - iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_init_done)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); + iwl_clear_bit(trans, CSR_GP_CNTRL, CSR_GP_CNTRL_REG_FLAG_INIT_DONE); if (reset) { /* @@ -1561,7 +1552,7 @@ static int iwl_trans_pcie_d3_resume(struct iwl_trans *trans, } iwl_set_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); ret = iwl_finish_nic_init(trans, trans->trans_cfg); if (ret) @@ -1583,7 +1574,7 @@ static int iwl_trans_pcie_d3_resume(struct iwl_trans *trans, if (!reset) { iwl_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); } else { iwl_trans_pcie_tx_reset(trans); @@ -1945,6 +1936,11 @@ static void iwl_trans_pcie_configure(struct iwl_trans *trans, trans_pcie->rx_buf_size = trans_cfg->rx_buf_size; trans_pcie->rx_page_order = iwl_trans_get_rb_size_order(trans_pcie->rx_buf_size); + trans_pcie->rx_buf_bytes = + iwl_trans_get_rb_size(trans_pcie->rx_buf_size); + trans_pcie->supported_dma_mask = DMA_BIT_MASK(12); + if (trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_AX210) + trans_pcie->supported_dma_mask = DMA_BIT_MASK(11); trans_pcie->bc_table_dword = trans_cfg->bc_table_dword; trans_pcie->scd_set_active = trans_cfg->scd_set_active; @@ -2054,7 +2050,7 @@ static bool iwl_trans_pcie_grab_nic_access(struct iwl_trans *trans, /* this bit wakes up the NIC */ __iwl_trans_pcie_set_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); if (trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_8000) udelay(2); @@ -2079,8 +2075,8 @@ static bool iwl_trans_pcie_grab_nic_access(struct iwl_trans *trans, * and do not save/restore SRAM when power cycling. */ ret = iwl_poll_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_val_mac_access_en), - (BIT(trans->trans_cfg->csr->flag_mac_clock_ready) | + CSR_GP_CNTRL_REG_VAL_MAC_ACCESS_EN, + (CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY | CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP), 15000); if (unlikely(ret < 0)) { u32 cntrl = iwl_read32(trans, CSR_GP_CNTRL); @@ -2162,7 +2158,7 @@ static void iwl_trans_pcie_release_nic_access(struct iwl_trans *trans, goto out; __iwl_trans_pcie_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); /* * Above we read the CSR_GP_CNTRL register, which will flush * any previous writes, but we need the write that clears the @@ -2963,7 +2959,7 @@ static u32 iwl_trans_pcie_dump_rbs(struct iwl_trans *trans, int allocated_rb_nums) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); - int max_len = PAGE_SIZE << trans_pcie->rx_page_order; + int max_len = trans_pcie->rx_buf_bytes; /* Dump RBs is supported only for pre-9000 devices (1 queue) */ struct iwl_rxq *rxq = &trans_pcie->rxq[0]; u32 i, r, j, rb_len = 0; @@ -2989,9 +2985,9 @@ static u32 iwl_trans_pcie_dump_rbs(struct iwl_trans *trans, rb->index = cpu_to_le32(i); memcpy(rb->data, page_address(rxb->page), max_len); /* remap the page for the free benefit */ - rxb->page_dma = dma_map_page(trans->dev, rxb->page, 0, - max_len, - DMA_FROM_DEVICE); + rxb->page_dma = dma_map_page(trans->dev, rxb->page, + rxb->offset, max_len, + DMA_FROM_DEVICE); *data = iwl_fw_error_next_data(*data); } @@ -3460,19 +3456,34 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev, { struct iwl_trans_pcie *trans_pcie; struct iwl_trans *trans; - int ret, addr_size; + int ret, addr_size, txcmd_size, txcmd_align; + const struct iwl_trans_ops *ops = &trans_ops_pcie_gen2; + + if (!cfg_trans->gen2) { + ops = &trans_ops_pcie; + txcmd_size = sizeof(struct iwl_tx_cmd); + txcmd_align = sizeof(void *); + } else if (cfg_trans->device_family < IWL_DEVICE_FAMILY_AX210) { + txcmd_size = sizeof(struct iwl_tx_cmd_gen2); + txcmd_align = 64; + } else { + txcmd_size = sizeof(struct iwl_tx_cmd_gen3); + txcmd_align = 128; + } + + txcmd_size += sizeof(struct iwl_cmd_header); + txcmd_size += 36; /* biggest possible 802.11 header */ + + /* Ensure device TX cmd cannot reach/cross a page boundary in gen2 */ + if (WARN_ON(cfg_trans->gen2 && txcmd_size >= txcmd_align)) + return ERR_PTR(-EINVAL); ret = pcim_enable_device(pdev); if (ret) return ERR_PTR(ret); - if (cfg_trans->gen2) - trans = iwl_trans_alloc(sizeof(struct iwl_trans_pcie), - &pdev->dev, &trans_ops_pcie_gen2); - else - trans = iwl_trans_alloc(sizeof(struct iwl_trans_pcie), - &pdev->dev, &trans_ops_pcie); - + trans = iwl_trans_alloc(sizeof(struct iwl_trans_pcie), &pdev->dev, ops, + txcmd_size, txcmd_align); if (!trans) return ERR_PTR(-ENOMEM); @@ -3482,6 +3493,7 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev, trans_pcie->opmode_down = true; spin_lock_init(&trans_pcie->irq_lock); spin_lock_init(&trans_pcie->reg_lock); + spin_lock_init(&trans_pcie->alloc_page_lock); mutex_init(&trans_pcie->mutex); init_waitqueue_head(&trans_pcie->ucode_write_waitq); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c index 8ca0250de99e..86fc00167817 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c @@ -221,6 +221,17 @@ static int iwl_pcie_gen2_set_tb(struct iwl_trans *trans, int idx = iwl_pcie_gen2_get_num_tbs(trans, tfd); struct iwl_tfh_tb *tb; + /* + * Only WARN here so we know about the issue, but we mess up our + * unmap path because not every place currently checks for errors + * returned from this function - it can only return an error if + * there's no more space, and so when we know there is enough we + * don't always check ... + */ + WARN(iwl_pcie_crosses_4g_boundary(addr, len), + "possible DMA problem with iova:0x%llx, len:%d\n", + (unsigned long long)addr, len); + if (WARN_ON(idx >= IWL_TFH_NUM_TBS)) return -EINVAL; tb = &tfd->tbs[idx]; @@ -240,13 +251,114 @@ static int iwl_pcie_gen2_set_tb(struct iwl_trans *trans, return idx; } +static struct page *get_workaround_page(struct iwl_trans *trans, + struct sk_buff *skb) +{ + struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + struct page **page_ptr; + struct page *ret; + + page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs); + + ret = alloc_page(GFP_ATOMIC); + if (!ret) + return NULL; + + /* set the chaining pointer to the previous page if there */ + *(void **)(page_address(ret) + PAGE_SIZE - sizeof(void *)) = *page_ptr; + *page_ptr = ret; + + return ret; +} + +/* + * Add a TB and if needed apply the FH HW bug workaround; + * meta != NULL indicates that it's a page mapping and we + * need to dma_unmap_page() and set the meta->tbs bit in + * this case. + */ +static int iwl_pcie_gen2_set_tb_with_wa(struct iwl_trans *trans, + struct sk_buff *skb, + struct iwl_tfh_tfd *tfd, + dma_addr_t phys, void *virt, + u16 len, struct iwl_cmd_meta *meta) +{ + dma_addr_t oldphys = phys; + struct page *page; + int ret; + + if (unlikely(dma_mapping_error(trans->dev, phys))) + return -ENOMEM; + + if (likely(!iwl_pcie_crosses_4g_boundary(phys, len))) { + ret = iwl_pcie_gen2_set_tb(trans, tfd, phys, len); + + if (ret < 0) + goto unmap; + + if (meta) + meta->tbs |= BIT(ret); + + ret = 0; + goto trace; + } + + /* + * Work around a hardware bug. If (as expressed in the + * condition above) the TB ends on a 32-bit boundary, + * then the next TB may be accessed with the wrong + * address. + * To work around it, copy the data elsewhere and make + * a new mapping for it so the device will not fail. + */ + + if (WARN_ON(len > PAGE_SIZE - sizeof(void *))) { + ret = -ENOBUFS; + goto unmap; + } + + page = get_workaround_page(trans, skb); + if (!page) { + ret = -ENOMEM; + goto unmap; + } + + memcpy(page_address(page), virt, len); + + phys = dma_map_single(trans->dev, page_address(page), len, + DMA_TO_DEVICE); + if (unlikely(dma_mapping_error(trans->dev, phys))) + return -ENOMEM; + ret = iwl_pcie_gen2_set_tb(trans, tfd, phys, len); + if (ret < 0) { + /* unmap the new allocation as single */ + oldphys = phys; + meta = NULL; + goto unmap; + } + IWL_WARN(trans, + "TB bug workaround: copied %d bytes from 0x%llx to 0x%llx\n", + len, (unsigned long long)oldphys, (unsigned long long)phys); + + ret = 0; +unmap: + if (meta) + dma_unmap_page(trans->dev, oldphys, len, DMA_TO_DEVICE); + else + dma_unmap_single(trans->dev, oldphys, len, DMA_TO_DEVICE); +trace: + trace_iwlwifi_dev_tx_tb(trans->dev, skb, virt, phys, len); + + return ret; +} + static int iwl_pcie_gen2_build_amsdu(struct iwl_trans *trans, struct sk_buff *skb, struct iwl_tfh_tfd *tfd, int start_len, - u8 hdr_len, struct iwl_device_cmd *dev_cmd) + u8 hdr_len, + struct iwl_device_tx_cmd *dev_cmd) { #ifdef CONFIG_INET - struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_tx_cmd_gen2 *tx_cmd = (void *)dev_cmd->payload; struct ieee80211_hdr *hdr = (void *)skb->data; unsigned int snap_ip_tcp_hdrlen, ip_hdrlen, total_len, hdr_room; @@ -254,7 +366,6 @@ static int iwl_pcie_gen2_build_amsdu(struct iwl_trans *trans, u16 length, amsdu_pad; u8 *start_hdr; struct iwl_tso_hdr_page *hdr_page; - struct page **page_ptr; struct tso_t tso; trace_iwlwifi_dev_tx(trans->dev, skb, tfd, sizeof(*tfd), @@ -270,14 +381,11 @@ static int iwl_pcie_gen2_build_amsdu(struct iwl_trans *trans, (3 + snap_ip_tcp_hdrlen + sizeof(struct ethhdr)); /* Our device supports 9 segments at most, it will fit in 1 page */ - hdr_page = get_page_hdr(trans, hdr_room); + hdr_page = get_page_hdr(trans, hdr_room, skb); if (!hdr_page) return -ENOMEM; - get_page(hdr_page->page); start_hdr = hdr_page->pos; - page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs); - *page_ptr = hdr_page->page; /* * Pull the ieee80211 header to be able to use TSO core, @@ -332,6 +440,11 @@ static int iwl_pcie_gen2_build_amsdu(struct iwl_trans *trans, dev_kfree_skb(csum_skb); goto out_err; } + /* + * No need for _with_wa, this is from the TSO page and + * we leave some space at the end of it so can't hit + * the buggy scenario. + */ iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, tb_len); trace_iwlwifi_dev_tx_tb(trans->dev, skb, start_hdr, tb_phys, tb_len); @@ -343,16 +456,18 @@ static int iwl_pcie_gen2_build_amsdu(struct iwl_trans *trans, /* put the payload */ while (data_left) { + int ret; + tb_len = min_t(unsigned int, tso.size, data_left); tb_phys = dma_map_single(trans->dev, tso.data, tb_len, DMA_TO_DEVICE); - if (unlikely(dma_mapping_error(trans->dev, tb_phys))) { + ret = iwl_pcie_gen2_set_tb_with_wa(trans, skb, tfd, + tb_phys, tso.data, + tb_len, NULL); + if (ret) { dev_kfree_skb(csum_skb); goto out_err; } - iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, tb_len); - trace_iwlwifi_dev_tx_tb(trans->dev, skb, tso.data, - tb_phys, tb_len); data_left -= tb_len; tso_build_data(skb, &tso, tb_len); @@ -372,7 +487,7 @@ out_err: static struct iwl_tfh_tfd *iwl_pcie_gen2_build_tx_amsdu(struct iwl_trans *trans, struct iwl_txq *txq, - struct iwl_device_cmd *dev_cmd, + struct iwl_device_tx_cmd *dev_cmd, struct sk_buff *skb, struct iwl_cmd_meta *out_meta, int hdr_len, @@ -386,6 +501,11 @@ iwl_tfh_tfd *iwl_pcie_gen2_build_tx_amsdu(struct iwl_trans *trans, tb_phys = iwl_pcie_get_first_tb_dma(txq, idx); + /* + * No need for _with_wa, the first TB allocation is aligned up + * to a 64-byte boundary and thus can't be at the end or cross + * a page boundary (much less a 2^32 boundary). + */ iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, IWL_FIRST_TB_SIZE); /* @@ -404,6 +524,10 @@ iwl_tfh_tfd *iwl_pcie_gen2_build_tx_amsdu(struct iwl_trans *trans, tb_phys = dma_map_single(trans->dev, tb1_addr, len, DMA_TO_DEVICE); if (unlikely(dma_mapping_error(trans->dev, tb_phys))) goto out_err; + /* + * No need for _with_wa(), we ensure (via alignment) that the data + * here can never cross or end at a page boundary. + */ iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, len); if (iwl_pcie_gen2_build_amsdu(trans, skb, tfd, @@ -430,24 +554,19 @@ static int iwl_pcie_gen2_tx_add_frags(struct iwl_trans *trans, for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { const skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; dma_addr_t tb_phys; - int tb_idx; + unsigned int fragsz = skb_frag_size(frag); + int ret; - if (!skb_frag_size(frag)) + if (!fragsz) continue; tb_phys = skb_frag_dma_map(trans->dev, frag, 0, - skb_frag_size(frag), DMA_TO_DEVICE); - - if (unlikely(dma_mapping_error(trans->dev, tb_phys))) - return -ENOMEM; - tb_idx = iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, - skb_frag_size(frag)); - trace_iwlwifi_dev_tx_tb(trans->dev, skb, skb_frag_address(frag), - tb_phys, skb_frag_size(frag)); - if (tb_idx < 0) - return tb_idx; - - out_meta->tbs |= BIT(tb_idx); + fragsz, DMA_TO_DEVICE); + ret = iwl_pcie_gen2_set_tb_with_wa(trans, skb, tfd, tb_phys, + skb_frag_address(frag), + fragsz, out_meta); + if (ret) + return ret; } return 0; @@ -456,7 +575,7 @@ static int iwl_pcie_gen2_tx_add_frags(struct iwl_trans *trans, static struct iwl_tfh_tfd *iwl_pcie_gen2_build_tx(struct iwl_trans *trans, struct iwl_txq *txq, - struct iwl_device_cmd *dev_cmd, + struct iwl_device_tx_cmd *dev_cmd, struct sk_buff *skb, struct iwl_cmd_meta *out_meta, int hdr_len, @@ -475,6 +594,11 @@ iwl_tfh_tfd *iwl_pcie_gen2_build_tx(struct iwl_trans *trans, /* The first TB points to bi-directional DMA data */ memcpy(&txq->first_tb_bufs[idx], dev_cmd, IWL_FIRST_TB_SIZE); + /* + * No need for _with_wa, the first TB allocation is aligned up + * to a 64-byte boundary and thus can't be at the end or cross + * a page boundary (much less a 2^32 boundary). + */ iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, IWL_FIRST_TB_SIZE); /* @@ -496,6 +620,10 @@ iwl_tfh_tfd *iwl_pcie_gen2_build_tx(struct iwl_trans *trans, tb_phys = dma_map_single(trans->dev, tb1_addr, tb1_len, DMA_TO_DEVICE); if (unlikely(dma_mapping_error(trans->dev, tb_phys))) goto out_err; + /* + * No need for _with_wa(), we ensure (via alignment) that the data + * here can never cross or end at a page boundary. + */ iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, tb1_len); trace_iwlwifi_dev_tx(trans->dev, skb, tfd, sizeof(*tfd), &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, hdr_len); @@ -504,26 +632,30 @@ iwl_tfh_tfd *iwl_pcie_gen2_build_tx(struct iwl_trans *trans, tb2_len = skb_headlen(skb) - hdr_len; if (tb2_len > 0) { + int ret; + tb_phys = dma_map_single(trans->dev, skb->data + hdr_len, tb2_len, DMA_TO_DEVICE); - if (unlikely(dma_mapping_error(trans->dev, tb_phys))) + ret = iwl_pcie_gen2_set_tb_with_wa(trans, skb, tfd, tb_phys, + skb->data + hdr_len, tb2_len, + NULL); + if (ret) goto out_err; - iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, tb2_len); - trace_iwlwifi_dev_tx_tb(trans->dev, skb, skb->data + hdr_len, - tb_phys, tb2_len); } if (iwl_pcie_gen2_tx_add_frags(trans, skb, tfd, out_meta)) goto out_err; skb_walk_frags(skb, frag) { + int ret; + tb_phys = dma_map_single(trans->dev, frag->data, skb_headlen(frag), DMA_TO_DEVICE); - if (unlikely(dma_mapping_error(trans->dev, tb_phys))) + ret = iwl_pcie_gen2_set_tb_with_wa(trans, skb, tfd, tb_phys, + frag->data, + skb_headlen(frag), NULL); + if (ret) goto out_err; - iwl_pcie_gen2_set_tb(trans, tfd, tb_phys, skb_headlen(frag)); - trace_iwlwifi_dev_tx_tb(trans->dev, skb, frag->data, - tb_phys, skb_headlen(frag)); if (iwl_pcie_gen2_tx_add_frags(trans, frag, tfd, out_meta)) goto out_err; } @@ -538,7 +670,7 @@ out_err: static struct iwl_tfh_tfd *iwl_pcie_gen2_build_tfd(struct iwl_trans *trans, struct iwl_txq *txq, - struct iwl_device_cmd *dev_cmd, + struct iwl_device_tx_cmd *dev_cmd, struct sk_buff *skb, struct iwl_cmd_meta *out_meta) { @@ -578,7 +710,7 @@ struct iwl_tfh_tfd *iwl_pcie_gen2_build_tfd(struct iwl_trans *trans, } int iwl_trans_pcie_gen2_tx(struct iwl_trans *trans, struct sk_buff *skb, - struct iwl_device_cmd *dev_cmd, int txq_id) + struct iwl_device_tx_cmd *dev_cmd, int txq_id) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_cmd_meta *out_meta; @@ -587,6 +719,10 @@ int iwl_trans_pcie_gen2_tx(struct iwl_trans *trans, struct sk_buff *skb, int idx; void *tfd; + if (WARN_ONCE(txq_id >= IWL_MAX_TVQM_QUEUES, + "queue %d out of range", txq_id)) + return -EINVAL; + if (WARN_ONCE(!test_bit(txq_id, trans_pcie->queue_used), "TX on unused queue %d\n", txq_id)) return -EINVAL; @@ -603,7 +739,7 @@ int iwl_trans_pcie_gen2_tx(struct iwl_trans *trans, struct sk_buff *skb, /* don't put the packet on the ring, if there is no room */ if (unlikely(iwl_queue_space(trans, txq) < 3)) { - struct iwl_device_cmd **dev_cmd_ptr; + struct iwl_device_tx_cmd **dev_cmd_ptr; dev_cmd_ptr = (void *)((u8 *)skb->cb + trans_pcie->dev_cmd_offs); @@ -1101,9 +1237,15 @@ void iwl_pcie_gen2_txq_free_memory(struct iwl_trans *trans, static void iwl_pcie_gen2_txq_free(struct iwl_trans *trans, int txq_id) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); - struct iwl_txq *txq = trans_pcie->txq[txq_id]; + struct iwl_txq *txq; int i; + if (WARN_ONCE(txq_id >= IWL_MAX_TVQM_QUEUES, + "queue %d out of range", txq_id)) + return; + + txq = trans_pcie->txq[txq_id]; + if (WARN_ON(!txq)) return; @@ -1258,6 +1400,10 @@ void iwl_trans_pcie_dyn_txq_free(struct iwl_trans *trans, int queue) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + if (WARN(queue >= IWL_MAX_TVQM_QUEUES, + "queue %d out of range", queue)) + return; + /* * Upon HW Rfkill - we stop the device, and then stop the queues * in the op_mode. Just for the sake of the simplicity of the op_mode, diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index f21f16ab2a97..2f69a6469fe7 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -213,8 +213,8 @@ static void iwl_pcie_txq_update_byte_cnt_tbl(struct iwl_trans *trans, u8 sec_ctl = 0; u16 len = byte_cnt + IWL_TX_CRC_SIZE + IWL_TX_DELIMITER_SIZE; __le16 bc_ent; - struct iwl_tx_cmd *tx_cmd = - (void *)txq->entries[txq->write_ptr].cmd->payload; + struct iwl_device_tx_cmd *dev_cmd = txq->entries[txq->write_ptr].cmd; + struct iwl_tx_cmd *tx_cmd = (void *)dev_cmd->payload; u8 sta_id = tx_cmd->sta_id; scd_bc_tbl = trans_pcie->scd_bc_tbls.addr; @@ -257,8 +257,8 @@ static void iwl_pcie_txq_inval_byte_cnt_tbl(struct iwl_trans *trans, int read_ptr = txq->read_ptr; u8 sta_id = 0; __le16 bc_ent; - struct iwl_tx_cmd *tx_cmd = - (void *)txq->entries[read_ptr].cmd->payload; + struct iwl_device_tx_cmd *dev_cmd = txq->entries[read_ptr].cmd; + struct iwl_tx_cmd *tx_cmd = (void *)dev_cmd->payload; WARN_ON(read_ptr >= TFD_QUEUE_SIZE_MAX); @@ -306,7 +306,7 @@ static void iwl_pcie_txq_inc_wr_ptr(struct iwl_trans *trans, IWL_DEBUG_INFO(trans, "Tx queue %d requesting wakeup, GP1 = 0x%x\n", txq_id, reg); iwl_set_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); txq->need_update = true; return; } @@ -624,12 +624,18 @@ void iwl_pcie_free_tso_page(struct iwl_trans_pcie *trans_pcie, struct sk_buff *skb) { struct page **page_ptr; + struct page *next; page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs); + next = *page_ptr; + *page_ptr = NULL; - if (*page_ptr) { - __free_page(*page_ptr); - *page_ptr = NULL; + while (next) { + struct page *tmp = next; + + next = *(void **)(page_address(next) + PAGE_SIZE - + sizeof(void *)); + __free_page(tmp); } } @@ -646,7 +652,7 @@ static void iwl_pcie_clear_cmd_in_flight(struct iwl_trans *trans) trans_pcie->cmd_hold_nic_awake = false; __iwl_trans_pcie_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); } /* @@ -1196,7 +1202,7 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, int txq_id, int ssn, while (!skb_queue_empty(&overflow_skbs)) { struct sk_buff *skb = __skb_dequeue(&overflow_skbs); - struct iwl_device_cmd *dev_cmd_ptr; + struct iwl_device_tx_cmd *dev_cmd_ptr; dev_cmd_ptr = *(void **)((u8 *)skb->cb + trans_pcie->dev_cmd_offs); @@ -1255,16 +1261,16 @@ static int iwl_pcie_set_cmd_in_flight(struct iwl_trans *trans, if (trans->trans_cfg->base_params->apmg_wake_up_wa && !trans_pcie->cmd_hold_nic_awake) { __iwl_trans_pcie_set_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); ret = iwl_poll_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_val_mac_access_en), - (BIT(trans->trans_cfg->csr->flag_mac_clock_ready) | + CSR_GP_CNTRL_REG_VAL_MAC_ACCESS_EN, + (CSR_GP_CNTRL_REG_FLAG_MAC_CLOCK_READY | CSR_GP_CNTRL_REG_FLAG_GOING_TO_SLEEP), 15000); if (ret < 0) { __iwl_trans_pcie_clear_bit(trans, CSR_GP_CNTRL, - BIT(trans->trans_cfg->csr->flag_mac_access_req)); + CSR_GP_CNTRL_REG_FLAG_MAC_ACCESS_REQ); IWL_ERR(trans, "Failed to wake NIC for hcmd\n"); return -EIO; } @@ -2052,17 +2058,34 @@ static int iwl_fill_data_tbs(struct iwl_trans *trans, struct sk_buff *skb, } #ifdef CONFIG_INET -struct iwl_tso_hdr_page *get_page_hdr(struct iwl_trans *trans, size_t len) +struct iwl_tso_hdr_page *get_page_hdr(struct iwl_trans *trans, size_t len, + struct sk_buff *skb) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct iwl_tso_hdr_page *p = this_cpu_ptr(trans_pcie->tso_hdr_page); + struct page **page_ptr; + + page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs); + + if (WARN_ON(*page_ptr)) + return NULL; if (!p->page) goto alloc; - /* enough room on this page */ - if (p->pos + len < (u8 *)page_address(p->page) + PAGE_SIZE) - return p; + /* + * Check if there's enough room on this page + * + * Note that we put a page chaining pointer *last* in the + * page - we need it somewhere, and if it's there then we + * avoid DMA mapping the last bits of the page which may + * trigger the 32-bit boundary hardware bug. + * + * (see also get_workaround_page() in tx-gen2.c) + */ + if (p->pos + len < (u8 *)page_address(p->page) + PAGE_SIZE - + sizeof(void *)) + goto out; /* We don't have enough room on this page, get a new one. */ __free_page(p->page); @@ -2072,6 +2095,11 @@ alloc: if (!p->page) return NULL; p->pos = page_address(p->page); + /* set the chaining pointer to NULL */ + *(void **)(page_address(p->page) + PAGE_SIZE - sizeof(void *)) = NULL; +out: + *page_ptr = p->page; + get_page(p->page); return p; } @@ -2097,7 +2125,8 @@ static void iwl_compute_pseudo_hdr_csum(void *iph, struct tcphdr *tcph, static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, struct iwl_txq *txq, u8 hdr_len, struct iwl_cmd_meta *out_meta, - struct iwl_device_cmd *dev_cmd, u16 tb1_len) + struct iwl_device_tx_cmd *dev_cmd, + u16 tb1_len) { struct iwl_tx_cmd *tx_cmd = (void *)dev_cmd->payload; struct iwl_trans_pcie *trans_pcie = txq->trans_pcie; @@ -2107,7 +2136,6 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, u16 length, iv_len, amsdu_pad; u8 *start_hdr; struct iwl_tso_hdr_page *hdr_page; - struct page **page_ptr; struct tso_t tso; /* if the packet is protected, then it must be CCMP or GCMP */ @@ -2130,14 +2158,11 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, (3 + snap_ip_tcp_hdrlen + sizeof(struct ethhdr)) + iv_len; /* Our device supports 9 segments at most, it will fit in 1 page */ - hdr_page = get_page_hdr(trans, hdr_room); + hdr_page = get_page_hdr(trans, hdr_room, skb); if (!hdr_page) return -ENOMEM; - get_page(hdr_page->page); start_hdr = hdr_page->pos; - page_ptr = (void *)((u8 *)skb->cb + trans_pcie->page_offs); - *page_ptr = hdr_page->page; memcpy(hdr_page->pos, skb->data + hdr_len, iv_len); hdr_page->pos += iv_len; @@ -2279,7 +2304,8 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, struct iwl_txq *txq, u8 hdr_len, struct iwl_cmd_meta *out_meta, - struct iwl_device_cmd *dev_cmd, u16 tb1_len) + struct iwl_device_tx_cmd *dev_cmd, + u16 tb1_len) { /* No A-MSDU without CONFIG_INET */ WARN_ON(1); @@ -2289,7 +2315,7 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, #endif /* CONFIG_INET */ int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb, - struct iwl_device_cmd *dev_cmd, int txq_id) + struct iwl_device_tx_cmd *dev_cmd, int txq_id) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct ieee80211_hdr *hdr; @@ -2346,7 +2372,7 @@ int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb, /* don't put the packet on the ring, if there is no room */ if (unlikely(iwl_queue_space(trans, txq) < 3)) { - struct iwl_device_cmd **dev_cmd_ptr; + struct iwl_device_tx_cmd **dev_cmd_ptr; dev_cmd_ptr = (void *)((u8 *)skb->cb + trans_pcie->dev_cmd_offs); diff --git a/drivers/net/wireless/intersil/hostap/hostap_ap.c b/drivers/net/wireless/intersil/hostap/hostap_ap.c index 0094b1d2b577..3ec46f48cfde 100644 --- a/drivers/net/wireless/intersil/hostap/hostap_ap.c +++ b/drivers/net/wireless/intersil/hostap/hostap_ap.c @@ -2508,7 +2508,7 @@ static int prism2_hostapd_add_sta(struct ap_data *ap, sta->supported_rates[0] = 2; if (sta->tx_supp_rates & WLAN_RATE_2M) sta->supported_rates[1] = 4; - if (sta->tx_supp_rates & WLAN_RATE_5M5) + if (sta->tx_supp_rates & WLAN_RATE_5M5) sta->supported_rates[2] = 11; if (sta->tx_supp_rates & WLAN_RATE_11M) sta->supported_rates[3] = 22; diff --git a/drivers/net/wireless/marvell/libertas/cfg.c b/drivers/net/wireless/marvell/libertas/cfg.c index 57edfada0665..c9401c121a14 100644 --- a/drivers/net/wireless/marvell/libertas/cfg.c +++ b/drivers/net/wireless/marvell/libertas/cfg.c @@ -273,6 +273,10 @@ add_ie_rates(u8 *tlv, const u8 *ie, int *nrates) int hw, ap, ap_max = ie[1]; u8 hw_rate; + if (ap_max > MAX_RATES) { + lbs_deb_assoc("invalid rates\n"); + return tlv; + } /* Advance past IE header */ ie += 2; @@ -1717,6 +1721,9 @@ static int lbs_ibss_join_existing(struct lbs_private *priv, struct cmd_ds_802_11_ad_hoc_join cmd; u8 preamble = RADIO_PREAMBLE_SHORT; int ret = 0; + int hw, i; + u8 rates_max; + u8 *rates; /* TODO: set preamble based on scan result */ ret = lbs_set_radio(priv, preamble, 1); @@ -1775,9 +1782,12 @@ static int lbs_ibss_join_existing(struct lbs_private *priv, if (!rates_eid) { lbs_add_rates(cmd.bss.rates); } else { - int hw, i; - u8 rates_max = rates_eid[1]; - u8 *rates = cmd.bss.rates; + rates_max = rates_eid[1]; + if (rates_max > MAX_RATES) { + lbs_deb_join("invalid rates"); + goto out; + } + rates = cmd.bss.rates; for (hw = 0; hw < ARRAY_SIZE(lbs_rates); hw++) { u8 hw_rate = lbs_rates[hw].bitrate / 5; for (i = 0; i < rates_max; i++) { diff --git a/drivers/net/wireless/marvell/mwifiex/tdls.c b/drivers/net/wireless/marvell/mwifiex/tdls.c index 7caf1d26124a..f8f282ce39bd 100644 --- a/drivers/net/wireless/marvell/mwifiex/tdls.c +++ b/drivers/net/wireless/marvell/mwifiex/tdls.c @@ -894,7 +894,7 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, u8 *peer, *pos, *end; u8 i, action, basic; u16 cap = 0; - int ie_len = 0; + int ies_len = 0; if (len < (sizeof(struct ethhdr) + 3)) return; @@ -916,7 +916,7 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, pos = buf + sizeof(struct ethhdr) + 4; /* payload 1+ category 1 + action 1 + dialog 1 */ cap = get_unaligned_le16(pos); - ie_len = len - sizeof(struct ethhdr) - TDLS_REQ_FIX_LEN; + ies_len = len - sizeof(struct ethhdr) - TDLS_REQ_FIX_LEN; pos += 2; break; @@ -926,7 +926,7 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, /* payload 1+ category 1 + action 1 + dialog 1 + status code 2*/ pos = buf + sizeof(struct ethhdr) + 6; cap = get_unaligned_le16(pos); - ie_len = len - sizeof(struct ethhdr) - TDLS_RESP_FIX_LEN; + ies_len = len - sizeof(struct ethhdr) - TDLS_RESP_FIX_LEN; pos += 2; break; @@ -934,7 +934,7 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, if (len < (sizeof(struct ethhdr) + TDLS_CONFIRM_FIX_LEN)) return; pos = buf + sizeof(struct ethhdr) + TDLS_CONFIRM_FIX_LEN; - ie_len = len - sizeof(struct ethhdr) - TDLS_CONFIRM_FIX_LEN; + ies_len = len - sizeof(struct ethhdr) - TDLS_CONFIRM_FIX_LEN; break; default: mwifiex_dbg(priv->adapter, ERROR, "Unknown TDLS frame type.\n"); @@ -947,33 +947,33 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, sta_ptr->tdls_cap.capab = cpu_to_le16(cap); - for (end = pos + ie_len; pos + 1 < end; pos += 2 + pos[1]) { - if (pos + 2 + pos[1] > end) + for (end = pos + ies_len; pos + 1 < end; pos += 2 + pos[1]) { + u8 ie_len = pos[1]; + + if (pos + 2 + ie_len > end) break; switch (*pos) { case WLAN_EID_SUPP_RATES: - if (pos[1] > 32) + if (ie_len > sizeof(sta_ptr->tdls_cap.rates)) return; - sta_ptr->tdls_cap.rates_len = pos[1]; - for (i = 0; i < pos[1]; i++) + sta_ptr->tdls_cap.rates_len = ie_len; + for (i = 0; i < ie_len; i++) sta_ptr->tdls_cap.rates[i] = pos[i + 2]; break; case WLAN_EID_EXT_SUPP_RATES: - if (pos[1] > 32) + if (ie_len > sizeof(sta_ptr->tdls_cap.rates)) return; basic = sta_ptr->tdls_cap.rates_len; - if (pos[1] > 32 - basic) + if (ie_len > sizeof(sta_ptr->tdls_cap.rates) - basic) return; - for (i = 0; i < pos[1]; i++) + for (i = 0; i < ie_len; i++) sta_ptr->tdls_cap.rates[basic + i] = pos[i + 2]; - sta_ptr->tdls_cap.rates_len += pos[1]; + sta_ptr->tdls_cap.rates_len += ie_len; break; case WLAN_EID_HT_CAPABILITY: - if (pos > end - sizeof(struct ieee80211_ht_cap) - 2) - return; - if (pos[1] != sizeof(struct ieee80211_ht_cap)) + if (ie_len != sizeof(struct ieee80211_ht_cap)) return; /* copy the ie's value into ht_capb*/ memcpy((u8 *)&sta_ptr->tdls_cap.ht_capb, pos + 2, @@ -981,59 +981,45 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, sta_ptr->is_11n_enabled = 1; break; case WLAN_EID_HT_OPERATION: - if (pos > end - - sizeof(struct ieee80211_ht_operation) - 2) - return; - if (pos[1] != sizeof(struct ieee80211_ht_operation)) + if (ie_len != sizeof(struct ieee80211_ht_operation)) return; /* copy the ie's value into ht_oper*/ memcpy(&sta_ptr->tdls_cap.ht_oper, pos + 2, sizeof(struct ieee80211_ht_operation)); break; case WLAN_EID_BSS_COEX_2040: - if (pos > end - 3) - return; - if (pos[1] != 1) + if (ie_len != sizeof(pos[2])) return; sta_ptr->tdls_cap.coex_2040 = pos[2]; break; case WLAN_EID_EXT_CAPABILITY: - if (pos > end - sizeof(struct ieee_types_header)) - return; - if (pos[1] < sizeof(struct ieee_types_header)) + if (ie_len < sizeof(struct ieee_types_header)) return; - if (pos[1] > 8) + if (ie_len > 8) return; memcpy((u8 *)&sta_ptr->tdls_cap.extcap, pos, sizeof(struct ieee_types_header) + - min_t(u8, pos[1], 8)); + min_t(u8, ie_len, 8)); break; case WLAN_EID_RSN: - if (pos > end - sizeof(struct ieee_types_header)) + if (ie_len < sizeof(struct ieee_types_header)) return; - if (pos[1] < sizeof(struct ieee_types_header)) - return; - if (pos[1] > IEEE_MAX_IE_SIZE - + if (ie_len > IEEE_MAX_IE_SIZE - sizeof(struct ieee_types_header)) return; memcpy((u8 *)&sta_ptr->tdls_cap.rsn_ie, pos, sizeof(struct ieee_types_header) + - min_t(u8, pos[1], IEEE_MAX_IE_SIZE - + min_t(u8, ie_len, IEEE_MAX_IE_SIZE - sizeof(struct ieee_types_header))); break; case WLAN_EID_QOS_CAPA: - if (pos > end - 3) - return; - if (pos[1] != 1) + if (ie_len != sizeof(pos[2])) return; sta_ptr->tdls_cap.qos_info = pos[2]; break; case WLAN_EID_VHT_OPERATION: if (priv->adapter->is_hw_11ac_capable) { - if (pos > end - - sizeof(struct ieee80211_vht_operation) - 2) - return; - if (pos[1] != + if (ie_len != sizeof(struct ieee80211_vht_operation)) return; /* copy the ie's value into vhtoper*/ @@ -1043,10 +1029,7 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, break; case WLAN_EID_VHT_CAPABILITY: if (priv->adapter->is_hw_11ac_capable) { - if (pos > end - - sizeof(struct ieee80211_vht_cap) - 2) - return; - if (pos[1] != sizeof(struct ieee80211_vht_cap)) + if (ie_len != sizeof(struct ieee80211_vht_cap)) return; /* copy the ie's value into vhtcap*/ memcpy((u8 *)&sta_ptr->tdls_cap.vhtcap, pos + 2, @@ -1056,9 +1039,7 @@ void mwifiex_process_tdls_action_frame(struct mwifiex_private *priv, break; case WLAN_EID_AID: if (priv->adapter->is_hw_11ac_capable) { - if (pos > end - 4) - return; - if (pos[1] != 2) + if (ie_len != sizeof(u16)) return; sta_ptr->tdls_cap.aid = get_unaligned_le16((pos + 2)); diff --git a/drivers/net/wireless/mediatek/mt76/airtime.c b/drivers/net/wireless/mediatek/mt76/airtime.c index 55116f395f9a..a4a785467748 100644 --- a/drivers/net/wireless/mediatek/mt76/airtime.c +++ b/drivers/net/wireless/mediatek/mt76/airtime.c @@ -242,7 +242,7 @@ u32 mt76_calc_rx_airtime(struct mt76_dev *dev, struct mt76_rx_status *status, return 0; sband = dev->hw->wiphy->bands[status->band]; - if (!sband || status->rate_idx > sband->n_bitrates) + if (!sband || status->rate_idx >= sband->n_bitrates) return 0; rate = &sband->bitrates[status->rate_idx]; diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c index b9f2a401041a..96018fd65779 100644 --- a/drivers/net/wireless/mediatek/mt76/mac80211.c +++ b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -378,7 +378,8 @@ void mt76_unregister_device(struct mt76_dev *dev) { struct ieee80211_hw *hw = dev->hw; - mt76_led_cleanup(dev); + if (IS_ENABLED(CONFIG_MT76_LEDS)) + mt76_led_cleanup(dev); mt76_tx_status_check(dev, NULL, true); ieee80211_unregister_hw(hw); } diff --git a/drivers/net/wireless/quantenna/qtnfmac/cfg80211.c b/drivers/net/wireless/quantenna/qtnfmac/cfg80211.c index 59d089e092f9..8849faa5bc10 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/cfg80211.c +++ b/drivers/net/wireless/quantenna/qtnfmac/cfg80211.c @@ -1056,7 +1056,8 @@ static void qtnf_cfg80211_reg_notifier(struct wiphy *wiphy, pr_debug("MAC%u: initiator=%d alpha=%c%c\n", mac->macid, req->initiator, req->alpha2[0], req->alpha2[1]); - ret = qtnf_cmd_reg_notify(mac, req, qtnf_mac_slave_radar_get(wiphy)); + ret = qtnf_cmd_reg_notify(mac, req, qtnf_slave_radar_get(), + qtnf_dfs_offload_get()); if (ret) { pr_err("MAC%u: failed to update region to %c%c: %d\n", mac->macid, req->alpha2[0], req->alpha2[1], ret); @@ -1078,7 +1079,8 @@ struct wiphy *qtnf_wiphy_allocate(struct qtnf_bus *bus) { struct wiphy *wiphy; - if (bus->hw_info.hw_capab & QLINK_HW_CAPAB_DFS_OFFLOAD) + if (qtnf_dfs_offload_get() && + (bus->hw_info.hw_capab & QLINK_HW_CAPAB_DFS_OFFLOAD)) qtn_cfg80211_ops.start_radar_detection = NULL; if (!(bus->hw_info.hw_capab & QLINK_HW_CAPAB_PWR_MGMT)) @@ -1163,7 +1165,8 @@ int qtnf_wiphy_register(struct qtnf_hw_info *hw_info, struct qtnf_wmac *mac) WIPHY_FLAG_NETNS_OK; wiphy->flags &= ~WIPHY_FLAG_PS_ON_BY_DEFAULT; - if (hw_info->hw_capab & QLINK_HW_CAPAB_DFS_OFFLOAD) + if (qtnf_dfs_offload_get() && + (hw_info->hw_capab & QLINK_HW_CAPAB_DFS_OFFLOAD)) wiphy_ext_feature_set(wiphy, NL80211_EXT_FEATURE_DFS_OFFLOAD); if (hw_info->hw_capab & QLINK_HW_CAPAB_SCAN_DWELL) diff --git a/drivers/net/wireless/quantenna/qtnfmac/commands.c b/drivers/net/wireless/quantenna/qtnfmac/commands.c index 548f6ff6d0f2..d0d7ec8794c4 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/commands.c +++ b/drivers/net/wireless/quantenna/qtnfmac/commands.c @@ -257,6 +257,14 @@ int qtnf_cmd_send_start_ap(struct qtnf_vif *vif, cmd->pbss = s->pbss; cmd->ht_required = s->ht_required; cmd->vht_required = s->vht_required; + cmd->twt_responder = s->twt_responder; + if (s->he_obss_pd.enable) { + cmd->sr_params.sr_control |= QLINK_SR_SRG_INFORMATION_PRESENT; + cmd->sr_params.srg_obss_pd_min_offset = + s->he_obss_pd.min_offset; + cmd->sr_params.srg_obss_pd_max_offset = + s->he_obss_pd.max_offset; + } aen = &cmd->aen; aen->auth_type = s->auth_type; @@ -510,6 +518,8 @@ qtnf_sta_info_parse_rate(struct rate_info *rate_dst, rate_dst->flags |= RATE_INFO_FLAGS_MCS; else if (rate_src->flags & QLINK_STA_INFO_RATE_FLAG_VHT_MCS) rate_dst->flags |= RATE_INFO_FLAGS_VHT_MCS; + else if (rate_src->flags & QLINK_STA_INFO_RATE_FLAG_HE_MCS) + rate_dst->flags |= RATE_INFO_FLAGS_HE_MCS; if (rate_src->flags & QLINK_STA_INFO_RATE_FLAG_SHORT_GI) rate_dst->flags |= RATE_INFO_FLAGS_SHORT_GI; @@ -2454,7 +2464,7 @@ out: } int qtnf_cmd_reg_notify(struct qtnf_wmac *mac, struct regulatory_request *req, - bool slave_radar) + bool slave_radar, bool dfs_offload) { struct wiphy *wiphy = priv_to_wiphy(mac); struct qtnf_bus *bus = mac->bus; @@ -2517,6 +2527,7 @@ int qtnf_cmd_reg_notify(struct qtnf_wmac *mac, struct regulatory_request *req, } cmd->slave_radar = slave_radar; + cmd->dfs_offload = dfs_offload; cmd->num_channels = 0; for (band = 0; band < NUM_NL80211_BANDS; band++) { diff --git a/drivers/net/wireless/quantenna/qtnfmac/commands.h b/drivers/net/wireless/quantenna/qtnfmac/commands.h index 761755bf9ede..ab273257b078 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/commands.h +++ b/drivers/net/wireless/quantenna/qtnfmac/commands.h @@ -58,7 +58,7 @@ int qtnf_cmd_send_disconnect(struct qtnf_vif *vif, int qtnf_cmd_send_updown_intf(struct qtnf_vif *vif, bool up); int qtnf_cmd_reg_notify(struct qtnf_wmac *mac, struct regulatory_request *req, - bool slave_radar); + bool slave_radar, bool dfs_offload); int qtnf_cmd_get_chan_stats(struct qtnf_wmac *mac, u16 channel, struct qtnf_chan_stats *stats); int qtnf_cmd_send_chan_switch(struct qtnf_vif *vif, diff --git a/drivers/net/wireless/quantenna/qtnfmac/core.c b/drivers/net/wireless/quantenna/qtnfmac/core.c index 648dfc38bd70..4320180f8c07 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/core.c +++ b/drivers/net/wireless/quantenna/qtnfmac/core.c @@ -21,8 +21,22 @@ static bool slave_radar = true; module_param(slave_radar, bool, 0644); MODULE_PARM_DESC(slave_radar, "set 0 to disable radar detection in slave mode"); +static bool dfs_offload; +module_param(dfs_offload, bool, 0644); +MODULE_PARM_DESC(dfs_offload, "set 1 to enable DFS offload to firmware"); + static struct dentry *qtnf_debugfs_dir; +bool qtnf_slave_radar_get(void) +{ + return slave_radar; +} + +bool qtnf_dfs_offload_get(void) +{ + return dfs_offload; +} + struct qtnf_wmac *qtnf_core_get_mac(const struct qtnf_bus *bus, u8 macid) { struct qtnf_wmac *mac = NULL; @@ -211,9 +225,6 @@ static int qtnf_netdev_port_parent_id(struct net_device *ndev, const struct qtnf_vif *vif = qtnf_netdev_get_priv(ndev); const struct qtnf_bus *bus = vif->mac->bus; - if (!(bus->hw_info.hw_capab & QLINK_HW_CAPAB_HW_BRIDGE)) - return -EOPNOTSUPP; - ppid->id_len = sizeof(bus->hw_id); memcpy(&ppid->id, bus->hw_id, ppid->id_len); @@ -456,11 +467,6 @@ static struct qtnf_wmac *qtnf_core_mac_alloc(struct qtnf_bus *bus, return mac; } -bool qtnf_mac_slave_radar_get(struct wiphy *wiphy) -{ - return slave_radar; -} - static const struct ethtool_ops qtnf_ethtool_ops = { .get_drvinfo = cfg80211_get_drvinfo, }; @@ -656,12 +662,24 @@ bool qtnf_netdev_is_qtn(const struct net_device *ndev) return ndev->netdev_ops == &qtnf_netdev_ops; } +static int qtnf_check_br_ports(struct net_device *dev, void *data) +{ + struct net_device *ndev = data; + + if (dev != ndev && netdev_port_same_parent_id(dev, ndev)) + return -ENOTSUPP; + + return 0; +} + static int qtnf_core_netdevice_event(struct notifier_block *nb, unsigned long event, void *ptr) { struct net_device *ndev = netdev_notifier_info_to_dev(ptr); const struct netdev_notifier_changeupper_info *info; + struct net_device *brdev; struct qtnf_vif *vif; + struct qtnf_bus *bus; int br_domain; int ret = 0; @@ -672,25 +690,34 @@ static int qtnf_core_netdevice_event(struct notifier_block *nb, return NOTIFY_OK; vif = qtnf_netdev_get_priv(ndev); + bus = vif->mac->bus; switch (event) { case NETDEV_CHANGEUPPER: info = ptr; + brdev = info->upper_dev; - if (!netif_is_bridge_master(info->upper_dev)) + if (!netif_is_bridge_master(brdev)) break; pr_debug("[VIF%u.%u] change bridge: %s %s\n", - vif->mac->macid, vif->vifid, - netdev_name(info->upper_dev), + vif->mac->macid, vif->vifid, netdev_name(brdev), info->linking ? "add" : "del"); - if (info->linking) - br_domain = info->upper_dev->ifindex; - else - br_domain = ndev->ifindex; + if (IS_ENABLED(CONFIG_NET_SWITCHDEV) && + (bus->hw_info.hw_capab & QLINK_HW_CAPAB_HW_BRIDGE)) { + if (info->linking) + br_domain = brdev->ifindex; + else + br_domain = ndev->ifindex; + + ret = qtnf_cmd_netdev_changeupper(vif, br_domain); + } else { + ret = netdev_walk_all_lower_dev(brdev, + qtnf_check_br_ports, + ndev); + } - ret = qtnf_cmd_netdev_changeupper(vif, br_domain); break; default: break; @@ -763,13 +790,11 @@ int qtnf_core_attach(struct qtnf_bus *bus) } } - if (bus->hw_info.hw_capab & QLINK_HW_CAPAB_HW_BRIDGE) { - bus->netdev_nb.notifier_call = qtnf_core_netdevice_event; - ret = register_netdevice_notifier(&bus->netdev_nb); - if (ret) { - pr_err("failed to register netdev notifier: %d\n", ret); - goto error; - } + bus->netdev_nb.notifier_call = qtnf_core_netdevice_event; + ret = register_netdevice_notifier(&bus->netdev_nb); + if (ret) { + pr_err("failed to register netdev notifier: %d\n", ret); + goto error; } bus->fw_state = QTNF_FW_STATE_RUNNING; diff --git a/drivers/net/wireless/quantenna/qtnfmac/core.h b/drivers/net/wireless/quantenna/qtnfmac/core.h index 116ec16aa15b..d715e1cd0006 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/core.h +++ b/drivers/net/wireless/quantenna/qtnfmac/core.h @@ -133,7 +133,8 @@ struct qtnf_vif *qtnf_mac_get_free_vif(struct qtnf_wmac *mac); struct qtnf_vif *qtnf_mac_get_base_vif(struct qtnf_wmac *mac); void qtnf_mac_iface_comb_free(struct qtnf_wmac *mac); void qtnf_mac_ext_caps_free(struct qtnf_wmac *mac); -bool qtnf_mac_slave_radar_get(struct wiphy *wiphy); +bool qtnf_slave_radar_get(void); +bool qtnf_dfs_offload_get(void); struct wiphy *qtnf_wiphy_allocate(struct qtnf_bus *bus); int qtnf_core_net_attach(struct qtnf_wmac *mac, struct qtnf_vif *priv, const char *name, unsigned char name_assign_type); diff --git a/drivers/net/wireless/quantenna/qtnfmac/qlink.h b/drivers/net/wireless/quantenna/qtnfmac/qlink.h index 75527f1bb306..b2edb03819d1 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/qlink.h +++ b/drivers/net/wireless/quantenna/qtnfmac/qlink.h @@ -6,7 +6,7 @@ #include <linux/ieee80211.h> -#define QLINK_PROTO_VER 15 +#define QLINK_PROTO_VER 16 #define QLINK_MACID_RSVD 0xFF #define QLINK_VIFID_RSVD 0xFF @@ -196,6 +196,45 @@ struct qlink_sta_info_state { __le32 value; } __packed; +/** + * enum qlink_sr_ctrl_flags - control flags for spatial reuse parameter set + * + * @QLINK_SR_PSR_DISALLOWED: indicates whether or not PSR-based spatial reuse + * transmissions are allowed for STAs associated with the AP + * @QLINK_SR_NON_SRG_OBSS_PD_SR_DISALLOWED: indicates whether or not + * Non-SRG OBSS PD spatial reuse transmissions are allowed for STAs associated + * with the AP + * @NON_SRG_OFFSET_PRESENT: indicates whether or not Non-SRG OBSS PD Max offset + * field is valid in the element + * @QLINK_SR_SRG_INFORMATION_PRESENT: indicates whether or not SRG OBSS PD + * Min/Max offset fields ore valid in the element + */ +enum qlink_sr_ctrl_flags { + QLINK_SR_PSR_DISALLOWED = BIT(0), + QLINK_SR_NON_SRG_OBSS_PD_SR_DISALLOWED = BIT(1), + QLINK_SR_NON_SRG_OFFSET_PRESENT = BIT(2), + QLINK_SR_SRG_INFORMATION_PRESENT = BIT(3), +}; + +/** + * struct qlink_sr_params - spatial reuse parameters + * + * @sr_control: spatial reuse control field; flags contained in this field are + * defined in @qlink_sr_ctrl_flags + * @non_srg_obss_pd_max: added to -82 dBm to generate the value of the + * Non-SRG OBSS PD Max parameter + * @srg_obss_pd_min_offset: added to -82 dBm to generate the value of the + * SRG OBSS PD Min parameter + * @srg_obss_pd_max_offset: added to -82 dBm to generate the value of the + * SRG PBSS PD Max parameter + */ +struct qlink_sr_params { + u8 sr_control; + u8 non_srg_obss_pd_max; + u8 srg_obss_pd_min_offset; + u8 srg_obss_pd_max_offset; +} __packed; + /* QLINK Command messages related definitions */ @@ -596,8 +635,9 @@ enum qlink_user_reg_hint_type { * of &enum qlink_user_reg_hint_type. * @num_channels: number of &struct qlink_tlv_channel in a variable portion of a * payload. - * @slave_radar: whether slave device should enable radar detection. * @dfs_region: one of &enum qlink_dfs_regions. + * @slave_radar: whether slave device should enable radar detection. + * @dfs_offload: enable or disable DFS offload to firmware. * @info: variable portion of regulatory notifier callback. */ struct qlink_cmd_reg_notify { @@ -608,7 +648,7 @@ struct qlink_cmd_reg_notify { u8 num_channels; u8 dfs_region; u8 slave_radar; - u8 rsvd[1]; + u8 dfs_offload; u8 info[0]; } __packed; @@ -650,6 +690,8 @@ enum qlink_hidden_ssid { * @ht_required: stations must support HT * @vht_required: stations must support VHT * @aen: encryption info + * @sr_params: spatial reuse parameters + * @twt_responder: enable Target Wake Time * @info: variable configurations */ struct qlink_cmd_start_ap { @@ -665,6 +707,9 @@ struct qlink_cmd_start_ap { u8 ht_required; u8 vht_required; struct qlink_auth_encr aen; + struct qlink_sr_params sr_params; + u8 twt_responder; + u8 rsvd[3]; u8 info[0]; } __packed; @@ -948,6 +993,7 @@ enum qlink_sta_info_rate_flags { QLINK_STA_INFO_RATE_FLAG_VHT_MCS = BIT(1), QLINK_STA_INFO_RATE_FLAG_SHORT_GI = BIT(2), QLINK_STA_INFO_RATE_FLAG_60G = BIT(3), + QLINK_STA_INFO_RATE_FLAG_HE_MCS = BIT(4), }; /** diff --git a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtc8192e2ant.c b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtc8192e2ant.c index 3c96c320236c..658ff425c256 100644 --- a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtc8192e2ant.c +++ b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtc8192e2ant.c @@ -862,7 +862,7 @@ static void btc8192e2ant_set_sw_rf_rx_lpf_corner(struct btc_coexist *btcoexist, /* Resume RF Rx LPF corner * After initialized, we can use coex_dm->btRf0x1eBackup */ - if (btcoexist->initilized) { + if (btcoexist->initialized) { RT_TRACE(rtlpriv, COMP_BT_COEXIST, DBG_LOUD, "[BTCoex], Resume RF Rx LPF corner!!\n"); btcoexist->btc_set_rf_reg(btcoexist, BTC_RF_A, 0x1e, diff --git a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c index 191dafd03189..a4940a3842de 100644 --- a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c +++ b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c @@ -1461,7 +1461,7 @@ void exhalbtc_init_coex_dm(struct btc_coexist *btcoexist) ex_btc8192e2ant_init_coex_dm(btcoexist); } - btcoexist->initilized = true; + btcoexist->initialized = true; } void exhalbtc_ips_notify(struct btc_coexist *btcoexist, u8 type) diff --git a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.h b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.h index 8c0a7fdbf200..a96a995dd850 100644 --- a/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.h +++ b/drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.h @@ -679,7 +679,7 @@ struct btc_coexist { bool auto_report_2ant; bool dbg_mode_1ant; bool dbg_mode_2ant; - bool initilized; + bool initialized; bool stop_coex_dm; bool manual_control; struct btc_statistics statistics; diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/phy.c b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/phy.c index 5ca900f97d66..d13983ec09ad 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/phy.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/phy.c @@ -264,7 +264,7 @@ static bool _rtl88e_check_condition(struct ieee80211_hw *hw, u32 _board = rtlefuse->board_type; /*need efuse define*/ u32 _interface = rtlhal->interface; u32 _platform = 0x08;/*SupportPlatform */ - u32 cond = condition; + u32 cond; if (condition == 0xCDCDCDCD) return true; diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c index a0eda51e833c..4865639ac9ea 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.c @@ -9,7 +9,6 @@ #include "phy.h" #include "dm.h" #include "hw.h" -#include "sw.h" #include "trx.h" #include "led.h" #include "table.h" @@ -59,7 +58,7 @@ static void rtl88e_init_aspm_vars(struct ieee80211_hw *hw) rtlpci->const_support_pciaspm = rtlpriv->cfg->mod_params->aspm_support; } -int rtl88e_init_sw_vars(struct ieee80211_hw *hw) +static int rtl88e_init_sw_vars(struct ieee80211_hw *hw) { int err = 0; struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -173,7 +172,7 @@ int rtl88e_init_sw_vars(struct ieee80211_hw *hw) return err; } -void rtl88e_deinit_sw_vars(struct ieee80211_hw *hw) +static void rtl88e_deinit_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -189,7 +188,7 @@ void rtl88e_deinit_sw_vars(struct ieee80211_hw *hw) } /* get bt coexist status */ -bool rtl88e_get_btc_status(void) +static bool rtl88e_get_btc_status(void) { return false; } diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.h deleted file mode 100644 index 1407151503f8..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8188ee/sw.h +++ /dev/null @@ -1,12 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2013 Realtek Corporation.*/ - -#ifndef __RTL92CE_SW_H__ -#define __RTL92CE_SW_H__ - -int rtl88e_init_sw_vars(struct ieee80211_hw *hw); -void rtl88e_deinit_sw_vars(struct ieee80211_hw *hw); -bool rtl88e_get_btc_status(void); - - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c index 900788e4018c..ed68c850f9a2 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.c @@ -14,7 +14,6 @@ #include "../rtl8192c/phy_common.h" #include "hw.h" #include "rf.h" -#include "sw.h" #include "trx.h" #include "led.h" @@ -65,7 +64,7 @@ static void rtl92c_init_aspm_vars(struct ieee80211_hw *hw) rtlpci->const_support_pciaspm = rtlpriv->cfg->mod_params->aspm_support; } -int rtl92c_init_sw_vars(struct ieee80211_hw *hw) +static int rtl92c_init_sw_vars(struct ieee80211_hw *hw) { int err; struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -161,7 +160,7 @@ int rtl92c_init_sw_vars(struct ieee80211_hw *hw) return 0; } -void rtl92c_deinit_sw_vars(struct ieee80211_hw *hw) +static void rtl92c_deinit_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.h deleted file mode 100644 index f2d121a60159..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ce/sw.h +++ /dev/null @@ -1,15 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2012 Realtek Corporation.*/ - -#ifndef __RTL92CE_SW_H__ -#define __RTL92CE_SW_H__ - -int rtl92c_init_sw_vars(struct ieee80211_hw *hw); -void rtl92c_deinit_sw_vars(struct ieee80211_hw *hw); -void rtl92c_init_var_map(struct ieee80211_hw *hw); -bool _rtl92ce_phy_config_bb_with_headerfile(struct ieee80211_hw *hw, - u8 configtype); -bool _rtl92ce_phy_config_bb_with_pgheaderfile(struct ieee80211_hw *hw, - u8 configtype); - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c index ab3e4aebad39..b53daf1b29f7 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.c @@ -12,7 +12,6 @@ #include "mac.h" #include "dm.h" #include "rf.h" -#include "sw.h" #include "trx.h" #include "led.h" #include "hw.h" @@ -252,45 +251,45 @@ static struct rtl_hal_cfg rtl92cu_hal_cfg = { .maps[RTL_RC_HT_RATEMCS15] = DESC_RATEMCS15, }; -#define USB_VENDER_ID_REALTEK 0x0bda +#define USB_VENDOR_ID_REALTEK 0x0bda /* 2010-10-19 DID_USB_V3.4 */ static const struct usb_device_id rtl8192c_usb_ids[] = { /*=== Realtek demoboard ===*/ /* Default ID */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x8191, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x8191, rtl92cu_hal_cfg)}, /****** 8188CU ********/ /* RTL8188CTV */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x018a, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x018a, rtl92cu_hal_cfg)}, /* 8188CE-VAU USB minCard */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x8170, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x8170, rtl92cu_hal_cfg)}, /* 8188cu 1*1 dongle */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x8176, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x8176, rtl92cu_hal_cfg)}, /* 8188cu 1*1 dongle, (b/g mode only) */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x8177, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x8177, rtl92cu_hal_cfg)}, /* 8188cu Slim Solo */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x817a, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x817a, rtl92cu_hal_cfg)}, /* 8188cu Slim Combo */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x817b, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x817b, rtl92cu_hal_cfg)}, /* 8188RU High-power USB Dongle */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x817d, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x817d, rtl92cu_hal_cfg)}, /* 8188CE-VAU USB minCard (b/g mode only) */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x817e, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x817e, rtl92cu_hal_cfg)}, /* 8188RU in Alfa AWUS036NHR */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x817f, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x817f, rtl92cu_hal_cfg)}, /* RTL8188CUS-VL */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x818a, rtl92cu_hal_cfg)}, - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x819a, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x818a, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x819a, rtl92cu_hal_cfg)}, /* 8188 Combo for BC4 */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x8754, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x8754, rtl92cu_hal_cfg)}, /****** 8192CU ********/ /* 8192cu 2*2 */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x8178, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x8178, rtl92cu_hal_cfg)}, /* 8192CE-VAU USB minCard */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x817c, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x817c, rtl92cu_hal_cfg)}, /*=== Customer ID ===*/ /****** 8188CU ********/ @@ -329,7 +328,7 @@ static const struct usb_device_id rtl8192c_usb_ids[] = { /****** 8188 RU ********/ /* Netcore */ - {RTL_USB_DEVICE(USB_VENDER_ID_REALTEK, 0x317f, rtl92cu_hal_cfg)}, + {RTL_USB_DEVICE(USB_VENDOR_ID_REALTEK, 0x317f, rtl92cu_hal_cfg)}, /****** 8188CUS Slim Solo********/ {RTL_USB_DEVICE(0x04f2, 0xaff7, rtl92cu_hal_cfg)}, /*Xavi*/ diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.h deleted file mode 100644 index 662440c4cdc9..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/sw.h +++ /dev/null @@ -1,27 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2012 Realtek Corporation.*/ - -#ifndef __RTL92CU_SW_H__ -#define __RTL92CU_SW_H__ - -#define EFUSE_MAX_SECTION 16 - -void rtl92cu_phy_rf6052_set_cck_txpower(struct ieee80211_hw *hw, - u8 *powerlevel); -void rtl92cu_phy_rf6052_set_ofdm_txpower(struct ieee80211_hw *hw, - u8 *ppowerlevel, u8 channel); -bool _rtl92cu_phy_config_bb_with_headerfile(struct ieee80211_hw *hw, - u8 configtype); -bool _rtl92cu_phy_config_bb_with_pgheaderfile(struct ieee80211_hw *hw, - u8 configtype); -void _rtl92cu_phy_lc_calibrate(struct ieee80211_hw *hw, bool is2t); -void rtl92cu_phy_set_rf_reg(struct ieee80211_hw *hw, - enum radio_path rfpath, - u32 regaddr, u32 bitmask, u32 data); -bool rtl92cu_phy_set_rf_power_state(struct ieee80211_hw *hw, - enum rf_pwrstate rfpwr_state); -u32 rtl92cu_phy_query_rf_reg(struct ieee80211_hw *hw, - enum radio_path rfpath, u32 regaddr, u32 bitmask); -void rtl92cu_phy_set_bw_mode_callback(struct ieee80211_hw *hw); - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/dm.c index 648f9108ed4b..551aa86825ed 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/dm.c @@ -12,124 +12,6 @@ #include "fw.h" #include "trx.h" -static const u32 ofdmswing_table[OFDM_TABLE_SIZE] = { - 0x7f8001fe, /* 0, +6.0dB */ - 0x788001e2, /* 1, +5.5dB */ - 0x71c001c7, /* 2, +5.0dB */ - 0x6b8001ae, /* 3, +4.5dB */ - 0x65400195, /* 4, +4.0dB */ - 0x5fc0017f, /* 5, +3.5dB */ - 0x5a400169, /* 6, +3.0dB */ - 0x55400155, /* 7, +2.5dB */ - 0x50800142, /* 8, +2.0dB */ - 0x4c000130, /* 9, +1.5dB */ - 0x47c0011f, /* 10, +1.0dB */ - 0x43c0010f, /* 11, +0.5dB */ - 0x40000100, /* 12, +0dB */ - 0x3c8000f2, /* 13, -0.5dB */ - 0x390000e4, /* 14, -1.0dB */ - 0x35c000d7, /* 15, -1.5dB */ - 0x32c000cb, /* 16, -2.0dB */ - 0x300000c0, /* 17, -2.5dB */ - 0x2d4000b5, /* 18, -3.0dB */ - 0x2ac000ab, /* 19, -3.5dB */ - 0x288000a2, /* 20, -4.0dB */ - 0x26000098, /* 21, -4.5dB */ - 0x24000090, /* 22, -5.0dB */ - 0x22000088, /* 23, -5.5dB */ - 0x20000080, /* 24, -6.0dB */ - 0x1e400079, /* 25, -6.5dB */ - 0x1c800072, /* 26, -7.0dB */ - 0x1b00006c, /* 27. -7.5dB */ - 0x19800066, /* 28, -8.0dB */ - 0x18000060, /* 29, -8.5dB */ - 0x16c0005b, /* 30, -9.0dB */ - 0x15800056, /* 31, -9.5dB */ - 0x14400051, /* 32, -10.0dB */ - 0x1300004c, /* 33, -10.5dB */ - 0x12000048, /* 34, -11.0dB */ - 0x11000044, /* 35, -11.5dB */ - 0x10000040, /* 36, -12.0dB */ - 0x0f00003c, /* 37, -12.5dB */ - 0x0e400039, /* 38, -13.0dB */ - 0x0d800036, /* 39, -13.5dB */ - 0x0cc00033, /* 40, -14.0dB */ - 0x0c000030, /* 41, -14.5dB */ - 0x0b40002d, /* 42, -15.0dB */ -}; - -static const u8 cckswing_table_ch1ch13[CCK_TABLE_SIZE][8] = { - {0x36, 0x35, 0x2e, 0x25, 0x1c, 0x12, 0x09, 0x04}, /* 0, +0dB */ - {0x33, 0x32, 0x2b, 0x23, 0x1a, 0x11, 0x08, 0x04}, /* 1, -0.5dB */ - {0x30, 0x2f, 0x29, 0x21, 0x19, 0x10, 0x08, 0x03}, /* 2, -1.0dB */ - {0x2d, 0x2d, 0x27, 0x1f, 0x18, 0x0f, 0x08, 0x03}, /* 3, -1.5dB */ - {0x2b, 0x2a, 0x25, 0x1e, 0x16, 0x0e, 0x07, 0x03}, /* 4, -2.0dB */ - {0x28, 0x28, 0x22, 0x1c, 0x15, 0x0d, 0x07, 0x03}, /* 5, -2.5dB */ - {0x26, 0x25, 0x21, 0x1b, 0x14, 0x0d, 0x06, 0x03}, /* 6, -3.0dB */ - {0x24, 0x23, 0x1f, 0x19, 0x13, 0x0c, 0x06, 0x03}, /* 7, -3.5dB */ - {0x22, 0x21, 0x1d, 0x18, 0x11, 0x0b, 0x06, 0x02}, /* 8, -4.0dB */ - {0x20, 0x20, 0x1b, 0x16, 0x11, 0x08, 0x05, 0x02}, /* 9, -4.5dB */ - {0x1f, 0x1e, 0x1a, 0x15, 0x10, 0x0a, 0x05, 0x02}, /* 10, -5.0dB */ - {0x1d, 0x1c, 0x18, 0x14, 0x0f, 0x0a, 0x05, 0x02}, /* 11, -5.5dB */ - {0x1b, 0x1a, 0x17, 0x13, 0x0e, 0x09, 0x04, 0x02}, /* 12, -6.0dB */ - {0x1a, 0x19, 0x16, 0x12, 0x0d, 0x09, 0x04, 0x02}, /* 13, -6.5dB */ - {0x18, 0x17, 0x15, 0x11, 0x0c, 0x08, 0x04, 0x02}, /* 14, -7.0dB */ - {0x17, 0x16, 0x13, 0x10, 0x0c, 0x08, 0x04, 0x02}, /* 15, -7.5dB */ - {0x16, 0x15, 0x12, 0x0f, 0x0b, 0x07, 0x04, 0x01}, /* 16, -8.0dB */ - {0x14, 0x14, 0x11, 0x0e, 0x0b, 0x07, 0x03, 0x02}, /* 17, -8.5dB */ - {0x13, 0x13, 0x10, 0x0d, 0x0a, 0x06, 0x03, 0x01}, /* 18, -9.0dB */ - {0x12, 0x12, 0x0f, 0x0c, 0x09, 0x06, 0x03, 0x01}, /* 19, -9.5dB */ - {0x11, 0x11, 0x0f, 0x0c, 0x09, 0x06, 0x03, 0x01}, /* 20, -10.0dB */ - {0x10, 0x10, 0x0e, 0x0b, 0x08, 0x05, 0x03, 0x01}, /* 21, -10.5dB */ - {0x0f, 0x0f, 0x0d, 0x0b, 0x08, 0x05, 0x03, 0x01}, /* 22, -11.0dB */ - {0x0e, 0x0e, 0x0c, 0x0a, 0x08, 0x05, 0x02, 0x01}, /* 23, -11.5dB */ - {0x0d, 0x0d, 0x0c, 0x0a, 0x07, 0x05, 0x02, 0x01}, /* 24, -12.0dB */ - {0x0d, 0x0c, 0x0b, 0x09, 0x07, 0x04, 0x02, 0x01}, /* 25, -12.5dB */ - {0x0c, 0x0c, 0x0a, 0x09, 0x06, 0x04, 0x02, 0x01}, /* 26, -13.0dB */ - {0x0b, 0x0b, 0x0a, 0x08, 0x06, 0x04, 0x02, 0x01}, /* 27, -13.5dB */ - {0x0b, 0x0a, 0x09, 0x08, 0x06, 0x04, 0x02, 0x01}, /* 28, -14.0dB */ - {0x0a, 0x0a, 0x09, 0x07, 0x05, 0x03, 0x02, 0x01}, /* 29, -14.5dB */ - {0x0a, 0x09, 0x08, 0x07, 0x05, 0x03, 0x02, 0x01}, /* 30, -15.0dB */ - {0x09, 0x09, 0x08, 0x06, 0x05, 0x03, 0x01, 0x01}, /* 31, -15.5dB */ - {0x09, 0x08, 0x07, 0x06, 0x04, 0x03, 0x01, 0x01} /* 32, -16.0dB */ -}; - -static const u8 cckswing_table_ch14[CCK_TABLE_SIZE][8] = { - {0x36, 0x35, 0x2e, 0x1b, 0x00, 0x00, 0x00, 0x00}, /* 0, +0dB */ - {0x33, 0x32, 0x2b, 0x19, 0x00, 0x00, 0x00, 0x00}, /* 1, -0.5dB */ - {0x30, 0x2f, 0x29, 0x18, 0x00, 0x00, 0x00, 0x00}, /* 2, -1.0dB */ - {0x2d, 0x2d, 0x17, 0x17, 0x00, 0x00, 0x00, 0x00}, /* 3, -1.5dB */ - {0x2b, 0x2a, 0x25, 0x15, 0x00, 0x00, 0x00, 0x00}, /* 4, -2.0dB */ - {0x28, 0x28, 0x24, 0x14, 0x00, 0x00, 0x00, 0x00}, /* 5, -2.5dB */ - {0x26, 0x25, 0x21, 0x13, 0x00, 0x00, 0x00, 0x00}, /* 6, -3.0dB */ - {0x24, 0x23, 0x1f, 0x12, 0x00, 0x00, 0x00, 0x00}, /* 7, -3.5dB */ - {0x22, 0x21, 0x1d, 0x11, 0x00, 0x00, 0x00, 0x00}, /* 8, -4.0dB */ - {0x20, 0x20, 0x1b, 0x10, 0x00, 0x00, 0x00, 0x00}, /* 9, -4.5dB */ - {0x1f, 0x1e, 0x1a, 0x0f, 0x00, 0x00, 0x00, 0x00}, /* 10, -5.0dB */ - {0x1d, 0x1c, 0x18, 0x0e, 0x00, 0x00, 0x00, 0x00}, /* 11, -5.5dB */ - {0x1b, 0x1a, 0x17, 0x0e, 0x00, 0x00, 0x00, 0x00}, /* 12, -6.0dB */ - {0x1a, 0x19, 0x16, 0x0d, 0x00, 0x00, 0x00, 0x00}, /* 13, -6.5dB */ - {0x18, 0x17, 0x15, 0x0c, 0x00, 0x00, 0x00, 0x00}, /* 14, -7.0dB */ - {0x17, 0x16, 0x13, 0x0b, 0x00, 0x00, 0x00, 0x00}, /* 15, -7.5dB */ - {0x16, 0x15, 0x12, 0x0b, 0x00, 0x00, 0x00, 0x00}, /* 16, -8.0dB */ - {0x14, 0x14, 0x11, 0x0a, 0x00, 0x00, 0x00, 0x00}, /* 17, -8.5dB */ - {0x13, 0x13, 0x10, 0x0a, 0x00, 0x00, 0x00, 0x00}, /* 18, -9.0dB */ - {0x12, 0x12, 0x0f, 0x09, 0x00, 0x00, 0x00, 0x00}, /* 19, -9.5dB */ - {0x11, 0x11, 0x0f, 0x09, 0x00, 0x00, 0x00, 0x00}, /* 20, -10.0dB */ - {0x10, 0x10, 0x0e, 0x08, 0x00, 0x00, 0x00, 0x00}, /* 21, -10.5dB */ - {0x0f, 0x0f, 0x0d, 0x08, 0x00, 0x00, 0x00, 0x00}, /* 22, -11.0dB */ - {0x0e, 0x0e, 0x0c, 0x07, 0x00, 0x00, 0x00, 0x00}, /* 23, -11.5dB */ - {0x0d, 0x0d, 0x0c, 0x07, 0x00, 0x00, 0x00, 0x00}, /* 24, -12.0dB */ - {0x0d, 0x0c, 0x0b, 0x06, 0x00, 0x00, 0x00, 0x00}, /* 25, -12.5dB */ - {0x0c, 0x0c, 0x0a, 0x06, 0x00, 0x00, 0x00, 0x00}, /* 26, -13.0dB */ - {0x0b, 0x0b, 0x0a, 0x06, 0x00, 0x00, 0x00, 0x00}, /* 27, -13.5dB */ - {0x0b, 0x0a, 0x09, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 28, -14.0dB */ - {0x0a, 0x0a, 0x09, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 29, -14.5dB */ - {0x0a, 0x09, 0x08, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 30, -15.0dB */ - {0x09, 0x09, 0x08, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 31, -15.5dB */ - {0x09, 0x08, 0x07, 0x04, 0x00, 0x00, 0x00, 0x00} /* 32, -16.0dB */ -}; - static void rtl92ee_dm_false_alarm_counter_statistics(struct ieee80211_hw *hw) { u32 ret_value; diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c index b6ee7dae5943..b337d599b6f4 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.c @@ -9,7 +9,6 @@ #include "phy.h" #include "dm.h" #include "hw.h" -#include "sw.h" #include "fw.h" #include "trx.h" #include "led.h" @@ -65,7 +64,7 @@ static void rtl92ee_init_aspm_vars(struct ieee80211_hw *hw) rtlpci->const_support_pciaspm = rtlpriv->cfg->mod_params->aspm_support; } -int rtl92ee_init_sw_vars(struct ieee80211_hw *hw) +static int rtl92ee_init_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw)); @@ -164,7 +163,7 @@ int rtl92ee_init_sw_vars(struct ieee80211_hw *hw) return 0; } -void rtl92ee_deinit_sw_vars(struct ieee80211_hw *hw) +static void rtl92ee_deinit_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -175,7 +174,7 @@ void rtl92ee_deinit_sw_vars(struct ieee80211_hw *hw) } /* get bt coexist status */ -bool rtl92ee_get_btc_status(void) +static bool rtl92ee_get_btc_status(void) { return true; } diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.h deleted file mode 100644 index 36e29a2da0fd..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/sw.h +++ /dev/null @@ -1,11 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2014 Realtek Corporation.*/ - -#ifndef __RTL92E_SW_H__ -#define __RTL92E_SW_H__ - -int rtl92ee_init_sw_vars(struct ieee80211_hw *hw); -void rtl92ee_deinit_sw_vars(struct ieee80211_hw *hw); -bool rtl92ee_get_btc_status(void); - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c index 1c7ee569f4bf..7a54497b7df2 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.c @@ -11,7 +11,6 @@ #include "dm.h" #include "fw.h" #include "hw.h" -#include "sw.h" #include "trx.h" #include "led.h" diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.h deleted file mode 100644 index a31efba0e6b5..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192se/sw.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2012 Realtek Corporation.*/ - -#ifndef __REALTEK_PCI92SE_SW_H__ -#define __REALTEK_PCI92SE_SW_H__ - -#define EFUSE_MAX_SECTION 16 - -int rtl92se_init_sw(struct ieee80211_hw *hw); -void rtl92se_deinit_sw(struct ieee80211_hw *hw); -void rtl92se_init_var_map(struct ieee80211_hw *hw); - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c index d8260c7afe09..c61a92df9d73 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/dm.c @@ -13,118 +13,6 @@ #include "fw.h" #include "hal_btc.h" -static const u32 ofdmswing_table[OFDM_TABLE_SIZE] = { - 0x7f8001fe, - 0x788001e2, - 0x71c001c7, - 0x6b8001ae, - 0x65400195, - 0x5fc0017f, - 0x5a400169, - 0x55400155, - 0x50800142, - 0x4c000130, - 0x47c0011f, - 0x43c0010f, - 0x40000100, - 0x3c8000f2, - 0x390000e4, - 0x35c000d7, - 0x32c000cb, - 0x300000c0, - 0x2d4000b5, - 0x2ac000ab, - 0x288000a2, - 0x26000098, - 0x24000090, - 0x22000088, - 0x20000080, - 0x1e400079, - 0x1c800072, - 0x1b00006c, - 0x19800066, - 0x18000060, - 0x16c0005b, - 0x15800056, - 0x14400051, - 0x1300004c, - 0x12000048, - 0x11000044, - 0x10000040, -}; - -static const u8 cckswing_table_ch1ch13[CCK_TABLE_SIZE][8] = { - {0x36, 0x35, 0x2e, 0x25, 0x1c, 0x12, 0x09, 0x04}, - {0x33, 0x32, 0x2b, 0x23, 0x1a, 0x11, 0x08, 0x04}, - {0x30, 0x2f, 0x29, 0x21, 0x19, 0x10, 0x08, 0x03}, - {0x2d, 0x2d, 0x27, 0x1f, 0x18, 0x0f, 0x08, 0x03}, - {0x2b, 0x2a, 0x25, 0x1e, 0x16, 0x0e, 0x07, 0x03}, - {0x28, 0x28, 0x22, 0x1c, 0x15, 0x0d, 0x07, 0x03}, - {0x26, 0x25, 0x21, 0x1b, 0x14, 0x0d, 0x06, 0x03}, - {0x24, 0x23, 0x1f, 0x19, 0x13, 0x0c, 0x06, 0x03}, - {0x22, 0x21, 0x1d, 0x18, 0x11, 0x0b, 0x06, 0x02}, - {0x20, 0x20, 0x1b, 0x16, 0x11, 0x08, 0x05, 0x02}, - {0x1f, 0x1e, 0x1a, 0x15, 0x10, 0x0a, 0x05, 0x02}, - {0x1d, 0x1c, 0x18, 0x14, 0x0f, 0x0a, 0x05, 0x02}, - {0x1b, 0x1a, 0x17, 0x13, 0x0e, 0x09, 0x04, 0x02}, - {0x1a, 0x19, 0x16, 0x12, 0x0d, 0x09, 0x04, 0x02}, - {0x18, 0x17, 0x15, 0x11, 0x0c, 0x08, 0x04, 0x02}, - {0x17, 0x16, 0x13, 0x10, 0x0c, 0x08, 0x04, 0x02}, - {0x16, 0x15, 0x12, 0x0f, 0x0b, 0x07, 0x04, 0x01}, - {0x14, 0x14, 0x11, 0x0e, 0x0b, 0x07, 0x03, 0x02}, - {0x13, 0x13, 0x10, 0x0d, 0x0a, 0x06, 0x03, 0x01}, - {0x12, 0x12, 0x0f, 0x0c, 0x09, 0x06, 0x03, 0x01}, - {0x11, 0x11, 0x0f, 0x0c, 0x09, 0x06, 0x03, 0x01}, - {0x10, 0x10, 0x0e, 0x0b, 0x08, 0x05, 0x03, 0x01}, - {0x0f, 0x0f, 0x0d, 0x0b, 0x08, 0x05, 0x03, 0x01}, - {0x0e, 0x0e, 0x0c, 0x0a, 0x08, 0x05, 0x02, 0x01}, - {0x0d, 0x0d, 0x0c, 0x0a, 0x07, 0x05, 0x02, 0x01}, - {0x0d, 0x0c, 0x0b, 0x09, 0x07, 0x04, 0x02, 0x01}, - {0x0c, 0x0c, 0x0a, 0x09, 0x06, 0x04, 0x02, 0x01}, - {0x0b, 0x0b, 0x0a, 0x08, 0x06, 0x04, 0x02, 0x01}, - {0x0b, 0x0a, 0x09, 0x08, 0x06, 0x04, 0x02, 0x01}, - {0x0a, 0x0a, 0x09, 0x07, 0x05, 0x03, 0x02, 0x01}, - {0x0a, 0x09, 0x08, 0x07, 0x05, 0x03, 0x02, 0x01}, - {0x09, 0x09, 0x08, 0x06, 0x05, 0x03, 0x01, 0x01}, - {0x09, 0x08, 0x07, 0x06, 0x04, 0x03, 0x01, 0x01} -}; - -static const u8 cckswing_table_ch14[CCK_TABLE_SIZE][8] = { - {0x36, 0x35, 0x2e, 0x1b, 0x00, 0x00, 0x00, 0x00}, - {0x33, 0x32, 0x2b, 0x19, 0x00, 0x00, 0x00, 0x00}, - {0x30, 0x2f, 0x29, 0x18, 0x00, 0x00, 0x00, 0x00}, - {0x2d, 0x2d, 0x17, 0x17, 0x00, 0x00, 0x00, 0x00}, - {0x2b, 0x2a, 0x25, 0x15, 0x00, 0x00, 0x00, 0x00}, - {0x28, 0x28, 0x24, 0x14, 0x00, 0x00, 0x00, 0x00}, - {0x26, 0x25, 0x21, 0x13, 0x00, 0x00, 0x00, 0x00}, - {0x24, 0x23, 0x1f, 0x12, 0x00, 0x00, 0x00, 0x00}, - {0x22, 0x21, 0x1d, 0x11, 0x00, 0x00, 0x00, 0x00}, - {0x20, 0x20, 0x1b, 0x10, 0x00, 0x00, 0x00, 0x00}, - {0x1f, 0x1e, 0x1a, 0x0f, 0x00, 0x00, 0x00, 0x00}, - {0x1d, 0x1c, 0x18, 0x0e, 0x00, 0x00, 0x00, 0x00}, - {0x1b, 0x1a, 0x17, 0x0e, 0x00, 0x00, 0x00, 0x00}, - {0x1a, 0x19, 0x16, 0x0d, 0x00, 0x00, 0x00, 0x00}, - {0x18, 0x17, 0x15, 0x0c, 0x00, 0x00, 0x00, 0x00}, - {0x17, 0x16, 0x13, 0x0b, 0x00, 0x00, 0x00, 0x00}, - {0x16, 0x15, 0x12, 0x0b, 0x00, 0x00, 0x00, 0x00}, - {0x14, 0x14, 0x11, 0x0a, 0x00, 0x00, 0x00, 0x00}, - {0x13, 0x13, 0x10, 0x0a, 0x00, 0x00, 0x00, 0x00}, - {0x12, 0x12, 0x0f, 0x09, 0x00, 0x00, 0x00, 0x00}, - {0x11, 0x11, 0x0f, 0x09, 0x00, 0x00, 0x00, 0x00}, - {0x10, 0x10, 0x0e, 0x08, 0x00, 0x00, 0x00, 0x00}, - {0x0f, 0x0f, 0x0d, 0x08, 0x00, 0x00, 0x00, 0x00}, - {0x0e, 0x0e, 0x0c, 0x07, 0x00, 0x00, 0x00, 0x00}, - {0x0d, 0x0d, 0x0c, 0x07, 0x00, 0x00, 0x00, 0x00}, - {0x0d, 0x0c, 0x0b, 0x06, 0x00, 0x00, 0x00, 0x00}, - {0x0c, 0x0c, 0x0a, 0x06, 0x00, 0x00, 0x00, 0x00}, - {0x0b, 0x0b, 0x0a, 0x06, 0x00, 0x00, 0x00, 0x00}, - {0x0b, 0x0a, 0x09, 0x05, 0x00, 0x00, 0x00, 0x00}, - {0x0a, 0x0a, 0x09, 0x05, 0x00, 0x00, 0x00, 0x00}, - {0x0a, 0x09, 0x08, 0x05, 0x00, 0x00, 0x00, 0x00}, - {0x09, 0x09, 0x08, 0x05, 0x00, 0x00, 0x00, 0x00}, - {0x09, 0x08, 0x07, 0x04, 0x00, 0x00, 0x00, 0x00} -}; - static u8 rtl8723e_dm_initial_gain_min_pwdb(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c index 5702ac6deebf..ea86d5bf33d2 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.c @@ -11,7 +11,6 @@ #include "fw.h" #include "../rtl8723com/fw_common.h" #include "hw.h" -#include "sw.h" #include "trx.h" #include "led.h" #include "table.h" @@ -67,7 +66,7 @@ static void rtl8723e_init_aspm_vars(struct ieee80211_hw *hw) rtlpci->const_support_pciaspm = rtlpriv->cfg->mod_params->aspm_support; } -int rtl8723e_init_sw_vars(struct ieee80211_hw *hw) +static int rtl8723e_init_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw)); @@ -166,7 +165,7 @@ int rtl8723e_init_sw_vars(struct ieee80211_hw *hw) return 0; } -void rtl8723e_deinit_sw_vars(struct ieee80211_hw *hw) +static void rtl8723e_deinit_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -177,7 +176,7 @@ void rtl8723e_deinit_sw_vars(struct ieee80211_hw *hw) } /* get bt coexist status */ -bool rtl8723e_get_btc_status(void) +static bool rtl8723e_get_btc_status(void) { return true; } diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.h deleted file mode 100644 index 200ce39a0dd7..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723ae/sw.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2012 Realtek Corporation.*/ - -#ifndef __RTL8723E_SW_H__ -#define __RTL8723E_SW_H__ - -int rtl8723e_init_sw_vars(struct ieee80211_hw *hw); -void rtl8723e_deinit_sw_vars(struct ieee80211_hw *hw); -void rtl8723e_init_var_map(struct ieee80211_hw *hw); -bool rtl8723e_get_btc_status(void); - - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c index 3c8528f0ecb3..36209ac5b208 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.c @@ -13,7 +13,6 @@ #include "hw.h" #include "fw.h" #include "../rtl8723com/fw_common.h" -#include "sw.h" #include "trx.h" #include "led.h" #include "table.h" @@ -64,7 +63,7 @@ static void rtl8723be_init_aspm_vars(struct ieee80211_hw *hw) rtlpci->const_support_pciaspm = rtlpriv->cfg->mod_params->aspm_support; } -int rtl8723be_init_sw_vars(struct ieee80211_hw *hw) +static int rtl8723be_init_sw_vars(struct ieee80211_hw *hw) { int err = 0; struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -170,7 +169,7 @@ int rtl8723be_init_sw_vars(struct ieee80211_hw *hw) return 0; } -void rtl8723be_deinit_sw_vars(struct ieee80211_hw *hw) +static void rtl8723be_deinit_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -181,7 +180,7 @@ void rtl8723be_deinit_sw_vars(struct ieee80211_hw *hw) } /* get bt coexist status */ -bool rtl8723be_get_btc_status(void) +static bool rtl8723be_get_btc_status(void) { return true; } diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.h deleted file mode 100644 index 6ecacf9fbfd7..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8723be/sw.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2014 Realtek Corporation.*/ - -#ifndef __RTL8723BE_SW_H__ -#define __RTL8723BE_SW_H__ - -int rtl8723be_init_sw_vars(struct ieee80211_hw *hw); -void rtl8723be_deinit_sw_vars(struct ieee80211_hw *hw); -void rtl8723be_init_var_map(struct ieee80211_hw *hw); -bool rtl8723be_get_btc_status(void); - - -#endif diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/dm.c b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/dm.c index b54230433a6b..f57e8794f0ec 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/dm.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/dm.c @@ -93,124 +93,6 @@ static const u32 rtl8821ae_txscaling_table[TXSCALE_TABLE_SIZE] = { 0x3FE /* 36, +6.0dB */ }; -static const u32 ofdmswing_table[] = { - 0x0b40002d, /* 0, -15.0dB */ - 0x0c000030, /* 1, -14.5dB */ - 0x0cc00033, /* 2, -14.0dB */ - 0x0d800036, /* 3, -13.5dB */ - 0x0e400039, /* 4, -13.0dB */ - 0x0f00003c, /* 5, -12.5dB */ - 0x10000040, /* 6, -12.0dB */ - 0x11000044, /* 7, -11.5dB */ - 0x12000048, /* 8, -11.0dB */ - 0x1300004c, /* 9, -10.5dB */ - 0x14400051, /* 10, -10.0dB */ - 0x15800056, /* 11, -9.5dB */ - 0x16c0005b, /* 12, -9.0dB */ - 0x18000060, /* 13, -8.5dB */ - 0x19800066, /* 14, -8.0dB */ - 0x1b00006c, /* 15, -7.5dB */ - 0x1c800072, /* 16, -7.0dB */ - 0x1e400079, /* 17, -6.5dB */ - 0x20000080, /* 18, -6.0dB */ - 0x22000088, /* 19, -5.5dB */ - 0x24000090, /* 20, -5.0dB */ - 0x26000098, /* 21, -4.5dB */ - 0x288000a2, /* 22, -4.0dB */ - 0x2ac000ab, /* 23, -3.5dB */ - 0x2d4000b5, /* 24, -3.0dB */ - 0x300000c0, /* 25, -2.5dB */ - 0x32c000cb, /* 26, -2.0dB */ - 0x35c000d7, /* 27, -1.5dB */ - 0x390000e4, /* 28, -1.0dB */ - 0x3c8000f2, /* 29, -0.5dB */ - 0x40000100, /* 30, +0dB */ - 0x43c0010f, /* 31, +0.5dB */ - 0x47c0011f, /* 32, +1.0dB */ - 0x4c000130, /* 33, +1.5dB */ - 0x50800142, /* 34, +2.0dB */ - 0x55400155, /* 35, +2.5dB */ - 0x5a400169, /* 36, +3.0dB */ - 0x5fc0017f, /* 37, +3.5dB */ - 0x65400195, /* 38, +4.0dB */ - 0x6b8001ae, /* 39, +4.5dB */ - 0x71c001c7, /* 40, +5.0dB */ - 0x788001e2, /* 41, +5.5dB */ - 0x7f8001fe /* 42, +6.0dB */ -}; - -static const u8 cckswing_table_ch1ch13[CCK_TABLE_SIZE][8] = { - {0x09, 0x08, 0x07, 0x06, 0x04, 0x03, 0x01, 0x01}, /* 0, -16.0dB */ - {0x09, 0x09, 0x08, 0x06, 0x05, 0x03, 0x01, 0x01}, /* 1, -15.5dB */ - {0x0a, 0x09, 0x08, 0x07, 0x05, 0x03, 0x02, 0x01}, /* 2, -15.0dB */ - {0x0a, 0x0a, 0x09, 0x07, 0x05, 0x03, 0x02, 0x01}, /* 3, -14.5dB */ - {0x0b, 0x0a, 0x09, 0x08, 0x06, 0x04, 0x02, 0x01}, /* 4, -14.0dB */ - {0x0b, 0x0b, 0x0a, 0x08, 0x06, 0x04, 0x02, 0x01}, /* 5, -13.5dB */ - {0x0c, 0x0c, 0x0a, 0x09, 0x06, 0x04, 0x02, 0x01}, /* 6, -13.0dB */ - {0x0d, 0x0c, 0x0b, 0x09, 0x07, 0x04, 0x02, 0x01}, /* 7, -12.5dB */ - {0x0d, 0x0d, 0x0c, 0x0a, 0x07, 0x05, 0x02, 0x01}, /* 8, -12.0dB */ - {0x0e, 0x0e, 0x0c, 0x0a, 0x08, 0x05, 0x02, 0x01}, /* 9, -11.5dB */ - {0x0f, 0x0f, 0x0d, 0x0b, 0x08, 0x05, 0x03, 0x01}, /* 10, -11.0dB */ - {0x10, 0x10, 0x0e, 0x0b, 0x08, 0x05, 0x03, 0x01}, /* 11, -10.5dB */ - {0x11, 0x11, 0x0f, 0x0c, 0x09, 0x06, 0x03, 0x01}, /* 12, -10.0dB */ - {0x12, 0x12, 0x0f, 0x0c, 0x09, 0x06, 0x03, 0x01}, /* 13, -9.5dB */ - {0x13, 0x13, 0x10, 0x0d, 0x0a, 0x06, 0x03, 0x01}, /* 14, -9.0dB */ - {0x14, 0x14, 0x11, 0x0e, 0x0b, 0x07, 0x03, 0x02}, /* 15, -8.5dB */ - {0x16, 0x15, 0x12, 0x0f, 0x0b, 0x07, 0x04, 0x01}, /* 16, -8.0dB */ - {0x17, 0x16, 0x13, 0x10, 0x0c, 0x08, 0x04, 0x02}, /* 17, -7.5dB */ - {0x18, 0x17, 0x15, 0x11, 0x0c, 0x08, 0x04, 0x02}, /* 18, -7.0dB */ - {0x1a, 0x19, 0x16, 0x12, 0x0d, 0x09, 0x04, 0x02}, /* 19, -6.5dB */ - {0x1b, 0x1a, 0x17, 0x13, 0x0e, 0x09, 0x04, 0x02}, /* 20, -6.0dB */ - {0x1d, 0x1c, 0x18, 0x14, 0x0f, 0x0a, 0x05, 0x02}, /* 21, -5.5dB */ - {0x1f, 0x1e, 0x1a, 0x15, 0x10, 0x0a, 0x05, 0x02}, /* 22, -5.0dB */ - {0x20, 0x20, 0x1b, 0x16, 0x11, 0x08, 0x05, 0x02}, /* 23, -4.5dB */ - {0x22, 0x21, 0x1d, 0x18, 0x11, 0x0b, 0x06, 0x02}, /* 24, -4.0dB */ - {0x24, 0x23, 0x1f, 0x19, 0x13, 0x0c, 0x06, 0x03}, /* 25, -3.5dB */ - {0x26, 0x25, 0x21, 0x1b, 0x14, 0x0d, 0x06, 0x03}, /* 26, -3.0dB */ - {0x28, 0x28, 0x22, 0x1c, 0x15, 0x0d, 0x07, 0x03}, /* 27, -2.5dB */ - {0x2b, 0x2a, 0x25, 0x1e, 0x16, 0x0e, 0x07, 0x03}, /* 28, -2.0dB */ - {0x2d, 0x2d, 0x27, 0x1f, 0x18, 0x0f, 0x08, 0x03}, /* 29, -1.5dB */ - {0x30, 0x2f, 0x29, 0x21, 0x19, 0x10, 0x08, 0x03}, /* 30, -1.0dB */ - {0x33, 0x32, 0x2b, 0x23, 0x1a, 0x11, 0x08, 0x04}, /* 31, -0.5dB */ - {0x36, 0x35, 0x2e, 0x25, 0x1c, 0x12, 0x09, 0x04} /* 32, +0dB */ -}; - -static const u8 cckswing_table_ch14[CCK_TABLE_SIZE][8] = { - {0x09, 0x08, 0x07, 0x04, 0x00, 0x00, 0x00, 0x00}, /* 0, -16.0dB */ - {0x09, 0x09, 0x08, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 1, -15.5dB */ - {0x0a, 0x09, 0x08, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 2, -15.0dB */ - {0x0a, 0x0a, 0x09, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 3, -14.5dB */ - {0x0b, 0x0a, 0x09, 0x05, 0x00, 0x00, 0x00, 0x00}, /* 4, -14.0dB */ - {0x0b, 0x0b, 0x0a, 0x06, 0x00, 0x00, 0x00, 0x00}, /* 5, -13.5dB */ - {0x0c, 0x0c, 0x0a, 0x06, 0x00, 0x00, 0x00, 0x00}, /* 6, -13.0dB */ - {0x0d, 0x0c, 0x0b, 0x06, 0x00, 0x00, 0x00, 0x00}, /* 7, -12.5dB */ - {0x0d, 0x0d, 0x0c, 0x07, 0x00, 0x00, 0x00, 0x00}, /* 8, -12.0dB */ - {0x0e, 0x0e, 0x0c, 0x07, 0x00, 0x00, 0x00, 0x00}, /* 9, -11.5dB */ - {0x0f, 0x0f, 0x0d, 0x08, 0x00, 0x00, 0x00, 0x00}, /* 10, -11.0dB */ - {0x10, 0x10, 0x0e, 0x08, 0x00, 0x00, 0x00, 0x00}, /* 11, -10.5dB */ - {0x11, 0x11, 0x0f, 0x09, 0x00, 0x00, 0x00, 0x00}, /* 12, -10.0dB */ - {0x12, 0x12, 0x0f, 0x09, 0x00, 0x00, 0x00, 0x00}, /* 13, -9.5dB */ - {0x13, 0x13, 0x10, 0x0a, 0x00, 0x00, 0x00, 0x00}, /* 14, -9.0dB */ - {0x14, 0x14, 0x11, 0x0a, 0x00, 0x00, 0x00, 0x00}, /* 15, -8.5dB */ - {0x16, 0x15, 0x12, 0x0b, 0x00, 0x00, 0x00, 0x00}, /* 16, -8.0dB */ - {0x17, 0x16, 0x13, 0x0b, 0x00, 0x00, 0x00, 0x00}, /* 17, -7.5dB */ - {0x18, 0x17, 0x15, 0x0c, 0x00, 0x00, 0x00, 0x00}, /* 18, -7.0dB */ - {0x1a, 0x19, 0x16, 0x0d, 0x00, 0x00, 0x00, 0x00}, /* 19, -6.5dB */ - {0x1b, 0x1a, 0x17, 0x0e, 0x00, 0x00, 0x00, 0x00}, /* 20, -6.0dB */ - {0x1d, 0x1c, 0x18, 0x0e, 0x00, 0x00, 0x00, 0x00}, /* 21, -5.5dB */ - {0x1f, 0x1e, 0x1a, 0x0f, 0x00, 0x00, 0x00, 0x00}, /* 22, -5.0dB */ - {0x20, 0x20, 0x1b, 0x10, 0x00, 0x00, 0x00, 0x00}, /* 23, -4.5dB */ - {0x22, 0x21, 0x1d, 0x11, 0x00, 0x00, 0x00, 0x00}, /* 24, -4.0dB */ - {0x24, 0x23, 0x1f, 0x12, 0x00, 0x00, 0x00, 0x00}, /* 25, -3.5dB */ - {0x26, 0x25, 0x21, 0x13, 0x00, 0x00, 0x00, 0x00}, /* 26, -3.0dB */ - {0x28, 0x28, 0x24, 0x14, 0x00, 0x00, 0x00, 0x00}, /* 27, -2.5dB */ - {0x2b, 0x2a, 0x25, 0x15, 0x00, 0x00, 0x00, 0x00}, /* 28, -2.0dB */ - {0x2d, 0x2d, 0x17, 0x17, 0x00, 0x00, 0x00, 0x00}, /* 29, -1.5dB */ - {0x30, 0x2f, 0x29, 0x18, 0x00, 0x00, 0x00, 0x00}, /* 30, -1.0dB */ - {0x33, 0x32, 0x2b, 0x19, 0x00, 0x00, 0x00, 0x00}, /* 31, -0.5dB */ - {0x36, 0x35, 0x2e, 0x1b, 0x00, 0x00, 0x00, 0x00} /* 32, +0dB */ -}; - static const u32 edca_setting_dl[PEER_MAX] = { 0xa44f, /* 0 UNKNOWN */ 0x5ea44f, /* 1 REALTEK_90 */ diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c index 3def6a2b3450..d8df816753cb 100644 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.c @@ -10,7 +10,6 @@ #include "dm.h" #include "hw.h" #include "fw.h" -#include "sw.h" #include "trx.h" #include "led.h" #include "table.h" @@ -65,7 +64,7 @@ static void rtl8821ae_init_aspm_vars(struct ieee80211_hw *hw) } /*InitializeVariables8812E*/ -int rtl8821ae_init_sw_vars(struct ieee80211_hw *hw) +static int rtl8821ae_init_sw_vars(struct ieee80211_hw *hw) { int err = 0; struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -211,7 +210,7 @@ int rtl8821ae_init_sw_vars(struct ieee80211_hw *hw) return 0; } -void rtl8821ae_deinit_sw_vars(struct ieee80211_hw *hw) +static void rtl8821ae_deinit_sw_vars(struct ieee80211_hw *hw) { struct rtl_priv *rtlpriv = rtl_priv(hw); @@ -228,7 +227,7 @@ void rtl8821ae_deinit_sw_vars(struct ieee80211_hw *hw) } /* get bt coexist status */ -bool rtl8821ae_get_btc_status(void) +static bool rtl8821ae_get_btc_status(void) { return true; } diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.h b/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.h deleted file mode 100644 index 9d7610f84b20..000000000000 --- a/drivers/net/wireless/realtek/rtlwifi/rtl8821ae/sw.h +++ /dev/null @@ -1,12 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 2009-2010 Realtek Corporation.*/ - -#ifndef __RTL8821AE_SW_H__ -#define __RTL8821AE_SW_H__ - -int rtl8821ae_init_sw_vars(struct ieee80211_hw *hw); -void rtl8821ae_deinit_sw_vars(struct ieee80211_hw *hw); -void rtl8821ae_init_var_map(struct ieee80211_hw *hw); -bool rtl8821ae_get_btc_status(void); - -#endif diff --git a/drivers/net/wireless/realtek/rtw88/Makefile b/drivers/net/wireless/realtek/rtw88/Makefile index 15e12155a04c..cac148d13cf1 100644 --- a/drivers/net/wireless/realtek/rtw88/Makefile +++ b/drivers/net/wireless/realtek/rtw88/Makefile @@ -15,6 +15,7 @@ rtw88-y += main.o \ ps.o \ sec.o \ bf.o \ + wow.o \ regd.o rtw88-$(CONFIG_RTW88_8822BE) += rtw8822b.o rtw8822b_table.o diff --git a/drivers/net/wireless/realtek/rtw88/debug.h b/drivers/net/wireless/realtek/rtw88/debug.h index cd28f675e9cb..a0f36f29b4a6 100644 --- a/drivers/net/wireless/realtek/rtw88/debug.h +++ b/drivers/net/wireless/realtek/rtw88/debug.h @@ -18,6 +18,7 @@ enum rtw_debug_mask { RTW_DBG_DEBUGFS = 0x00000200, RTW_DBG_PS = 0x00000400, RTW_DBG_BF = 0x00000800, + RTW_DBG_WOW = 0x00001000, RTW_DBG_ALL = 0xffffffff }; diff --git a/drivers/net/wireless/realtek/rtw88/fw.c b/drivers/net/wireless/realtek/rtw88/fw.c index b8c581161f61..243441453ead 100644 --- a/drivers/net/wireless/realtek/rtw88/fw.c +++ b/drivers/net/wireless/realtek/rtw88/fw.c @@ -10,6 +10,7 @@ #include "sec.h" #include "debug.h" #include "util.h" +#include "wow.h" static void rtw_fw_c2h_cmd_handle_ext(struct rtw_dev *rtwdev, struct sk_buff *skb) @@ -482,6 +483,96 @@ void rtw_fw_set_pwr_mode(struct rtw_dev *rtwdev) rtw_fw_send_h2c_command(rtwdev, h2c_pkt); } +void rtw_fw_set_keep_alive_cmd(struct rtw_dev *rtwdev, bool enable) +{ + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + struct rtw_fw_wow_keep_alive_para mode = { + .adopt = true, + .pkt_type = KEEP_ALIVE_NULL_PKT, + .period = 5, + }; + + SET_H2C_CMD_ID_CLASS(h2c_pkt, H2C_CMD_KEEP_ALIVE); + SET_KEEP_ALIVE_ENABLE(h2c_pkt, enable); + SET_KEEP_ALIVE_ADOPT(h2c_pkt, mode.adopt); + SET_KEEP_ALIVE_PKT_TYPE(h2c_pkt, mode.pkt_type); + SET_KEEP_ALIVE_CHECK_PERIOD(h2c_pkt, mode.period); + + rtw_fw_send_h2c_command(rtwdev, h2c_pkt); +} + +void rtw_fw_set_disconnect_decision_cmd(struct rtw_dev *rtwdev, bool enable) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + struct rtw_fw_wow_disconnect_para mode = { + .adopt = true, + .period = 30, + .retry_count = 5, + }; + + SET_H2C_CMD_ID_CLASS(h2c_pkt, H2C_CMD_DISCONNECT_DECISION); + + if (test_bit(RTW_WOW_FLAG_EN_DISCONNECT, rtw_wow->flags)) { + SET_DISCONNECT_DECISION_ENABLE(h2c_pkt, enable); + SET_DISCONNECT_DECISION_ADOPT(h2c_pkt, mode.adopt); + SET_DISCONNECT_DECISION_CHECK_PERIOD(h2c_pkt, mode.period); + SET_DISCONNECT_DECISION_TRY_PKT_NUM(h2c_pkt, mode.retry_count); + } + + rtw_fw_send_h2c_command(rtwdev, h2c_pkt); +} + +void rtw_fw_set_wowlan_ctrl_cmd(struct rtw_dev *rtwdev, bool enable) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + + SET_H2C_CMD_ID_CLASS(h2c_pkt, H2C_CMD_WOWLAN); + + SET_WOWLAN_FUNC_ENABLE(h2c_pkt, enable); + if (rtw_wow_mgd_linked(rtwdev)) { + if (test_bit(RTW_WOW_FLAG_EN_MAGIC_PKT, rtw_wow->flags)) + SET_WOWLAN_MAGIC_PKT_ENABLE(h2c_pkt, enable); + if (test_bit(RTW_WOW_FLAG_EN_DISCONNECT, rtw_wow->flags)) + SET_WOWLAN_DEAUTH_WAKEUP_ENABLE(h2c_pkt, enable); + if (test_bit(RTW_WOW_FLAG_EN_REKEY_PKT, rtw_wow->flags)) + SET_WOWLAN_REKEY_WAKEUP_ENABLE(h2c_pkt, enable); + if (rtw_wow->pattern_cnt) + SET_WOWLAN_PATTERN_MATCH_ENABLE(h2c_pkt, enable); + } + + rtw_fw_send_h2c_command(rtwdev, h2c_pkt); +} + +void rtw_fw_set_aoac_global_info_cmd(struct rtw_dev *rtwdev, + u8 pairwise_key_enc, + u8 group_key_enc) +{ + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + + SET_H2C_CMD_ID_CLASS(h2c_pkt, H2C_CMD_AOAC_GLOBAL_INFO); + + SET_AOAC_GLOBAL_INFO_PAIRWISE_ENC_ALG(h2c_pkt, pairwise_key_enc); + SET_AOAC_GLOBAL_INFO_GROUP_ENC_ALG(h2c_pkt, group_key_enc); + + rtw_fw_send_h2c_command(rtwdev, h2c_pkt); +} + +void rtw_fw_set_remote_wake_ctrl_cmd(struct rtw_dev *rtwdev, bool enable) +{ + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + + SET_H2C_CMD_ID_CLASS(h2c_pkt, H2C_CMD_REMOTE_WAKE_CTRL); + + SET_REMOTE_WAKECTRL_ENABLE(h2c_pkt, enable); + + if (rtw_wow_no_link(rtwdev)) + SET_REMOTE_WAKE_CTRL_NLO_OFFLOAD_EN(h2c_pkt, enable); + + rtw_fw_send_h2c_command(rtwdev, h2c_pkt); +} + static u8 rtw_get_rsvd_page_location(struct rtw_dev *rtwdev, enum rtw_rsvd_packet_type type) { @@ -496,6 +587,26 @@ static u8 rtw_get_rsvd_page_location(struct rtw_dev *rtwdev, return location; } +void rtw_fw_set_nlo_info(struct rtw_dev *rtwdev, bool enable) +{ + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + u8 loc_nlo; + + loc_nlo = rtw_get_rsvd_page_location(rtwdev, RSVD_NLO_INFO); + + SET_H2C_CMD_ID_CLASS(h2c_pkt, H2C_CMD_NLO_INFO); + + SET_NLO_FUN_EN(h2c_pkt, enable); + if (enable) { + if (rtw_fw_lps_deep_mode) + SET_NLO_PS_32K(h2c_pkt, enable); + SET_NLO_IGNORE_SECURITY(h2c_pkt, enable); + SET_NLO_LOC_NLO_INFO(h2c_pkt, loc_nlo); + } + + rtw_fw_send_h2c_command(rtwdev, h2c_pkt); +} + void rtw_fw_set_pg_info(struct rtw_dev *rtwdev) { struct rtw_lps_conf *conf = &rtwdev->lps_conf; @@ -510,10 +621,45 @@ void rtw_fw_set_pg_info(struct rtw_dev *rtwdev) LPS_PG_INFO_LOC(h2c_pkt, loc_pg); LPS_PG_DPK_LOC(h2c_pkt, loc_dpk); LPS_PG_SEC_CAM_EN(h2c_pkt, conf->sec_cam_backup); + LPS_PG_PATTERN_CAM_EN(h2c_pkt, conf->pattern_cam_backup); rtw_fw_send_h2c_command(rtwdev, h2c_pkt); } +u8 rtw_get_rsvd_page_probe_req_location(struct rtw_dev *rtwdev, + struct cfg80211_ssid *ssid) +{ + struct rtw_rsvd_page *rsvd_pkt; + u8 location = 0; + + list_for_each_entry(rsvd_pkt, &rtwdev->rsvd_page_list, list) { + if (rsvd_pkt->type != RSVD_PROBE_REQ) + continue; + if ((!ssid && !rsvd_pkt->ssid) || + rtw_ssid_equal(rsvd_pkt->ssid, ssid)) + location = rsvd_pkt->page; + } + + return location; +} + +u16 rtw_get_rsvd_page_probe_req_size(struct rtw_dev *rtwdev, + struct cfg80211_ssid *ssid) +{ + struct rtw_rsvd_page *rsvd_pkt; + u16 size = 0; + + list_for_each_entry(rsvd_pkt, &rtwdev->rsvd_page_list, list) { + if (rsvd_pkt->type != RSVD_PROBE_REQ) + continue; + if ((!ssid && !rsvd_pkt->ssid) || + rtw_ssid_equal(rsvd_pkt->ssid, ssid)) + size = rsvd_pkt->skb->len; + } + + return size; +} + void rtw_send_rsvd_page_h2c(struct rtw_dev *rtwdev) { u8 h2c_pkt[H2C_PKT_SIZE] = {0}; @@ -559,6 +705,95 @@ rtw_beacon_get(struct ieee80211_hw *hw, struct ieee80211_vif *vif) return skb_new; } +static struct sk_buff *rtw_nlo_info_get(struct ieee80211_hw *hw) +{ + struct rtw_dev *rtwdev = hw->priv; + struct rtw_chip_info *chip = rtwdev->chip; + struct rtw_pno_request *pno_req = &rtwdev->wow.pno_req; + struct rtw_nlo_info_hdr *nlo_hdr; + struct cfg80211_ssid *ssid; + struct sk_buff *skb; + u8 *pos, loc; + u32 size; + int i; + + if (!pno_req->inited || !pno_req->match_set_cnt) + return NULL; + + size = sizeof(struct rtw_nlo_info_hdr) + pno_req->match_set_cnt * + IEEE80211_MAX_SSID_LEN + chip->tx_pkt_desc_sz; + + skb = alloc_skb(size, GFP_KERNEL); + if (!skb) + return NULL; + + skb_reserve(skb, chip->tx_pkt_desc_sz); + + nlo_hdr = skb_put_zero(skb, sizeof(struct rtw_nlo_info_hdr)); + + nlo_hdr->nlo_count = pno_req->match_set_cnt; + nlo_hdr->hidden_ap_count = pno_req->match_set_cnt; + + /* pattern check for firmware */ + memset(nlo_hdr->pattern_check, 0xA5, FW_NLO_INFO_CHECK_SIZE); + + for (i = 0; i < pno_req->match_set_cnt; i++) + nlo_hdr->ssid_len[i] = pno_req->match_sets[i].ssid.ssid_len; + + for (i = 0; i < pno_req->match_set_cnt; i++) { + ssid = &pno_req->match_sets[i].ssid; + loc = rtw_get_rsvd_page_probe_req_location(rtwdev, ssid); + if (!loc) { + rtw_err(rtwdev, "failed to get probe req rsvd loc\n"); + kfree(skb); + return NULL; + } + nlo_hdr->location[i] = loc; + } + + for (i = 0; i < pno_req->match_set_cnt; i++) { + pos = skb_put_zero(skb, IEEE80211_MAX_SSID_LEN); + memcpy(pos, pno_req->match_sets[i].ssid.ssid, + pno_req->match_sets[i].ssid.ssid_len); + } + + return skb; +} + +static struct sk_buff *rtw_cs_channel_info_get(struct ieee80211_hw *hw) +{ + struct rtw_dev *rtwdev = hw->priv; + struct rtw_chip_info *chip = rtwdev->chip; + struct rtw_pno_request *pno_req = &rtwdev->wow.pno_req; + struct ieee80211_channel *channels = pno_req->channels; + struct sk_buff *skb; + int count = pno_req->channel_cnt; + u8 *pos; + int i = 0; + + skb = alloc_skb(4 * count + chip->tx_pkt_desc_sz, GFP_KERNEL); + if (!skb) + return NULL; + + skb_reserve(skb, chip->tx_pkt_desc_sz); + + for (i = 0; i < count; i++) { + pos = skb_put_zero(skb, 4); + + CHSW_INFO_SET_CH(pos, channels[i].hw_value); + + if (channels[i].flags & IEEE80211_CHAN_RADAR) + CHSW_INFO_SET_ACTION_ID(pos, 0); + else + CHSW_INFO_SET_ACTION_ID(pos, 1); + CHSW_INFO_SET_TIMEOUT(pos, 1); + CHSW_INFO_SET_PRI_CH_IDX(pos, 1); + CHSW_INFO_SET_BW(pos, 0); + } + + return skb; +} + static struct sk_buff *rtw_lps_pg_dpk_get(struct ieee80211_hw *hw) { struct rtw_dev *rtwdev = hw->priv; @@ -591,6 +826,7 @@ static struct sk_buff *rtw_lps_pg_info_get(struct ieee80211_hw *hw, struct rtw_chip_info *chip = rtwdev->chip; struct rtw_lps_conf *conf = &rtwdev->lps_conf; struct rtw_lps_pg_info_hdr *pg_info_hdr; + struct rtw_wow_param *rtw_wow = &rtwdev->wow; struct sk_buff *skb; u32 size; @@ -605,19 +841,22 @@ static struct sk_buff *rtw_lps_pg_info_get(struct ieee80211_hw *hw, pg_info_hdr->macid = find_first_bit(rtwdev->mac_id_map, RTW_MAX_MAC_ID_NUM); pg_info_hdr->sec_cam_count = rtw_sec_cam_pg_backup(rtwdev, pg_info_hdr->sec_cam); + pg_info_hdr->pattern_count = rtw_wow->pattern_cnt; conf->sec_cam_backup = pg_info_hdr->sec_cam_count != 0; + conf->pattern_cam_backup = rtw_wow->pattern_cnt != 0; return skb; } static struct sk_buff *rtw_get_rsvd_page_skb(struct ieee80211_hw *hw, struct ieee80211_vif *vif, - enum rtw_rsvd_packet_type type) + struct rtw_rsvd_page *rsvd_pkt) { struct sk_buff *skb_new; + struct cfg80211_ssid *ssid; - switch (type) { + switch (rsvd_pkt->type) { case RSVD_BEACON: skb_new = rtw_beacon_get(hw, vif); break; @@ -639,6 +878,21 @@ static struct sk_buff *rtw_get_rsvd_page_skb(struct ieee80211_hw *hw, case RSVD_LPS_PG_INFO: skb_new = rtw_lps_pg_info_get(hw, vif); break; + case RSVD_PROBE_REQ: + ssid = (struct cfg80211_ssid *)rsvd_pkt->ssid; + if (ssid) + skb_new = ieee80211_probereq_get(hw, vif->addr, + ssid->ssid, + ssid->ssid_len, 0); + else + skb_new = ieee80211_probereq_get(hw, vif->addr, NULL, 0, 0); + break; + case RSVD_NLO_INFO: + skb_new = rtw_nlo_info_get(hw); + break; + case RSVD_CH_INFO: + skb_new = rtw_cs_channel_info_get(hw); + break; default: return NULL; } @@ -680,25 +934,53 @@ static void rtw_rsvd_page_list_to_buf(struct rtw_dev *rtwdev, u8 page_size, memcpy(buf, skb->data, skb->len); } +static struct rtw_rsvd_page *rtw_alloc_rsvd_page(struct rtw_dev *rtwdev, + enum rtw_rsvd_packet_type type, + bool txdesc) +{ + struct rtw_rsvd_page *rsvd_pkt = NULL; + + rsvd_pkt = kzalloc(sizeof(*rsvd_pkt), GFP_KERNEL); + + if (!rsvd_pkt) + return NULL; + + rsvd_pkt->type = type; + rsvd_pkt->add_txdesc = txdesc; + + return rsvd_pkt; +} + +static void rtw_insert_rsvd_page(struct rtw_dev *rtwdev, + struct rtw_rsvd_page *rsvd_pkt) +{ + lockdep_assert_held(&rtwdev->mutex); + list_add_tail(&rsvd_pkt->list, &rtwdev->rsvd_page_list); +} + void rtw_add_rsvd_page(struct rtw_dev *rtwdev, enum rtw_rsvd_packet_type type, bool txdesc) { struct rtw_rsvd_page *rsvd_pkt; - lockdep_assert_held(&rtwdev->mutex); + rsvd_pkt = rtw_alloc_rsvd_page(rtwdev, type, txdesc); + if (!rsvd_pkt) + return; - list_for_each_entry(rsvd_pkt, &rtwdev->rsvd_page_list, list) { - if (rsvd_pkt->type == type) - return; - } + rtw_insert_rsvd_page(rtwdev, rsvd_pkt); +} - rsvd_pkt = kmalloc(sizeof(*rsvd_pkt), GFP_KERNEL); +void rtw_add_rsvd_page_probe_req(struct rtw_dev *rtwdev, + struct cfg80211_ssid *ssid) +{ + struct rtw_rsvd_page *rsvd_pkt; + + rsvd_pkt = rtw_alloc_rsvd_page(rtwdev, RSVD_PROBE_REQ, true); if (!rsvd_pkt) return; - rsvd_pkt->type = type; - rsvd_pkt->add_txdesc = txdesc; - list_add_tail(&rsvd_pkt->list, &rtwdev->rsvd_page_list); + rsvd_pkt->ssid = ssid; + rtw_insert_rsvd_page(rtwdev, rsvd_pkt); } void rtw_reset_rsvd_page(struct rtw_dev *rtwdev) @@ -795,7 +1077,7 @@ static u8 *rtw_build_rsvd_page(struct rtw_dev *rtwdev, page_margin = page_size - tx_desc_sz; list_for_each_entry(rsvd_pkt, &rtwdev->rsvd_page_list, list) { - iter = rtw_get_rsvd_page_skb(hw, vif, rsvd_pkt->type); + iter = rtw_get_rsvd_page_skb(hw, vif, rsvd_pkt); if (!iter) { rtw_err(rtwdev, "failed to build rsvd packet\n"); goto release_skb; @@ -857,13 +1139,16 @@ static u8 *rtw_build_rsvd_page(struct rtw_dev *rtwdev, page += rtw_len_to_page(rsvd_pkt->skb->len, page_size); kfree_skb(rsvd_pkt->skb); + rsvd_pkt->skb = NULL; } return buf; release_skb: - list_for_each_entry(rsvd_pkt, &rtwdev->rsvd_page_list, list) + list_for_each_entry(rsvd_pkt, &rtwdev->rsvd_page_list, list) { kfree_skb(rsvd_pkt->skb); + rsvd_pkt->skb = NULL; + } return NULL; } @@ -973,3 +1258,81 @@ out: rtw_write8(rtwdev, REG_RCR + 2, rcr); return 0; } + +static void __rtw_fw_update_pkt(struct rtw_dev *rtwdev, u8 pkt_id, u16 size, + u8 location) +{ + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + u16 total_size = H2C_PKT_HDR_SIZE + H2C_PKT_UPDATE_PKT_LEN; + + rtw_h2c_pkt_set_header(h2c_pkt, H2C_PKT_UPDATE_PKT); + + SET_PKT_H2C_TOTAL_LEN(h2c_pkt, total_size); + UPDATE_PKT_SET_PKT_ID(h2c_pkt, pkt_id); + UPDATE_PKT_SET_LOCATION(h2c_pkt, location); + + /* include txdesc size */ + UPDATE_PKT_SET_SIZE(h2c_pkt, size); + + rtw_fw_send_h2c_packet(rtwdev, h2c_pkt); +} + +void rtw_fw_update_pkt_probe_req(struct rtw_dev *rtwdev, + struct cfg80211_ssid *ssid) +{ + u8 loc; + u32 size; + + loc = rtw_get_rsvd_page_probe_req_location(rtwdev, ssid); + if (!loc) { + rtw_err(rtwdev, "failed to get probe_req rsvd loc\n"); + return; + } + + size = rtw_get_rsvd_page_probe_req_size(rtwdev, ssid); + if (!size) { + rtw_err(rtwdev, "failed to get probe_req rsvd size\n"); + return; + } + + __rtw_fw_update_pkt(rtwdev, RTW_PACKET_PROBE_REQ, size, loc); +} + +void rtw_fw_channel_switch(struct rtw_dev *rtwdev, bool enable) +{ + struct rtw_pno_request *rtw_pno_req = &rtwdev->wow.pno_req; + u8 h2c_pkt[H2C_PKT_SIZE] = {0}; + u16 total_size = H2C_PKT_HDR_SIZE + H2C_PKT_CH_SWITCH_LEN; + u8 loc_ch_info; + const struct rtw_ch_switch_option cs_option = { + .dest_ch_en = 1, + .dest_ch = 1, + .periodic_option = 2, + .normal_period = 5, + .normal_period_sel = 0, + .normal_cycle = 10, + .slow_period = 1, + .slow_period_sel = 1, + }; + + rtw_h2c_pkt_set_header(h2c_pkt, H2C_PKT_CH_SWITCH); + SET_PKT_H2C_TOTAL_LEN(h2c_pkt, total_size); + + CH_SWITCH_SET_START(h2c_pkt, enable); + CH_SWITCH_SET_DEST_CH_EN(h2c_pkt, cs_option.dest_ch_en); + CH_SWITCH_SET_DEST_CH(h2c_pkt, cs_option.dest_ch); + CH_SWITCH_SET_NORMAL_PERIOD(h2c_pkt, cs_option.normal_period); + CH_SWITCH_SET_NORMAL_PERIOD_SEL(h2c_pkt, cs_option.normal_period_sel); + CH_SWITCH_SET_SLOW_PERIOD(h2c_pkt, cs_option.slow_period); + CH_SWITCH_SET_SLOW_PERIOD_SEL(h2c_pkt, cs_option.slow_period_sel); + CH_SWITCH_SET_NORMAL_CYCLE(h2c_pkt, cs_option.normal_cycle); + CH_SWITCH_SET_PERIODIC_OPT(h2c_pkt, cs_option.periodic_option); + + CH_SWITCH_SET_CH_NUM(h2c_pkt, rtw_pno_req->channel_cnt); + CH_SWITCH_SET_INFO_SIZE(h2c_pkt, rtw_pno_req->channel_cnt * 4); + + loc_ch_info = rtw_get_rsvd_page_location(rtwdev, RSVD_CH_INFO); + CH_SWITCH_SET_INFO_LOC(h2c_pkt, loc_ch_info); + + rtw_fw_send_h2c_packet(rtwdev, h2c_pkt); +} diff --git a/drivers/net/wireless/realtek/rtw88/fw.h b/drivers/net/wireless/realtek/rtw88/fw.h index 73d1b9ca8efc..ccd27bd45775 100644 --- a/drivers/net/wireless/realtek/rtw88/fw.h +++ b/drivers/net/wireless/realtek/rtw88/fw.h @@ -12,6 +12,8 @@ #define FW_HDR_SIZE 64 #define FW_HDR_CHKSUM_SIZE 8 +#define FW_NLO_INFO_CHECK_SIZE 4 + #define FIFO_PAGE_SIZE_SHIFT 12 #define FIFO_PAGE_SIZE 4096 #define RSVD_PAGE_START_ADDR 0x780 @@ -45,6 +47,9 @@ enum rtw_rsvd_packet_type { RSVD_QOS_NULL, RSVD_LPS_PG_DPK, RSVD_LPS_PG_INFO, + RSVD_PROBE_REQ, + RSVD_NLO_INFO, + RSVD_CH_INFO, }; enum rtw_fw_rf_type { @@ -98,6 +103,57 @@ struct rtw_rsvd_page { enum rtw_rsvd_packet_type type; u8 page; bool add_txdesc; + struct cfg80211_ssid *ssid; +}; + +enum rtw_keep_alive_pkt_type { + KEEP_ALIVE_NULL_PKT = 0, + KEEP_ALIVE_ARP_RSP = 1, +}; + +struct rtw_nlo_info_hdr { + u8 nlo_count; + u8 hidden_ap_count; + u8 rsvd1[2]; + u8 pattern_check[FW_NLO_INFO_CHECK_SIZE]; + u8 rsvd2[8]; + u8 ssid_len[16]; + u8 chiper[16]; + u8 rsvd3[16]; + u8 location[8]; +} __packed; + +enum rtw_packet_type { + RTW_PACKET_PROBE_REQ = 0x00, + + RTW_PACKET_UNDEFINE = 0x7FFFFFFF, +}; + +struct rtw_fw_wow_keep_alive_para { + bool adopt; + u8 pkt_type; + u8 period; /* unit: sec */ +}; + +struct rtw_fw_wow_disconnect_para { + bool adopt; + u8 period; /* unit: sec */ + u8 retry_count; +}; + +struct rtw_ch_switch_option { + u8 periodic_option; + u32 tsf_high; + u32 tsf_low; + u8 dest_ch_en; + u8 absolute_time_en; + u8 dest_ch; + u8 normal_period; + u8 normal_period_sel; + u8 normal_cycle; + u8 slow_period; + u8 slow_period_sel; + u8 nlo_en; }; struct rtw_fw_hdr { @@ -146,6 +202,12 @@ struct rtw_fw_hdr { #define H2C_PKT_PHYDM_INFO 0x11 #define H2C_PKT_IQK 0x0E +#define H2C_PKT_CH_SWITCH 0x02 +#define H2C_PKT_UPDATE_PKT 0x0C + +#define H2C_PKT_CH_SWITCH_LEN 0x20 +#define H2C_PKT_UPDATE_PKT_LEN 0x4 + #define SET_PKT_H2C_CATEGORY(h2c_pkt, value) \ le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(6, 0)) #define SET_PKT_H2C_CMD_ID(h2c_pkt, value) \ @@ -182,6 +244,57 @@ static inline void rtw_h2c_pkt_set_header(u8 *h2c_pkt, u8 sub_id) #define IQK_SET_SEGMENT_IQK(h2c_pkt, value) \ le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, BIT(1)) +#define CHSW_INFO_SET_CH(pkt, value) \ + le32p_replace_bits((__le32 *)(pkt) + 0x00, value, GENMASK(7, 0)) +#define CHSW_INFO_SET_PRI_CH_IDX(pkt, value) \ + le32p_replace_bits((__le32 *)(pkt) + 0x00, value, GENMASK(11, 8)) +#define CHSW_INFO_SET_BW(pkt, value) \ + le32p_replace_bits((__le32 *)(pkt) + 0x00, value, GENMASK(15, 12)) +#define CHSW_INFO_SET_TIMEOUT(pkt, value) \ + le32p_replace_bits((__le32 *)(pkt) + 0x00, value, GENMASK(23, 16)) +#define CHSW_INFO_SET_ACTION_ID(pkt, value) \ + le32p_replace_bits((__le32 *)(pkt) + 0x00, value, GENMASK(30, 24)) + +#define UPDATE_PKT_SET_SIZE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(15, 0)) +#define UPDATE_PKT_SET_PKT_ID(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(23, 16)) +#define UPDATE_PKT_SET_LOCATION(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(31, 24)) + +#define CH_SWITCH_SET_START(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, BIT(0)) +#define CH_SWITCH_SET_DEST_CH_EN(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, BIT(1)) +#define CH_SWITCH_SET_ABSOLUTE_TIME(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, BIT(2)) +#define CH_SWITCH_SET_PERIODIC_OPT(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(4, 3)) +#define CH_SWITCH_SET_INFO_LOC(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(15, 8)) +#define CH_SWITCH_SET_CH_NUM(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(23, 16)) +#define CH_SWITCH_SET_PRI_CH_IDX(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x02, value, GENMASK(27, 24)) +#define CH_SWITCH_SET_DEST_CH(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x03, value, GENMASK(7, 0)) +#define CH_SWITCH_SET_NORMAL_PERIOD(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x03, value, GENMASK(13, 8)) +#define CH_SWITCH_SET_NORMAL_PERIOD_SEL(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x03, value, GENMASK(15, 14)) +#define CH_SWITCH_SET_SLOW_PERIOD(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x03, value, GENMASK(21, 16)) +#define CH_SWITCH_SET_SLOW_PERIOD_SEL(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x03, value, GENMASK(23, 22)) +#define CH_SWITCH_SET_NORMAL_CYCLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x03, value, GENMASK(31, 24)) +#define CH_SWITCH_SET_TSF_HIGH(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x04, value, GENMASK(31, 0)) +#define CH_SWITCH_SET_TSF_LOW(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x05, value, GENMASK(31, 0)) +#define CH_SWITCH_SET_INFO_SIZE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x06, value, GENMASK(15, 0)) + /* Command H2C */ #define H2C_CMD_RSVD_PAGE 0x0 #define H2C_CMD_MEDIA_STATUS_RPT 0x01 @@ -198,6 +311,13 @@ static inline void rtw_h2c_pkt_set_header(u8 *h2c_pkt, u8 sub_id) #define H2C_CMD_QUERY_BT_MP_INFO 0x67 #define H2C_CMD_BT_WIFI_CONTROL 0x69 +#define H2C_CMD_KEEP_ALIVE 0x03 +#define H2C_CMD_DISCONNECT_DECISION 0x04 +#define H2C_CMD_WOWLAN 0x80 +#define H2C_CMD_REMOTE_WAKE_CTRL 0x81 +#define H2C_CMD_AOAC_GLOBAL_INFO 0x82 +#define H2C_CMD_NLO_INFO 0x8C + #define SET_H2C_CMD_ID_CLASS(h2c_pkt, value) \ le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(7, 0)) @@ -224,6 +344,8 @@ static inline void rtw_h2c_pkt_set_header(u8 *h2c_pkt, u8 sub_id) le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(31, 24)) #define LPS_PG_SEC_CAM_EN(h2c_pkt, value) \ le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(8)) +#define LPS_PG_PATTERN_CAM_EN(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(10)) #define SET_RSSI_INFO_MACID(h2c_pkt, value) \ le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(15, 8)) #define SET_RSSI_INFO_RSSI(h2c_pkt, value) \ @@ -301,6 +423,56 @@ static inline void rtw_h2c_pkt_set_header(u8 *h2c_pkt, u8 sub_id) #define SET_BT_WIFI_CONTROL_DATA5(h2c_pkt, value) \ le32p_replace_bits((__le32 *)(h2c_pkt) + 0x01, value, GENMASK(23, 16)) +#define SET_KEEP_ALIVE_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(8)) +#define SET_KEEP_ALIVE_ADOPT(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(9)) +#define SET_KEEP_ALIVE_PKT_TYPE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(10)) +#define SET_KEEP_ALIVE_CHECK_PERIOD(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(23, 16)) + +#define SET_DISCONNECT_DECISION_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(8)) +#define SET_DISCONNECT_DECISION_ADOPT(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(9)) +#define SET_DISCONNECT_DECISION_CHECK_PERIOD(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(23, 16)) +#define SET_DISCONNECT_DECISION_TRY_PKT_NUM(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(31, 24)) + +#define SET_WOWLAN_FUNC_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(8)) +#define SET_WOWLAN_PATTERN_MATCH_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(9)) +#define SET_WOWLAN_MAGIC_PKT_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(10)) +#define SET_WOWLAN_UNICAST_PKT_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(11)) +#define SET_WOWLAN_REKEY_WAKEUP_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(14)) +#define SET_WOWLAN_DEAUTH_WAKEUP_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(15)) + +#define SET_REMOTE_WAKECTRL_ENABLE(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(8)) +#define SET_REMOTE_WAKE_CTRL_NLO_OFFLOAD_EN(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(12)) + +#define SET_AOAC_GLOBAL_INFO_PAIRWISE_ENC_ALG(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(15, 8)) +#define SET_AOAC_GLOBAL_INFO_GROUP_ENC_ALG(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(23, 16)) + +#define SET_NLO_FUN_EN(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(8)) +#define SET_NLO_PS_32K(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(9)) +#define SET_NLO_IGNORE_SECURITY(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, BIT(10)) +#define SET_NLO_LOC_NLO_INFO(h2c_pkt, value) \ + le32p_replace_bits((__le32 *)(h2c_pkt) + 0x00, value, GENMASK(23, 16)) + static inline struct rtw_c2h_cmd *get_c2h_from_skb(struct sk_buff *skb) { u32 pkt_offset; @@ -332,6 +504,8 @@ void rtw_fw_send_ra_info(struct rtw_dev *rtwdev, struct rtw_sta_info *si); void rtw_fw_media_status_report(struct rtw_dev *rtwdev, u8 mac_id, bool conn); void rtw_add_rsvd_page(struct rtw_dev *rtwdev, enum rtw_rsvd_packet_type type, bool txdesc); +void rtw_add_rsvd_page_probe_req(struct rtw_dev *rtwdev, + struct cfg80211_ssid *ssid); int rtw_fw_write_data_rsvd_page(struct rtw_dev *rtwdev, u16 pg_addr, u8 *buf, u32 size); void rtw_reset_rsvd_page(struct rtw_dev *rtwdev); @@ -340,4 +514,16 @@ int rtw_fw_download_rsvd_page(struct rtw_dev *rtwdev, void rtw_send_rsvd_page_h2c(struct rtw_dev *rtwdev); int rtw_dump_drv_rsvd_page(struct rtw_dev *rtwdev, u32 offset, u32 size, u32 *buf); +void rtw_fw_set_remote_wake_ctrl_cmd(struct rtw_dev *rtwdev, bool enable); +void rtw_fw_set_wowlan_ctrl_cmd(struct rtw_dev *rtwdev, bool enable); +void rtw_fw_set_keep_alive_cmd(struct rtw_dev *rtwdev, bool enable); +void rtw_fw_set_disconnect_decision_cmd(struct rtw_dev *rtwdev, bool enable); +void rtw_fw_set_aoac_global_info_cmd(struct rtw_dev *rtwdev, + u8 pairwise_key_enc, + u8 group_key_enc); + +void rtw_fw_set_nlo_info(struct rtw_dev *rtwdev, bool enable); +void rtw_fw_update_pkt_probe_req(struct rtw_dev *rtwdev, + struct cfg80211_ssid *ssid); +void rtw_fw_channel_switch(struct rtw_dev *rtwdev, bool enable); #endif diff --git a/drivers/net/wireless/realtek/rtw88/hci.h b/drivers/net/wireless/realtek/rtw88/hci.h index 3d91aea942c3..85a81a578fd5 100644 --- a/drivers/net/wireless/realtek/rtw88/hci.h +++ b/drivers/net/wireless/realtek/rtw88/hci.h @@ -15,6 +15,7 @@ struct rtw_hci_ops { void (*stop)(struct rtw_dev *rtwdev); void (*deep_ps)(struct rtw_dev *rtwdev, bool enter); void (*link_ps)(struct rtw_dev *rtwdev, bool enter); + void (*interface_cfg)(struct rtw_dev *rtwdev); int (*write_data_rsvd_page)(struct rtw_dev *rtwdev, u8 *buf, u32 size); int (*write_data_h2c)(struct rtw_dev *rtwdev, u8 *buf, u32 size); @@ -59,6 +60,11 @@ static inline void rtw_hci_link_ps(struct rtw_dev *rtwdev, bool enter) rtwdev->hci.ops->link_ps(rtwdev, enter); } +static inline void rtw_hci_interface_cfg(struct rtw_dev *rtwdev) +{ + rtwdev->hci.ops->interface_cfg(rtwdev); +} + static inline int rtw_hci_write_data_rsvd_page(struct rtw_dev *rtwdev, u8 *buf, u32 size) { diff --git a/drivers/net/wireless/realtek/rtw88/mac.c b/drivers/net/wireless/realtek/rtw88/mac.c index 507970387b2a..cadf0abbe16b 100644 --- a/drivers/net/wireless/realtek/rtw88/mac.c +++ b/drivers/net/wireless/realtek/rtw88/mac.c @@ -16,10 +16,12 @@ void rtw_set_channel_mac(struct rtw_dev *rtwdev, u8 channel, u8 bw, u8 value8; txsc20 = primary_ch_idx; - if (txsc20 == 1 || txsc20 == 3) - txsc40 = 9; - else - txsc40 = 10; + if (bw == RTW_CHANNEL_WIDTH_80) { + if (txsc20 == 1 || txsc20 == 3) + txsc40 = 9; + else + txsc40 = 10; + } rtw_write8(rtwdev, REG_DATA_SC, BIT_TXSC_20M(txsc20) | BIT_TXSC_40M(txsc40)); @@ -1034,5 +1036,7 @@ int rtw_mac_init(struct rtw_dev *rtwdev) if (ret) return ret; + rtw_hci_interface_cfg(rtwdev); + return 0; } diff --git a/drivers/net/wireless/realtek/rtw88/mac80211.c b/drivers/net/wireless/realtek/rtw88/mac80211.c index 34a1c3b53cd4..6fc33e11d08c 100644 --- a/drivers/net/wireless/realtek/rtw88/mac80211.c +++ b/drivers/net/wireless/realtek/rtw88/mac80211.c @@ -12,6 +12,7 @@ #include "reg.h" #include "bf.h" #include "debug.h" +#include "wow.h" static void rtw_ops_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, @@ -152,12 +153,10 @@ static int rtw_ops_add_interface(struct ieee80211_hw *hw, u8 bcn_ctrl = 0; rtwvif->port = port; - rtwvif->vif = vif; rtwvif->stats.tx_unicast = 0; rtwvif->stats.rx_unicast = 0; rtwvif->stats.tx_cnt = 0; rtwvif->stats.rx_cnt = 0; - rtwvif->in_lps = false; memset(&rtwvif->bfee, 0, sizeof(struct rtw_bfee)); rtwvif->conf = &rtw_vif_port[port]; rtw_txq_init(rtwdev, vif->txq); @@ -735,6 +734,44 @@ static int rtw_ops_set_bitrate_mask(struct ieee80211_hw *hw, return 0; } +#ifdef CONFIG_PM +static int rtw_ops_suspend(struct ieee80211_hw *hw, + struct cfg80211_wowlan *wowlan) +{ + struct rtw_dev *rtwdev = hw->priv; + int ret; + + mutex_lock(&rtwdev->mutex); + ret = rtw_wow_suspend(rtwdev, wowlan); + if (ret) + rtw_err(rtwdev, "failed to suspend for wow %d\n", ret); + mutex_unlock(&rtwdev->mutex); + + return ret ? 1 : 0; +} + +static int rtw_ops_resume(struct ieee80211_hw *hw) +{ + struct rtw_dev *rtwdev = hw->priv; + int ret; + + mutex_lock(&rtwdev->mutex); + ret = rtw_wow_resume(rtwdev); + if (ret) + rtw_err(rtwdev, "failed to resume for wow %d\n", ret); + mutex_unlock(&rtwdev->mutex); + + return ret ? 1 : 0; +} + +static void rtw_ops_set_wakeup(struct ieee80211_hw *hw, bool enabled) +{ + struct rtw_dev *rtwdev = hw->priv; + + device_set_wakeup_enable(rtwdev->dev, enabled); +} +#endif + const struct ieee80211_ops rtw_ops = { .tx = rtw_ops_tx, .wake_tx_queue = rtw_ops_wake_tx_queue, @@ -757,5 +794,10 @@ const struct ieee80211_ops rtw_ops = { .sta_statistics = rtw_ops_sta_statistics, .flush = rtw_ops_flush, .set_bitrate_mask = rtw_ops_set_bitrate_mask, +#ifdef CONFIG_PM + .suspend = rtw_ops_suspend, + .resume = rtw_ops_resume, + .set_wakeup = rtw_ops_set_wakeup, +#endif }; EXPORT_SYMBOL(rtw_ops); diff --git a/drivers/net/wireless/realtek/rtw88/main.c b/drivers/net/wireless/realtek/rtw88/main.c index ae61415e1665..2845d2838f7b 100644 --- a/drivers/net/wireless/realtek/rtw88/main.c +++ b/drivers/net/wireless/realtek/rtw88/main.c @@ -706,8 +706,8 @@ void rtw_update_sta_info(struct rtw_dev *rtwdev, struct rtw_sta_info *si) if (sta->vht_cap.cap & IEEE80211_VHT_CAP_SHORT_GI_80) is_support_sgi = true; } else if (sta->ht_cap.ht_supported) { - ra_mask |= (sta->ht_cap.mcs.rx_mask[NL80211_BAND_5GHZ] << 20) | - (sta->ht_cap.mcs.rx_mask[NL80211_BAND_2GHZ] << 12); + ra_mask |= (sta->ht_cap.mcs.rx_mask[1] << 20) | + (sta->ht_cap.mcs.rx_mask[0] << 12); if (sta->ht_cap.cap & IEEE80211_HT_CAP_RX_STBC) stbc_en = HT_STBC_EN; if (sta->ht_cap.cap & IEEE80211_HT_CAP_LDPC_CODING) @@ -717,6 +717,9 @@ void rtw_update_sta_info(struct rtw_dev *rtwdev, struct rtw_sta_info *si) is_support_sgi = true; } + if (efuse->hw_cap.nss == 1) + ra_mask &= RA_MASK_VHT_RATES_1SS | RA_MASK_HT_RATES_1SS; + if (hal->current_band_type == RTW_BAND_5G) { ra_mask |= (u64)sta->supp_rates[NL80211_BAND_5GHZ] << 4; if (sta->vht_cap.vht_supported) { @@ -750,11 +753,6 @@ void rtw_update_sta_info(struct rtw_dev *rtwdev, struct rtw_sta_info *si) wireless_set = 0; } - if (efuse->hw_cap.nss == 1) { - ra_mask &= RA_MASK_VHT_RATES_1SS; - ra_mask &= RA_MASK_HT_RATES_1SS; - } - switch (sta->bandwidth) { case IEEE80211_STA_RX_BW_80: bw_mode = RTW_CHANNEL_WIDTH_80; @@ -793,6 +791,26 @@ void rtw_update_sta_info(struct rtw_dev *rtwdev, struct rtw_sta_info *si) rtw_fw_send_ra_info(rtwdev, si); } +static int rtw_wait_firmware_completion(struct rtw_dev *rtwdev) +{ + struct rtw_chip_info *chip = rtwdev->chip; + struct rtw_fw_state *fw; + + fw = &rtwdev->fw; + wait_for_completion(&fw->completion); + if (!fw->firmware) + return -EINVAL; + + if (chip->wow_fw_name) { + fw = &rtwdev->wow_fw; + wait_for_completion(&fw->completion); + if (!fw->firmware) + return -EINVAL; + } + + return 0; +} + static int rtw_power_on(struct rtw_dev *rtwdev) { struct rtw_chip_info *chip = rtwdev->chip; @@ -813,11 +831,10 @@ static int rtw_power_on(struct rtw_dev *rtwdev) goto err; } - wait_for_completion(&fw->completion); - if (!fw->firmware) { - ret = -EINVAL; - rtw_err(rtwdev, "failed to load firmware\n"); - goto err; + ret = rtw_wait_firmware_completion(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to wait firmware completion\n"); + goto err_off; } ret = rtw_download_firmware(rtwdev, fw); @@ -881,7 +898,7 @@ int rtw_core_start(struct rtw_dev *rtwdev) static void rtw_power_off(struct rtw_dev *rtwdev) { - rtwdev->hci.ops->stop(rtwdev); + rtw_hci_stop(rtwdev); rtw_mac_power_off(rtwdev); } @@ -1020,8 +1037,8 @@ static void rtw_unset_supported_band(struct ieee80211_hw *hw, static void rtw_load_firmware_cb(const struct firmware *firmware, void *context) { - struct rtw_dev *rtwdev = context; - struct rtw_fw_state *fw = &rtwdev->fw; + struct rtw_fw_state *fw = context; + struct rtw_dev *rtwdev = fw->rtwdev; const struct rtw_fw_hdr *fw_hdr; if (!firmware || !firmware->data) { @@ -1043,17 +1060,35 @@ static void rtw_load_firmware_cb(const struct firmware *firmware, void *context) fw->version, fw->sub_version, fw->sub_index, fw->h2c_version); } -static int rtw_load_firmware(struct rtw_dev *rtwdev, const char *fw_name) +static int rtw_load_firmware(struct rtw_dev *rtwdev, enum rtw_fw_type type) { - struct rtw_fw_state *fw = &rtwdev->fw; + const char *fw_name; + struct rtw_fw_state *fw; int ret; + switch (type) { + case RTW_WOWLAN_FW: + fw = &rtwdev->wow_fw; + fw_name = rtwdev->chip->wow_fw_name; + break; + + case RTW_NORMAL_FW: + fw = &rtwdev->fw; + fw_name = rtwdev->chip->fw_name; + break; + + default: + rtw_warn(rtwdev, "unsupported firmware type\n"); + return -ENOENT; + } + + fw->rtwdev = rtwdev; init_completion(&fw->completion); ret = request_firmware_nowait(THIS_MODULE, true, fw_name, rtwdev->dev, - GFP_KERNEL, rtwdev, rtw_load_firmware_cb); + GFP_KERNEL, fw, rtw_load_firmware_cb); if (ret) { - rtw_err(rtwdev, "async firmware request failed\n"); + rtw_err(rtwdev, "failed to async firmware request\n"); return ret; } @@ -1341,7 +1376,6 @@ int rtw_core_init(struct rtw_dev *rtwdev) skb_queue_head_init(&rtwdev->coex.queue); skb_queue_head_init(&rtwdev->tx_report.queue); - spin_lock_init(&rtwdev->dm_lock); spin_lock_init(&rtwdev->rf_lock); spin_lock_init(&rtwdev->h2c.lock); spin_lock_init(&rtwdev->txq_lock); @@ -1372,12 +1406,19 @@ int rtw_core_init(struct rtw_dev *rtwdev) BIT_HTC_LOC_CTRL | BIT_APP_PHYSTS | BIT_AB | BIT_AM | BIT_APM; - ret = rtw_load_firmware(rtwdev, rtwdev->chip->fw_name); + ret = rtw_load_firmware(rtwdev, RTW_NORMAL_FW); if (ret) { rtw_warn(rtwdev, "no firmware loaded\n"); return ret; } + if (chip->wow_fw_name) { + ret = rtw_load_firmware(rtwdev, RTW_WOWLAN_FW); + if (ret) { + rtw_warn(rtwdev, "no wow firmware loaded\n"); + return ret; + } + } return 0; } EXPORT_SYMBOL(rtw_core_init); @@ -1385,12 +1426,16 @@ EXPORT_SYMBOL(rtw_core_init); void rtw_core_deinit(struct rtw_dev *rtwdev) { struct rtw_fw_state *fw = &rtwdev->fw; + struct rtw_fw_state *wow_fw = &rtwdev->wow_fw; struct rtw_rsvd_page *rsvd_pkt, *tmp; unsigned long flags; if (fw->firmware) release_firmware(fw->firmware); + if (wow_fw->firmware) + release_firmware(wow_fw->firmware); + tasklet_kill(&rtwdev->tx_tasklet); spin_lock_irqsave(&rtwdev->tx_report.q_lock, flags); skb_queue_purge(&rtwdev->tx_report.queue); @@ -1445,6 +1490,10 @@ int rtw_register_hw(struct rtw_dev *rtwdev, struct ieee80211_hw *hw) wiphy_ext_feature_set(hw->wiphy, NL80211_EXT_FEATURE_CAN_REPLACE_PTK0); +#ifdef CONFIG_PM + hw->wiphy->wowlan = rtwdev->chip->wowlan_stub; + hw->wiphy->max_sched_scan_ssids = rtwdev->chip->max_sched_scan_ssids; +#endif rtw_set_supported_band(hw, rtwdev->chip); SET_IEEE80211_PERM_ADDR(hw, rtwdev->efuse.addr); diff --git a/drivers/net/wireless/realtek/rtw88/main.h b/drivers/net/wireless/realtek/rtw88/main.h index d012eefcd0da..f334d201bfb5 100644 --- a/drivers/net/wireless/realtek/rtw88/main.h +++ b/drivers/net/wireless/realtek/rtw88/main.h @@ -19,6 +19,10 @@ #define RTW_MAX_SEC_CAM_NUM 32 #define MAX_PG_CAM_BACKUP_NUM 8 +#define RTW_MAX_PATTERN_NUM 12 +#define RTW_MAX_PATTERN_MASK_SIZE 16 +#define RTW_MAX_PATTERN_SIZE 128 + #define RTW_WATCH_DOG_DELAY_TIME round_jiffies_relative(HZ * 2) #define RFREG_MASK 0xfffff @@ -193,6 +197,11 @@ enum rtw_rx_queue_type { RTK_MAX_RX_QUEUE_NUM }; +enum rtw_fw_type { + RTW_NORMAL_FW = 0x0, + RTW_WOWLAN_FW = 0x1, +}; + enum rtw_rate_index { RTW_RATEID_BGN_40M_2SS = 0, RTW_RATEID_BGN_40M_1SS = 1, @@ -337,6 +346,7 @@ enum rtw_flags { RTW_FLAG_LEISURE_PS_DEEP, RTW_FLAG_DIG_DISABLE, RTW_FLAG_BUSY_TRAFFIC, + RTW_FLAG_WOWLAN, NUM_OF_RTW_FLAGS, }; @@ -367,6 +377,15 @@ enum rtw_snr { RTW_SNR_NUM }; +enum rtw_wow_flags { + RTW_WOW_FLAG_EN_MAGIC_PKT, + RTW_WOW_FLAG_EN_REKEY_PKT, + RTW_WOW_FLAG_EN_DISCONNECT, + + /* keep it last */ + RTW_WOW_FLAG_MAX, +}; + /* the power index is represented by differences, which cck-1s & ht40-1s are * the base values, so for 1s's differences, there are only ht20 & ofdm */ @@ -608,6 +627,7 @@ struct rtw_lps_conf { u8 smart_ps; u8 port_id; bool sec_cam_backup; + bool pattern_cam_backup; }; enum rtw_hw_key_type { @@ -716,7 +736,6 @@ struct rtw_bf_info { }; struct rtw_vif { - struct ieee80211_vif *vif; enum rtw_net_type net_type; u16 aid; u8 mac_addr[ETH_ALEN]; @@ -727,7 +746,6 @@ struct rtw_vif { const struct rtw_vif_port *conf; struct rtw_traffic_stats stats; - bool in_lps; struct rtw_bfee bfee; }; @@ -902,6 +920,33 @@ struct rtw_intf_phy_para { u16 platform; }; +struct rtw_wow_pattern { + u16 crc; + u8 type; + u8 valid; + u8 mask[RTW_MAX_PATTERN_MASK_SIZE]; +}; + +struct rtw_pno_request { + bool inited; + u32 match_set_cnt; + struct cfg80211_match_set *match_sets; + u8 channel_cnt; + struct ieee80211_channel *channels; + struct cfg80211_sched_scan_plan scan_plan; +}; + +struct rtw_wow_param { + struct ieee80211_vif *wow_vif; + DECLARE_BITMAP(flags, RTW_WOW_FLAG_MAX); + u8 txpause; + u8 pattern_cnt; + struct rtw_wow_pattern patterns[RTW_MAX_PATTERN_NUM]; + + bool ips_enabled; + struct rtw_pno_request pno_req; +}; + struct rtw_intf_phy_para_table { struct rtw_intf_phy_para *usb2_para; struct rtw_intf_phy_para *usb3_para; @@ -1030,6 +1075,10 @@ struct rtw_chip_info { u8 bfer_su_max_num; u8 bfer_mu_max_num; + const char *wow_fw_name; + const struct wiphy_wowlan_support *wowlan_stub; + const u8 max_sched_scan_ssids; + /* coex paras */ u32 coex_para_ver; u8 bt_desired_ver; @@ -1456,6 +1505,7 @@ struct rtw_fifo_conf { struct rtw_fw_state { const struct firmware *firmware; + struct rtw_dev *rtwdev; struct completion completion; u16 version; u8 sub_version; @@ -1534,9 +1584,6 @@ struct rtw_dev { /* ensures exclusive access from mac80211 callbacks */ struct mutex mutex; - /* lock for dm to use */ - spinlock_t dm_lock; - /* read/write rf register */ spinlock_t rf_lock; @@ -1580,6 +1627,9 @@ struct rtw_dev { u8 mp_mode; + struct rtw_fw_state wow_fw; + struct rtw_wow_param wow; + /* hci related data, must be last */ u8 priv[0] __aligned(sizeof(void *)); }; @@ -1605,6 +1655,18 @@ static inline struct ieee80211_vif *rtwvif_to_vif(struct rtw_vif *rtwvif) return container_of(p, struct ieee80211_vif, drv_priv); } +static inline bool rtw_ssid_equal(struct cfg80211_ssid *a, + struct cfg80211_ssid *b) +{ + if (!a || !b || a->ssid_len != b->ssid_len) + return false; + + if (memcmp(a->ssid, b->ssid, a->ssid_len)) + return false; + + return true; +} + void rtw_get_channel_params(struct cfg80211_chan_def *chandef, struct rtw_channel_params *ch_param); bool check_hw_ready(struct rtw_dev *rtwdev, u32 addr, u32 mask, u32 target); diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index a58e8276a41a..1fbc14c149ec 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -6,6 +6,7 @@ #include <linux/pci.h> #include "main.h" #include "pci.h" +#include "reg.h" #include "tx.h" #include "rx.h" #include "fw.h" @@ -486,13 +487,6 @@ static void rtw_pci_disable_interrupt(struct rtw_dev *rtwdev, rtwpci->irq_enabled = false; } -static int rtw_pci_setup(struct rtw_dev *rtwdev) -{ - rtw_pci_reset_trx_ring(rtwdev); - - return 0; -} - static void rtw_pci_dma_reset(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci) { /* reset dma and rx tag */ @@ -501,11 +495,22 @@ static void rtw_pci_dma_reset(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci) rtwpci->rx_tag = 0; } +static int rtw_pci_setup(struct rtw_dev *rtwdev) +{ + struct rtw_pci *rtwpci = (struct rtw_pci *)rtwdev->priv; + + rtw_pci_reset_trx_ring(rtwdev); + rtw_pci_dma_reset(rtwdev, rtwpci); + + return 0; +} + static void rtw_pci_dma_release(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci) { struct rtw_pci_tx_ring *tx_ring; u8 queue; + rtw_pci_reset_trx_ring(rtwdev); for (queue = 0; queue < RTK_MAX_TX_QUEUE_NUM; queue++) { tx_ring = &rtwpci->tx_rings[queue]; rtw_pci_free_tx_ring_skbs(rtwdev, tx_ring); @@ -517,8 +522,6 @@ static int rtw_pci_start(struct rtw_dev *rtwdev) struct rtw_pci *rtwpci = (struct rtw_pci *)rtwdev->priv; unsigned long flags; - rtw_pci_dma_reset(rtwdev, rtwpci); - spin_lock_irqsave(&rtwpci->irq_lock, flags); rtw_pci_enable_interrupt(rtwdev, rtwpci); spin_unlock_irqrestore(&rtwpci->irq_lock, flags); @@ -832,6 +835,11 @@ static void rtw_pci_tx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, while (count--) { skb = skb_dequeue(&ring->queue); + if (!skb) { + rtw_err(rtwdev, "failed to dequeue %d skb TX queue %d, BD=0x%08x, rp %d -> %d\n", + count, hw_queue, bd_idx, ring->r.rp, cur_rp); + break; + } tx_data = rtw_pci_get_tx_data(skb); pci_unmap_single(rtwpci->pdev, tx_data->dma, skb->len, PCI_DMA_TODEVICE); @@ -1222,6 +1230,21 @@ static void rtw_pci_link_cfg(struct rtw_dev *rtwdev) rtwpci->link_ctrl = link_ctrl; } +static void rtw_pci_interface_cfg(struct rtw_dev *rtwdev) +{ + struct rtw_chip_info *chip = rtwdev->chip; + + switch (chip->id) { + case RTW_CHIP_TYPE_8822C: + if (rtwdev->hal.cut_version >= RTW_CHIP_VER_CUT_D) + rtw_write32_mask(rtwdev, REG_HCI_MIX_CFG, + BIT_PCIE_EMAC_PDN_AUX_TO_FAST_CLK, 1); + break; + default: + break; + } +} + static void rtw_pci_phy_cfg(struct rtw_dev *rtwdev) { struct rtw_chip_info *chip = rtwdev->chip; @@ -1264,6 +1287,23 @@ static void rtw_pci_phy_cfg(struct rtw_dev *rtwdev) rtw_pci_link_cfg(rtwdev); } +#ifdef CONFIG_PM +static int rtw_pci_suspend(struct device *dev) +{ + return 0; +} + +static int rtw_pci_resume(struct device *dev) +{ + return 0; +} + +static SIMPLE_DEV_PM_OPS(rtw_pm_ops, rtw_pci_suspend, rtw_pci_resume); +#define RTW_PM_OPS (&rtw_pm_ops) +#else +#define RTW_PM_OPS NULL +#endif + static int rtw_pci_claim(struct rtw_dev *rtwdev, struct pci_dev *pdev) { int ret; @@ -1330,6 +1370,7 @@ static struct rtw_hci_ops rtw_pci_ops = { .stop = rtw_pci_stop, .deep_ps = rtw_pci_deep_ps, .link_ps = rtw_pci_link_ps, + .interface_cfg = rtw_pci_interface_cfg, .read8 = rtw_pci_read8, .read16 = rtw_pci_read16, @@ -1489,6 +1530,7 @@ static struct pci_driver rtw_pci_driver = { .id_table = rtw_pci_id_table, .probe = rtw_pci_probe, .remove = rtw_pci_remove, + .driver.pm = RTW_PM_OPS, }; module_pci_driver(rtw_pci_driver); diff --git a/drivers/net/wireless/realtek/rtw88/pci.h b/drivers/net/wireless/realtek/rtw88/pci.h index 49bf29a92152..1580cfc57361 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.h +++ b/drivers/net/wireless/realtek/rtw88/pci.h @@ -210,7 +210,7 @@ struct rtw_pci { void __iomem *mmap; }; -static u32 max_num_of_tx_queue(u8 queue) +static inline u32 max_num_of_tx_queue(u8 queue) { u32 max_num; diff --git a/drivers/net/wireless/realtek/rtw88/phy.c b/drivers/net/wireless/realtek/rtw88/phy.c index a3e1e9578b65..eea9d888fbf1 100644 --- a/drivers/net/wireless/realtek/rtw88/phy.c +++ b/drivers/net/wireless/realtek/rtw88/phy.c @@ -1434,7 +1434,7 @@ static void rtw_load_rfk_table(struct rtw_dev *rtwdev) rtw_load_table(rtwdev, chip->rfk_init_tbl); - dpk_info->is_dpk_pwr_on = 1; + dpk_info->is_dpk_pwr_on = true; } void rtw_phy_load_tables(struct rtw_dev *rtwdev) diff --git a/drivers/net/wireless/realtek/rtw88/ps.c b/drivers/net/wireless/realtek/rtw88/ps.c index 913e6f47130f..7a189a9926fe 100644 --- a/drivers/net/wireless/realtek/rtw88/ps.c +++ b/drivers/net/wireless/realtek/rtw88/ps.c @@ -91,11 +91,11 @@ void rtw_power_mode_change(struct rtw_dev *rtwdev, bool enter) return; /* check confirm power mode has left power save state */ - for (polling_cnt = 0; polling_cnt < 3; polling_cnt++) { + for (polling_cnt = 0; polling_cnt < 50; polling_cnt++) { polling = rtw_read8(rtwdev, rtwdev->hci.cpwm_addr); if ((polling ^ confirm) & BIT_RPWM_TOGGLE) return; - mdelay(20); + udelay(100); } /* in case of fw/hw missed the request, retry */ diff --git a/drivers/net/wireless/realtek/rtw88/reg.h b/drivers/net/wireless/realtek/rtw88/reg.h index 7e817bc997eb..9d94534c9674 100644 --- a/drivers/net/wireless/realtek/rtw88/reg.h +++ b/drivers/net/wireless/realtek/rtw88/reg.h @@ -9,6 +9,8 @@ #define BIT_FEN_CPUEN BIT(2) #define BIT_FEN_BB_GLB_RST BIT(1) #define BIT_FEN_BB_RSTB BIT(0) +#define BIT_R_DIS_PRST BIT(6) +#define BIT_WLOCK_1C_B6 BIT(5) #define REG_SYS_PW_CTRL 0x0004 #define REG_SYS_CLK_CTRL 0x0008 #define BIT_CPU_CLK_EN BIT(14) @@ -160,8 +162,12 @@ #define REG_CR 0x0100 #define REG_TRXFF_BNDY 0x0114 #define REG_RXFF_BNDY 0x011C +#define REG_FE1IMR 0x0120 +#define BIT_FS_RXDONE BIT(16) #define REG_PKTBUF_DBG_CTRL 0x0140 #define REG_C2HEVT 0x01A0 +#define REG_MCUTST_II 0x01C4 +#define REG_WOWLAN_WAKE_REASON 0x01C7 #define REG_HMETFR 0x01CC #define REG_HMEBOX0 0x01D0 #define REG_HMEBOX1 0x01D4 @@ -192,9 +198,18 @@ #define REG_H2C_TAIL 0x0248 #define REG_H2C_READ_ADDR 0x024C #define REG_H2C_INFO 0x0254 +#define REG_RXPKT_NUM 0x0284 +#define BIT_RXDMA_REQ BIT(19) +#define BIT_RW_RELEASE BIT(18) +#define BIT_RXDMA_IDLE BIT(17) +#define REG_RXPKTNUM 0x02B0 #define REG_INT_MIG 0x0304 +#define REG_HCI_MIX_CFG 0x03FC +#define BIT_PCIE_EMAC_PDN_AUX_TO_FAST_CLK BIT(26) +#define REG_BCNQ_INFO 0x0418 +#define BIT_MGQ_CPU_EMPTY BIT(24) #define REG_FWHW_TXQ_CTRL 0x0420 #define BIT_EN_BCNQ_DL BIT(22) #define BIT_EN_WR_FREE_TAIL BIT(20) @@ -323,6 +338,20 @@ #define BIT_RFMOD_80M BIT(8) #define BIT_RFMOD_40M BIT(7) #define REG_WMAC_TRXPTCL_CTL_H 0x066C +#define REG_WKFMCAM_CMD 0x0698 +#define BIT_WKFCAM_POLLING_V1 BIT(31) +#define BIT_WKFCAM_CLR_V1 BIT(30) +#define BIT_WKFCAM_WE BIT(16) +#define BIT_SHIFT_WKFCAM_ADDR_V2 8 +#define BIT_MASK_WKFCAM_ADDR_V2 0xff +#define BIT_WKFCAM_ADDR_V2(x) \ + (((x) & BIT_MASK_WKFCAM_ADDR_V2) << BIT_SHIFT_WKFCAM_ADDR_V2) +#define REG_WKFMCAM_RWD 0x069C +#define BIT_WKFMCAM_VALID BIT(31) +#define BIT_WKFMCAM_BC BIT(26) +#define BIT_WKFMCAM_MC BIT(25) +#define BIT_WKFMCAM_UC BIT(24) + #define REG_RXFLTMAP0 0x06A0 #define REG_RXFLTMAP1 0x06A2 #define REG_RXFLTMAP2 0x06A4 diff --git a/drivers/net/wireless/realtek/rtw88/rtw8822c.c b/drivers/net/wireless/realtek/rtw88/rtw8822c.c index 174029836833..3865097696d4 100644 --- a/drivers/net/wireless/realtek/rtw88/rtw8822c.c +++ b/drivers/net/wireless/realtek/rtw88/rtw8822c.c @@ -3487,12 +3487,12 @@ static struct rtw_pwr_seq_cmd trans_cardemu_to_act_8822c[] = { RTW_PWR_CUT_ALL_MSK, RTW_PWR_INTF_ALL_MSK, RTW_PWR_ADDR_MAC, - RTW_PWR_CMD_WRITE, BIT(7), 0}, - {0x0005, + RTW_PWR_CMD_WRITE, (BIT(4) | BIT(3)), 0}, + {0x1018, RTW_PWR_CUT_ALL_MSK, RTW_PWR_INTF_ALL_MSK, RTW_PWR_ADDR_MAC, - RTW_PWR_CMD_WRITE, (BIT(4) | BIT(3)), 0}, + RTW_PWR_CMD_WRITE, BIT(2), BIT(2)}, {0x0005, RTW_PWR_CUT_ALL_MSK, RTW_PWR_INTF_ALL_MSK, @@ -4060,6 +4060,18 @@ static const struct rtw_pwr_track_tbl rtw8822c_rtw_pwr_track_tbl = { .pwrtrk_2g_ccka_p = rtw8822c_pwrtrk_2g_cck_a_p, }; +#ifdef CONFIG_PM +static const struct wiphy_wowlan_support rtw_wowlan_stub_8822c = { + .flags = WIPHY_WOWLAN_MAGIC_PKT | WIPHY_WOWLAN_GTK_REKEY_FAILURE | + WIPHY_WOWLAN_DISCONNECT | WIPHY_WOWLAN_SUPPORTS_GTK_REKEY | + WIPHY_WOWLAN_NET_DETECT, + .n_patterns = RTW_MAX_PATTERN_NUM, + .pattern_max_len = RTW_MAX_PATTERN_SIZE, + .pattern_min_len = 1, + .max_nd_match_sets = 4, +}; +#endif + struct rtw_chip_info rtw8822c_hw_spec = { .ops = &rtw8822c_ops, .id = RTW_CHIP_TYPE_8822C, @@ -4106,6 +4118,11 @@ struct rtw_chip_info rtw8822c_hw_spec = { .bfer_su_max_num = 2, .bfer_mu_max_num = 1, +#ifdef CONFIG_PM + .wow_fw_name = "rtw88/rtw8822c_wow_fw.bin", + .wowlan_stub = &rtw_wowlan_stub_8822c, + .max_sched_scan_ssids = 4, +#endif .coex_para_ver = 0x19062706, .bt_desired_ver = 0x6, .scbd_support = true, @@ -4135,3 +4152,4 @@ struct rtw_chip_info rtw8822c_hw_spec = { EXPORT_SYMBOL(rtw8822c_hw_spec); MODULE_FIRMWARE("rtw88/rtw8822c_fw.bin"); +MODULE_FIRMWARE("rtw88/rtw8822c_wow_fw.bin"); diff --git a/drivers/net/wireless/realtek/rtw88/util.h b/drivers/net/wireless/realtek/rtw88/util.h index 7bd2843b0bce..41c10e7144df 100644 --- a/drivers/net/wireless/realtek/rtw88/util.h +++ b/drivers/net/wireless/realtek/rtw88/util.h @@ -15,6 +15,8 @@ struct rtw_dev; IEEE80211_IFACE_ITER_NORMAL, iterator, data) #define rtw_iterate_stas_atomic(rtwdev, iterator, data) \ ieee80211_iterate_stations_atomic(rtwdev->hw, iterator, data) +#define rtw_iterate_keys(rtwdev, vif, iterator, data) \ + ieee80211_iter_keys(rtwdev->hw, vif, iterator, data) static inline u8 *get_hdr_bssid(struct ieee80211_hdr *hdr) { diff --git a/drivers/net/wireless/realtek/rtw88/wow.c b/drivers/net/wireless/realtek/rtw88/wow.c new file mode 100644 index 000000000000..af5c27e1bb07 --- /dev/null +++ b/drivers/net/wireless/realtek/rtw88/wow.c @@ -0,0 +1,890 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause +/* Copyright(c) 2018-2019 Realtek Corporation + */ + +#include "main.h" +#include "fw.h" +#include "wow.h" +#include "reg.h" +#include "debug.h" +#include "mac.h" +#include "ps.h" + +static void rtw_wow_show_wakeup_reason(struct rtw_dev *rtwdev) +{ + u8 reason; + + reason = rtw_read8(rtwdev, REG_WOWLAN_WAKE_REASON); + + if (reason == RTW_WOW_RSN_RX_DEAUTH) + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: Rx deauth\n"); + else if (reason == RTW_WOW_RSN_DISCONNECT) + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: AP is off\n"); + else if (reason == RTW_WOW_RSN_RX_MAGIC_PKT) + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: Rx magic packet\n"); + else if (reason == RTW_WOW_RSN_RX_GTK_REKEY) + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: Rx gtk rekey\n"); + else if (reason == RTW_WOW_RSN_RX_PTK_REKEY) + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: Rx ptk rekey\n"); + else if (reason == RTW_WOW_RSN_RX_PATTERN_MATCH) + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: Rx pattern match packet\n"); + else if (reason == RTW_WOW_RSN_RX_NLO) + rtw_dbg(rtwdev, RTW_DBG_WOW, "Rx NLO\n"); + else + rtw_warn(rtwdev, "Unknown wakeup reason %x\n", reason); +} + +static void rtw_wow_pattern_write_cam(struct rtw_dev *rtwdev, u8 addr, + u32 wdata) +{ + rtw_write32(rtwdev, REG_WKFMCAM_RWD, wdata); + rtw_write32(rtwdev, REG_WKFMCAM_CMD, BIT_WKFCAM_POLLING_V1 | + BIT_WKFCAM_WE | BIT_WKFCAM_ADDR_V2(addr)); + + if (!check_hw_ready(rtwdev, REG_WKFMCAM_CMD, BIT_WKFCAM_POLLING_V1, 0)) + rtw_err(rtwdev, "failed to write pattern cam\n"); +} + +static void rtw_wow_pattern_write_cam_ent(struct rtw_dev *rtwdev, u8 id, + struct rtw_wow_pattern *rtw_pattern) +{ + int i; + u8 addr; + u32 wdata; + + for (i = 0; i < RTW_MAX_PATTERN_MASK_SIZE / 4; i++) { + addr = (id << 3) + i; + wdata = rtw_pattern->mask[i * 4]; + wdata |= rtw_pattern->mask[i * 4 + 1] << 8; + wdata |= rtw_pattern->mask[i * 4 + 2] << 16; + wdata |= rtw_pattern->mask[i * 4 + 3] << 24; + rtw_wow_pattern_write_cam(rtwdev, addr, wdata); + } + + wdata = rtw_pattern->crc; + addr = (id << 3) + RTW_MAX_PATTERN_MASK_SIZE / 4; + + switch (rtw_pattern->type) { + case RTW_PATTERN_BROADCAST: + wdata |= BIT_WKFMCAM_BC | BIT_WKFMCAM_VALID; + break; + case RTW_PATTERN_MULTICAST: + wdata |= BIT_WKFMCAM_MC | BIT_WKFMCAM_VALID; + break; + case RTW_PATTERN_UNICAST: + wdata |= BIT_WKFMCAM_UC | BIT_WKFMCAM_VALID; + break; + default: + break; + } + rtw_wow_pattern_write_cam(rtwdev, addr, wdata); +} + +/* RTK internal CRC16 for Pattern Cam */ +static u16 __rtw_cal_crc16(u8 data, u16 crc) +{ + u8 shift_in, data_bit; + u8 crc_bit4, crc_bit11, crc_bit15; + u16 crc_result; + int index; + + for (index = 0; index < 8; index++) { + crc_bit15 = ((crc & BIT(15)) ? 1 : 0); + data_bit = (data & (BIT(0) << index) ? 1 : 0); + shift_in = crc_bit15 ^ data_bit; + + crc_result = crc << 1; + + if (shift_in == 0) + crc_result &= (~BIT(0)); + else + crc_result |= BIT(0); + + crc_bit11 = ((crc & BIT(11)) ? 1 : 0) ^ shift_in; + + if (crc_bit11 == 0) + crc_result &= (~BIT(12)); + else + crc_result |= BIT(12); + + crc_bit4 = ((crc & BIT(4)) ? 1 : 0) ^ shift_in; + + if (crc_bit4 == 0) + crc_result &= (~BIT(5)); + else + crc_result |= BIT(5); + + crc = crc_result; + } + return crc; +} + +static u16 rtw_calc_crc(u8 *pdata, int length) +{ + u16 crc = 0xffff; + int i; + + for (i = 0; i < length; i++) + crc = __rtw_cal_crc16(pdata[i], crc); + + /* get 1' complement */ + return ~crc; +} + +static void rtw_wow_pattern_generate(struct rtw_dev *rtwdev, + struct rtw_vif *rtwvif, + const struct cfg80211_pkt_pattern *pkt_pattern, + struct rtw_wow_pattern *rtw_pattern) +{ + const u8 *mask; + const u8 *pattern; + u8 mask_hw[RTW_MAX_PATTERN_MASK_SIZE] = {0}; + u8 content[RTW_MAX_PATTERN_SIZE] = {0}; + u8 mac_addr[ETH_ALEN] = {0}; + u8 mask_len; + u16 count; + int len; + int i; + + pattern = pkt_pattern->pattern; + len = pkt_pattern->pattern_len; + mask = pkt_pattern->mask; + + ether_addr_copy(mac_addr, rtwvif->mac_addr); + memset(rtw_pattern, 0, sizeof(*rtw_pattern)); + + mask_len = DIV_ROUND_UP(len, 8); + + if (is_broadcast_ether_addr(pattern)) + rtw_pattern->type = RTW_PATTERN_BROADCAST; + else if (is_multicast_ether_addr(pattern)) + rtw_pattern->type = RTW_PATTERN_MULTICAST; + else if (ether_addr_equal(pattern, mac_addr)) + rtw_pattern->type = RTW_PATTERN_UNICAST; + else + rtw_pattern->type = RTW_PATTERN_INVALID; + + /* translate mask from os to mask for hw + * pattern from OS uses 'ethenet frame', like this: + * | 6 | 6 | 2 | 20 | Variable | 4 | + * |--------+--------+------+-----------+------------+-----| + * | 802.3 Mac Header | IP Header | TCP Packet | FCS | + * | DA | SA | Type | + * + * BUT, packet catched by our HW is in '802.11 frame', begin from LLC + * | 24 or 30 | 6 | 2 | 20 | Variable | 4 | + * |-------------------+--------+------+-----------+------------+-----| + * | 802.11 MAC Header | LLC | IP Header | TCP Packet | FCS | + * | Others | Tpye | + * + * Therefore, we need translate mask_from_OS to mask_to_hw. + * We should left-shift mask by 6 bits, then set the new bit[0~5] = 0, + * because new mask[0~5] means 'SA', but our HW packet begins from LLC, + * bit[0~5] corresponds to first 6 Bytes in LLC, they just don't match. + */ + + /* Shift 6 bits */ + for (i = 0; i < mask_len - 1; i++) { + mask_hw[i] = u8_get_bits(mask[i], GENMASK(7, 6)); + mask_hw[i] |= u8_get_bits(mask[i + 1], GENMASK(5, 0)) << 2; + } + mask_hw[i] = u8_get_bits(mask[i], GENMASK(7, 6)); + + /* Set bit 0-5 to zero */ + mask_hw[0] &= (~GENMASK(5, 0)); + + memcpy(rtw_pattern->mask, mask_hw, RTW_MAX_PATTERN_MASK_SIZE); + + /* To get the wake up pattern from the mask. + * We do not count first 12 bits which means + * DA[6] and SA[6] in the pattern to match HW design. + */ + count = 0; + for (i = 12; i < len; i++) { + if ((mask[i / 8] >> (i % 8)) & 0x01) { + content[count] = pattern[i]; + count++; + } + } + + rtw_pattern->crc = rtw_calc_crc(content, count); +} + +static void rtw_wow_pattern_clear_cam(struct rtw_dev *rtwdev) +{ + bool ret; + + rtw_write32(rtwdev, REG_WKFMCAM_CMD, BIT_WKFCAM_POLLING_V1 | + BIT_WKFCAM_CLR_V1); + + ret = check_hw_ready(rtwdev, REG_WKFMCAM_CMD, BIT_WKFCAM_POLLING_V1, 0); + if (!ret) + rtw_err(rtwdev, "failed to clean pattern cam\n"); +} + +static void rtw_wow_pattern_write(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + struct rtw_wow_pattern *rtw_pattern = rtw_wow->patterns; + int i = 0; + + for (i = 0; i < rtw_wow->pattern_cnt; i++) + rtw_wow_pattern_write_cam_ent(rtwdev, i, rtw_pattern + i); +} + +static void rtw_wow_pattern_clear(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + + rtw_wow_pattern_clear_cam(rtwdev); + + rtw_wow->pattern_cnt = 0; + memset(rtw_wow->patterns, 0, sizeof(rtw_wow->patterns)); +} + +static void rtw_wow_bb_stop(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + + /* wait 100ms for firmware to finish TX */ + msleep(100); + + if (!rtw_read32_mask(rtwdev, REG_BCNQ_INFO, BIT_MGQ_CPU_EMPTY)) + rtw_warn(rtwdev, "Wrong status of MGQ_CPU empty!\n"); + + rtw_wow->txpause = rtw_read8(rtwdev, REG_TXPAUSE); + rtw_write8(rtwdev, REG_TXPAUSE, 0xff); + rtw_write8_clr(rtwdev, REG_SYS_FUNC_EN, BIT_FEN_BB_RSTB); +} + +static void rtw_wow_bb_start(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + + rtw_write8_set(rtwdev, REG_SYS_FUNC_EN, BIT_FEN_BB_RSTB); + rtw_write8(rtwdev, REG_TXPAUSE, rtw_wow->txpause); +} + +static void rtw_wow_rx_dma_stop(struct rtw_dev *rtwdev) +{ + /* wait 100ms for HW to finish rx dma */ + msleep(100); + + rtw_write32_set(rtwdev, REG_RXPKT_NUM, BIT_RW_RELEASE); + + if (!check_hw_ready(rtwdev, REG_RXPKT_NUM, BIT_RXDMA_IDLE, 1)) + rtw_err(rtwdev, "failed to stop rx dma\n"); +} + +static void rtw_wow_rx_dma_start(struct rtw_dev *rtwdev) +{ + rtw_write32_clr(rtwdev, REG_RXPKT_NUM, BIT_RW_RELEASE); +} + +static bool rtw_wow_check_fw_status(struct rtw_dev *rtwdev, bool wow_enable) +{ + bool ret; + + /* wait 100ms for wow firmware to finish work */ + msleep(100); + + if (wow_enable) { + if (!rtw_read8(rtwdev, REG_WOWLAN_WAKE_REASON)) + ret = 0; + } else { + if (rtw_read32_mask(rtwdev, REG_FE1IMR, BIT_FS_RXDONE) == 0 && + rtw_read32_mask(rtwdev, REG_RXPKT_NUM, BIT_RW_RELEASE) == 0) + ret = 0; + } + + if (ret) + rtw_err(rtwdev, "failed to check wow status %s\n", + wow_enable ? "enabled" : "disabled"); + + return ret; +} + +static void rtw_wow_fw_security_type_iter(struct ieee80211_hw *hw, + struct ieee80211_vif *vif, + struct ieee80211_sta *sta, + struct ieee80211_key_conf *key, + void *data) +{ + struct rtw_fw_key_type_iter_data *iter_data = data; + struct rtw_dev *rtwdev = hw->priv; + u8 hw_key_type; + + if (vif != rtwdev->wow.wow_vif) + return; + + switch (key->cipher) { + case WLAN_CIPHER_SUITE_WEP40: + hw_key_type = RTW_CAM_WEP40; + break; + case WLAN_CIPHER_SUITE_WEP104: + hw_key_type = RTW_CAM_WEP104; + break; + case WLAN_CIPHER_SUITE_TKIP: + hw_key_type = RTW_CAM_TKIP; + key->flags |= IEEE80211_KEY_FLAG_GENERATE_MMIC; + break; + case WLAN_CIPHER_SUITE_CCMP: + hw_key_type = RTW_CAM_AES; + key->flags |= IEEE80211_KEY_FLAG_SW_MGMT_TX; + break; + default: + rtw_err(rtwdev, "Unsupported key type for wowlan mode\n"); + hw_key_type = 0; + break; + } + + if (sta) + iter_data->pairwise_key_type = hw_key_type; + else + iter_data->group_key_type = hw_key_type; +} + +static void rtw_wow_fw_security_type(struct rtw_dev *rtwdev) +{ + struct rtw_fw_key_type_iter_data data = {}; + struct ieee80211_vif *wow_vif = rtwdev->wow.wow_vif; + + data.rtwdev = rtwdev; + rtw_iterate_keys(rtwdev, wow_vif, + rtw_wow_fw_security_type_iter, &data); + rtw_fw_set_aoac_global_info_cmd(rtwdev, data.pairwise_key_type, + data.group_key_type); +} + +static int rtw_wow_fw_start(struct rtw_dev *rtwdev) +{ + if (rtw_wow_mgd_linked(rtwdev)) { + rtw_send_rsvd_page_h2c(rtwdev); + rtw_wow_pattern_write(rtwdev); + rtw_wow_fw_security_type(rtwdev); + rtw_fw_set_disconnect_decision_cmd(rtwdev, true); + rtw_fw_set_keep_alive_cmd(rtwdev, true); + } else if (rtw_wow_no_link(rtwdev)) { + rtw_fw_set_nlo_info(rtwdev, true); + rtw_fw_update_pkt_probe_req(rtwdev, NULL); + rtw_fw_channel_switch(rtwdev, true); + } + + rtw_fw_set_wowlan_ctrl_cmd(rtwdev, true); + rtw_fw_set_remote_wake_ctrl_cmd(rtwdev, true); + + return rtw_wow_check_fw_status(rtwdev, true); +} + +static int rtw_wow_fw_stop(struct rtw_dev *rtwdev) +{ + if (rtw_wow_mgd_linked(rtwdev)) { + rtw_fw_set_disconnect_decision_cmd(rtwdev, false); + rtw_fw_set_keep_alive_cmd(rtwdev, false); + rtw_wow_pattern_clear(rtwdev); + } else if (rtw_wow_no_link(rtwdev)) { + rtw_fw_channel_switch(rtwdev, false); + rtw_fw_set_nlo_info(rtwdev, false); + } + + rtw_fw_set_wowlan_ctrl_cmd(rtwdev, false); + rtw_fw_set_remote_wake_ctrl_cmd(rtwdev, false); + + return rtw_wow_check_fw_status(rtwdev, false); +} + +static void rtw_wow_avoid_reset_mac(struct rtw_dev *rtwdev) +{ + /* When resuming from wowlan mode, some hosts issue signal + * (PCIE: PREST, USB: SE0RST) to device, and lead to reset + * mac core. If it happens, the connection to AP will be lost. + * Setting REG_RSV_CTRL Register can avoid this process. + */ + switch (rtw_hci_type(rtwdev)) { + case RTW_HCI_TYPE_PCIE: + case RTW_HCI_TYPE_USB: + rtw_write8(rtwdev, REG_RSV_CTRL, BIT_WLOCK_1C_B6); + rtw_write8(rtwdev, REG_RSV_CTRL, + BIT_WLOCK_1C_B6 | BIT_R_DIS_PRST); + break; + default: + rtw_warn(rtwdev, "Unsupported hci type to disable reset MAC\n"); + break; + } +} + +static void rtw_wow_fw_media_status_iter(void *data, struct ieee80211_sta *sta) +{ + struct rtw_sta_info *si = (struct rtw_sta_info *)sta->drv_priv; + struct rtw_fw_media_status_iter_data *iter_data = data; + struct rtw_dev *rtwdev = iter_data->rtwdev; + + rtw_fw_media_status_report(rtwdev, si->mac_id, iter_data->connect); +} + +static void rtw_wow_fw_media_status(struct rtw_dev *rtwdev, bool connect) +{ + struct rtw_fw_media_status_iter_data data; + + data.rtwdev = rtwdev; + data.connect = connect; + + rtw_iterate_stas_atomic(rtwdev, rtw_wow_fw_media_status_iter, &data); +} + +static void rtw_wow_config_pno_rsvd_page(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + struct rtw_pno_request *rtw_pno_req = &rtw_wow->pno_req; + struct cfg80211_ssid *ssid; + int i; + + for (i = 0 ; i < rtw_pno_req->match_set_cnt; i++) { + ssid = &rtw_pno_req->match_sets[i].ssid; + rtw_add_rsvd_page_probe_req(rtwdev, ssid); + } + rtw_add_rsvd_page_probe_req(rtwdev, NULL); + rtw_add_rsvd_page(rtwdev, RSVD_NLO_INFO, false); + rtw_add_rsvd_page(rtwdev, RSVD_CH_INFO, true); +} + +static void rtw_wow_config_linked_rsvd_page(struct rtw_dev *rtwdev) +{ + rtw_add_rsvd_page(rtwdev, RSVD_PS_POLL, true); + rtw_add_rsvd_page(rtwdev, RSVD_QOS_NULL, true); + rtw_add_rsvd_page(rtwdev, RSVD_NULL, true); + rtw_add_rsvd_page(rtwdev, RSVD_LPS_PG_DPK, true); + rtw_add_rsvd_page(rtwdev, RSVD_LPS_PG_INFO, true); +} + +static void rtw_wow_config_rsvd_page(struct rtw_dev *rtwdev) +{ + rtw_reset_rsvd_page(rtwdev); + + if (rtw_wow_mgd_linked(rtwdev)) { + rtw_wow_config_linked_rsvd_page(rtwdev); + } else if (test_bit(RTW_FLAG_WOWLAN, rtwdev->flags) && + rtw_wow_no_link(rtwdev)) { + rtw_wow_config_pno_rsvd_page(rtwdev); + } +} + +static int rtw_wow_dl_fw_rsvd_page(struct rtw_dev *rtwdev) +{ + struct ieee80211_vif *wow_vif = rtwdev->wow.wow_vif; + + rtw_wow_config_rsvd_page(rtwdev); + + return rtw_fw_download_rsvd_page(rtwdev, wow_vif); +} + +static int rtw_wow_swap_fw(struct rtw_dev *rtwdev, enum rtw_fw_type type) +{ + struct rtw_fw_state *fw; + int ret; + + switch (type) { + case RTW_WOWLAN_FW: + fw = &rtwdev->wow_fw; + break; + + case RTW_NORMAL_FW: + fw = &rtwdev->fw; + break; + + default: + rtw_warn(rtwdev, "unsupported firmware type to swap\n"); + return -ENOENT; + } + + ret = rtw_download_firmware(rtwdev, fw); + if (ret) + goto out; + + rtw_fw_send_general_info(rtwdev); + rtw_fw_send_phydm_info(rtwdev); + rtw_wow_fw_media_status(rtwdev, true); + +out: + return ret; +} + +static void rtw_wow_check_pno(struct rtw_dev *rtwdev, + struct cfg80211_sched_scan_request *nd_config) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + struct rtw_pno_request *pno_req = &rtw_wow->pno_req; + struct ieee80211_channel *channel; + int i, size; + + if (!nd_config->n_match_sets || !nd_config->n_channels) + goto err; + + pno_req->match_set_cnt = nd_config->n_match_sets; + size = sizeof(*pno_req->match_sets) * pno_req->match_set_cnt; + pno_req->match_sets = kmemdup(nd_config->match_sets, size, GFP_KERNEL); + if (!pno_req->match_sets) + goto err; + + pno_req->channel_cnt = nd_config->n_channels; + size = sizeof(*nd_config->channels[0]) * nd_config->n_channels; + pno_req->channels = kmalloc(size, GFP_KERNEL); + if (!pno_req->channels) + goto channel_err; + + for (i = 0 ; i < pno_req->channel_cnt; i++) { + channel = pno_req->channels + i; + memcpy(channel, nd_config->channels[i], sizeof(*channel)); + } + + pno_req->scan_plan = *nd_config->scan_plans; + pno_req->inited = true; + + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: net-detect is enabled\n"); + + return; + +channel_err: + kfree(pno_req->match_sets); + +err: + rtw_dbg(rtwdev, RTW_DBG_WOW, "WOW: net-detect is disabled\n"); +} + +static int rtw_wow_leave_linked_ps(struct rtw_dev *rtwdev) +{ + if (!test_bit(RTW_FLAG_WOWLAN, rtwdev->flags)) + cancel_delayed_work_sync(&rtwdev->watch_dog_work); + + rtw_leave_lps(rtwdev); + + return 0; +} + +static int rtw_wow_leave_no_link_ps(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + int ret = 0; + + if (test_bit(RTW_FLAG_WOWLAN, rtwdev->flags)) { + if (rtw_fw_lps_deep_mode) + rtw_leave_lps_deep(rtwdev); + } else { + if (test_bit(RTW_FLAG_INACTIVE_PS, rtwdev->flags)) { + rtw_wow->ips_enabled = true; + ret = rtw_leave_ips(rtwdev); + if (ret) + return ret; + } + } + + return 0; +} + +static int rtw_wow_leave_ps(struct rtw_dev *rtwdev) +{ + int ret = 0; + + if (rtw_wow_mgd_linked(rtwdev)) + ret = rtw_wow_leave_linked_ps(rtwdev); + else if (rtw_wow_no_link(rtwdev)) + ret = rtw_wow_leave_no_link_ps(rtwdev); + + return ret; +} + +static int rtw_wow_restore_ps(struct rtw_dev *rtwdev) +{ + int ret = 0; + + if (rtw_wow_no_link(rtwdev) && rtwdev->wow.ips_enabled) + ret = rtw_enter_ips(rtwdev); + + return ret; +} + +static int rtw_wow_enter_linked_ps(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + struct ieee80211_vif *wow_vif = rtw_wow->wow_vif; + struct rtw_vif *rtwvif = (struct rtw_vif *)wow_vif->drv_priv; + + rtw_enter_lps(rtwdev, rtwvif->port); + + return 0; +} + +static int rtw_wow_enter_no_link_ps(struct rtw_dev *rtwdev) +{ + /* firmware enters deep ps by itself if supported */ + set_bit(RTW_FLAG_LEISURE_PS_DEEP, rtwdev->flags); + + return 0; +} + +static int rtw_wow_enter_ps(struct rtw_dev *rtwdev) +{ + int ret = 0; + + if (rtw_wow_mgd_linked(rtwdev)) + ret = rtw_wow_enter_linked_ps(rtwdev); + else if (rtw_wow_no_link(rtwdev) && rtw_fw_lps_deep_mode) + ret = rtw_wow_enter_no_link_ps(rtwdev); + + return ret; +} + +static void rtw_wow_stop_trx(struct rtw_dev *rtwdev) +{ + rtw_wow_bb_stop(rtwdev); + rtw_wow_rx_dma_stop(rtwdev); +} + +static int rtw_wow_start(struct rtw_dev *rtwdev) +{ + int ret; + + ret = rtw_wow_fw_start(rtwdev); + if (ret) + goto out; + + rtw_hci_stop(rtwdev); + rtw_wow_bb_start(rtwdev); + rtw_wow_avoid_reset_mac(rtwdev); + +out: + return ret; +} + +static int rtw_wow_enable(struct rtw_dev *rtwdev) +{ + int ret = 0; + + rtw_wow_stop_trx(rtwdev); + + ret = rtw_wow_swap_fw(rtwdev, RTW_WOWLAN_FW); + if (ret) { + rtw_err(rtwdev, "failed to swap wow fw\n"); + goto error; + } + + set_bit(RTW_FLAG_WOWLAN, rtwdev->flags); + + ret = rtw_wow_dl_fw_rsvd_page(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to download wowlan rsvd page\n"); + goto error; + } + + ret = rtw_wow_start(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to start wow\n"); + goto error; + } + + return ret; + +error: + clear_bit(RTW_FLAG_WOWLAN, rtwdev->flags); + return ret; +} + +static int rtw_wow_stop(struct rtw_dev *rtwdev) +{ + int ret; + + /* some HCI related registers will be reset after resume, + * need to set them again. + */ + ret = rtw_hci_setup(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to setup hci\n"); + return ret; + } + + ret = rtw_hci_start(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to start hci\n"); + return ret; + } + + ret = rtw_wow_fw_stop(rtwdev); + if (ret) + rtw_err(rtwdev, "failed to stop wowlan fw\n"); + + rtw_wow_bb_stop(rtwdev); + + return ret; +} + +static void rtw_wow_resume_trx(struct rtw_dev *rtwdev) +{ + rtw_wow_rx_dma_start(rtwdev); + rtw_wow_bb_start(rtwdev); + ieee80211_queue_delayed_work(rtwdev->hw, &rtwdev->watch_dog_work, + RTW_WATCH_DOG_DELAY_TIME); +} + +static int rtw_wow_disable(struct rtw_dev *rtwdev) +{ + int ret; + + clear_bit(RTW_FLAG_WOWLAN, rtwdev->flags); + + ret = rtw_wow_stop(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to stop wow\n"); + goto out; + } + + ret = rtw_wow_swap_fw(rtwdev, RTW_NORMAL_FW); + if (ret) { + rtw_err(rtwdev, "failed to swap normal fw\n"); + goto out; + } + + ret = rtw_wow_dl_fw_rsvd_page(rtwdev); + if (ret) + rtw_err(rtwdev, "failed to download normal rsvd page\n"); + +out: + rtw_wow_resume_trx(rtwdev); + return ret; +} + +static void rtw_wow_vif_iter(void *data, u8 *mac, struct ieee80211_vif *vif) +{ + struct rtw_dev *rtwdev = data; + struct rtw_vif *rtwvif = (struct rtw_vif *)vif->drv_priv; + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + + /* Current wowlan function support setting of only one STATION vif. + * So when one suitable vif is found, stop the iteration. + */ + if (rtw_wow->wow_vif || vif->type != NL80211_IFTYPE_STATION) + return; + + switch (rtwvif->net_type) { + case RTW_NET_MGD_LINKED: + rtw_wow->wow_vif = vif; + break; + case RTW_NET_NO_LINK: + if (rtw_wow->pno_req.inited) + rtwdev->wow.wow_vif = vif; + break; + default: + break; + } +} + +static int rtw_wow_set_wakeups(struct rtw_dev *rtwdev, + struct cfg80211_wowlan *wowlan) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + struct rtw_wow_pattern *rtw_patterns = rtw_wow->patterns; + struct rtw_vif *rtwvif; + int i; + + if (wowlan->disconnect) + set_bit(RTW_WOW_FLAG_EN_DISCONNECT, rtw_wow->flags); + if (wowlan->magic_pkt) + set_bit(RTW_WOW_FLAG_EN_MAGIC_PKT, rtw_wow->flags); + if (wowlan->gtk_rekey_failure) + set_bit(RTW_WOW_FLAG_EN_REKEY_PKT, rtw_wow->flags); + + if (wowlan->nd_config) + rtw_wow_check_pno(rtwdev, wowlan->nd_config); + + rtw_iterate_vifs_atomic(rtwdev, rtw_wow_vif_iter, rtwdev); + if (!rtw_wow->wow_vif) + return -EPERM; + + rtwvif = (struct rtw_vif *)rtw_wow->wow_vif->drv_priv; + if (wowlan->n_patterns && wowlan->patterns) { + rtw_wow->pattern_cnt = wowlan->n_patterns; + for (i = 0; i < wowlan->n_patterns; i++) + rtw_wow_pattern_generate(rtwdev, rtwvif, + wowlan->patterns + i, + rtw_patterns + i); + } + + return 0; +} + +static void rtw_wow_clear_wakeups(struct rtw_dev *rtwdev) +{ + struct rtw_wow_param *rtw_wow = &rtwdev->wow; + struct rtw_pno_request *pno_req = &rtw_wow->pno_req; + + if (pno_req->inited) { + kfree(pno_req->channels); + kfree(pno_req->match_sets); + } + + memset(rtw_wow, 0, sizeof(rtwdev->wow)); +} + +int rtw_wow_suspend(struct rtw_dev *rtwdev, struct cfg80211_wowlan *wowlan) +{ + int ret = 0; + + ret = rtw_wow_set_wakeups(rtwdev, wowlan); + if (ret) { + rtw_err(rtwdev, "failed to set wakeup event\n"); + goto out; + } + + ret = rtw_wow_leave_ps(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to leave ps from normal mode\n"); + goto out; + } + + ret = rtw_wow_enable(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to enable wow\n"); + rtw_wow_restore_ps(rtwdev); + goto out; + } + + ret = rtw_wow_enter_ps(rtwdev); + if (ret) + rtw_err(rtwdev, "failed to enter ps for wow\n"); + +out: + return ret; +} + +int rtw_wow_resume(struct rtw_dev *rtwdev) +{ + int ret; + + /* If wowlan mode is not enabled, do nothing */ + if (!test_bit(RTW_FLAG_WOWLAN, rtwdev->flags)) { + rtw_err(rtwdev, "wow is not enabled\n"); + ret = -EPERM; + goto out; + } + + ret = rtw_wow_leave_ps(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to leave ps from wowlan mode\n"); + goto out; + } + + rtw_wow_show_wakeup_reason(rtwdev); + + ret = rtw_wow_disable(rtwdev); + if (ret) { + rtw_err(rtwdev, "failed to disable wow\n"); + goto out; + } + + ret = rtw_wow_restore_ps(rtwdev); + if (ret) + rtw_err(rtwdev, "failed to restore ps to normal mode\n"); + +out: + rtw_wow_clear_wakeups(rtwdev); + return ret; +} diff --git a/drivers/net/wireless/realtek/rtw88/wow.h b/drivers/net/wireless/realtek/rtw88/wow.h new file mode 100644 index 000000000000..289368a2cba4 --- /dev/null +++ b/drivers/net/wireless/realtek/rtw88/wow.h @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */ +/* Copyright(c) 2018-2019 Realtek Corporation + */ + +#ifndef __RTW_WOW_H__ +#define __RTW_WOW_H__ + +#define PNO_CHECK_BYTE 4 + +enum rtw_wow_pattern_type { + RTW_PATTERN_BROADCAST = 0, + RTW_PATTERN_MULTICAST, + RTW_PATTERN_UNICAST, + RTW_PATTERN_VALID, + RTW_PATTERN_INVALID, +}; + +enum rtw_wake_reason { + RTW_WOW_RSN_RX_PTK_REKEY = 0x1, + RTW_WOW_RSN_RX_GTK_REKEY = 0x2, + RTW_WOW_RSN_RX_DEAUTH = 0x8, + RTW_WOW_RSN_DISCONNECT = 0x10, + RTW_WOW_RSN_RX_MAGIC_PKT = 0x21, + RTW_WOW_RSN_RX_PATTERN_MATCH = 0x23, + RTW_WOW_RSN_RX_NLO = 0x55, +}; + +struct rtw_fw_media_status_iter_data { + struct rtw_dev *rtwdev; + u8 connect; +}; + +struct rtw_fw_key_type_iter_data { + struct rtw_dev *rtwdev; + u8 group_key_type; + u8 pairwise_key_type; +}; + +static inline bool rtw_wow_mgd_linked(struct rtw_dev *rtwdev) +{ + struct ieee80211_vif *wow_vif = rtwdev->wow.wow_vif; + struct rtw_vif *rtwvif = (struct rtw_vif *)wow_vif->drv_priv; + + return (rtwvif->net_type == RTW_NET_MGD_LINKED); +} + +static inline bool rtw_wow_no_link(struct rtw_dev *rtwdev) +{ + struct ieee80211_vif *wow_vif = rtwdev->wow.wow_vif; + struct rtw_vif *rtwvif = (struct rtw_vif *)wow_vif->drv_priv; + + return (rtwvif->net_type == RTW_NET_NO_LINK); +} + +int rtw_wow_suspend(struct rtw_dev *rtwdev, struct cfg80211_wowlan *wowlan); +int rtw_wow_resume(struct rtw_dev *rtwdev); + +#endif diff --git a/drivers/net/wireless/st/cw1200/txrx.c b/drivers/net/wireless/st/cw1200/txrx.c index 2dfcdb145944..400dd585916b 100644 --- a/drivers/net/wireless/st/cw1200/txrx.c +++ b/drivers/net/wireless/st/cw1200/txrx.c @@ -715,7 +715,7 @@ void cw1200_tx(struct ieee80211_hw *dev, }; struct ieee80211_sta *sta; struct wsm_tx *wsm; - bool tid_update = 0; + bool tid_update = false; u8 flags = 0; int ret; diff --git a/drivers/net/wireless/ti/wlcore/cmd.c b/drivers/net/wireless/ti/wlcore/cmd.c index 2a48fc6ad56c..6ef8fc9ae627 100644 --- a/drivers/net/wireless/ti/wlcore/cmd.c +++ b/drivers/net/wireless/ti/wlcore/cmd.c @@ -1429,7 +1429,7 @@ out: int wl1271_cmd_set_ap_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, u16 action, u8 id, u8 key_type, u8 key_size, const u8 *key, u8 hlid, u32 tx_seq_32, - u16 tx_seq_16) + u16 tx_seq_16, bool is_pairwise) { struct wl1271_cmd_set_keys *cmd; int ret = 0; @@ -1444,8 +1444,10 @@ int wl1271_cmd_set_ap_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, lid_type = WEP_DEFAULT_LID_TYPE; else lid_type = BROADCAST_LID_TYPE; - } else { + } else if (is_pairwise) { lid_type = UNICAST_LID_TYPE; + } else { + lid_type = BROADCAST_LID_TYPE; } wl1271_debug(DEBUG_CRYPT, "ap key action: %d id: %d lid: %d type: %d" diff --git a/drivers/net/wireless/ti/wlcore/cmd.h b/drivers/net/wireless/ti/wlcore/cmd.h index 084375ba4abf..bfad7b5a1ac6 100644 --- a/drivers/net/wireless/ti/wlcore/cmd.h +++ b/drivers/net/wireless/ti/wlcore/cmd.h @@ -65,7 +65,7 @@ int wl1271_cmd_set_sta_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, int wl1271_cmd_set_ap_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, u16 action, u8 id, u8 key_type, u8 key_size, const u8 *key, u8 hlid, u32 tx_seq_32, - u16 tx_seq_16); + u16 tx_seq_16, bool is_pairwise); int wl12xx_cmd_set_peer_state(struct wl1271 *wl, struct wl12xx_vif *wlvif, u8 hlid); int wl12xx_roc(struct wl1271 *wl, struct wl12xx_vif *wlvif, u8 role_id, diff --git a/drivers/net/wireless/ti/wlcore/main.c b/drivers/net/wireless/ti/wlcore/main.c index e994995d79ab..ed049c9f7e29 100644 --- a/drivers/net/wireless/ti/wlcore/main.c +++ b/drivers/net/wireless/ti/wlcore/main.c @@ -3273,7 +3273,7 @@ out: static int wl1271_record_ap_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, u8 id, u8 key_type, u8 key_size, const u8 *key, u8 hlid, u32 tx_seq_32, - u16 tx_seq_16) + u16 tx_seq_16, bool is_pairwise) { struct wl1271_ap_key *ap_key; int i; @@ -3311,6 +3311,7 @@ static int wl1271_record_ap_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, ap_key->hlid = hlid; ap_key->tx_seq_32 = tx_seq_32; ap_key->tx_seq_16 = tx_seq_16; + ap_key->is_pairwise = is_pairwise; wlvif->ap.recorded_keys[i] = ap_key; return 0; @@ -3346,7 +3347,7 @@ static int wl1271_ap_init_hwenc(struct wl1271 *wl, struct wl12xx_vif *wlvif) key->id, key->key_type, key->key_size, key->key, hlid, key->tx_seq_32, - key->tx_seq_16); + key->tx_seq_16, key->is_pairwise); if (ret < 0) goto out; @@ -3369,7 +3370,8 @@ out: static int wl1271_set_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, u16 action, u8 id, u8 key_type, u8 key_size, const u8 *key, u32 tx_seq_32, - u16 tx_seq_16, struct ieee80211_sta *sta) + u16 tx_seq_16, struct ieee80211_sta *sta, + bool is_pairwise) { int ret; bool is_ap = (wlvif->bss_type == BSS_TYPE_AP_BSS); @@ -3396,12 +3398,12 @@ static int wl1271_set_key(struct wl1271 *wl, struct wl12xx_vif *wlvif, ret = wl1271_record_ap_key(wl, wlvif, id, key_type, key_size, key, hlid, tx_seq_32, - tx_seq_16); + tx_seq_16, is_pairwise); } else { ret = wl1271_cmd_set_ap_key(wl, wlvif, action, id, key_type, key_size, key, hlid, tx_seq_32, - tx_seq_16); + tx_seq_16, is_pairwise); } if (ret < 0) @@ -3501,6 +3503,7 @@ int wlcore_set_key(struct wl1271 *wl, enum set_key_cmd cmd, u16 tx_seq_16 = 0; u8 key_type; u8 hlid; + bool is_pairwise; wl1271_debug(DEBUG_MAC80211, "mac80211 set key"); @@ -3550,12 +3553,14 @@ int wlcore_set_key(struct wl1271 *wl, enum set_key_cmd cmd, return -EOPNOTSUPP; } + is_pairwise = key_conf->flags & IEEE80211_KEY_FLAG_PAIRWISE; + switch (cmd) { case SET_KEY: ret = wl1271_set_key(wl, wlvif, KEY_ADD_OR_REPLACE, key_conf->keyidx, key_type, key_conf->keylen, key_conf->key, - tx_seq_32, tx_seq_16, sta); + tx_seq_32, tx_seq_16, sta, is_pairwise); if (ret < 0) { wl1271_error("Could not add or replace key"); return ret; @@ -3581,7 +3586,7 @@ int wlcore_set_key(struct wl1271 *wl, enum set_key_cmd cmd, ret = wl1271_set_key(wl, wlvif, KEY_REMOVE, key_conf->keyidx, key_type, key_conf->keylen, key_conf->key, - 0, 0, sta); + 0, 0, sta, is_pairwise); if (ret < 0) { wl1271_error("Could not remove key"); return ret; @@ -6223,6 +6228,7 @@ static int wl1271_init_ieee80211(struct wl1271 *wl) ieee80211_hw_set(wl->hw, SUPPORT_FAST_XMIT); ieee80211_hw_set(wl->hw, CHANCTX_STA_CSA); + ieee80211_hw_set(wl->hw, SUPPORTS_PER_STA_GTK); ieee80211_hw_set(wl->hw, QUEUE_CONTROL); ieee80211_hw_set(wl->hw, TX_AMPDU_SETUP_IN_HW); ieee80211_hw_set(wl->hw, AMPDU_AGGREGATION); @@ -6267,7 +6273,8 @@ static int wl1271_init_ieee80211(struct wl1271 *wl) wl->hw->wiphy->flags |= WIPHY_FLAG_AP_UAPSD | WIPHY_FLAG_HAS_REMAIN_ON_CHANNEL | - WIPHY_FLAG_HAS_CHANNEL_SWITCH; + WIPHY_FLAG_HAS_CHANNEL_SWITCH | ++ WIPHY_FLAG_IBSS_RSN; wl->hw->wiphy->features |= NL80211_FEATURE_AP_SCAN; diff --git a/drivers/net/wireless/ti/wlcore/wlcore_i.h b/drivers/net/wireless/ti/wlcore/wlcore_i.h index 6fab60b0e367..eefae3f867b9 100644 --- a/drivers/net/wireless/ti/wlcore/wlcore_i.h +++ b/drivers/net/wireless/ti/wlcore/wlcore_i.h @@ -212,6 +212,7 @@ struct wl1271_ap_key { u8 hlid; u32 tx_seq_32; u16 tx_seq_16; + bool is_pairwise; }; enum wl12xx_flags { diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 4937a088d7d8..fbeb9f73ef28 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5074,18 +5074,25 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0422, quirk_no_ext_tags); #ifdef CONFIG_PCI_ATS /* - * Some devices have a broken ATS implementation causing IOMMU stalls. - * Don't use ATS for those devices. + * Some devices require additional driver setup to enable ATS. Don't use + * ATS for those devices as ATS will be enabled before the driver has had a + * chance to load and configure the device. */ -static void quirk_no_ats(struct pci_dev *pdev) +static void quirk_amd_harvest_no_ats(struct pci_dev *pdev) { - pci_info(pdev, "disabling ATS (broken on this device)\n"); + if (pdev->device == 0x7340 && pdev->revision != 0xc5) + return; + + pci_info(pdev, "disabling ATS\n"); pdev->ats_cap = 0; } /* AMD Stoney platform GPU */ -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_no_ats); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_amd_harvest_no_ats); +/* AMD Iceland dGPU */ +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_amd_harvest_no_ats); +/* AMD Navi14 dGPU */ +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7340, quirk_amd_harvest_no_ats); #endif /* CONFIG_PCI_ATS */ /* Freescale PCIe doesn't support MSI in RC mode */ diff --git a/drivers/pinctrl/intel/pinctrl-sunrisepoint.c b/drivers/pinctrl/intel/pinctrl-sunrisepoint.c index 44d7f50bbc82..d936e7aa74c4 100644 --- a/drivers/pinctrl/intel/pinctrl-sunrisepoint.c +++ b/drivers/pinctrl/intel/pinctrl-sunrisepoint.c @@ -49,6 +49,7 @@ .padown_offset = SPT_PAD_OWN, \ .padcfglock_offset = SPT_PADCFGLOCK, \ .hostown_offset = SPT_HOSTSW_OWN, \ + .is_offset = SPT_GPI_IS, \ .ie_offset = SPT_GPI_IE, \ .pin_base = (s), \ .npins = ((e) - (s) + 1), \ diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h index 7c37e94fb28b..9575a627a1e1 100644 --- a/drivers/s390/net/qeth_core.h +++ b/drivers/s390/net/qeth_core.h @@ -558,7 +558,6 @@ enum qeth_channel_states { */ enum qeth_card_states { CARD_STATE_DOWN, - CARD_STATE_HARDSETUP, CARD_STATE_SOFTSETUP, }; @@ -608,6 +607,8 @@ struct qeth_cmd_buffer { long timeout; unsigned char *data; void (*finalize)(struct qeth_card *card, struct qeth_cmd_buffer *iob); + bool (*match)(struct qeth_cmd_buffer *iob, + struct qeth_cmd_buffer *reply); void (*callback)(struct qeth_card *card, struct qeth_cmd_buffer *iob, unsigned int data_length); int rc; @@ -618,6 +619,14 @@ static inline void qeth_get_cmd(struct qeth_cmd_buffer *iob) refcount_inc(&iob->ref_count); } +static inline struct qeth_ipa_cmd *__ipa_reply(struct qeth_cmd_buffer *iob) +{ + if (!IS_IPA(iob->data)) + return NULL; + + return (struct qeth_ipa_cmd *) PDU_ENCAPSULATION(iob->data); +} + static inline struct qeth_ipa_cmd *__ipa_cmd(struct qeth_cmd_buffer *iob) { return (struct qeth_ipa_cmd *)(iob->data + IPA_PDU_HEADER_SIZE); @@ -727,11 +736,10 @@ struct qeth_osn_info { struct qeth_discipline { const struct device_type *devtype; - int (*recover)(void *ptr); int (*setup) (struct ccwgroup_device *); void (*remove) (struct ccwgroup_device *); - int (*set_online) (struct ccwgroup_device *); - int (*set_offline) (struct ccwgroup_device *); + int (*set_online)(struct qeth_card *card); + void (*set_offline)(struct qeth_card *card); int (*do_ioctl)(struct net_device *dev, struct ifreq *rq, int cmd); int (*control_event_handler)(struct qeth_card *card, struct qeth_ipa_cmd *cmd); @@ -987,14 +995,11 @@ struct net_device *qeth_clone_netdev(struct net_device *orig); struct qeth_card *qeth_get_card_by_busid(char *bus_id); void qeth_set_allowed_threads(struct qeth_card *, unsigned long , int); int qeth_threads_running(struct qeth_card *, unsigned long); -int qeth_do_run_thread(struct qeth_card *, unsigned long); -void qeth_clear_thread_start_bit(struct qeth_card *, unsigned long); -void qeth_clear_thread_running_bit(struct qeth_card *, unsigned long); int qeth_core_hardsetup_card(struct qeth_card *card, bool *carrier_ok); int qeth_stop_channel(struct qeth_channel *channel); +int qeth_set_offline(struct qeth_card *card, bool resetting); void qeth_print_status_message(struct qeth_card *); -int qeth_init_qdio_queues(struct qeth_card *); int qeth_send_ipa_cmd(struct qeth_card *, struct qeth_cmd_buffer *, int (*reply_cb) (struct qeth_card *, struct qeth_reply *, unsigned long), @@ -1027,7 +1032,9 @@ void qeth_setadp_promisc_mode(struct qeth_card *card, bool enable); int qeth_setadpparms_change_macaddr(struct qeth_card *); void qeth_tx_timeout(struct net_device *, unsigned int txqueue); void qeth_prepare_ipa_cmd(struct qeth_card *card, struct qeth_cmd_buffer *iob, - u16 cmd_length); + u16 cmd_length, + bool (*match)(struct qeth_cmd_buffer *iob, + struct qeth_cmd_buffer *reply)); int qeth_query_switch_attributes(struct qeth_card *card, struct qeth_switch_info *sw_info); int qeth_query_card_info(struct qeth_card *card, diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c index c19cc3978e1f..9639938581f5 100644 --- a/drivers/s390/net/qeth_core_main.c +++ b/drivers/s390/net/qeth_core_main.c @@ -247,9 +247,6 @@ int qeth_realloc_buffer_pool(struct qeth_card *card, int bufcnt) { QETH_CARD_TEXT(card, 2, "realcbp"); - if (card->state != CARD_STATE_DOWN) - return -EPERM; - /* TODO: steel/add buffers from/to a running card's buffer pool (?) */ qeth_clear_working_pool_list(card); qeth_free_buffer_pool(card); @@ -748,8 +745,8 @@ static void qeth_issue_next_read_cb(struct qeth_card *card, goto out; } - if (IS_IPA(iob->data)) { - cmd = (struct qeth_ipa_cmd *) PDU_ENCAPSULATION(iob->data); + cmd = __ipa_reply(iob); + if (cmd) { cmd = qeth_check_ipa_data(card, cmd); if (!cmd) goto out; @@ -758,17 +755,12 @@ static void qeth_issue_next_read_cb(struct qeth_card *card, card->osn_info.assist_cb(card->dev, cmd); goto out; } - } else { - /* non-IPA commands should only flow during initialization */ - if (card->state != CARD_STATE_DOWN) - goto out; } /* match against pending cmd requests */ spin_lock_irqsave(&card->lock, flags); list_for_each_entry(tmp, &card->cmd_waiter_list, list) { - if (!IS_IPA(tmp->data) || - __ipa_cmd(tmp)->hdr.seqno == cmd->hdr.seqno) { + if (tmp->match && tmp->match(tmp, iob)) { request = tmp; /* take the object outside the lock */ qeth_get_cmd(request); @@ -823,7 +815,8 @@ static int qeth_set_thread_start_bit(struct qeth_card *card, return 0; } -void qeth_clear_thread_start_bit(struct qeth_card *card, unsigned long thread) +static void qeth_clear_thread_start_bit(struct qeth_card *card, + unsigned long thread) { unsigned long flags; @@ -832,9 +825,9 @@ void qeth_clear_thread_start_bit(struct qeth_card *card, unsigned long thread) spin_unlock_irqrestore(&card->thread_mask_lock, flags); wake_up(&card->wait_q); } -EXPORT_SYMBOL_GPL(qeth_clear_thread_start_bit); -void qeth_clear_thread_running_bit(struct qeth_card *card, unsigned long thread) +static void qeth_clear_thread_running_bit(struct qeth_card *card, + unsigned long thread) { unsigned long flags; @@ -843,7 +836,6 @@ void qeth_clear_thread_running_bit(struct qeth_card *card, unsigned long thread) spin_unlock_irqrestore(&card->thread_mask_lock, flags); wake_up_all(&card->wait_q); } -EXPORT_SYMBOL_GPL(qeth_clear_thread_running_bit); static int __qeth_do_run_thread(struct qeth_card *card, unsigned long thread) { @@ -864,7 +856,7 @@ static int __qeth_do_run_thread(struct qeth_card *card, unsigned long thread) return rc; } -int qeth_do_run_thread(struct qeth_card *card, unsigned long thread) +static int qeth_do_run_thread(struct qeth_card *card, unsigned long thread) { int rc = 0; @@ -872,7 +864,6 @@ int qeth_do_run_thread(struct qeth_card *card, unsigned long thread) (rc = __qeth_do_run_thread(card, thread)) >= 0); return rc; } -EXPORT_SYMBOL_GPL(qeth_do_run_thread); void qeth_schedule_recovery(struct qeth_card *card) { @@ -880,7 +871,6 @@ void qeth_schedule_recovery(struct qeth_card *card) if (qeth_set_thread_start_bit(card, QETH_RECOVER_THREAD) == 0) schedule_work(&card->kernel_thread_starter); } -EXPORT_SYMBOL_GPL(qeth_schedule_recovery); static int qeth_get_problem(struct qeth_card *card, struct ccw_device *cdev, struct irb *irb) @@ -1287,6 +1277,7 @@ static int qeth_do_start_thread(struct qeth_card *card, unsigned long thread) return rc; } +static int qeth_do_reset(void *data); static void qeth_start_kernel_thread(struct work_struct *work) { struct task_struct *ts; @@ -1298,8 +1289,7 @@ static void qeth_start_kernel_thread(struct work_struct *work) card->write.state != CH_STATE_UP) return; if (qeth_do_start_thread(card, QETH_RECOVER_THREAD)) { - ts = kthread_run(card->discipline->recover, (void *)card, - "qeth_recover"); + ts = kthread_run(qeth_do_reset, card, "qeth_recover"); if (IS_ERR(ts)) { qeth_clear_thread_start_bit(card, QETH_RECOVER_THREAD); qeth_clear_thread_running_bit(card, @@ -1690,6 +1680,13 @@ static void qeth_mpc_finalize_cmd(struct qeth_card *card, iob->callback = qeth_release_buffer_cb; } +static bool qeth_mpc_match_reply(struct qeth_cmd_buffer *iob, + struct qeth_cmd_buffer *reply) +{ + /* MPC cmds are issued strictly in sequence. */ + return !IS_IPA(reply->data); +} + static struct qeth_cmd_buffer *qeth_mpc_alloc_cmd(struct qeth_card *card, void *data, unsigned int data_length) @@ -1704,6 +1701,7 @@ static struct qeth_cmd_buffer *qeth_mpc_alloc_cmd(struct qeth_card *card, qeth_setup_ccw(__ccw_from_cmd(iob), CCW_CMD_WRITE, 0, data_length, iob->data); iob->finalize = qeth_mpc_finalize_cmd; + iob->match = qeth_mpc_match_reply; return iob; } @@ -2665,7 +2663,7 @@ static unsigned int qeth_tx_select_bulk_max(struct qeth_card *card, return card->ssqd.mmwc ? card->ssqd.mmwc : 1; } -int qeth_init_qdio_queues(struct qeth_card *card) +static int qeth_init_qdio_queues(struct qeth_card *card) { unsigned int i; int rc; @@ -2713,7 +2711,6 @@ int qeth_init_qdio_queues(struct qeth_card *card) } return 0; } -EXPORT_SYMBOL_GPL(qeth_init_qdio_queues); static void qeth_ipa_finalize_cmd(struct qeth_card *card, struct qeth_cmd_buffer *iob) @@ -2725,7 +2722,9 @@ static void qeth_ipa_finalize_cmd(struct qeth_card *card, } void qeth_prepare_ipa_cmd(struct qeth_card *card, struct qeth_cmd_buffer *iob, - u16 cmd_length) + u16 cmd_length, + bool (*match)(struct qeth_cmd_buffer *iob, + struct qeth_cmd_buffer *reply)) { u8 prot_type = qeth_mpc_select_prot_type(card); u16 total_length = iob->length; @@ -2733,6 +2732,7 @@ void qeth_prepare_ipa_cmd(struct qeth_card *card, struct qeth_cmd_buffer *iob, qeth_setup_ccw(__ccw_from_cmd(iob), CCW_CMD_WRITE, 0, total_length, iob->data); iob->finalize = qeth_ipa_finalize_cmd; + iob->match = match; memcpy(iob->data, IPA_PDU_HEADER, IPA_PDU_HEADER_SIZE); memcpy(QETH_IPA_PDU_LEN_TOTAL(iob->data), &total_length, 2); @@ -2745,6 +2745,14 @@ void qeth_prepare_ipa_cmd(struct qeth_card *card, struct qeth_cmd_buffer *iob, } EXPORT_SYMBOL_GPL(qeth_prepare_ipa_cmd); +static bool qeth_ipa_match_reply(struct qeth_cmd_buffer *iob, + struct qeth_cmd_buffer *reply) +{ + struct qeth_ipa_cmd *ipa_reply = __ipa_reply(reply); + + return ipa_reply && (__ipa_cmd(iob)->hdr.seqno == ipa_reply->hdr.seqno); +} + struct qeth_cmd_buffer *qeth_ipa_alloc_cmd(struct qeth_card *card, enum qeth_ipa_cmds cmd_code, enum qeth_prot_versions prot, @@ -2760,7 +2768,7 @@ struct qeth_cmd_buffer *qeth_ipa_alloc_cmd(struct qeth_card *card, if (!iob) return NULL; - qeth_prepare_ipa_cmd(card, iob, data_length); + qeth_prepare_ipa_cmd(card, iob, data_length, qeth_ipa_match_reply); hdr = &__ipa_cmd(iob)->hdr; hdr->command = cmd_code; @@ -3096,7 +3104,6 @@ int qeth_hw_trap(struct qeth_card *card, enum qeth_diags_trap_action action) } return qeth_send_ipa_cmd(card, iob, qeth_hw_trap_cb, NULL); } -EXPORT_SYMBOL_GPL(qeth_hw_trap); static int qeth_check_qdio_errors(struct qeth_card *card, struct qdio_buffer *buf, @@ -5026,6 +5033,12 @@ retriable: if (rc) goto out; + rc = qeth_init_qdio_queues(card); + if (rc) { + QETH_CARD_TEXT_(card, 2, "9err%d", rc); + goto out; + } + return 0; out: dev_warn(&card->gdev->dev, "The qeth device driver failed to recover " @@ -5036,6 +5049,89 @@ out: } EXPORT_SYMBOL_GPL(qeth_core_hardsetup_card); +static int qeth_set_online(struct qeth_card *card) +{ + int rc; + + mutex_lock(&card->discipline_mutex); + mutex_lock(&card->conf_mutex); + QETH_CARD_TEXT(card, 2, "setonlin"); + + rc = card->discipline->set_online(card); + + mutex_unlock(&card->conf_mutex); + mutex_unlock(&card->discipline_mutex); + + return rc; +} + +int qeth_set_offline(struct qeth_card *card, bool resetting) +{ + int rc, rc2, rc3; + + mutex_lock(&card->discipline_mutex); + mutex_lock(&card->conf_mutex); + QETH_CARD_TEXT(card, 3, "setoffl"); + + if ((!resetting && card->info.hwtrap) || card->info.hwtrap == 2) { + qeth_hw_trap(card, QETH_DIAGS_TRAP_DISARM); + card->info.hwtrap = 1; + } + + rtnl_lock(); + card->info.open_when_online = card->dev->flags & IFF_UP; + dev_close(card->dev); + netif_device_detach(card->dev); + netif_carrier_off(card->dev); + rtnl_unlock(); + + card->discipline->set_offline(card); + + rc = qeth_stop_channel(&card->data); + rc2 = qeth_stop_channel(&card->write); + rc3 = qeth_stop_channel(&card->read); + if (!rc) + rc = (rc2) ? rc2 : rc3; + if (rc) + QETH_CARD_TEXT_(card, 2, "1err%d", rc); + qdio_free(CARD_DDEV(card)); + + /* let user_space know that device is offline */ + kobject_uevent(&card->gdev->dev.kobj, KOBJ_CHANGE); + + mutex_unlock(&card->conf_mutex); + mutex_unlock(&card->discipline_mutex); + return 0; +} +EXPORT_SYMBOL_GPL(qeth_set_offline); + +static int qeth_do_reset(void *data) +{ + struct qeth_card *card = data; + int rc; + + QETH_CARD_TEXT(card, 2, "recover1"); + if (!qeth_do_run_thread(card, QETH_RECOVER_THREAD)) + return 0; + QETH_CARD_TEXT(card, 2, "recover2"); + dev_warn(&card->gdev->dev, + "A recovery process has been started for the device\n"); + + qeth_set_offline(card, true); + rc = qeth_set_online(card); + if (!rc) { + dev_info(&card->gdev->dev, + "Device successfully recovered!\n"); + } else { + ccwgroup_set_offline(card->gdev); + dev_warn(&card->gdev->dev, + "The qeth device driver failed to recover an error on the device\n"); + } + qeth_clear_thread_start_bit(card, QETH_RECOVER_THREAD); + qeth_clear_thread_running_bit(card, QETH_RECOVER_THREAD); + return 0; +} + #if IS_ENABLED(CONFIG_QETH_L3) static void qeth_l3_rebuild_skb(struct qeth_card *card, struct sk_buff *skb, struct qeth_hdr *hdr) @@ -5972,7 +6068,8 @@ static int qeth_core_set_online(struct ccwgroup_device *gdev) goto err; } } - rc = card->discipline->set_online(gdev); + + rc = qeth_set_online(card); err: return rc; } @@ -5980,7 +6077,8 @@ err: static int qeth_core_set_offline(struct ccwgroup_device *gdev) { struct qeth_card *card = dev_get_drvdata(&gdev->dev); - return card->discipline->set_offline(gdev); + + return qeth_set_offline(card, false); } static void qeth_core_shutdown(struct ccwgroup_device *gdev) @@ -6003,7 +6101,7 @@ static int qeth_suspend(struct ccwgroup_device *gdev) if (gdev->state == CCWGROUP_OFFLINE) return 0; - card->discipline->set_offline(gdev); + qeth_set_offline(card, false); return 0; } @@ -6012,7 +6110,7 @@ static int qeth_resume(struct ccwgroup_device *gdev) struct qeth_card *card = dev_get_drvdata(&gdev->dev); int rc; - rc = card->discipline->set_online(gdev); + rc = qeth_set_online(card); qeth_set_allowed_threads(card, 0xffffffff, 0); if (rc) diff --git a/drivers/s390/net/qeth_core_sys.c b/drivers/s390/net/qeth_core_sys.c index 7bd86027f559..2bd9993aa60b 100644 --- a/drivers/s390/net/qeth_core_sys.c +++ b/drivers/s390/net/qeth_core_sys.c @@ -24,8 +24,6 @@ static ssize_t qeth_dev_state_show(struct device *dev, switch (card->state) { case CARD_STATE_DOWN: return sprintf(buf, "DOWN\n"); - case CARD_STATE_HARDSETUP: - return sprintf(buf, "HARDSETUP\n"); case CARD_STATE_SOFTSETUP: if (card->dev->flags & IFF_UP) return sprintf(buf, "UP (LAN %s)\n", diff --git a/drivers/s390/net/qeth_l2.h b/drivers/s390/net/qeth_l2.h index ddc615b431a8..adf25c9fd2b3 100644 --- a/drivers/s390/net/qeth_l2.h +++ b/drivers/s390/net/qeth_l2.h @@ -13,7 +13,6 @@ extern const struct attribute_group *qeth_l2_attr_groups[]; int qeth_l2_create_device_attributes(struct device *); void qeth_l2_remove_device_attributes(struct device *); -void qeth_l2_setup_bridgeport_attrs(struct qeth_card *card); int qeth_bridgeport_query_ports(struct qeth_card *card, enum qeth_sbp_roles *role, enum qeth_sbp_states *state); diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c index 7175b5d8a23c..692bd2623401 100644 --- a/drivers/s390/net/qeth_l2_main.c +++ b/drivers/s390/net/qeth_l2_main.c @@ -24,7 +24,6 @@ #include "qeth_core.h" #include "qeth_l2.h" -static int qeth_l2_set_offline(struct ccwgroup_device *); static void qeth_bridgeport_query_support(struct qeth_card *card); static void qeth_bridge_state_change(struct qeth_card *card, struct qeth_ipa_cmd *cmd); @@ -284,15 +283,12 @@ static void qeth_l2_stop_card(struct qeth_card *card) if (card->state == CARD_STATE_SOFTSETUP) { qeth_clear_ipacmd_list(card); - card->state = CARD_STATE_HARDSETUP; - } - if (card->state == CARD_STATE_HARDSETUP) { qeth_drain_output_queues(card); - qeth_clear_working_pool_list(card); card->state = CARD_STATE_DOWN; } qeth_qdio_clear_card(card, 0); + qeth_clear_working_pool_list(card); flush_workqueue(card->event_wq); card->info.mac_bits &= ~QETH_LAYER2_MAC_REGISTERED; card->info.promisc_mode = 0; @@ -610,7 +606,7 @@ static void qeth_l2_remove_device(struct ccwgroup_device *cgdev) wait_event(card->wait_q, qeth_threads_running(card, 0xffffffff) == 0); if (cgdev->state == CCWGROUP_ONLINE) - qeth_l2_set_offline(cgdev); + qeth_set_offline(card, false); cancel_work_sync(&card->close_dev_work); if (qeth_netdev_is_registered(card->dev)) @@ -728,17 +724,31 @@ static void qeth_l2_trace_features(struct qeth_card *card) sizeof(card->options.vnicc.sup_chars)); } -static int qeth_l2_set_online(struct ccwgroup_device *gdev) +static void qeth_l2_setup_bridgeport_attrs(struct qeth_card *card) { - struct qeth_card *card = dev_get_drvdata(&gdev->dev); + if (!card->options.sbp.reflect_promisc && + card->options.sbp.role != QETH_SBP_ROLE_NONE) { + /* Conditional to avoid spurious error messages */ + qeth_bridgeport_setrole(card, card->options.sbp.role); + /* Let the callback function refresh the stored role value. */ + qeth_bridgeport_query_ports(card, &card->options.sbp.role, + NULL); + } + if (card->options.sbp.hostnotification) { + if (qeth_bridgeport_an_set(card, 1)) + card->options.sbp.hostnotification = 0; + } else { + qeth_bridgeport_an_set(card, 0); + } +} + +static int qeth_l2_set_online(struct qeth_card *card) +{ + struct ccwgroup_device *gdev = card->gdev; struct net_device *dev = card->dev; int rc = 0; bool carrier_ok; - mutex_lock(&card->discipline_mutex); - mutex_lock(&card->conf_mutex); - QETH_CARD_TEXT(card, 2, "setonlin"); - rc = qeth_core_hardsetup_card(card, &carrier_ok); if (rc) { QETH_CARD_TEXT_(card, 2, "2err%04x", rc); @@ -748,9 +758,11 @@ static int qeth_l2_set_online(struct ccwgroup_device *gdev) mutex_lock(&card->sbp_lock); qeth_bridgeport_query_support(card); - if (card->options.sbp.supported_funcs) + if (card->options.sbp.supported_funcs) { + qeth_l2_setup_bridgeport_attrs(card); dev_info(&card->gdev->dev, - "The device represents a Bridge Capable Port\n"); + "The device represents a Bridge Capable Port\n"); + } mutex_unlock(&card->sbp_lock); qeth_l2_register_dev_addr(card); @@ -761,20 +773,11 @@ static int qeth_l2_set_online(struct ccwgroup_device *gdev) qeth_trace_features(card); qeth_l2_trace_features(card); - qeth_l2_setup_bridgeport_attrs(card); - - card->state = CARD_STATE_HARDSETUP; qeth_print_status_message(card); /* softsetup */ QETH_CARD_TEXT(card, 2, "softsetp"); - rc = qeth_init_qdio_queues(card); - if (rc) { - QETH_CARD_TEXT_(card, 2, "6err%d", rc); - rc = -ENODEV; - goto out_remove; - } card->state = CARD_STATE_SOFTSETUP; qeth_set_allowed_threads(card, 0xffffffff, 0); @@ -801,8 +804,6 @@ static int qeth_l2_set_online(struct ccwgroup_device *gdev) } /* let user_space know that device is online */ kobject_uevent(&gdev->dev.kobj, KOBJ_CHANGE); - mutex_unlock(&card->conf_mutex); - mutex_unlock(&card->discipline_mutex); return 0; out_remove: @@ -811,81 +812,12 @@ out_remove: qeth_stop_channel(&card->write); qeth_stop_channel(&card->read); qdio_free(CARD_DDEV(card)); - - mutex_unlock(&card->conf_mutex); - mutex_unlock(&card->discipline_mutex); return rc; } -static int __qeth_l2_set_offline(struct ccwgroup_device *cgdev, - int recovery_mode) +static void qeth_l2_set_offline(struct qeth_card *card) { - struct qeth_card *card = dev_get_drvdata(&cgdev->dev); - int rc = 0, rc2 = 0, rc3 = 0; - - mutex_lock(&card->discipline_mutex); - mutex_lock(&card->conf_mutex); - QETH_CARD_TEXT(card, 3, "setoffl"); - - if ((!recovery_mode && card->info.hwtrap) || card->info.hwtrap == 2) { - qeth_hw_trap(card, QETH_DIAGS_TRAP_DISARM); - card->info.hwtrap = 1; - } - - rtnl_lock(); - card->info.open_when_online = card->dev->flags & IFF_UP; - dev_close(card->dev); - netif_device_detach(card->dev); - netif_carrier_off(card->dev); - rtnl_unlock(); - qeth_l2_stop_card(card); - rc = qeth_stop_channel(&card->data); - rc2 = qeth_stop_channel(&card->write); - rc3 = qeth_stop_channel(&card->read); - if (!rc) - rc = (rc2) ? rc2 : rc3; - if (rc) - QETH_CARD_TEXT_(card, 2, "1err%d", rc); - qdio_free(CARD_DDEV(card)); - - /* let user_space know that device is offline */ - kobject_uevent(&cgdev->dev.kobj, KOBJ_CHANGE); - mutex_unlock(&card->conf_mutex); - mutex_unlock(&card->discipline_mutex); - return 0; -} - -static int qeth_l2_set_offline(struct ccwgroup_device *cgdev) -{ - return __qeth_l2_set_offline(cgdev, 0); -} - -static int qeth_l2_recover(void *ptr) -{ - struct qeth_card *card; - int rc = 0; - - card = (struct qeth_card *) ptr; - QETH_CARD_TEXT(card, 2, "recover1"); - if (!qeth_do_run_thread(card, QETH_RECOVER_THREAD)) - return 0; - QETH_CARD_TEXT(card, 2, "recover2"); - dev_warn(&card->gdev->dev, - "A recovery process has been started for the device\n"); - __qeth_l2_set_offline(card->gdev, 1); - rc = qeth_l2_set_online(card->gdev); - if (!rc) - dev_info(&card->gdev->dev, - "Device successfully recovered!\n"); - else { - ccwgroup_set_offline(card->gdev); - dev_warn(&card->gdev->dev, "The qeth device driver " - "failed to recover an error on the device\n"); - } - qeth_clear_thread_start_bit(card, QETH_RECOVER_THREAD); - qeth_clear_thread_running_bit(card, QETH_RECOVER_THREAD); - return 0; } static int __init qeth_l2_init(void) @@ -922,7 +854,6 @@ static int qeth_l2_control_event(struct qeth_card *card, struct qeth_discipline qeth_l2_discipline = { .devtype = &qeth_l2_devtype, - .recover = qeth_l2_recover, .setup = qeth_l2_probe_device, .remove = qeth_l2_remove_device, .set_online = qeth_l2_set_online, @@ -961,7 +892,8 @@ int qeth_osn_assist(struct net_device *dev, void *data, int data_len) if (!iob) return -ENOMEM; - qeth_prepare_ipa_cmd(card, iob, (u16) data_len); + qeth_prepare_ipa_cmd(card, iob, (u16) data_len, NULL); + memcpy(__ipa_cmd(iob), data, data_len); iob->callback = qeth_osn_assist_cb; return qeth_send_ipa_cmd(card, iob, NULL, NULL); diff --git a/drivers/s390/net/qeth_l2_sys.c b/drivers/s390/net/qeth_l2_sys.c index 7fa325cf6f8d..86bcae992f72 100644 --- a/drivers/s390/net/qeth_l2_sys.c +++ b/drivers/s390/net/qeth_l2_sys.c @@ -246,40 +246,6 @@ static struct attribute_group qeth_l2_bridgeport_attr_group = { .attrs = qeth_l2_bridgeport_attrs, }; -/** - * qeth_l2_setup_bridgeport_attrs() - set/restore attrs when turning online. - * @card: qeth_card structure pointer - * - * Note: this function is called with conf_mutex held by the caller - */ -void qeth_l2_setup_bridgeport_attrs(struct qeth_card *card) -{ - int rc; - - if (!card) - return; - if (!card->options.sbp.supported_funcs) - return; - - mutex_lock(&card->sbp_lock); - if (!card->options.sbp.reflect_promisc && - card->options.sbp.role != QETH_SBP_ROLE_NONE) { - /* Conditional to avoid spurious error messages */ - qeth_bridgeport_setrole(card, card->options.sbp.role); - /* Let the callback function refresh the stored role value. */ - qeth_bridgeport_query_ports(card, - &card->options.sbp.role, NULL); - } - if (card->options.sbp.hostnotification) { - rc = qeth_bridgeport_an_set(card, 1); - if (rc) - card->options.sbp.hostnotification = 0; - } else { - qeth_bridgeport_an_set(card, 0); - } - mutex_unlock(&card->sbp_lock); -} - /* VNIC CHARS support */ /* convert sysfs attr name to VNIC characteristic */ diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c index 6c0bffd11138..317d56647a4a 100644 --- a/drivers/s390/net/qeth_l3_main.c +++ b/drivers/s390/net/qeth_l3_main.c @@ -37,8 +37,6 @@ #include "qeth_l3.h" - -static int qeth_l3_set_offline(struct ccwgroup_device *); static int qeth_l3_register_addr_entry(struct qeth_card *, struct qeth_ipaddr *); static int qeth_l3_deregister_addr_entry(struct qeth_card *, @@ -903,7 +901,7 @@ out: return rc; } -static int qeth_l3_start_ipassists(struct qeth_card *card) +static void qeth_l3_start_ipassists(struct qeth_card *card) { QETH_CARD_TEXT(card, 3, "strtipas"); @@ -913,7 +911,6 @@ static int qeth_l3_start_ipassists(struct qeth_card *card) qeth_l3_start_ipa_multicast(card); /* go on*/ qeth_l3_start_ipa_ipv6(card); /* go on*/ qeth_l3_start_ipa_broadcast(card); /* go on*/ - return 0; } static int qeth_l3_iqd_read_initial_mac_cb(struct qeth_card *card, @@ -1180,15 +1177,12 @@ static void qeth_l3_stop_card(struct qeth_card *card) if (card->state == CARD_STATE_SOFTSETUP) { qeth_l3_clear_ip_htable(card, 1); qeth_clear_ipacmd_list(card); - card->state = CARD_STATE_HARDSETUP; - } - if (card->state == CARD_STATE_HARDSETUP) { qeth_drain_output_queues(card); - qeth_clear_working_pool_list(card); card->state = CARD_STATE_DOWN; } qeth_qdio_clear_card(card, 0); + qeth_clear_working_pool_list(card); flush_workqueue(card->event_wq); card->info.promisc_mode = 0; } @@ -2044,7 +2038,7 @@ static void qeth_l3_remove_device(struct ccwgroup_device *cgdev) wait_event(card->wait_q, qeth_threads_running(card, 0xffffffff) == 0); if (cgdev->state == CCWGROUP_ONLINE) - qeth_l3_set_offline(cgdev); + qeth_set_offline(card, false); cancel_work_sync(&card->close_dev_work); if (qeth_netdev_is_registered(card->dev)) @@ -2056,17 +2050,13 @@ static void qeth_l3_remove_device(struct ccwgroup_device *cgdev) qeth_l3_clear_ipato_list(card); } -static int qeth_l3_set_online(struct ccwgroup_device *gdev) +static int qeth_l3_set_online(struct qeth_card *card) { - struct qeth_card *card = dev_get_drvdata(&gdev->dev); + struct ccwgroup_device *gdev = card->gdev; struct net_device *dev = card->dev; int rc = 0; bool carrier_ok; - mutex_lock(&card->discipline_mutex); - mutex_lock(&card->conf_mutex); - QETH_CARD_TEXT(card, 2, "setonlin"); - rc = qeth_core_hardsetup_card(card, &carrier_ok); if (rc) { QETH_CARD_TEXT_(card, 2, "2err%04x", rc); @@ -2074,7 +2064,6 @@ static int qeth_l3_set_online(struct ccwgroup_device *gdev) goto out_remove; } - card->state = CARD_STATE_HARDSETUP; qeth_print_status_message(card); /* softsetup */ @@ -2084,11 +2073,8 @@ static int qeth_l3_set_online(struct ccwgroup_device *gdev) if (rc) QETH_CARD_TEXT_(card, 2, "2err%04x", rc); if (!card->options.sniffer) { - rc = qeth_l3_start_ipassists(card); - if (rc) { - QETH_CARD_TEXT_(card, 2, "3err%d", rc); - goto out_remove; - } + qeth_l3_start_ipassists(card); + rc = qeth_l3_setrouting_v4(card); if (rc) QETH_CARD_TEXT_(card, 2, "4err%04x", rc); @@ -2097,12 +2083,6 @@ static int qeth_l3_set_online(struct ccwgroup_device *gdev) QETH_CARD_TEXT_(card, 2, "5err%04x", rc); } - rc = qeth_init_qdio_queues(card); - if (rc) { - QETH_CARD_TEXT_(card, 2, "6err%d", rc); - rc = -ENODEV; - goto out_remove; - } card->state = CARD_STATE_SOFTSETUP; qeth_set_allowed_threads(card, 0xffffffff, 0); @@ -2131,8 +2111,6 @@ static int qeth_l3_set_online(struct ccwgroup_device *gdev) qeth_trace_features(card); /* let user_space know that device is online */ kobject_uevent(&gdev->dev.kobj, KOBJ_CHANGE); - mutex_unlock(&card->conf_mutex); - mutex_unlock(&card->discipline_mutex); return 0; out_remove: qeth_l3_stop_card(card); @@ -2140,82 +2118,12 @@ out_remove: qeth_stop_channel(&card->write); qeth_stop_channel(&card->read); qdio_free(CARD_DDEV(card)); - - mutex_unlock(&card->conf_mutex); - mutex_unlock(&card->discipline_mutex); return rc; } -static int __qeth_l3_set_offline(struct ccwgroup_device *cgdev, - int recovery_mode) +static void qeth_l3_set_offline(struct qeth_card *card) { - struct qeth_card *card = dev_get_drvdata(&cgdev->dev); - int rc = 0, rc2 = 0, rc3 = 0; - - mutex_lock(&card->discipline_mutex); - mutex_lock(&card->conf_mutex); - QETH_CARD_TEXT(card, 3, "setoffl"); - - if ((!recovery_mode && card->info.hwtrap) || card->info.hwtrap == 2) { - qeth_hw_trap(card, QETH_DIAGS_TRAP_DISARM); - card->info.hwtrap = 1; - } - - rtnl_lock(); - card->info.open_when_online = card->dev->flags & IFF_UP; - dev_close(card->dev); - netif_device_detach(card->dev); - netif_carrier_off(card->dev); - rtnl_unlock(); - qeth_l3_stop_card(card); - rc = qeth_stop_channel(&card->data); - rc2 = qeth_stop_channel(&card->write); - rc3 = qeth_stop_channel(&card->read); - if (!rc) - rc = (rc2) ? rc2 : rc3; - if (rc) - QETH_CARD_TEXT_(card, 2, "1err%d", rc); - qdio_free(CARD_DDEV(card)); - - /* let user_space know that device is offline */ - kobject_uevent(&cgdev->dev.kobj, KOBJ_CHANGE); - mutex_unlock(&card->conf_mutex); - mutex_unlock(&card->discipline_mutex); - return 0; -} - -static int qeth_l3_set_offline(struct ccwgroup_device *cgdev) -{ - return __qeth_l3_set_offline(cgdev, 0); -} - -static int qeth_l3_recover(void *ptr) -{ - struct qeth_card *card; - int rc = 0; - - card = (struct qeth_card *) ptr; - QETH_CARD_TEXT(card, 2, "recover1"); - QETH_CARD_HEX(card, 2, &card, sizeof(void *)); - if (!qeth_do_run_thread(card, QETH_RECOVER_THREAD)) - return 0; - QETH_CARD_TEXT(card, 2, "recover2"); - dev_warn(&card->gdev->dev, - "A recovery process has been started for the device\n"); - __qeth_l3_set_offline(card->gdev, 1); - rc = qeth_l3_set_online(card->gdev); - if (!rc) - dev_info(&card->gdev->dev, - "Device successfully recovered!\n"); - else { - ccwgroup_set_offline(card->gdev); - dev_warn(&card->gdev->dev, "The qeth device driver " - "failed to recover an error on the device\n"); - } - qeth_clear_thread_start_bit(card, QETH_RECOVER_THREAD); - qeth_clear_thread_running_bit(card, QETH_RECOVER_THREAD); - return 0; } /* Returns zero if the command is successfully "consumed" */ @@ -2227,7 +2135,6 @@ static int qeth_l3_control_event(struct qeth_card *card, struct qeth_discipline qeth_l3_discipline = { .devtype = &qeth_l3_devtype, - .recover = qeth_l3_recover, .setup = qeth_l3_probe_device, .remove = qeth_l3_remove_device, .set_online = qeth_l3_set_online, diff --git a/drivers/tee/optee/Kconfig b/drivers/tee/optee/Kconfig index d1ad512e1708..3ca71e3812ed 100644 --- a/drivers/tee/optee/Kconfig +++ b/drivers/tee/optee/Kconfig @@ -3,6 +3,7 @@ config OPTEE tristate "OP-TEE" depends on HAVE_ARM_SMCCC + depends on MMU help This implements the OP-TEE Trusted Execution Environment (TEE) driver. diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index f639dde2a679..ba4d8f375b3c 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -500,11 +500,8 @@ static int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, &dev_replace->scrub_progress, 0, 1); ret = btrfs_dev_replace_finishing(fs_info, ret); - if (ret == -EINPROGRESS) { + if (ret == -EINPROGRESS) ret = BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS; - } else if (ret != -ECANCELED) { - WARN_ON(ret); - } return ret; diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 21de630b0730..fd266a2d15ec 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3577,17 +3577,27 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, * This can easily boost the amount of SYSTEM chunks if cleaner * thread can't be triggered fast enough, and use up all space * of btrfs_super_block::sys_chunk_array + * + * While for dev replace, we need to try our best to mark block + * group RO, to prevent race between: + * - Write duplication + * Contains latest data + * - Scrub copy + * Contains data from commit tree + * + * If target block group is not marked RO, nocow writes can + * be overwritten by scrub copy, causing data corruption. + * So for dev-replace, it's not allowed to continue if a block + * group is not RO. */ - ret = btrfs_inc_block_group_ro(cache, false); - scrub_pause_off(fs_info); - + ret = btrfs_inc_block_group_ro(cache, sctx->is_dev_replace); if (ret == 0) { ro_set = 1; - } else if (ret == -ENOSPC) { + } else if (ret == -ENOSPC && !sctx->is_dev_replace) { /* * btrfs_inc_block_group_ro return -ENOSPC when it * failed in creating new chunk for metadata. - * It is not a problem for scrub/replace, because + * It is not a problem for scrub, because * metadata are always cowed, and our scrub paused * commit_transactions. */ @@ -3596,9 +3606,22 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, btrfs_warn(fs_info, "failed setting block group ro: %d", ret); btrfs_put_block_group(cache); + scrub_pause_off(fs_info); break; } + /* + * Now the target block is marked RO, wait for nocow writes to + * finish before dev-replace. + * COW is fine, as COW never overwrites extents in commit tree. + */ + if (sctx->is_dev_replace) { + btrfs_wait_nocow_writers(cache); + btrfs_wait_ordered_roots(fs_info, U64_MAX, cache->start, + cache->length); + } + + scrub_pause_off(fs_info); down_write(&dev_replace->rwsem); dev_replace->cursor_right = found_key.offset + length; dev_replace->cursor_left = found_key.offset; diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 374db1bd57d1..145d46ba25ae 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) /* avoid calling iput_final() in mds dispatch threads */ ceph_async_iput(req->r_inode); } - if (req->r_parent) + if (req->r_parent) { ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); + ceph_async_iput(req->r_parent); + } ceph_async_iput(req->r_target_inode); if (req->r_dentry) dput(req->r_dentry); @@ -2676,8 +2678,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ if (req->r_inode) ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); - if (req->r_parent) + if (req->r_parent) { ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); + ihold(req->r_parent); + } if (req->r_old_dentry_dir) ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir), CEPH_CAP_PIN); diff --git a/fs/io_uring.c b/fs/io_uring.c index 187dd94fd6b1..5953d7f13690 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4463,13 +4463,15 @@ static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg, return -EINVAL; if (copy_from_user(&up, arg, sizeof(up))) return -EFAULT; + if (up.resv) + return -EINVAL; if (check_add_overflow(up.offset, nr_args, &done)) return -EOVERFLOW; if (done > ctx->nr_user_files) return -EINVAL; done = 0; - fds = (__s32 __user *) up.fds; + fds = u64_to_user_ptr(up.fds); while (nr_args) { struct fixed_file_table *table; unsigned index; diff --git a/fs/readdir.c b/fs/readdir.c index d26d5ea4de7b..de2eceffdee8 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -102,10 +102,14 @@ EXPORT_SYMBOL(iterate_dir); * filename length, and the above "soft error" worry means * that it's probably better left alone until we have that * issue clarified. + * + * Note the PATH_MAX check - it's arbitrary but the real + * kernel limit on a possible path component, not NAME_MAX, + * which is the technical standard limit. */ static int verify_dirent_name(const char *name, int len) { - if (!len) + if (len <= 0 || len >= PATH_MAX) return -EIO; if (memchr(name, '/', len)) return -EIO; @@ -206,7 +210,7 @@ struct linux_dirent { struct getdents_callback { struct dir_context ctx; struct linux_dirent __user * current_dir; - struct linux_dirent __user * previous; + int prev_reclen; int count; int error; }; @@ -214,12 +218,13 @@ struct getdents_callback { static int filldir(struct dir_context *ctx, const char *name, int namlen, loff_t offset, u64 ino, unsigned int d_type) { - struct linux_dirent __user * dirent; + struct linux_dirent __user *dirent, *prev; struct getdents_callback *buf = container_of(ctx, struct getdents_callback, ctx); unsigned long d_ino; int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2, sizeof(long)); + int prev_reclen; buf->error = verify_dirent_name(name, namlen); if (unlikely(buf->error)) @@ -232,28 +237,24 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen, buf->error = -EOVERFLOW; return -EOVERFLOW; } - dirent = buf->previous; - if (dirent && signal_pending(current)) + prev_reclen = buf->prev_reclen; + if (prev_reclen && signal_pending(current)) return -EINTR; - - /* - * Note! This range-checks 'previous' (which may be NULL). - * The real range was checked in getdents - */ - if (!user_access_begin(dirent, sizeof(*dirent))) - goto efault; - if (dirent) - unsafe_put_user(offset, &dirent->d_off, efault_end); dirent = buf->current_dir; + prev = (void __user *) dirent - prev_reclen; + if (!user_access_begin(prev, reclen + prev_reclen)) + goto efault; + + /* This might be 'dirent->d_off', but if so it will get overwritten */ + unsafe_put_user(offset, &prev->d_off, efault_end); unsafe_put_user(d_ino, &dirent->d_ino, efault_end); unsafe_put_user(reclen, &dirent->d_reclen, efault_end); unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end); unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end); user_access_end(); - buf->previous = dirent; - dirent = (void __user *)dirent + reclen; - buf->current_dir = dirent; + buf->current_dir = (void __user *)dirent + reclen; + buf->prev_reclen = reclen; buf->count -= reclen; return 0; efault_end: @@ -267,7 +268,6 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd, struct linux_dirent __user *, dirent, unsigned int, count) { struct fd f; - struct linux_dirent __user * lastdirent; struct getdents_callback buf = { .ctx.actor = filldir, .count = count, @@ -285,8 +285,10 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd, error = iterate_dir(f.file, &buf.ctx); if (error >= 0) error = buf.error; - lastdirent = buf.previous; - if (lastdirent) { + if (buf.prev_reclen) { + struct linux_dirent __user * lastdirent; + lastdirent = (void __user *)buf.current_dir - buf.prev_reclen; + if (put_user(buf.ctx.pos, &lastdirent->d_off)) error = -EFAULT; else @@ -299,7 +301,7 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd, struct getdents_callback64 { struct dir_context ctx; struct linux_dirent64 __user * current_dir; - struct linux_dirent64 __user * previous; + int prev_reclen; int count; int error; }; @@ -307,11 +309,12 @@ struct getdents_callback64 { static int filldir64(struct dir_context *ctx, const char *name, int namlen, loff_t offset, u64 ino, unsigned int d_type) { - struct linux_dirent64 __user *dirent; + struct linux_dirent64 __user *dirent, *prev; struct getdents_callback64 *buf = container_of(ctx, struct getdents_callback64, ctx); int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1, sizeof(u64)); + int prev_reclen; buf->error = verify_dirent_name(name, namlen); if (unlikely(buf->error)) @@ -319,30 +322,27 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, buf->error = -EINVAL; /* only used if we fail.. */ if (reclen > buf->count) return -EINVAL; - dirent = buf->previous; - if (dirent && signal_pending(current)) + prev_reclen = buf->prev_reclen; + if (prev_reclen && signal_pending(current)) return -EINTR; - - /* - * Note! This range-checks 'previous' (which may be NULL). - * The real range was checked in getdents - */ - if (!user_access_begin(dirent, sizeof(*dirent))) - goto efault; - if (dirent) - unsafe_put_user(offset, &dirent->d_off, efault_end); dirent = buf->current_dir; + prev = (void __user *)dirent - prev_reclen; + if (!user_access_begin(prev, reclen + prev_reclen)) + goto efault; + + /* This might be 'dirent->d_off', but if so it will get overwritten */ + unsafe_put_user(offset, &prev->d_off, efault_end); unsafe_put_user(ino, &dirent->d_ino, efault_end); unsafe_put_user(reclen, &dirent->d_reclen, efault_end); unsafe_put_user(d_type, &dirent->d_type, efault_end); unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end); user_access_end(); - buf->previous = dirent; - dirent = (void __user *)dirent + reclen; - buf->current_dir = dirent; + buf->prev_reclen = reclen; + buf->current_dir = (void __user *)dirent + reclen; buf->count -= reclen; return 0; + efault_end: user_access_end(); efault: @@ -354,7 +354,6 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent, unsigned int count) { struct fd f; - struct linux_dirent64 __user * lastdirent; struct getdents_callback64 buf = { .ctx.actor = filldir64, .count = count, @@ -372,9 +371,11 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent, error = iterate_dir(f.file, &buf.ctx); if (error >= 0) error = buf.error; - lastdirent = buf.previous; - if (lastdirent) { + if (buf.prev_reclen) { + struct linux_dirent64 __user * lastdirent; typeof(lastdirent->d_off) d_off = buf.ctx.pos; + + lastdirent = (void __user *) buf.current_dir - buf.prev_reclen; if (__put_user(d_off, &lastdirent->d_off)) error = -EFAULT; else diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index 62b40df36c98..28b241cd6987 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -319,8 +319,12 @@ static int reiserfs_for_each_xattr(struct inode *inode, out_dir: dput(dir); out: - /* -ENODATA isn't an error */ - if (err == -ENODATA) + /* + * -ENODATA: this object doesn't have any xattrs + * -EOPNOTSUPP: this file system doesn't have xattrs enabled on disk. + * Neither are errors + */ + if (err == -ENODATA || err == -EOPNOTSUPP) err = 0; return err; } diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index ff335b22f23c..f0f3a9fffa6a 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -53,6 +53,7 @@ * bitmap_find_next_zero_area_off(buf, len, pos, n, mask) as above * bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n + * bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest * bitmap_replace(dst, old, new, mask, nbits) *dst = (*old & ~(*mask)) | (*new & *mask) * bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src) * bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit) @@ -133,6 +134,9 @@ extern void __bitmap_shift_right(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits); extern void __bitmap_shift_left(unsigned long *dst, const unsigned long *src, unsigned int shift, unsigned int nbits); +extern void bitmap_cut(unsigned long *dst, const unsigned long *src, + unsigned int first, unsigned int cut, + unsigned int nbits); extern int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int nbits); extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1, diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h index f6564b572d77..8801f1f986e5 100644 --- a/include/linux/etherdevice.h +++ b/include/linux/etherdevice.h @@ -43,7 +43,6 @@ __be16 eth_header_parse_protocol(const struct sk_buff *skb); int eth_prepare_mac_addr_change(struct net_device *dev, void *p); void eth_commit_mac_addr_change(struct net_device *dev, void *p); int eth_mac_addr(struct net_device *dev, void *p); -int eth_change_mtu(struct net_device *dev, int new_mtu); int eth_validate_addr(struct net_device *dev); struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs, diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h index 4b19c544c59a..34d050bb1ae6 100644 --- a/include/linux/netdev_features.h +++ b/include/linux/netdev_features.h @@ -53,8 +53,9 @@ enum { NETIF_F_GSO_ESP_BIT, /* ... ESP with TSO */ NETIF_F_GSO_UDP_BIT, /* ... UFO, deprecated except tuntap */ NETIF_F_GSO_UDP_L4_BIT, /* ... UDP payload GSO (not UFO) */ + NETIF_F_GSO_FRAGLIST_BIT, /* ... Fraglist GSO */ /**/NETIF_F_GSO_LAST = /* last bit, see GSO_MASK */ - NETIF_F_GSO_UDP_L4_BIT, + NETIF_F_GSO_FRAGLIST_BIT, NETIF_F_FCOE_CRC_BIT, /* FCoE CRC32 */ NETIF_F_SCTP_CRC_BIT, /* SCTP checksum offload */ @@ -80,6 +81,7 @@ enum { NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */ NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */ + NETIF_F_GRO_FRAGLIST_BIT, /* Fraglist GRO */ /* * Add your fresh new feature above and remember to update @@ -150,6 +152,8 @@ enum { #define NETIF_F_GSO_UDP_L4 __NETIF_F(GSO_UDP_L4) #define NETIF_F_HW_TLS_TX __NETIF_F(HW_TLS_TX) #define NETIF_F_HW_TLS_RX __NETIF_F(HW_TLS_RX) +#define NETIF_F_GRO_FRAGLIST __NETIF_F(GRO_FRAGLIST) +#define NETIF_F_GSO_FRAGLIST __NETIF_F(GSO_FRAGLIST) /* Finds the next feature with the highest number of the range of start till 0. */ @@ -226,6 +230,9 @@ static inline int find_next_netdev_feature(u64 feature, unsigned long start) /* changeable features with no special hardware requirements */ #define NETIF_F_SOFT_FEATURES (NETIF_F_GSO | NETIF_F_GRO) +/* Changeable features with no special hardware requirements that defaults to off. */ +#define NETIF_F_SOFT_FEATURES_OFF NETIF_F_GRO_FRAGLIST + #define NETIF_F_VLAN_FEATURES (NETIF_F_HW_VLAN_CTAG_FILTER | \ NETIF_F_HW_VLAN_CTAG_RX | \ NETIF_F_HW_VLAN_CTAG_TX | \ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5ec3537fbdb1..a9c6b5c61d27 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -850,6 +850,7 @@ enum tc_setup_type { TC_SETUP_QDISC_TAPRIO, TC_SETUP_FT, TC_SETUP_QDISC_ETS, + TC_SETUP_QDISC_TBF, }; /* These structures hold the attributes of bpf state that are being passed @@ -938,6 +939,11 @@ struct netdev_name_node { int netdev_name_node_alt_create(struct net_device *dev, const char *name); int netdev_name_node_alt_destroy(struct net_device *dev, const char *name); +struct netdev_net_notifier { + struct list_head list; + struct notifier_block *nb; +}; + /* * This structure defines the management hooks for network devices. * The following hooks can be defined; unless noted otherwise, they are @@ -1792,6 +1798,10 @@ enum netdev_priv_flags { * * @wol_enabled: Wake-on-LAN is enabled * + * @net_notifier_list: List of per-net netdev notifier block + * that follow this device when it is moved + * to another network namespace. + * * FIXME: cleanup struct net_device such that network protocol info * moves out. */ @@ -2084,6 +2094,8 @@ struct net_device { struct lock_class_key addr_list_lock_key; bool proto_down; unsigned wol_enabled:1; + + struct list_head net_notifier_list; }; #define to_net_dev(d) container_of(d, struct net_device, dev) @@ -2325,7 +2337,8 @@ struct napi_gro_cb { /* Number of gro_receive callbacks this packet already went through */ u8 recursion_counter:4; - /* 1 bit hole */ + /* GRO is done by frag_list pointer chaining. */ + u8 is_flist:1; /* used to support CHECKSUM_COMPLETE for tunneling protocols */ __wsum csum; @@ -2527,6 +2540,12 @@ int unregister_netdevice_notifier(struct notifier_block *nb); int register_netdevice_notifier_net(struct net *net, struct notifier_block *nb); int unregister_netdevice_notifier_net(struct net *net, struct notifier_block *nb); +int register_netdevice_notifier_dev_net(struct net_device *dev, + struct notifier_block *nb, + struct netdev_net_notifier *nn); +int unregister_netdevice_notifier_dev_net(struct net_device *dev, + struct notifier_block *nb, + struct netdev_net_notifier *nn); struct netdev_notifier_info { struct net_device *dev; @@ -2693,6 +2712,7 @@ struct net_device *dev_get_by_napi_id(unsigned int napi_id); int netdev_get_name(struct net *net, char *name, int ifindex); int dev_restart(struct net_device *dev); int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb); +int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb); static inline unsigned int skb_gro_offset(const struct sk_buff *skb) { @@ -3704,6 +3724,8 @@ int dev_set_alias(struct net_device *, const char *, size_t); int dev_get_alias(const struct net_device *, char *, size_t); int dev_change_net_namespace(struct net_device *, struct net *, const char *); int __dev_set_mtu(struct net_device *, int); +int dev_validate_mtu(struct net_device *dev, int mtu, + struct netlink_ext_ack *extack); int dev_set_mtu_ext(struct net_device *dev, int mtu, struct netlink_ext_ack *extack); int dev_set_mtu(struct net_device *, int); @@ -3891,22 +3913,48 @@ void netif_device_attach(struct net_device *dev); */ enum { - NETIF_MSG_DRV = 0x0001, - NETIF_MSG_PROBE = 0x0002, - NETIF_MSG_LINK = 0x0004, - NETIF_MSG_TIMER = 0x0008, - NETIF_MSG_IFDOWN = 0x0010, - NETIF_MSG_IFUP = 0x0020, - NETIF_MSG_RX_ERR = 0x0040, - NETIF_MSG_TX_ERR = 0x0080, - NETIF_MSG_TX_QUEUED = 0x0100, - NETIF_MSG_INTR = 0x0200, - NETIF_MSG_TX_DONE = 0x0400, - NETIF_MSG_RX_STATUS = 0x0800, - NETIF_MSG_PKTDATA = 0x1000, - NETIF_MSG_HW = 0x2000, - NETIF_MSG_WOL = 0x4000, + NETIF_MSG_DRV_BIT, + NETIF_MSG_PROBE_BIT, + NETIF_MSG_LINK_BIT, + NETIF_MSG_TIMER_BIT, + NETIF_MSG_IFDOWN_BIT, + NETIF_MSG_IFUP_BIT, + NETIF_MSG_RX_ERR_BIT, + NETIF_MSG_TX_ERR_BIT, + NETIF_MSG_TX_QUEUED_BIT, + NETIF_MSG_INTR_BIT, + NETIF_MSG_TX_DONE_BIT, + NETIF_MSG_RX_STATUS_BIT, + NETIF_MSG_PKTDATA_BIT, + NETIF_MSG_HW_BIT, + NETIF_MSG_WOL_BIT, + + /* When you add a new bit above, update netif_msg_class_names array + * in net/ethtool/common.c + */ + NETIF_MSG_CLASS_COUNT, }; +/* Both ethtool_ops interface and internal driver implementation use u32 */ +static_assert(NETIF_MSG_CLASS_COUNT <= 32); + +#define __NETIF_MSG_BIT(bit) ((u32)1 << (bit)) +#define __NETIF_MSG(name) __NETIF_MSG_BIT(NETIF_MSG_ ## name ## _BIT) + +#define NETIF_MSG_DRV __NETIF_MSG(DRV) +#define NETIF_MSG_PROBE __NETIF_MSG(PROBE) +#define NETIF_MSG_LINK __NETIF_MSG(LINK) +#define NETIF_MSG_TIMER __NETIF_MSG(TIMER) +#define NETIF_MSG_IFDOWN __NETIF_MSG(IFDOWN) +#define NETIF_MSG_IFUP __NETIF_MSG(IFUP) +#define NETIF_MSG_RX_ERR __NETIF_MSG(RX_ERR) +#define NETIF_MSG_TX_ERR __NETIF_MSG(TX_ERR) +#define NETIF_MSG_TX_QUEUED __NETIF_MSG(TX_QUEUED) +#define NETIF_MSG_INTR __NETIF_MSG(INTR) +#define NETIF_MSG_TX_DONE __NETIF_MSG(TX_DONE) +#define NETIF_MSG_RX_STATUS __NETIF_MSG(RX_STATUS) +#define NETIF_MSG_PKTDATA __NETIF_MSG(PKTDATA) +#define NETIF_MSG_HW __NETIF_MSG(HW) +#define NETIF_MSG_WOL __NETIF_MSG(WOL) #define netif_msg_drv(p) ((p)->msg_enable & NETIF_MSG_DRV) #define netif_msg_probe(p) ((p)->msg_enable & NETIF_MSG_PROBE) @@ -4567,6 +4615,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type) BUILD_BUG_ON(SKB_GSO_ESP != (NETIF_F_GSO_ESP >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_UDP != (NETIF_F_GSO_UDP >> NETIF_F_GSO_SHIFT)); BUILD_BUG_ON(SKB_GSO_UDP_L4 != (NETIF_F_GSO_UDP_L4 >> NETIF_F_GSO_SHIFT)); + BUILD_BUG_ON(SKB_GSO_FRAGLIST != (NETIF_F_GSO_FRAGLIST >> NETIF_F_GSO_SHIFT)); return (features & feature) == feature; } diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 4d8b1eaf7708..908d38dbcb91 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -426,13 +426,6 @@ ip6addrptr(const struct sk_buff *skb, bool src, struct in6_addr *addr) sizeof(*addr)); } -/* Calculate the bytes required to store the inclusive range of a-b */ -static inline int -bitmap_bytes(u32 a, u32 b) -{ - return 4 * ((((b - a + 8) / 8) + 3) / 4); -} - /* How often should the gc be run by default */ #define IPSET_GC_TIME (3 * 60) diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h index cf09ab37b45b..851425c3178f 100644 --- a/include/linux/netfilter/nfnetlink.h +++ b/include/linux/netfilter/nfnetlink.h @@ -31,7 +31,7 @@ struct nfnetlink_subsystem { const struct nfnl_callback *cb; /* callback for individual types */ struct module *owner; int (*commit)(struct net *net, struct sk_buff *skb); - int (*abort)(struct net *net, struct sk_buff *skb); + int (*abort)(struct net *net, struct sk_buff *skb, bool autoload); void (*cleanup)(struct net *net); bool (*valid_genid)(struct net *net, u32 genid); }; diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 26beae7db264..3d13a4b717e9 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -592,6 +592,8 @@ enum { SKB_GSO_UDP = 1 << 16, SKB_GSO_UDP_L4 = 1 << 17, + + SKB_GSO_FRAGLIST = 1 << 18, }; #if BITS_PER_LONG > 32 @@ -3533,6 +3535,8 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet); bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu); bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len); struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features); +struct sk_buff *skb_segment_list(struct sk_buff *skb, netdev_features_t features, + unsigned int offset); struct sk_buff *skb_vlan_untag(struct sk_buff *skb); int skb_ensure_writable(struct sk_buff *skb, int write_len); int __skb_vlan_pop(struct sk_buff *skb, u16 *vlan_tci); diff --git a/include/linux/tcp.h b/include/linux/tcp.h index ca6f01531e64..1cf73e6f85ca 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -78,6 +78,27 @@ struct tcp_sack_block { #define TCP_SACK_SEEN (1 << 0) /*1 = peer is SACK capable, */ #define TCP_DSACK_SEEN (1 << 2) /*1 = DSACK was received from peer*/ +#if IS_ENABLED(CONFIG_MPTCP) +struct mptcp_options_received { + u64 sndr_key; + u64 rcvr_key; + u64 data_ack; + u64 data_seq; + u32 subflow_seq; + u16 data_len; + u8 mp_capable : 1, + mp_join : 1, + dss : 1; + u8 use_map:1, + dsn64:1, + data_fin:1, + use_ack:1, + ack64:1, + mpc_map:1, + __unused:2; +}; +#endif + struct tcp_options_received { /* PAWS/RTTM data */ int ts_recent_stamp;/* Time we stored ts_recent (for aging) */ @@ -95,6 +116,9 @@ struct tcp_options_received { u8 num_sacks; /* Number of SACK blocks */ u16 user_mss; /* mss requested by user in ioctl */ u16 mss_clamp; /* Maximal mss, negotiated at connection setup */ +#if IS_ENABLED(CONFIG_MPTCP) + struct mptcp_options_received mptcp; +#endif }; static inline void tcp_clear_options(struct tcp_options_received *rx_opt) @@ -104,6 +128,11 @@ static inline void tcp_clear_options(struct tcp_options_received *rx_opt) #if IS_ENABLED(CONFIG_SMC) rx_opt->smc_ok = 0; #endif +#if IS_ENABLED(CONFIG_MPTCP) + rx_opt->mptcp.mp_capable = 0; + rx_opt->mptcp.mp_join = 0; + rx_opt->mptcp.dss = 0; +#endif } /* This is the max number of SACKS that we'll generate and process. It's safe @@ -119,6 +148,9 @@ struct tcp_request_sock { const struct tcp_request_sock_ops *af_specific; u64 snt_synack; /* first SYNACK sent time */ bool tfo_listener; +#if IS_ENABLED(CONFIG_MPTCP) + bool is_mptcp; +#endif u32 txhash; u32 rcv_isn; u32 snt_isn; @@ -354,6 +386,8 @@ struct tcp_sock { #define BPF_SOCK_OPS_TEST_FLAG(TP, ARG) 0 #endif + u16 timeout_rehash; /* Timeout-triggered rehash attempts */ + u32 rcv_ooopack; /* Received out-of-order packets, for tcpinfo */ /* Receiver side RTT estimation */ @@ -379,6 +413,9 @@ struct tcp_sock { u32 mtu_info; /* We received an ICMP_FRAG_NEEDED / ICMPV6_PKT_TOOBIG * while socket was owned by user. */ +#if IS_ENABLED(CONFIG_MPTCP) + bool is_mptcp; +#endif #ifdef CONFIG_TCP_MD5SIG /* TCP AF-Specific parts; only used by MD5 Signature support so far */ diff --git a/include/linux/xarray.h b/include/linux/xarray.h index 86eecbd98e84..f73e1775ded0 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -417,6 +417,36 @@ static inline bool xa_marked(const struct xarray *xa, xa_mark_t mark) } /** + * xa_for_each_range() - Iterate over a portion of an XArray. + * @xa: XArray. + * @index: Index of @entry. + * @entry: Entry retrieved from array. + * @start: First index to retrieve from array. + * @last: Last index to retrieve from array. + * + * During the iteration, @entry will have the value of the entry stored + * in @xa at @index. You may modify @index during the iteration if you + * want to skip or reprocess indices. It is safe to modify the array + * during the iteration. At the end of the iteration, @entry will be set + * to NULL and @index will have a value less than or equal to max. + * + * xa_for_each_range() is O(n.log(n)) while xas_for_each() is O(n). You have + * to handle your own locking with xas_for_each(), and if you have to unlock + * after each iteration, it will also end up being O(n.log(n)). + * xa_for_each_range() will spin if it hits a retry entry; if you intend to + * see retry entries, you should use the xas_for_each() iterator instead. + * The xas_for_each() iterator will expand into more inline code than + * xa_for_each_range(). + * + * Context: Any context. Takes and releases the RCU lock. + */ +#define xa_for_each_range(xa, index, entry, start, last) \ + for (index = start, \ + entry = xa_find(xa, &index, last, XA_PRESENT); \ + entry; \ + entry = xa_find_after(xa, &index, last, XA_PRESENT)) + +/** * xa_for_each_start() - Iterate over a portion of an XArray. * @xa: XArray. * @index: Index of @entry. @@ -439,11 +469,8 @@ static inline bool xa_marked(const struct xarray *xa, xa_mark_t mark) * * Context: Any context. Takes and releases the RCU lock. */ -#define xa_for_each_start(xa, index, entry, start) \ - for (index = start, \ - entry = xa_find(xa, &index, ULONG_MAX, XA_PRESENT); \ - entry; \ - entry = xa_find_after(xa, &index, ULONG_MAX, XA_PRESENT)) +#define xa_for_each_start(xa, index, entry, start) \ + xa_for_each_range(xa, index, entry, start, ULONG_MAX) /** * xa_for_each() - Iterate over present entries in an XArray. @@ -508,6 +535,14 @@ static inline bool xa_marked(const struct xarray *xa, xa_mark_t mark) spin_lock_irqsave(&(xa)->xa_lock, flags) #define xa_unlock_irqrestore(xa, flags) \ spin_unlock_irqrestore(&(xa)->xa_lock, flags) +#define xa_lock_nested(xa, subclass) \ + spin_lock_nested(&(xa)->xa_lock, subclass) +#define xa_lock_bh_nested(xa, subclass) \ + spin_lock_bh_nested(&(xa)->xa_lock, subclass) +#define xa_lock_irq_nested(xa, subclass) \ + spin_lock_irq_nested(&(xa)->xa_lock, subclass) +#define xa_lock_irqsave_nested(xa, flags, subclass) \ + spin_lock_irqsave_nested(&(xa)->xa_lock, flags, subclass) /* * Versions of the normal API which require the caller to hold the diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index fabee6db0abb..e42bb8e03c09 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -129,6 +129,8 @@ void bt_warn(const char *fmt, ...); __printf(1, 2) void bt_err(const char *fmt, ...); __printf(1, 2) +void bt_warn_ratelimited(const char *fmt, ...); +__printf(1, 2) void bt_err_ratelimited(const char *fmt, ...); #define BT_INFO(fmt, ...) bt_info(fmt "\n", ##__VA_ARGS__) @@ -136,8 +138,6 @@ void bt_err_ratelimited(const char *fmt, ...); #define BT_ERR(fmt, ...) bt_err(fmt "\n", ##__VA_ARGS__) #define BT_DBG(fmt, ...) pr_debug(fmt "\n", ##__VA_ARGS__) -#define BT_ERR_RATELIMITED(fmt, ...) bt_err_ratelimited(fmt "\n", ##__VA_ARGS__) - #define bt_dev_info(hdev, fmt, ...) \ BT_INFO("%s: " fmt, (hdev)->name, ##__VA_ARGS__) #define bt_dev_warn(hdev, fmt, ...) \ @@ -147,8 +147,10 @@ void bt_err_ratelimited(const char *fmt, ...); #define bt_dev_dbg(hdev, fmt, ...) \ BT_DBG("%s: " fmt, (hdev)->name, ##__VA_ARGS__) +#define bt_dev_warn_ratelimited(hdev, fmt, ...) \ + bt_warn_ratelimited("%s: " fmt, (hdev)->name, ##__VA_ARGS__) #define bt_dev_err_ratelimited(hdev, fmt, ...) \ - BT_ERR_RATELIMITED("%s: " fmt, (hdev)->name, ##__VA_ARGS__) + bt_err_ratelimited("%s: " fmt, (hdev)->name, ##__VA_ARGS__) /* Connection and socket states */ enum { diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h index 5bc1e30dedde..6293bdd7d862 100644 --- a/include/net/bluetooth/hci.h +++ b/include/net/bluetooth/hci.h @@ -27,6 +27,7 @@ #define HCI_MAX_ACL_SIZE 1024 #define HCI_MAX_SCO_SIZE 255 +#define HCI_MAX_ISO_SIZE 251 #define HCI_MAX_EVENT_SIZE 260 #define HCI_MAX_FRAME_SIZE (HCI_MAX_ACL_SIZE + 4) @@ -303,6 +304,7 @@ enum { #define HCI_ACLDATA_PKT 0x02 #define HCI_SCODATA_PKT 0x03 #define HCI_EVENT_PKT 0x04 +#define HCI_ISODATA_PKT 0x05 #define HCI_DIAG_PKT 0xf0 #define HCI_VENDOR_PKT 0xff @@ -352,6 +354,15 @@ enum { #define ACL_ACTIVE_BCAST 0x04 #define ACL_PICO_BCAST 0x08 +/* ISO PB flags */ +#define ISO_START 0x00 +#define ISO_CONT 0x01 +#define ISO_SINGLE 0x02 +#define ISO_END 0x03 + +/* ISO TS flags */ +#define ISO_TS 0x01 + /* Baseband links */ #define SCO_LINK 0x00 #define ACL_LINK 0x01 @@ -359,6 +370,7 @@ enum { /* Low Energy links do not have defined link type. Use invented one */ #define LE_LINK 0x80 #define AMP_LINK 0x81 +#define ISO_LINK 0x82 #define INVALID_LINK 0xff /* LMP features */ @@ -440,6 +452,8 @@ enum { #define HCI_LE_PHY_2M 0x01 #define HCI_LE_PHY_CODED 0x08 #define HCI_LE_CHAN_SEL_ALG2 0x40 +#define HCI_LE_CIS_MASTER 0x10 +#define HCI_LE_CIS_SLAVE 0x20 /* Connection modes */ #define HCI_CM_ACTIVE 0x0000 @@ -1718,6 +1732,86 @@ struct hci_cp_le_set_adv_set_rand_addr { bdaddr_t bdaddr; } __packed; +#define HCI_OP_LE_READ_BUFFER_SIZE_V2 0x2060 +struct hci_rp_le_read_buffer_size_v2 { + __u8 status; + __le16 acl_mtu; + __u8 acl_max_pkt; + __le16 iso_mtu; + __u8 iso_max_pkt; +} __packed; + +#define HCI_OP_LE_READ_ISO_TX_SYNC 0x2061 +struct hci_cp_le_read_iso_tx_sync { + __le16 handle; +} __packed; + +struct hci_rp_le_read_iso_tx_sync { + __u8 status; + __le16 handle; + __le16 seq; + __le32 imestamp; + __u8 offset[3]; +} __packed; + +#define HCI_OP_LE_SET_CIG_PARAMS 0x2062 +struct hci_cis_params { + __u8 cis_id; + __le16 m_sdu; + __le16 s_sdu; + __u8 m_phy; + __u8 s_phy; + __u8 m_rtn; + __u8 s_rtn; +} __packed; + +struct hci_cp_le_set_cig_params { + __u8 cig_id; + __u8 m_interval[3]; + __u8 s_interval[3]; + __u8 sca; + __u8 packing; + __u8 framing; + __le16 m_latency; + __le16 s_latency; + __u8 num_cis; + struct hci_cis_params cis[0]; +} __packed; + +struct hci_rp_le_set_cig_params { + __u8 status; + __u8 cig_id; + __u8 num_handles; + __le16 handle[0]; +} __packed; + +#define HCI_OP_LE_CREATE_CIS 0x2064 +struct hci_cis { + __le16 cis_handle; + __le16 acl_handle; +} __packed; + +struct hci_cp_le_create_cis { + __u8 num_cis; + struct hci_cis cis[0]; +} __packed; + +#define HCI_OP_LE_REMOVE_CIG 0x2065 +struct hci_cp_le_remove_cig { + __u8 cig_id; +} __packed; + +#define HCI_OP_LE_ACCEPT_CIS 0x2066 +struct hci_cp_le_accept_cis { + __le16 handle; +} __packed; + +#define HCI_OP_LE_REJECT_CIS 0x2067 +struct hci_cp_le_reject_cis { + __le16 handle; + __u8 reason; +} __packed; + /* ---- HCI Events ---- */ #define HCI_EV_INQUIRY_COMPLETE 0x01 @@ -2186,6 +2280,14 @@ struct hci_ev_le_direct_adv_info { __s8 rssi; } __packed; +#define HCI_EV_LE_PHY_UPDATE_COMPLETE 0x0c +struct hci_ev_le_phy_update_complete { + __u8 status; + __le16 handle; + __u8 tx_phy; + __u8 rx_phy; +} __packed; + #define HCI_EV_LE_EXT_ADV_REPORT 0x0d struct hci_ev_le_ext_adv_report { __le16 evt_type; @@ -2226,6 +2328,34 @@ struct hci_evt_le_ext_adv_set_term { __u8 num_evts; } __packed; +#define HCI_EVT_LE_CIS_ESTABLISHED 0x19 +struct hci_evt_le_cis_established { + __u8 status; + __le16 handle; + __u8 cig_sync_delay[3]; + __u8 cis_sync_delay[3]; + __u8 m_latency[3]; + __u8 s_latency[3]; + __u8 m_phy; + __u8 s_phy; + __u8 nse; + __u8 m_bn; + __u8 s_bn; + __u8 m_ft; + __u8 s_ft; + __le16 m_mtu; + __le16 s_mtu; + __le16 interval; +} __packed; + +#define HCI_EVT_LE_CIS_REQ 0x1a +struct hci_evt_le_cis_req { + __le16 acl_handle; + __le16 cis_handle; + __u8 cig_id; + __u8 cis_id; +} __packed; + #define HCI_EV_VENDOR 0xff /* Internal events generated by Bluetooth stack */ @@ -2254,6 +2384,7 @@ struct hci_ev_si_security { #define HCI_EVENT_HDR_SIZE 2 #define HCI_ACL_HDR_SIZE 4 #define HCI_SCO_HDR_SIZE 3 +#define HCI_ISO_HDR_SIZE 4 struct hci_command_hdr { __le16 opcode; /* OCF & OGF */ @@ -2275,6 +2406,30 @@ struct hci_sco_hdr { __u8 dlen; } __packed; +struct hci_iso_hdr { + __le16 handle; + __le16 dlen; + __u8 data[0]; +} __packed; + +/* ISO data packet status flags */ +#define HCI_ISO_STATUS_VALID 0x00 +#define HCI_ISO_STATUS_INVALID 0x01 +#define HCI_ISO_STATUS_NOP 0x02 + +#define HCI_ISO_DATA_HDR_SIZE 4 +struct hci_iso_data_hdr { + __le16 sn; + __le16 slen; +}; + +#define HCI_ISO_TS_DATA_HDR_SIZE 8 +struct hci_iso_ts_data_hdr { + __le32 ts; + __le16 sn; + __le16 slen; +}; + static inline struct hci_event_hdr *hci_event_hdr(const struct sk_buff *skb) { return (struct hci_event_hdr *) skb->data; @@ -2300,4 +2455,14 @@ static inline struct hci_sco_hdr *hci_sco_hdr(const struct sk_buff *skb) #define hci_handle(h) (h & 0x0fff) #define hci_flags(h) (h >> 12) +/* ISO handle and flags pack/unpack */ +#define hci_iso_flags_pb(f) (f & 0x0003) +#define hci_iso_flags_ts(f) ((f >> 2) & 0x0001) +#define hci_iso_flags_pack(pb, ts) ((pb & 0x03) | ((ts & 0x01) << 2)) + +/* ISO data length and flags pack/unpack */ +#define hci_iso_data_len_pack(h, f) ((__u16) ((h) | ((f) << 14))) +#define hci_iso_data_len(h) ((h) & 0x3fff) +#define hci_iso_data_flags(h) ((h) >> 14) + #endif /* __HCI_H */ diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h index b689aceb636b..89ecf0a80aa1 100644 --- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -118,6 +118,13 @@ struct bt_uuid { u8 svc_hint; }; +struct blocked_key { + struct list_head list; + struct rcu_head rcu; + u8 type; + u8 val[16]; +}; + struct smp_csrk { bdaddr_t bdaddr; u8 bdaddr_type; @@ -397,6 +404,7 @@ struct hci_dev { struct list_head le_conn_params; struct list_head pend_le_conns; struct list_head pend_le_reports; + struct list_head blocked_keys; struct hci_dev_stats stat; @@ -493,6 +501,8 @@ struct hci_conn { __u16 le_supv_timeout; __u8 le_adv_data[HCI_MAX_AD_LENGTH]; __u8 le_adv_data_len; + __u8 le_tx_phy; + __u8 le_rx_phy; __s8 rssi; __s8 tx_power; __s8 max_tx_power; @@ -1121,6 +1131,8 @@ struct smp_irk *hci_find_irk_by_addr(struct hci_dev *hdev, bdaddr_t *bdaddr, struct smp_irk *hci_add_irk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type, u8 val[16], bdaddr_t *rpa); void hci_remove_irk(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type); +bool hci_is_blocked_key(struct hci_dev *hdev, u8 type, u8 val[16]); +void hci_blocked_keys_clear(struct hci_dev *hdev); void hci_smp_irks_clear(struct hci_dev *hdev); bool hci_bdaddr_is_paired(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 type); diff --git a/include/net/bluetooth/hci_mon.h b/include/net/bluetooth/hci_mon.h index 240786b04a46..2d5fcda1bcd0 100644 --- a/include/net/bluetooth/hci_mon.h +++ b/include/net/bluetooth/hci_mon.h @@ -49,6 +49,8 @@ struct hci_mon_hdr { #define HCI_MON_CTRL_CLOSE 15 #define HCI_MON_CTRL_COMMAND 16 #define HCI_MON_CTRL_EVENT 17 +#define HCI_MON_ISO_TX_PKT 18 +#define HCI_MON_ISO_RX_PKT 19 struct hci_mon_new_index { __u8 type; diff --git a/include/net/bluetooth/mgmt.h b/include/net/bluetooth/mgmt.h index 9cee7ddc6741..a90666af05bd 100644 --- a/include/net/bluetooth/mgmt.h +++ b/include/net/bluetooth/mgmt.h @@ -654,6 +654,23 @@ struct mgmt_cp_set_phy_confguration { } __packed; #define MGMT_SET_PHY_CONFIGURATION_SIZE 4 +#define MGMT_OP_SET_BLOCKED_KEYS 0x0046 + +#define HCI_BLOCKED_KEY_TYPE_LINKKEY 0x00 +#define HCI_BLOCKED_KEY_TYPE_LTK 0x01 +#define HCI_BLOCKED_KEY_TYPE_IRK 0x02 + +struct mgmt_blocked_key_info { + __u8 type; + __u8 val[16]; +} __packed; + +struct mgmt_cp_set_blocked_keys { + __le16 key_count; + struct mgmt_blocked_key_info keys[0]; +} __packed; +#define MGMT_OP_SET_BLOCKED_KEYS_SIZE 2 + #define MGMT_EV_CMD_COMPLETE 0x0001 struct mgmt_ev_cmd_complete { __le16 opcode; diff --git a/include/net/devlink.h b/include/net/devlink.h index 5e46c24bb6e6..ce5cea428fdc 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -487,6 +487,8 @@ enum devlink_param_generic_id { #define DEVLINK_INFO_VERSION_GENERIC_FW_NCSI "fw.ncsi" /* FW parameter set id */ #define DEVLINK_INFO_VERSION_GENERIC_FW_PSID "fw.psid" +/* RoCE FW version */ +#define DEVLINK_INFO_VERSION_GENERIC_FW_ROCE "fw.roce" struct devlink_region; struct devlink_info_req; diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 4e95f6df508c..cec1a54401f2 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1113,6 +1113,9 @@ int inet6_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg); int inet6_hash_connect(struct inet_timewait_death_row *death_row, struct sock *sk); +int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size); +int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, + int flags); /* * reassembly.c diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 0573ae75c3db..27627e2d1bc2 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -9,6 +9,7 @@ #define __NET_MPTCP_H #include <linux/skbuff.h> +#include <linux/tcp.h> #include <linux/types.h> /* MPTCP sk_buff extension data */ @@ -22,12 +23,49 @@ struct mptcp_ext { data_fin:1, use_ack:1, ack64:1, - __unused:3; + mpc_map:1, + __unused:2; /* one byte hole */ }; +struct mptcp_out_options { +#if IS_ENABLED(CONFIG_MPTCP) + u16 suboptions; + u64 sndr_key; + u64 rcvr_key; + struct mptcp_ext ext_copy; +#endif +}; + #ifdef CONFIG_MPTCP +void mptcp_init(void); + +static inline bool sk_is_mptcp(const struct sock *sk) +{ + return tcp_sk(sk)->is_mptcp; +} + +static inline bool rsk_is_mptcp(const struct request_sock *req) +{ + return tcp_rsk(req)->is_mptcp; +} + +void mptcp_parse_option(const struct sk_buff *skb, const unsigned char *ptr, + int opsize, struct tcp_options_received *opt_rx); +bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, + unsigned int *size, struct mptcp_out_options *opts); +void mptcp_rcv_synsent(struct sock *sk); +bool mptcp_synack_options(const struct request_sock *req, unsigned int *size, + struct mptcp_out_options *opts); +bool mptcp_established_options(struct sock *sk, struct sk_buff *skb, + unsigned int *size, unsigned int remaining, + struct mptcp_out_options *opts); +void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb, + struct tcp_options_received *opt_rx); + +void mptcp_write_options(__be32 *ptr, struct mptcp_out_options *opts); + /* move the skb extension owership, with the assumption that 'to' is * newly allocated */ @@ -70,6 +108,59 @@ static inline bool mptcp_skb_can_collapse(const struct sk_buff *to, #else +static inline void mptcp_init(void) +{ +} + +static inline bool sk_is_mptcp(const struct sock *sk) +{ + return false; +} + +static inline bool rsk_is_mptcp(const struct request_sock *req) +{ + return false; +} + +static inline void mptcp_parse_option(const struct sk_buff *skb, + const unsigned char *ptr, int opsize, + struct tcp_options_received *opt_rx) +{ +} + +static inline bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, + unsigned int *size, + struct mptcp_out_options *opts) +{ + return false; +} + +static inline void mptcp_rcv_synsent(struct sock *sk) +{ +} + +static inline bool mptcp_synack_options(const struct request_sock *req, + unsigned int *size, + struct mptcp_out_options *opts) +{ + return false; +} + +static inline bool mptcp_established_options(struct sock *sk, + struct sk_buff *skb, + unsigned int *size, + unsigned int remaining, + struct mptcp_out_options *opts) +{ + return false; +} + +static inline void mptcp_incoming_options(struct sock *sk, + struct sk_buff *skb, + struct tcp_options_received *opt_rx) +{ +} + static inline void mptcp_skb_ext_move(struct sk_buff *to, const struct sk_buff *from) { @@ -82,4 +173,16 @@ static inline bool mptcp_skb_can_collapse(const struct sk_buff *to, } #endif /* CONFIG_MPTCP */ + +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped); + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +int mptcpv6_init(void); +#elif IS_ENABLED(CONFIG_IPV6) +static inline int mptcpv6_init(void) +{ + return 0; +} +#endif + #endif /* __NET_MPTCP_H */ diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index fe7c50acc681..4170c033d461 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -231,6 +231,7 @@ struct nft_userdata { * struct nft_set_elem - generic representation of set elements * * @key: element key + * @key_end: closing element key * @priv: element private data and extensions */ struct nft_set_elem { @@ -238,6 +239,10 @@ struct nft_set_elem { u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; struct nft_data val; } key; + union { + u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)]; + struct nft_data val; + } key_end; void *priv; }; @@ -259,11 +264,15 @@ struct nft_set_iter { * @klen: key length * @dlen: data length * @size: number of set elements + * @field_len: length of each field in concatenation, bytes + * @field_count: number of concatenated fields in element */ struct nft_set_desc { unsigned int klen; unsigned int dlen; unsigned int size; + u8 field_len[NFT_REG32_COUNT]; + u8 field_count; }; /** @@ -404,6 +413,8 @@ void nft_unregister_set(struct nft_set_type *type); * @dtype: data type (verdict or numeric type defined by userspace) * @objtype: object type (see NFT_OBJECT_* definitions) * @size: maximum set size + * @field_len: length of each field in concatenation, bytes + * @field_count: number of concatenated fields in element * @use: number of rules references to this set * @nelems: number of elements * @ndeact: number of deactivated elements queued for removal @@ -430,6 +441,8 @@ struct nft_set { u32 dtype; u32 objtype; u32 size; + u8 field_len[NFT_REG32_COUNT]; + u8 field_count; u32 use; atomic_t nelems; u32 ndeact; @@ -502,6 +515,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); * enum nft_set_extensions - set extension type IDs * * @NFT_SET_EXT_KEY: element key + * @NFT_SET_EXT_KEY_END: upper bound element key, for ranges * @NFT_SET_EXT_DATA: mapping data * @NFT_SET_EXT_FLAGS: element flags * @NFT_SET_EXT_TIMEOUT: element timeout @@ -513,6 +527,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set); */ enum nft_set_extensions { NFT_SET_EXT_KEY, + NFT_SET_EXT_KEY_END, NFT_SET_EXT_DATA, NFT_SET_EXT_FLAGS, NFT_SET_EXT_TIMEOUT, @@ -606,6 +621,11 @@ static inline struct nft_data *nft_set_ext_key(const struct nft_set_ext *ext) return nft_set_ext(ext, NFT_SET_EXT_KEY); } +static inline struct nft_data *nft_set_ext_key_end(const struct nft_set_ext *ext) +{ + return nft_set_ext(ext, NFT_SET_EXT_KEY_END); +} + static inline struct nft_data *nft_set_ext_data(const struct nft_set_ext *ext) { return nft_set_ext(ext, NFT_SET_EXT_DATA); @@ -655,7 +675,7 @@ static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext) void *nft_set_elem_init(const struct nft_set *set, const struct nft_set_ext_tmpl *tmpl, - const u32 *key, const u32 *data, + const u32 *key, const u32 *key_end, const u32 *data, u64 timeout, u64 expiration, gfp_t gfp); void nft_set_elem_destroy(const struct nft_set *set, void *elem, bool destroy_expr); diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h index 2656155b4069..29e7e1021267 100644 --- a/include/net/netfilter/nf_tables_core.h +++ b/include/net/netfilter/nf_tables_core.h @@ -74,6 +74,7 @@ extern struct nft_set_type nft_set_hash_type; extern struct nft_set_type nft_set_hash_fast_type; extern struct nft_set_type nft_set_rbtree_type; extern struct nft_set_type nft_set_bitmap_type; +extern struct nft_set_type nft_set_pipapo_type; struct nft_expr; struct nft_regs; diff --git a/include/net/netns/nftables.h b/include/net/netns/nftables.h index 286fd960896f..a1a8d45adb42 100644 --- a/include/net/netns/nftables.h +++ b/include/net/netns/nftables.h @@ -7,6 +7,7 @@ struct netns_nftables { struct list_head tables; struct list_head commit_list; + struct list_head module_list; struct mutex commit_mutex; unsigned int base_seq; u8 gencursor; diff --git a/include/net/pie.h b/include/net/pie.h new file mode 100644 index 000000000000..fd5a37cb7993 --- /dev/null +++ b/include/net/pie.h @@ -0,0 +1,138 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __NET_SCHED_PIE_H +#define __NET_SCHED_PIE_H + +#include <linux/ktime.h> +#include <linux/skbuff.h> +#include <linux/types.h> +#include <net/inet_ecn.h> +#include <net/pkt_sched.h> + +#define MAX_PROB U64_MAX +#define DTIME_INVALID U64_MAX +#define QUEUE_THRESHOLD 16384 +#define DQCOUNT_INVALID -1 +#define PIE_SCALE 8 + +/** + * struct pie_params - contains pie parameters + * @target: target delay in pschedtime + * @tudpate: interval at which drop probability is calculated + * @limit: total number of packets that can be in the queue + * @alpha: parameter to control drop probability + * @beta: parameter to control drop probability + * @ecn: is ECN marking of packets enabled + * @bytemode: is drop probability scaled based on pkt size + * @dq_rate_estimator: is Little's law used for qdelay calculation + */ +struct pie_params { + psched_time_t target; + u32 tupdate; + u32 limit; + u32 alpha; + u32 beta; + u8 ecn; + u8 bytemode; + u8 dq_rate_estimator; +}; + +/** + * struct pie_vars - contains pie variables + * @qdelay: current queue delay + * @qdelay_old: queue delay in previous qdelay calculation + * @burst_time: burst time allowance + * @dq_tstamp: timestamp at which dq rate was last calculated + * @prob: drop probability + * @accu_prob: accumulated drop probability + * @dq_count: number of bytes dequeued in a measurement cycle + * @avg_dq_rate: calculated average dq rate + * @qlen_old: queue length during previous qdelay calculation + * @accu_prob_overflows: number of times accu_prob overflows + */ +struct pie_vars { + psched_time_t qdelay; + psched_time_t qdelay_old; + psched_time_t burst_time; + psched_time_t dq_tstamp; + u64 prob; + u64 accu_prob; + u64 dq_count; + u32 avg_dq_rate; + u32 qlen_old; + u8 accu_prob_overflows; +}; + +/** + * struct pie_stats - contains pie stats + * @packets_in: total number of packets enqueued + * @dropped: packets dropped due to pie action + * @overlimit: packets dropped due to lack of space in queue + * @ecn_mark: packets marked with ECN + * @maxq: maximum queue size + */ +struct pie_stats { + u32 packets_in; + u32 dropped; + u32 overlimit; + u32 ecn_mark; + u32 maxq; +}; + +/** + * struct pie_skb_cb - contains private skb vars + * @enqueue_time: timestamp when the packet is enqueued + * @mem_usage: size of the skb during enqueue + */ +struct pie_skb_cb { + psched_time_t enqueue_time; + u32 mem_usage; +}; + +static inline void pie_params_init(struct pie_params *params) +{ + params->target = PSCHED_NS2TICKS(15 * NSEC_PER_MSEC); /* 15 ms */ + params->tupdate = usecs_to_jiffies(15 * USEC_PER_MSEC); /* 15 ms */ + params->limit = 1000; + params->alpha = 2; + params->beta = 20; + params->ecn = false; + params->bytemode = false; + params->dq_rate_estimator = false; +} + +static inline void pie_vars_init(struct pie_vars *vars) +{ + vars->burst_time = PSCHED_NS2TICKS(150 * NSEC_PER_MSEC); /* 150 ms */ + vars->dq_tstamp = DTIME_INVALID; + vars->accu_prob = 0; + vars->dq_count = DQCOUNT_INVALID; + vars->avg_dq_rate = 0; + vars->accu_prob_overflows = 0; +} + +static inline struct pie_skb_cb *get_pie_cb(const struct sk_buff *skb) +{ + qdisc_cb_private_validate(skb, sizeof(struct pie_skb_cb)); + return (struct pie_skb_cb *)qdisc_skb_cb(skb)->data; +} + +static inline psched_time_t pie_get_enqueue_time(const struct sk_buff *skb) +{ + return get_pie_cb(skb)->enqueue_time; +} + +static inline void pie_set_enqueue_time(struct sk_buff *skb) +{ + get_pie_cb(skb)->enqueue_time = psched_get_time(); +} + +bool pie_drop_early(struct Qdisc *sch, struct pie_params *params, + struct pie_vars *vars, u32 qlen, u32 packet_size); + +void pie_process_dequeue(struct sk_buff *skb, struct pie_params *params, + struct pie_vars *vars, u32 qlen); + +void pie_calculate_probability(struct pie_params *params, struct pie_vars *vars, + u32 qlen); + +#endif diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 47b115e2012a..a972244ab193 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -141,31 +141,38 @@ __cls_set_class(unsigned long *clp, unsigned long cl) return xchg(clp, cl); } -static inline unsigned long -cls_set_class(struct Qdisc *q, unsigned long *clp, unsigned long cl) +static inline void +__tcf_bind_filter(struct Qdisc *q, struct tcf_result *r, unsigned long base) { - unsigned long old_cl; + unsigned long cl; - sch_tree_lock(q); - old_cl = __cls_set_class(clp, cl); - sch_tree_unlock(q); - return old_cl; + cl = q->ops->cl_ops->bind_tcf(q, base, r->classid); + cl = __cls_set_class(&r->class, cl); + if (cl) + q->ops->cl_ops->unbind_tcf(q, cl); } static inline void tcf_bind_filter(struct tcf_proto *tp, struct tcf_result *r, unsigned long base) { struct Qdisc *q = tp->chain->block->q; - unsigned long cl; /* Check q as it is not set for shared blocks. In that case, * setting class is not supported. */ if (!q) return; - cl = q->ops->cl_ops->bind_tcf(q, base, r->classid); - cl = cls_set_class(q, &r->class, cl); - if (cl) + sch_tree_lock(q); + __tcf_bind_filter(q, r, base); + sch_tree_unlock(q); +} + +static inline void +__tcf_unbind_filter(struct Qdisc *q, struct tcf_result *r) +{ + unsigned long cl; + + if ((cl = __cls_set_class(&r->class, 0)) != 0) q->ops->cl_ops->unbind_tcf(q, cl); } @@ -173,12 +180,10 @@ static inline void tcf_unbind_filter(struct tcf_proto *tp, struct tcf_result *r) { struct Qdisc *q = tp->chain->block->q; - unsigned long cl; if (!q) return; - if ((cl = __cls_set_class(&r->class, 0)) != 0) - q->ops->cl_ops->unbind_tcf(q, cl); + __tcf_unbind_filter(q, r); } struct tcf_exts { @@ -854,4 +859,26 @@ struct tc_ets_qopt_offload { }; }; +enum tc_tbf_command { + TC_TBF_REPLACE, + TC_TBF_DESTROY, + TC_TBF_STATS, +}; + +struct tc_tbf_qopt_offload_replace_params { + struct psched_ratecfg rate; + u32 max_size; + struct gnet_stats_queue *qstats; +}; + +struct tc_tbf_qopt_offload { + enum tc_tbf_command command; + u32 handle; + u32 parent; + union { + struct tc_tbf_qopt_offload_replace_params replace_params; + struct tc_qopt_offload_stats stats; + }; +}; + #endif diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index fceddf89592a..151208704ed2 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -318,7 +318,8 @@ struct tcf_proto_ops { void *type_data); void (*hw_del)(struct tcf_proto *tp, void *type_data); - void (*bind_class)(void *, u32, unsigned long); + void (*bind_class)(void *, u32, unsigned long, + void *, unsigned long); void * (*tmplt_create)(struct net *net, struct tcf_chain *chain, struct nlattr **tca, diff --git a/include/net/udp.h b/include/net/udp.h index bad74f780831..44e0e52b585c 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -167,7 +167,7 @@ typedef struct sock *(*udp_lookup_t)(struct sk_buff *skb, __be16 sport, __be16 dport); struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb, - struct udphdr *uh, udp_lookup_t lookup); + struct udphdr *uh, struct sock *sk); int udp_gro_complete(struct sk_buff *skb, int nhoff, udp_lookup_t lookup); struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, diff --git a/include/trace/events/xen.h b/include/trace/events/xen.h index 9a0e8af21310..a5ccfa67bc5c 100644 --- a/include/trace/events/xen.h +++ b/include/trace/events/xen.h @@ -66,7 +66,11 @@ TRACE_EVENT(xen_mc_callback, TP_PROTO(xen_mc_callback_fn_t fn, void *data), TP_ARGS(fn, data), TP_STRUCT__entry( - __field(xen_mc_callback_fn_t, fn) + /* + * Use field_struct to avoid is_signed_type() + * comparison of a function pointer. + */ + __field_struct(xen_mc_callback_fn_t, fn) __field(void *, data) ), TP_fast_assign( diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index 116bcbf09c74..4295ebfa2f91 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -594,6 +594,8 @@ struct ethtool_pauseparam { * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS * @ETH_SS_PHY_TUNABLES: PHY tunable names * @ETH_SS_LINK_MODES: link mode names + * @ETH_SS_MSG_CLASSES: debug message class names + * @ETH_SS_WOL_MODES: wake-on-lan modes */ enum ethtool_stringset { ETH_SS_TEST = 0, @@ -606,6 +608,8 @@ enum ethtool_stringset { ETH_SS_PHY_STATS, ETH_SS_PHY_TUNABLES, ETH_SS_LINK_MODES, + ETH_SS_MSG_CLASSES, + ETH_SS_WOL_MODES, /* add new constants above here */ ETH_SS_COUNT @@ -1693,6 +1697,8 @@ static inline int ethtool_validate_duplex(__u8 duplex) #define WAKE_MAGICSECURE (1 << 6) /* only meaningful if WAKE_MAGIC */ #define WAKE_FILTER (1 << 7) +#define WOL_MODE_COUNT 8 + /* L2-L4 network traffic flow types */ #define TCP_V4_FLOW 0x01 /* hash or spec (tcp_ip4_spec) */ #define UDP_V4_FLOW 0x02 /* hash or spec (udp_ip4_spec) */ diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index 02f82f42a889..7e0b460f872c 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -20,6 +20,10 @@ enum { ETHTOOL_MSG_LINKMODES_GET, ETHTOOL_MSG_LINKMODES_SET, ETHTOOL_MSG_LINKSTATE_GET, + ETHTOOL_MSG_DEBUG_GET, + ETHTOOL_MSG_DEBUG_SET, + ETHTOOL_MSG_WOL_GET, + ETHTOOL_MSG_WOL_SET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -35,6 +39,10 @@ enum { ETHTOOL_MSG_LINKMODES_GET_REPLY, ETHTOOL_MSG_LINKMODES_NTF, ETHTOOL_MSG_LINKSTATE_GET_REPLY, + ETHTOOL_MSG_DEBUG_GET_REPLY, + ETHTOOL_MSG_DEBUG_NTF, + ETHTOOL_MSG_WOL_GET_REPLY, + ETHTOOL_MSG_WOL_NTF, /* add new constants above here */ __ETHTOOL_MSG_KERNEL_CNT, @@ -195,6 +203,31 @@ enum { ETHTOOL_A_LINKSTATE_MAX = __ETHTOOL_A_LINKSTATE_CNT - 1 }; +/* DEBUG */ + +enum { + ETHTOOL_A_DEBUG_UNSPEC, + ETHTOOL_A_DEBUG_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_DEBUG_MSGMASK, /* bitset */ + + /* add new constants above here */ + __ETHTOOL_A_DEBUG_CNT, + ETHTOOL_A_DEBUG_MAX = __ETHTOOL_A_DEBUG_CNT - 1 +}; + +/* WOL */ + +enum { + ETHTOOL_A_WOL_UNSPEC, + ETHTOOL_A_WOL_HEADER, /* nest - _A_HEADER_* */ + ETHTOOL_A_WOL_MODES, /* bitset */ + ETHTOOL_A_WOL_SOPASS, /* binary */ + + /* add new constants above here */ + __ETHTOOL_A_WOL_CNT, + ETHTOOL_A_WOL_MAX = __ETHTOOL_A_WOL_CNT - 1 +}; + /* generic netlink info */ #define ETHTOOL_GENL_NAME "ethtool" #define ETHTOOL_GENL_VERSION 1 diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index ac38f0b674b8..42f7ca38ad80 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -130,6 +130,7 @@ enum { #define BRIDGE_VLAN_INFO_RANGE_BEGIN (1<<3) /* VLAN is start of vlan range */ #define BRIDGE_VLAN_INFO_RANGE_END (1<<4) /* VLAN is end of vlan range */ #define BRIDGE_VLAN_INFO_BRENTRY (1<<5) /* Global bridge VLAN entry */ +#define BRIDGE_VLAN_INFO_ONLY_OPTS (1<<6) /* Skip create/delete/flags */ struct bridge_vlan_info { __u16 flags; @@ -190,6 +191,7 @@ enum { BRIDGE_VLANDB_ENTRY_UNSPEC, BRIDGE_VLANDB_ENTRY_INFO, BRIDGE_VLANDB_ENTRY_RANGE, + BRIDGE_VLANDB_ENTRY_STATE, __BRIDGE_VLANDB_ENTRY_MAX, }; #define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index a3300e1b9a01..55cfcb71606d 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -178,7 +178,8 @@ struct io_uring_params { struct io_uring_files_update { __u32 offset; - __s32 *fds; + __u32 resv; + __aligned_u64 /* __s32 * */ fds; }; #endif diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index 261864736b26..065218a20bb7 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -48,6 +48,7 @@ enum nft_registers { #define NFT_REG_SIZE 16 #define NFT_REG32_SIZE 4 +#define NFT_REG32_COUNT (NFT_REG32_15 - NFT_REG32_00 + 1) /** * enum nft_verdicts - nf_tables internal verdicts @@ -301,15 +302,29 @@ enum nft_set_policies { * enum nft_set_desc_attributes - set element description * * @NFTA_SET_DESC_SIZE: number of elements in set (NLA_U32) + * @NFTA_SET_DESC_CONCAT: description of field concatenation (NLA_NESTED) */ enum nft_set_desc_attributes { NFTA_SET_DESC_UNSPEC, NFTA_SET_DESC_SIZE, + NFTA_SET_DESC_CONCAT, __NFTA_SET_DESC_MAX }; #define NFTA_SET_DESC_MAX (__NFTA_SET_DESC_MAX - 1) /** + * enum nft_set_field_attributes - attributes of concatenated fields + * + * @NFTA_SET_FIELD_LEN: length of single field, in bits (NLA_U32) + */ +enum nft_set_field_attributes { + NFTA_SET_FIELD_UNSPEC, + NFTA_SET_FIELD_LEN, + __NFTA_SET_FIELD_MAX +}; +#define NFTA_SET_FIELD_MAX (__NFTA_SET_FIELD_MAX - 1) + +/** * enum nft_set_attributes - nf_tables set netlink attributes * * @NFTA_SET_TABLE: table name (NLA_STRING) @@ -370,6 +385,7 @@ enum nft_set_elem_flags { * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY) * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes) * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING) + * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_NESTED: nft_data) */ enum nft_set_elem_attributes { NFTA_SET_ELEM_UNSPEC, @@ -382,6 +398,7 @@ enum nft_set_elem_attributes { NFTA_SET_ELEM_EXPR, NFTA_SET_ELEM_PAD, NFTA_SET_ELEM_OBJREF, + NFTA_SET_ELEM_KEY_END, __NFTA_SET_ELEM_MAX }; #define NFTA_SET_ELEM_MAX (__NFTA_SET_ELEM_MAX - 1) diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index bf5a5b1dfb0b..bbe791b24168 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -971,6 +971,37 @@ struct tc_pie_xstats { __u32 ecn_mark; /* packets marked with ecn*/ }; +/* FQ PIE */ +enum { + TCA_FQ_PIE_UNSPEC, + TCA_FQ_PIE_LIMIT, + TCA_FQ_PIE_FLOWS, + TCA_FQ_PIE_TARGET, + TCA_FQ_PIE_TUPDATE, + TCA_FQ_PIE_ALPHA, + TCA_FQ_PIE_BETA, + TCA_FQ_PIE_QUANTUM, + TCA_FQ_PIE_MEMORY_LIMIT, + TCA_FQ_PIE_ECN_PROB, + TCA_FQ_PIE_ECN, + TCA_FQ_PIE_BYTEMODE, + TCA_FQ_PIE_DQ_RATE_ESTIMATOR, + __TCA_FQ_PIE_MAX +}; +#define TCA_FQ_PIE_MAX (__TCA_FQ_PIE_MAX - 1) + +struct tc_fq_pie_xstats { + __u32 packets_in; /* total number of packets enqueued */ + __u32 dropped; /* packets dropped due to fq_pie_action */ + __u32 overlimit; /* dropped due to lack of space in queue */ + __u32 overmemory; /* dropped due to lack of memory in queue */ + __u32 ecn_mark; /* packets marked with ecn */ + __u32 new_flow_count; /* count of new flows created by packets */ + __u32 new_flows_len; /* count of flows in new list */ + __u32 old_flows_len; /* count of flows in old list */ + __u32 memory_usage; /* total memory across all queues */ +}; + /* CBS */ struct tc_cbs_qopt { __u8 offload; diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h index 7eee233e78d2..7d91f4debc48 100644 --- a/include/uapi/linux/snmp.h +++ b/include/uapi/linux/snmp.h @@ -285,6 +285,8 @@ enum LINUX_MIB_TCPRCVQDROP, /* TCPRcvQDrop */ LINUX_MIB_TCPWQUEUETOOBIG, /* TCPWqueueTooBig */ LINUX_MIB_TCPFASTOPENPASSIVEALTKEY, /* TCPFastOpenPassiveAltKey */ + LINUX_MIB_TCPTIMEOUTREHASH, /* TCPTimeoutRehash */ + LINUX_MIB_TCPDUPLICATEDATAREHASH, /* TCPDuplicateDataRehash */ __LINUX_MIB_MAX }; diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h index d87184e673ca..fd9eb8f6bcae 100644 --- a/include/uapi/linux/tcp.h +++ b/include/uapi/linux/tcp.h @@ -311,6 +311,7 @@ enum { TCP_NLA_DSACK_DUPS, /* DSACK blocks received */ TCP_NLA_REORD_SEEN, /* reordering events seen */ TCP_NLA_SRTT, /* smoothed RTT in usecs */ + TCP_NLA_TIMEOUT_REHASH, /* Timeout-triggered rehash attempts */ }; /* for TCP_MD5SIG socket option */ diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 26b9168321e7..d65f2d5ab694 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -1147,24 +1147,24 @@ void free_basic_memory_bitmaps(void) void clear_free_pages(void) { -#ifdef CONFIG_PAGE_POISONING_ZERO struct memory_bitmap *bm = free_pages_map; unsigned long pfn; if (WARN_ON(!(free_pages_map))) return; - memory_bm_position_reset(bm); - pfn = memory_bm_next_pfn(bm); - while (pfn != BM_END_OF_MAP) { - if (pfn_valid(pfn)) - clear_highpage(pfn_to_page(pfn)); - + if (IS_ENABLED(CONFIG_PAGE_POISONING_ZERO) || want_init_on_free()) { + memory_bm_position_reset(bm); pfn = memory_bm_next_pfn(bm); + while (pfn != BM_END_OF_MAP) { + if (pfn_valid(pfn)) + clear_highpage(pfn_to_page(pfn)); + + pfn = memory_bm_next_pfn(bm); + } + memory_bm_position_reset(bm); + pr_info("free pages cleared after restore\n"); } - memory_bm_position_reset(bm); - pr_info("free pages cleared after restore\n"); -#endif /* PAGE_POISONING_ZERO */ } /** diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index ddb7e7f5fe8d..5b6ee4aadc26 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -9420,6 +9420,11 @@ __init static int tracing_set_default_clock(void) { /* sched_clock_stable() is determined in late_initcall */ if (!trace_boot_clock && !sched_clock_stable()) { + if (security_locked_down(LOCKDOWN_TRACEFS)) { + pr_warn("Can not set tracing clock due to lockdown\n"); + return -EPERM; + } + printk(KERN_WARNING "Unstable clock detected, switching default tracing clock to \"global\"\n" "If you want to keep using the local clock, then add:\n" diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index f62de5f43e79..6ac35b9e195d 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -116,6 +116,7 @@ struct hist_field { struct ftrace_event_field *field; unsigned long flags; hist_field_fn_t fn; + unsigned int ref; unsigned int size; unsigned int offset; unsigned int is_signed; @@ -1766,11 +1767,13 @@ static struct hist_field *find_var(struct hist_trigger_data *hist_data, struct event_trigger_data *test; struct hist_field *hist_field; + lockdep_assert_held(&event_mutex); + hist_field = find_var_field(hist_data, var_name); if (hist_field) return hist_field; - list_for_each_entry_rcu(test, &file->triggers, list) { + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { test_data = test->private_data; hist_field = find_var_field(test_data, var_name); @@ -1820,7 +1823,9 @@ static struct hist_field *find_file_var(struct trace_event_file *file, struct event_trigger_data *test; struct hist_field *hist_field; - list_for_each_entry_rcu(test, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { test_data = test->private_data; hist_field = find_var_field(test_data, var_name); @@ -2423,8 +2428,16 @@ static int contains_operator(char *str) return field_op; } +static void get_hist_field(struct hist_field *hist_field) +{ + hist_field->ref++; +} + static void __destroy_hist_field(struct hist_field *hist_field) { + if (--hist_field->ref > 1) + return; + kfree(hist_field->var.name); kfree(hist_field->name); kfree(hist_field->type); @@ -2466,6 +2479,8 @@ static struct hist_field *create_hist_field(struct hist_trigger_data *hist_data, if (!hist_field) return NULL; + hist_field->ref = 1; + hist_field->hist_data = hist_data; if (flags & HIST_FIELD_FL_EXPR || flags & HIST_FIELD_FL_ALIAS) @@ -2661,6 +2676,17 @@ static struct hist_field *create_var_ref(struct hist_trigger_data *hist_data, { unsigned long flags = HIST_FIELD_FL_VAR_REF; struct hist_field *ref_field; + int i; + + /* Check if the variable already exists */ + for (i = 0; i < hist_data->n_var_refs; i++) { + ref_field = hist_data->var_refs[i]; + if (ref_field->var.idx == var_field->var.idx && + ref_field->var.hist_data == var_field->hist_data) { + get_hist_field(ref_field); + return ref_field; + } + } ref_field = create_hist_field(var_field->hist_data, NULL, flags, NULL); if (ref_field) { @@ -3115,7 +3141,9 @@ static char *find_trigger_filter(struct hist_trigger_data *hist_data, { struct event_trigger_data *test; - list_for_each_entry_rcu(test, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { if (test->private_data == hist_data) return test->filter_str; @@ -3166,9 +3194,11 @@ find_compatible_hist(struct hist_trigger_data *target_hist_data, struct event_trigger_data *test; unsigned int n_keys; + lockdep_assert_held(&event_mutex); + n_keys = target_hist_data->n_fields - target_hist_data->n_vals; - list_for_each_entry_rcu(test, &file->triggers, list) { + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { hist_data = test->private_data; @@ -5528,7 +5558,7 @@ static int hist_show(struct seq_file *m, void *v) goto out_unlock; } - list_for_each_entry_rcu(data, &event_file->triggers, list) { + list_for_each_entry(data, &event_file->triggers, list) { if (data->cmd_ops->trigger_type == ETT_EVENT_HIST) hist_trigger_show(m, data, n++); } @@ -5921,7 +5951,9 @@ static int hist_register_trigger(char *glob, struct event_trigger_ops *ops, if (hist_data->attrs->name && !named_data) goto new; - list_for_each_entry_rcu(test, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { if (!hist_trigger_match(data, test, named_data, false)) continue; @@ -6005,10 +6037,12 @@ static bool have_hist_trigger_match(struct event_trigger_data *data, struct event_trigger_data *test, *named_data = NULL; bool match = false; + lockdep_assert_held(&event_mutex); + if (hist_data->attrs->name) named_data = find_named_trigger(hist_data->attrs->name); - list_for_each_entry_rcu(test, &file->triggers, list) { + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { if (hist_trigger_match(data, test, named_data, false)) { match = true; @@ -6026,10 +6060,12 @@ static bool hist_trigger_check_refs(struct event_trigger_data *data, struct hist_trigger_data *hist_data = data->private_data; struct event_trigger_data *test, *named_data = NULL; + lockdep_assert_held(&event_mutex); + if (hist_data->attrs->name) named_data = find_named_trigger(hist_data->attrs->name); - list_for_each_entry_rcu(test, &file->triggers, list) { + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { if (!hist_trigger_match(data, test, named_data, false)) continue; @@ -6051,10 +6087,12 @@ static void hist_unregister_trigger(char *glob, struct event_trigger_ops *ops, struct event_trigger_data *test, *named_data = NULL; bool unregistered = false; + lockdep_assert_held(&event_mutex); + if (hist_data->attrs->name) named_data = find_named_trigger(hist_data->attrs->name); - list_for_each_entry_rcu(test, &file->triggers, list) { + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { if (!hist_trigger_match(data, test, named_data, false)) continue; @@ -6080,7 +6118,9 @@ static bool hist_file_check_refs(struct trace_event_file *file) struct hist_trigger_data *hist_data; struct event_trigger_data *test; - list_for_each_entry_rcu(test, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { hist_data = test->private_data; if (check_var_refs(hist_data)) @@ -6323,7 +6363,8 @@ hist_enable_trigger(struct event_trigger_data *data, void *rec, struct enable_trigger_data *enable_data = data->private_data; struct event_trigger_data *test; - list_for_each_entry_rcu(test, &enable_data->file->triggers, list) { + list_for_each_entry_rcu(test, &enable_data->file->triggers, list, + lockdep_is_held(&event_mutex)) { if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) { if (enable_data->enable) test->paused = false; diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c index 2cd53ca21b51..40106fff06a4 100644 --- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -501,7 +501,9 @@ void update_cond_flag(struct trace_event_file *file) struct event_trigger_data *data; bool set_cond = false; - list_for_each_entry_rcu(data, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(data, &file->triggers, list) { if (data->filter || event_command_post_trigger(data->cmd_ops) || event_command_needs_rec(data->cmd_ops)) { set_cond = true; @@ -536,7 +538,9 @@ static int register_trigger(char *glob, struct event_trigger_ops *ops, struct event_trigger_data *test; int ret = 0; - list_for_each_entry_rcu(test, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(test, &file->triggers, list) { if (test->cmd_ops->trigger_type == data->cmd_ops->trigger_type) { ret = -EEXIST; goto out; @@ -581,7 +585,9 @@ static void unregister_trigger(char *glob, struct event_trigger_ops *ops, struct event_trigger_data *data; bool unregistered = false; - list_for_each_entry_rcu(data, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(data, &file->triggers, list) { if (data->cmd_ops->trigger_type == test->cmd_ops->trigger_type) { unregistered = true; list_del_rcu(&data->list); @@ -1497,7 +1503,9 @@ int event_enable_register_trigger(char *glob, struct event_trigger_data *test; int ret = 0; - list_for_each_entry_rcu(test, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(test, &file->triggers, list) { test_enable_data = test->private_data; if (test_enable_data && (test->cmd_ops->trigger_type == @@ -1537,7 +1545,9 @@ void event_enable_unregister_trigger(char *glob, struct event_trigger_data *data; bool unregistered = false; - list_for_each_entry_rcu(data, &file->triggers, list) { + lockdep_assert_held(&event_mutex); + + list_for_each_entry(data, &file->triggers, list) { enable_data = data->private_data; if (enable_data && (data->cmd_ops->trigger_type == diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 7f890262c8a3..3f54dc2f6e1c 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -290,7 +290,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char *group, INIT_HLIST_NODE(&tk->rp.kp.hlist); INIT_LIST_HEAD(&tk->rp.kp.list); - ret = trace_probe_init(&tk->tp, event, group); + ret = trace_probe_init(&tk->tp, event, group, false); if (ret < 0) goto error; diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 905b10af5d5c..9ae87be422f2 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -984,15 +984,19 @@ void trace_probe_cleanup(struct trace_probe *tp) } int trace_probe_init(struct trace_probe *tp, const char *event, - const char *group) + const char *group, bool alloc_filter) { struct trace_event_call *call; + size_t size = sizeof(struct trace_probe_event); int ret = 0; if (!event || !group) return -EINVAL; - tp->event = kzalloc(sizeof(struct trace_probe_event), GFP_KERNEL); + if (alloc_filter) + size += sizeof(struct trace_uprobe_filter); + + tp->event = kzalloc(size, GFP_KERNEL); if (!tp->event) return -ENOMEM; diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index 4ee703728aec..a0ff9e200ef6 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -223,6 +223,12 @@ struct probe_arg { const struct fetch_type *type; /* Type of this argument */ }; +struct trace_uprobe_filter { + rwlock_t rwlock; + int nr_systemwide; + struct list_head perf_events; +}; + /* Event call and class holder */ struct trace_probe_event { unsigned int flags; /* For TP_FLAG_* */ @@ -230,6 +236,7 @@ struct trace_probe_event { struct trace_event_call call; struct list_head files; struct list_head probes; + struct trace_uprobe_filter filter[0]; }; struct trace_probe { @@ -322,7 +329,7 @@ static inline bool trace_probe_has_single_file(struct trace_probe *tp) } int trace_probe_init(struct trace_probe *tp, const char *event, - const char *group); + const char *group, bool alloc_filter); void trace_probe_cleanup(struct trace_probe *tp); int trace_probe_append(struct trace_probe *tp, struct trace_probe *to); void trace_probe_unlink(struct trace_probe *tp); diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 352073d36585..2619bc5ed520 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -34,12 +34,6 @@ struct uprobe_trace_entry_head { #define DATAOF_TRACE_ENTRY(entry, is_return) \ ((void*)(entry) + SIZEOF_TRACE_ENTRY(is_return)) -struct trace_uprobe_filter { - rwlock_t rwlock; - int nr_systemwide; - struct list_head perf_events; -}; - static int trace_uprobe_create(int argc, const char **argv); static int trace_uprobe_show(struct seq_file *m, struct dyn_event *ev); static int trace_uprobe_release(struct dyn_event *ev); @@ -60,7 +54,6 @@ static struct dyn_event_operations trace_uprobe_ops = { */ struct trace_uprobe { struct dyn_event devent; - struct trace_uprobe_filter filter; struct uprobe_consumer consumer; struct path path; struct inode *inode; @@ -351,7 +344,7 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret) if (!tu) return ERR_PTR(-ENOMEM); - ret = trace_probe_init(&tu->tp, event, group); + ret = trace_probe_init(&tu->tp, event, group, true); if (ret < 0) goto error; @@ -359,7 +352,7 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret) tu->consumer.handler = uprobe_dispatcher; if (is_ret) tu->consumer.ret_handler = uretprobe_dispatcher; - init_trace_uprobe_filter(&tu->filter); + init_trace_uprobe_filter(tu->tp.event->filter); return tu; error: @@ -1067,13 +1060,14 @@ static void __probe_event_disable(struct trace_probe *tp) struct trace_probe *pos; struct trace_uprobe *tu; + tu = container_of(tp, struct trace_uprobe, tp); + WARN_ON(!uprobe_filter_is_empty(tu->tp.event->filter)); + list_for_each_entry(pos, trace_probe_probe_list(tp), list) { tu = container_of(pos, struct trace_uprobe, tp); if (!tu->inode) continue; - WARN_ON(!uprobe_filter_is_empty(&tu->filter)); - uprobe_unregister(tu->inode, tu->offset, &tu->consumer); tu->inode = NULL; } @@ -1108,7 +1102,7 @@ static int probe_event_enable(struct trace_event_call *call, } tu = container_of(tp, struct trace_uprobe, tp); - WARN_ON(!uprobe_filter_is_empty(&tu->filter)); + WARN_ON(!uprobe_filter_is_empty(tu->tp.event->filter)); if (enabled) return 0; @@ -1205,39 +1199,39 @@ __uprobe_perf_filter(struct trace_uprobe_filter *filter, struct mm_struct *mm) } static inline bool -uprobe_filter_event(struct trace_uprobe *tu, struct perf_event *event) +trace_uprobe_filter_event(struct trace_uprobe_filter *filter, + struct perf_event *event) { - return __uprobe_perf_filter(&tu->filter, event->hw.target->mm); + return __uprobe_perf_filter(filter, event->hw.target->mm); } -static int uprobe_perf_close(struct trace_uprobe *tu, struct perf_event *event) +static bool trace_uprobe_filter_remove(struct trace_uprobe_filter *filter, + struct perf_event *event) { bool done; - write_lock(&tu->filter.rwlock); + write_lock(&filter->rwlock); if (event->hw.target) { list_del(&event->hw.tp_list); - done = tu->filter.nr_systemwide || + done = filter->nr_systemwide || (event->hw.target->flags & PF_EXITING) || - uprobe_filter_event(tu, event); + trace_uprobe_filter_event(filter, event); } else { - tu->filter.nr_systemwide--; - done = tu->filter.nr_systemwide; + filter->nr_systemwide--; + done = filter->nr_systemwide; } - write_unlock(&tu->filter.rwlock); + write_unlock(&filter->rwlock); - if (!done) - return uprobe_apply(tu->inode, tu->offset, &tu->consumer, false); - - return 0; + return done; } -static int uprobe_perf_open(struct trace_uprobe *tu, struct perf_event *event) +/* This returns true if the filter always covers target mm */ +static bool trace_uprobe_filter_add(struct trace_uprobe_filter *filter, + struct perf_event *event) { bool done; - int err; - write_lock(&tu->filter.rwlock); + write_lock(&filter->rwlock); if (event->hw.target) { /* * event->parent != NULL means copy_process(), we can avoid @@ -1247,28 +1241,21 @@ static int uprobe_perf_open(struct trace_uprobe *tu, struct perf_event *event) * attr.enable_on_exec means that exec/mmap will install the * breakpoints we need. */ - done = tu->filter.nr_systemwide || + done = filter->nr_systemwide || event->parent || event->attr.enable_on_exec || - uprobe_filter_event(tu, event); - list_add(&event->hw.tp_list, &tu->filter.perf_events); + trace_uprobe_filter_event(filter, event); + list_add(&event->hw.tp_list, &filter->perf_events); } else { - done = tu->filter.nr_systemwide; - tu->filter.nr_systemwide++; + done = filter->nr_systemwide; + filter->nr_systemwide++; } - write_unlock(&tu->filter.rwlock); + write_unlock(&filter->rwlock); - err = 0; - if (!done) { - err = uprobe_apply(tu->inode, tu->offset, &tu->consumer, true); - if (err) - uprobe_perf_close(tu, event); - } - return err; + return done; } -static int uprobe_perf_multi_call(struct trace_event_call *call, - struct perf_event *event, - int (*op)(struct trace_uprobe *tu, struct perf_event *event)) +static int uprobe_perf_close(struct trace_event_call *call, + struct perf_event *event) { struct trace_probe *pos, *tp; struct trace_uprobe *tu; @@ -1278,25 +1265,59 @@ static int uprobe_perf_multi_call(struct trace_event_call *call, if (WARN_ON_ONCE(!tp)) return -ENODEV; + tu = container_of(tp, struct trace_uprobe, tp); + if (trace_uprobe_filter_remove(tu->tp.event->filter, event)) + return 0; + list_for_each_entry(pos, trace_probe_probe_list(tp), list) { tu = container_of(pos, struct trace_uprobe, tp); - ret = op(tu, event); + ret = uprobe_apply(tu->inode, tu->offset, &tu->consumer, false); if (ret) break; } return ret; } + +static int uprobe_perf_open(struct trace_event_call *call, + struct perf_event *event) +{ + struct trace_probe *pos, *tp; + struct trace_uprobe *tu; + int err = 0; + + tp = trace_probe_primary_from_call(call); + if (WARN_ON_ONCE(!tp)) + return -ENODEV; + + tu = container_of(tp, struct trace_uprobe, tp); + if (trace_uprobe_filter_add(tu->tp.event->filter, event)) + return 0; + + list_for_each_entry(pos, trace_probe_probe_list(tp), list) { + err = uprobe_apply(tu->inode, tu->offset, &tu->consumer, true); + if (err) { + uprobe_perf_close(call, event); + break; + } + } + + return err; +} + static bool uprobe_perf_filter(struct uprobe_consumer *uc, enum uprobe_filter_ctx ctx, struct mm_struct *mm) { + struct trace_uprobe_filter *filter; struct trace_uprobe *tu; int ret; tu = container_of(uc, struct trace_uprobe, consumer); - read_lock(&tu->filter.rwlock); - ret = __uprobe_perf_filter(&tu->filter, mm); - read_unlock(&tu->filter.rwlock); + filter = tu->tp.event->filter; + + read_lock(&filter->rwlock); + ret = __uprobe_perf_filter(filter, mm); + read_unlock(&filter->rwlock); return ret; } @@ -1419,10 +1440,10 @@ trace_uprobe_register(struct trace_event_call *event, enum trace_reg type, return 0; case TRACE_REG_PERF_OPEN: - return uprobe_perf_multi_call(event, data, uprobe_perf_open); + return uprobe_perf_open(event, data); case TRACE_REG_PERF_CLOSE: - return uprobe_perf_multi_call(event, data, uprobe_perf_close); + return uprobe_perf_close(event, data); #endif default: diff --git a/lib/bitmap.c b/lib/bitmap.c index 4250519d7d1c..6e175fbd69a9 100644 --- a/lib/bitmap.c +++ b/lib/bitmap.c @@ -168,6 +168,72 @@ void __bitmap_shift_left(unsigned long *dst, const unsigned long *src, } EXPORT_SYMBOL(__bitmap_shift_left); +/** + * bitmap_cut() - remove bit region from bitmap and right shift remaining bits + * @dst: destination bitmap, might overlap with src + * @src: source bitmap + * @first: start bit of region to be removed + * @cut: number of bits to remove + * @nbits: bitmap size, in bits + * + * Set the n-th bit of @dst iff the n-th bit of @src is set and + * n is less than @first, or the m-th bit of @src is set for any + * m such that @first <= n < nbits, and m = n + @cut. + * + * In pictures, example for a big-endian 32-bit architecture: + * + * @src: + * 31 63 + * | | + * 10000000 11000001 11110010 00010101 10000000 11000001 01110010 00010101 + * | | | | + * 16 14 0 32 + * + * if @cut is 3, and @first is 14, bits 14-16 in @src are cut and @dst is: + * + * 31 63 + * | | + * 10110000 00011000 00110010 00010101 00010000 00011000 00101110 01000010 + * | | | + * 14 (bit 17 0 32 + * from @src) + * + * Note that @dst and @src might overlap partially or entirely. + * + * This is implemented in the obvious way, with a shift and carry + * step for each moved bit. Optimisation is left as an exercise + * for the compiler. + */ +void bitmap_cut(unsigned long *dst, const unsigned long *src, + unsigned int first, unsigned int cut, unsigned int nbits) +{ + unsigned int len = BITS_TO_LONGS(nbits); + unsigned long keep = 0, carry; + int i; + + memmove(dst, src, len * sizeof(*dst)); + + if (first % BITS_PER_LONG) { + keep = src[first / BITS_PER_LONG] & + (~0UL >> (BITS_PER_LONG - first % BITS_PER_LONG)); + } + + while (cut--) { + for (i = first / BITS_PER_LONG; i < len; i++) { + if (i < len - 1) + carry = dst[i + 1] & 1UL; + else + carry = 0; + + dst[i] = (dst[i] >> 1) | (carry << (BITS_PER_LONG - 1)); + } + } + + dst[first / BITS_PER_LONG] &= ~0UL << (first % BITS_PER_LONG); + dst[first / BITS_PER_LONG] |= keep; +} +EXPORT_SYMBOL(bitmap_cut); + int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1, const unsigned long *bitmap2, unsigned int bits) { diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index dccb95af6003..706020b06617 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -30,13 +30,6 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS; unsigned long res = 0; - /* - * Truncate 'max' to the user-specified limit, so that - * we only have one limit we need to check in the loop - */ - if (max > count) - max = count; - if (IS_UNALIGNED(src, dst)) goto byte_at_a_time; @@ -114,6 +107,13 @@ long strncpy_from_user(char *dst, const char __user *src, long count) unsigned long max = max_addr - src_addr; long retval; + /* + * Truncate 'max' to the user-specified limit, so that + * we only have one limit we need to check in the loop + */ + if (max > count) + max = count; + kasan_check_write(dst, count); check_object_size(dst, count, false); if (user_access_begin(src, max)) { diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index 6c0005d5dd5c..41670d4a5816 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -27,13 +27,6 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count, unsigned long c; /* - * Truncate 'max' to the user-specified limit, so that - * we only have one limit we need to check in the loop - */ - if (max > count) - max = count; - - /* * Do everything aligned. But that means that we * need to also expand the maximum.. */ @@ -109,6 +102,13 @@ long strnlen_user(const char __user *str, long count) unsigned long max = max_addr - src_addr; long retval; + /* + * Truncate 'max' to the user-specified limit, so that + * we only have one limit we need to check in the loop + */ + if (max > count) + max = count; + if (user_access_begin(str, max)) { retval = do_strnlen_user(str, count, max); user_access_end(); diff --git a/lib/test_xarray.c b/lib/test_xarray.c index 7df4f7f395bf..55c14e8c8859 100644 --- a/lib/test_xarray.c +++ b/lib/test_xarray.c @@ -2,6 +2,7 @@ /* * test_xarray.c: Test the XArray API * Copyright (c) 2017-2018 Microsoft Corporation + * Copyright (c) 2019-2020 Oracle * Author: Matthew Wilcox <willy@infradead.org> */ @@ -902,28 +903,34 @@ static noinline void check_store_iter(struct xarray *xa) XA_BUG_ON(xa, !xa_empty(xa)); } -static noinline void check_multi_find(struct xarray *xa) +static noinline void check_multi_find_1(struct xarray *xa, unsigned order) { #ifdef CONFIG_XARRAY_MULTI + unsigned long multi = 3 << order; + unsigned long next = 4 << order; unsigned long index; - xa_store_order(xa, 12, 2, xa_mk_value(12), GFP_KERNEL); - XA_BUG_ON(xa, xa_store_index(xa, 16, GFP_KERNEL) != NULL); + xa_store_order(xa, multi, order, xa_mk_value(multi), GFP_KERNEL); + XA_BUG_ON(xa, xa_store_index(xa, next, GFP_KERNEL) != NULL); + XA_BUG_ON(xa, xa_store_index(xa, next + 1, GFP_KERNEL) != NULL); index = 0; XA_BUG_ON(xa, xa_find(xa, &index, ULONG_MAX, XA_PRESENT) != - xa_mk_value(12)); - XA_BUG_ON(xa, index != 12); - index = 13; + xa_mk_value(multi)); + XA_BUG_ON(xa, index != multi); + index = multi + 1; XA_BUG_ON(xa, xa_find(xa, &index, ULONG_MAX, XA_PRESENT) != - xa_mk_value(12)); - XA_BUG_ON(xa, (index < 12) || (index >= 16)); + xa_mk_value(multi)); + XA_BUG_ON(xa, (index < multi) || (index >= next)); XA_BUG_ON(xa, xa_find_after(xa, &index, ULONG_MAX, XA_PRESENT) != - xa_mk_value(16)); - XA_BUG_ON(xa, index != 16); - - xa_erase_index(xa, 12); - xa_erase_index(xa, 16); + xa_mk_value(next)); + XA_BUG_ON(xa, index != next); + XA_BUG_ON(xa, xa_find_after(xa, &index, next, XA_PRESENT) != NULL); + XA_BUG_ON(xa, index != next); + + xa_erase_index(xa, multi); + xa_erase_index(xa, next); + xa_erase_index(xa, next + 1); XA_BUG_ON(xa, !xa_empty(xa)); #endif } @@ -1046,12 +1053,33 @@ static noinline void check_find_3(struct xarray *xa) xa_destroy(xa); } +static noinline void check_find_4(struct xarray *xa) +{ + unsigned long index = 0; + void *entry; + + xa_store_index(xa, ULONG_MAX, GFP_KERNEL); + + entry = xa_find_after(xa, &index, ULONG_MAX, XA_PRESENT); + XA_BUG_ON(xa, entry != xa_mk_index(ULONG_MAX)); + + entry = xa_find_after(xa, &index, ULONG_MAX, XA_PRESENT); + XA_BUG_ON(xa, entry); + + xa_erase_index(xa, ULONG_MAX); +} + static noinline void check_find(struct xarray *xa) { + unsigned i; + check_find_1(xa); check_find_2(xa); check_find_3(xa); - check_multi_find(xa); + check_find_4(xa); + + for (i = 2; i < 10; i++) + check_multi_find_1(xa, i); check_multi_find_2(xa); } @@ -1132,6 +1160,27 @@ static noinline void check_move_tiny(struct xarray *xa) XA_BUG_ON(xa, !xa_empty(xa)); } +static noinline void check_move_max(struct xarray *xa) +{ + XA_STATE(xas, xa, 0); + + xa_store_index(xa, ULONG_MAX, GFP_KERNEL); + rcu_read_lock(); + XA_BUG_ON(xa, xas_find(&xas, ULONG_MAX) != xa_mk_index(ULONG_MAX)); + XA_BUG_ON(xa, xas_find(&xas, ULONG_MAX) != NULL); + rcu_read_unlock(); + + xas_set(&xas, 0); + rcu_read_lock(); + XA_BUG_ON(xa, xas_find(&xas, ULONG_MAX) != xa_mk_index(ULONG_MAX)); + xas_pause(&xas); + XA_BUG_ON(xa, xas_find(&xas, ULONG_MAX) != NULL); + rcu_read_unlock(); + + xa_erase_index(xa, ULONG_MAX); + XA_BUG_ON(xa, !xa_empty(xa)); +} + static noinline void check_move_small(struct xarray *xa, unsigned long idx) { XA_STATE(xas, xa, 0); @@ -1240,6 +1289,7 @@ static noinline void check_move(struct xarray *xa) xa_destroy(xa); check_move_tiny(xa); + check_move_max(xa); for (i = 0; i < 16; i++) check_move_small(xa, 1UL << i); diff --git a/lib/xarray.c b/lib/xarray.c index 1237c213f52b..1d9fab7db8da 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -1,7 +1,8 @@ // SPDX-License-Identifier: GPL-2.0+ /* * XArray implementation - * Copyright (c) 2017 Microsoft Corporation + * Copyright (c) 2017-2018 Microsoft Corporation + * Copyright (c) 2018-2020 Oracle * Author: Matthew Wilcox <willy@infradead.org> */ @@ -967,6 +968,7 @@ void xas_pause(struct xa_state *xas) if (xas_invalid(xas)) return; + xas->xa_node = XAS_RESTART; if (node) { unsigned int offset = xas->xa_offset; while (++offset < XA_CHUNK_SIZE) { @@ -974,10 +976,11 @@ void xas_pause(struct xa_state *xas) break; } xas->xa_index += (offset - xas->xa_offset) << node->shift; + if (xas->xa_index == 0) + xas->xa_node = XAS_BOUNDS; } else { xas->xa_index++; } - xas->xa_node = XAS_RESTART; } EXPORT_SYMBOL_GPL(xas_pause); @@ -1079,13 +1082,15 @@ void *xas_find(struct xa_state *xas, unsigned long max) { void *entry; - if (xas_error(xas)) + if (xas_error(xas) || xas->xa_node == XAS_BOUNDS) return NULL; + if (xas->xa_index > max) + return set_bounds(xas); if (!xas->xa_node) { xas->xa_index = 1; return set_bounds(xas); - } else if (xas_top(xas->xa_node)) { + } else if (xas->xa_node == XAS_RESTART) { entry = xas_load(xas); if (entry || xas_not_node(xas->xa_node)) return entry; @@ -1150,6 +1155,8 @@ void *xas_find_marked(struct xa_state *xas, unsigned long max, xa_mark_t mark) if (xas_error(xas)) return NULL; + if (xas->xa_index > max) + goto max; if (!xas->xa_node) { xas->xa_index = 1; @@ -1824,6 +1831,17 @@ void *xa_find(struct xarray *xa, unsigned long *indexp, } EXPORT_SYMBOL(xa_find); +static bool xas_sibling(struct xa_state *xas) +{ + struct xa_node *node = xas->xa_node; + unsigned long mask; + + if (!node) + return false; + mask = (XA_CHUNK_SIZE << node->shift) - 1; + return (xas->xa_index & mask) > (xas->xa_offset << node->shift); +} + /** * xa_find_after() - Search the XArray for a present entry. * @xa: XArray. @@ -1847,21 +1865,20 @@ void *xa_find_after(struct xarray *xa, unsigned long *indexp, XA_STATE(xas, xa, *indexp + 1); void *entry; + if (xas.xa_index == 0) + return NULL; + rcu_read_lock(); for (;;) { if ((__force unsigned int)filter < XA_MAX_MARKS) entry = xas_find_marked(&xas, max, filter); else entry = xas_find(&xas, max); - if (xas.xa_node == XAS_BOUNDS) + + if (xas_invalid(&xas)) break; - if (xas.xa_shift) { - if (xas.xa_index & ((1UL << xas.xa_shift) - 1)) - continue; - } else { - if (xas.xa_offset < (xas.xa_index & XA_CHUNK_MASK)) - continue; - } + if (xas_sibling(&xas)) + continue; if (!xas_retry(&xas, entry)) break; } diff --git a/net/Kconfig b/net/Kconfig index 54916b7adb9b..b0937a700f01 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -91,6 +91,7 @@ if INET source "net/ipv4/Kconfig" source "net/ipv6/Kconfig" source "net/netlabel/Kconfig" +source "net/mptcp/Kconfig" endif # if INET diff --git a/net/Makefile b/net/Makefile index 848303d98d3d..07ea48160874 100644 --- a/net/Makefile +++ b/net/Makefile @@ -87,3 +87,4 @@ endif obj-$(CONFIG_QRTR) += qrtr/ obj-$(CONFIG_NET_NCSI) += ncsi/ obj-$(CONFIG_XDP_SOCKETS) += xdp/ +obj-$(CONFIG_MPTCP) += mptcp/ diff --git a/net/atm/atm_sysfs.c b/net/atm/atm_sysfs.c index 39b94ca5f65d..aa1b57161f3b 100644 --- a/net/atm/atm_sysfs.c +++ b/net/atm/atm_sysfs.c @@ -33,23 +33,17 @@ static ssize_t show_atmaddress(struct device *cdev, unsigned long flags; struct atm_dev *adev = to_atm_dev(cdev); struct atm_dev_addr *aaddr; - int bin[] = { 1, 2, 10, 6, 1 }, *fmt = bin; - int i, j, count = 0; + int count = 0; spin_lock_irqsave(&adev->lock, flags); list_for_each_entry(aaddr, &adev->local, entry) { - for (i = 0, j = 0; i < ATM_ESA_LEN; ++i, ++j) { - if (j == *fmt) { - count += scnprintf(buf + count, - PAGE_SIZE - count, "."); - ++fmt; - j = 0; - } - count += scnprintf(buf + count, - PAGE_SIZE - count, "%02x", - aaddr->addr.sas_addr.prv[i]); - } - count += scnprintf(buf + count, PAGE_SIZE - count, "\n"); + count += scnprintf(buf + count, PAGE_SIZE - count, + "%1phN.%2phN.%10phN.%6phN.%1phN\n", + &aaddr->addr.sas_addr.prv[0], + &aaddr->addr.sas_addr.prv[1], + &aaddr->addr.sas_addr.prv[3], + &aaddr->addr.sas_addr.prv[13], + &aaddr->addr.sas_addr.prv[19]); } spin_unlock_irqrestore(&adev->lock, flags); diff --git a/net/atm/lec.c b/net/atm/lec.c index b57368e70aab..25fa3a7b72bd 100644 --- a/net/atm/lec.c +++ b/net/atm/lec.c @@ -799,14 +799,9 @@ static const char *lec_arp_get_status_string(unsigned char status) static void lec_info(struct seq_file *seq, struct lec_arp_table *entry) { - int i; - - for (i = 0; i < ETH_ALEN; i++) - seq_printf(seq, "%2.2x", entry->mac_addr[i] & 0xff); - seq_printf(seq, " "); - for (i = 0; i < ATM_ESA_LEN; i++) - seq_printf(seq, "%2.2x", entry->atm_addr[i] & 0xff); - seq_printf(seq, " %s %4.4x", lec_arp_get_status_string(entry->status), + seq_printf(seq, "%pM ", entry->mac_addr); + seq_printf(seq, "%*phN ", ATM_ESA_LEN, entry->atm_addr); + seq_printf(seq, "%s %4.4x", lec_arp_get_status_string(entry->status), entry->flags & 0xffff); if (entry->vcc) seq_printf(seq, "%3d %3d ", entry->vcc->vpi, entry->vcc->vci); @@ -1354,7 +1349,7 @@ static void dump_arp_table(struct lec_priv *priv) { struct lec_arp_table *rulla; char buf[256]; - int i, j, offset; + int i, offset; pr_info("Dump %p:\n", priv); for (i = 0; i < LEC_ARP_TABLE_SIZE; i++) { @@ -1362,14 +1357,10 @@ static void dump_arp_table(struct lec_priv *priv) &priv->lec_arp_tables[i], next) { offset = 0; offset += sprintf(buf, "%d: %p\n", i, rulla); - offset += sprintf(buf + offset, "Mac: %pM", + offset += sprintf(buf + offset, "Mac: %pM ", rulla->mac_addr); - offset += sprintf(buf + offset, " Atm:"); - for (j = 0; j < ATM_ESA_LEN; j++) { - offset += sprintf(buf + offset, - "%2.2x ", - rulla->atm_addr[j] & 0xff); - } + offset += sprintf(buf + offset, "Atm: %*ph ", ATM_ESA_LEN, + rulla->atm_addr); offset += sprintf(buf + offset, "Vcc vpi:%d vci:%d, Recv_vcc vpi:%d vci:%d Last_used:%lx, Timestamp:%lx, No_tries:%d ", rulla->vcc ? rulla->vcc->vpi : 0, @@ -1392,12 +1383,9 @@ static void dump_arp_table(struct lec_priv *priv) pr_info("No forward\n"); hlist_for_each_entry(rulla, &priv->lec_no_forward, next) { offset = 0; - offset += sprintf(buf + offset, "Mac: %pM", rulla->mac_addr); - offset += sprintf(buf + offset, " Atm:"); - for (j = 0; j < ATM_ESA_LEN; j++) { - offset += sprintf(buf + offset, "%2.2x ", - rulla->atm_addr[j] & 0xff); - } + offset += sprintf(buf + offset, "Mac: %pM ", rulla->mac_addr); + offset += sprintf(buf + offset, "Atm: %*ph ", ATM_ESA_LEN, + rulla->atm_addr); offset += sprintf(buf + offset, "Vcc vpi:%d vci:%d, Recv_vcc vpi:%d vci:%d Last_used:%lx, Timestamp:%lx, No_tries:%d ", rulla->vcc ? rulla->vcc->vpi : 0, @@ -1417,12 +1405,9 @@ static void dump_arp_table(struct lec_priv *priv) pr_info("Empty ones\n"); hlist_for_each_entry(rulla, &priv->lec_arp_empty_ones, next) { offset = 0; - offset += sprintf(buf + offset, "Mac: %pM", rulla->mac_addr); - offset += sprintf(buf + offset, " Atm:"); - for (j = 0; j < ATM_ESA_LEN; j++) { - offset += sprintf(buf + offset, "%2.2x ", - rulla->atm_addr[j] & 0xff); - } + offset += sprintf(buf + offset, "Mac: %pM ", rulla->mac_addr); + offset += sprintf(buf + offset, "Atm: %*ph ", ATM_ESA_LEN, + rulla->atm_addr); offset += sprintf(buf + offset, "Vcc vpi:%d vci:%d, Recv_vcc vpi:%d vci:%d Last_used:%lx, Timestamp:%lx, No_tries:%d ", rulla->vcc ? rulla->vcc->vpi : 0, @@ -1442,12 +1427,9 @@ static void dump_arp_table(struct lec_priv *priv) pr_info("Multicast Forward VCCs\n"); hlist_for_each_entry(rulla, &priv->mcast_fwds, next) { offset = 0; - offset += sprintf(buf + offset, "Mac: %pM", rulla->mac_addr); - offset += sprintf(buf + offset, " Atm:"); - for (j = 0; j < ATM_ESA_LEN; j++) { - offset += sprintf(buf + offset, "%2.2x ", - rulla->atm_addr[j] & 0xff); - } + offset += sprintf(buf + offset, "Mac: %pM ", rulla->mac_addr); + offset += sprintf(buf + offset, "Atm: %*ph ", ATM_ESA_LEN, + rulla->atm_addr); offset += sprintf(buf + offset, "Vcc vpi:%d vci:%d, Recv_vcc vpi:%d vci:%d Last_used:%lx, Timestamp:%lx, No_tries:%d ", rulla->vcc ? rulla->vcc->vpi : 0, @@ -1973,17 +1955,8 @@ lec_vcc_added(struct lec_priv *priv, const struct atmlec_ioc *ioc_data, * Vcc which we don't want to make default vcc, * attach it anyway. */ - pr_debug("LEC_ARP:Attaching data direct, not default: %2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x\n", - ioc_data->atm_addr[0], ioc_data->atm_addr[1], - ioc_data->atm_addr[2], ioc_data->atm_addr[3], - ioc_data->atm_addr[4], ioc_data->atm_addr[5], - ioc_data->atm_addr[6], ioc_data->atm_addr[7], - ioc_data->atm_addr[8], ioc_data->atm_addr[9], - ioc_data->atm_addr[10], ioc_data->atm_addr[11], - ioc_data->atm_addr[12], ioc_data->atm_addr[13], - ioc_data->atm_addr[14], ioc_data->atm_addr[15], - ioc_data->atm_addr[16], ioc_data->atm_addr[17], - ioc_data->atm_addr[18], ioc_data->atm_addr[19]); + pr_debug("LEC_ARP:Attaching data direct, not default: %*phN\n", + ATM_ESA_LEN, ioc_data->atm_addr); entry = make_entry(priv, bus_mac); if (entry == NULL) goto out; @@ -1999,17 +1972,8 @@ lec_vcc_added(struct lec_priv *priv, const struct atmlec_ioc *ioc_data, dump_arp_table(priv); goto out; } - pr_debug("LEC_ARP:Attaching data direct, default: %2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x%2.2x\n", - ioc_data->atm_addr[0], ioc_data->atm_addr[1], - ioc_data->atm_addr[2], ioc_data->atm_addr[3], - ioc_data->atm_addr[4], ioc_data->atm_addr[5], - ioc_data->atm_addr[6], ioc_data->atm_addr[7], - ioc_data->atm_addr[8], ioc_data->atm_addr[9], - ioc_data->atm_addr[10], ioc_data->atm_addr[11], - ioc_data->atm_addr[12], ioc_data->atm_addr[13], - ioc_data->atm_addr[14], ioc_data->atm_addr[15], - ioc_data->atm_addr[16], ioc_data->atm_addr[17], - ioc_data->atm_addr[18], ioc_data->atm_addr[19]); + pr_debug("LEC_ARP:Attaching data direct, default: %*phN\n", + ATM_ESA_LEN, ioc_data->atm_addr); for (i = 0; i < LEC_ARP_TABLE_SIZE; i++) { hlist_for_each_entry(entry, &priv->lec_arp_tables[i], next) { diff --git a/net/atm/proc.c b/net/atm/proc.c index d79221fd4dae..c31896707313 100644 --- a/net/atm/proc.c +++ b/net/atm/proc.c @@ -134,8 +134,7 @@ static void vcc_seq_stop(struct seq_file *seq, void *v) static void *vcc_seq_next(struct seq_file *seq, void *v, loff_t *pos) { v = vcc_walk(seq, 1); - if (v) - (*pos)++; + (*pos)++; return v; } diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c index 9e19d5a3aac8..cbbc34a006d1 100644 --- a/net/bluetooth/hci_core.c +++ b/net/bluetooth/hci_core.c @@ -2311,6 +2311,33 @@ void hci_smp_irks_clear(struct hci_dev *hdev) } } +void hci_blocked_keys_clear(struct hci_dev *hdev) +{ + struct blocked_key *b; + + list_for_each_entry_rcu(b, &hdev->blocked_keys, list) { + list_del_rcu(&b->list); + kfree_rcu(b, rcu); + } +} + +bool hci_is_blocked_key(struct hci_dev *hdev, u8 type, u8 val[16]) +{ + bool blocked = false; + struct blocked_key *b; + + rcu_read_lock(); + list_for_each_entry(b, &hdev->blocked_keys, list) { + if (b->type == type && !memcmp(b->val, val, sizeof(b->val))) { + blocked = true; + break; + } + } + + rcu_read_unlock(); + return blocked; +} + struct link_key *hci_find_link_key(struct hci_dev *hdev, bdaddr_t *bdaddr) { struct link_key *k; @@ -2319,6 +2346,16 @@ struct link_key *hci_find_link_key(struct hci_dev *hdev, bdaddr_t *bdaddr) list_for_each_entry_rcu(k, &hdev->link_keys, list) { if (bacmp(bdaddr, &k->bdaddr) == 0) { rcu_read_unlock(); + + if (hci_is_blocked_key(hdev, + HCI_BLOCKED_KEY_TYPE_LINKKEY, + k->val)) { + bt_dev_warn_ratelimited(hdev, + "Link key blocked for %pMR", + &k->bdaddr); + return NULL; + } + return k; } } @@ -2387,6 +2424,15 @@ struct smp_ltk *hci_find_ltk(struct hci_dev *hdev, bdaddr_t *bdaddr, if (smp_ltk_is_sc(k) || ltk_role(k->type) == role) { rcu_read_unlock(); + + if (hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_LTK, + k->val)) { + bt_dev_warn_ratelimited(hdev, + "LTK blocked for %pMR", + &k->bdaddr); + return NULL; + } + return k; } } @@ -2397,31 +2443,42 @@ struct smp_ltk *hci_find_ltk(struct hci_dev *hdev, bdaddr_t *bdaddr, struct smp_irk *hci_find_irk_by_rpa(struct hci_dev *hdev, bdaddr_t *rpa) { + struct smp_irk *irk_to_return = NULL; struct smp_irk *irk; rcu_read_lock(); list_for_each_entry_rcu(irk, &hdev->identity_resolving_keys, list) { if (!bacmp(&irk->rpa, rpa)) { - rcu_read_unlock(); - return irk; + irk_to_return = irk; + goto done; } } list_for_each_entry_rcu(irk, &hdev->identity_resolving_keys, list) { if (smp_irk_matches(hdev, irk->val, rpa)) { bacpy(&irk->rpa, rpa); - rcu_read_unlock(); - return irk; + irk_to_return = irk; + goto done; } } + +done: + if (irk_to_return && hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_IRK, + irk_to_return->val)) { + bt_dev_warn_ratelimited(hdev, "Identity key blocked for %pMR", + &irk_to_return->bdaddr); + irk_to_return = NULL; + } + rcu_read_unlock(); - return NULL; + return irk_to_return; } struct smp_irk *hci_find_irk_by_addr(struct hci_dev *hdev, bdaddr_t *bdaddr, u8 addr_type) { + struct smp_irk *irk_to_return = NULL; struct smp_irk *irk; /* Identity Address must be public or static random */ @@ -2432,13 +2489,23 @@ struct smp_irk *hci_find_irk_by_addr(struct hci_dev *hdev, bdaddr_t *bdaddr, list_for_each_entry_rcu(irk, &hdev->identity_resolving_keys, list) { if (addr_type == irk->addr_type && bacmp(bdaddr, &irk->bdaddr) == 0) { - rcu_read_unlock(); - return irk; + irk_to_return = irk; + goto done; } } + +done: + + if (irk_to_return && hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_IRK, + irk_to_return->val)) { + bt_dev_warn_ratelimited(hdev, "Identity key blocked for %pMR", + &irk_to_return->bdaddr); + irk_to_return = NULL; + } + rcu_read_unlock(); - return NULL; + return irk_to_return; } struct link_key *hci_add_link_key(struct hci_dev *hdev, struct hci_conn *conn, @@ -3244,6 +3311,7 @@ struct hci_dev *hci_alloc_dev(void) INIT_LIST_HEAD(&hdev->pend_le_reports); INIT_LIST_HEAD(&hdev->conn_hash.list); INIT_LIST_HEAD(&hdev->adv_instances); + INIT_LIST_HEAD(&hdev->blocked_keys); INIT_WORK(&hdev->rx_work, hci_rx_work); INIT_WORK(&hdev->cmd_work, hci_cmd_work); @@ -3443,6 +3511,7 @@ void hci_unregister_dev(struct hci_dev *hdev) hci_bdaddr_list_clear(&hdev->le_resolv_list); hci_conn_params_clear_all(hdev); hci_discovery_filter_clear(hdev); + hci_blocked_keys_clear(hdev); hci_dev_unlock(hdev); hci_dev_put(hdev); @@ -3496,7 +3565,8 @@ int hci_recv_frame(struct hci_dev *hdev, struct sk_buff *skb) if (hci_skb_pkt_type(skb) != HCI_EVENT_PKT && hci_skb_pkt_type(skb) != HCI_ACLDATA_PKT && - hci_skb_pkt_type(skb) != HCI_SCODATA_PKT) { + hci_skb_pkt_type(skb) != HCI_SCODATA_PKT && + hci_skb_pkt_type(skb) != HCI_ISODATA_PKT) { kfree_skb(skb); return -EINVAL; } @@ -4218,15 +4288,10 @@ static void hci_sched_le(struct hci_dev *hdev) if (!hci_conn_num(hdev, LE_LINK)) return; - if (!hci_dev_test_flag(hdev, HCI_UNCONFIGURED)) { - /* LE tx timeout must be longer than maximum - * link supervision timeout (40.9 seconds) */ - if (!hdev->le_cnt && hdev->le_pkts && - time_after(jiffies, hdev->le_last_tx + HZ * 45)) - hci_link_tx_to(hdev, LE_LINK); - } - cnt = hdev->le_pkts ? hdev->le_cnt : hdev->acl_cnt; + + __check_timeout(hdev, cnt); + tmp = cnt; while (cnt && (chan = hci_chan_sent(hdev, LE_LINK, "e))) { u32 priority = (skb_peek(&chan->data_q))->priority; @@ -4479,6 +4544,7 @@ static void hci_rx_work(struct work_struct *work) switch (hci_skb_pkt_type(skb)) { case HCI_ACLDATA_PKT: case HCI_SCODATA_PKT: + case HCI_ISODATA_PKT: kfree_skb(skb); continue; } diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c index 402e2cc54044..6b1314c738b8 100644 --- a/net/bluetooth/hci_debugfs.c +++ b/net/bluetooth/hci_debugfs.c @@ -26,6 +26,7 @@ #include <net/bluetooth/bluetooth.h> #include <net/bluetooth/hci_core.h> +#include "smp.h" #include "hci_debugfs.h" #define DEFINE_QUIRK_ATTRIBUTE(__name, __quirk) \ @@ -152,6 +153,21 @@ static int blacklist_show(struct seq_file *f, void *p) DEFINE_SHOW_ATTRIBUTE(blacklist); +static int blocked_keys_show(struct seq_file *f, void *p) +{ + struct hci_dev *hdev = f->private; + struct blocked_key *key; + + rcu_read_lock(); + list_for_each_entry_rcu(key, &hdev->blocked_keys, list) + seq_printf(f, "%u %*phN\n", key->type, 16, key->val); + rcu_read_unlock(); + + return 0; +} + +DEFINE_SHOW_ATTRIBUTE(blocked_keys); + static int uuids_show(struct seq_file *f, void *p) { struct hci_dev *hdev = f->private; @@ -308,6 +324,8 @@ void hci_debugfs_create_common(struct hci_dev *hdev) &device_list_fops); debugfs_create_file("blacklist", 0444, hdev->debugfs, hdev, &blacklist_fops); + debugfs_create_file("blocked_keys", 0444, hdev->debugfs, hdev, + &blocked_keys_fops); debugfs_create_file("uuids", 0444, hdev->debugfs, hdev, &uuids_fops); debugfs_create_file("remote_oob", 0400, hdev->debugfs, hdev, &remote_oob_fops); @@ -972,6 +990,62 @@ static int adv_max_interval_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(adv_max_interval_fops, adv_max_interval_get, adv_max_interval_set, "%llu\n"); +static int min_key_size_set(void *data, u64 val) +{ + struct hci_dev *hdev = data; + + if (val > hdev->le_max_key_size || val < SMP_MIN_ENC_KEY_SIZE) + return -EINVAL; + + hci_dev_lock(hdev); + hdev->le_min_key_size = val; + hci_dev_unlock(hdev); + + return 0; +} + +static int min_key_size_get(void *data, u64 *val) +{ + struct hci_dev *hdev = data; + + hci_dev_lock(hdev); + *val = hdev->le_min_key_size; + hci_dev_unlock(hdev); + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(min_key_size_fops, min_key_size_get, + min_key_size_set, "%llu\n"); + +static int max_key_size_set(void *data, u64 val) +{ + struct hci_dev *hdev = data; + + if (val > SMP_MAX_ENC_KEY_SIZE || val < hdev->le_min_key_size) + return -EINVAL; + + hci_dev_lock(hdev); + hdev->le_max_key_size = val; + hci_dev_unlock(hdev); + + return 0; +} + +static int max_key_size_get(void *data, u64 *val) +{ + struct hci_dev *hdev = data; + + hci_dev_lock(hdev); + *val = hdev->le_max_key_size; + hci_dev_unlock(hdev); + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(max_key_size_fops, max_key_size_get, + max_key_size_set, "%llu\n"); + static int auth_payload_timeout_set(void *data, u64 val) { struct hci_dev *hdev = data; @@ -1054,6 +1128,10 @@ void hci_debugfs_create_le(struct hci_dev *hdev) &adv_max_interval_fops); debugfs_create_u16("discov_interleaved_timeout", 0644, hdev->debugfs, &hdev->discov_interleaved_timeout); + debugfs_create_file("min_key_size", 0644, hdev->debugfs, hdev, + &min_key_size_fops); + debugfs_create_file("max_key_size", 0644, hdev->debugfs, hdev, + &max_key_size_fops); debugfs_create_file("auth_payload_timeout", 0644, hdev->debugfs, hdev, &auth_payload_timeout_fops); diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index c1d3a303d97f..6ddc4a74a5e4 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -5451,7 +5451,7 @@ static void hci_le_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) hci_dev_unlock(hdev); } -static u8 ext_evt_type_to_legacy(u16 evt_type) +static u8 ext_evt_type_to_legacy(struct hci_dev *hdev, u16 evt_type) { if (evt_type & LE_EXT_ADV_LEGACY_PDU) { switch (evt_type) { @@ -5468,10 +5468,7 @@ static u8 ext_evt_type_to_legacy(u16 evt_type) return LE_ADV_SCAN_RSP; } - BT_ERR_RATELIMITED("Unknown advertising packet type: 0x%02x", - evt_type); - - return LE_ADV_INVALID; + goto invalid; } if (evt_type & LE_EXT_ADV_CONN_IND) { @@ -5491,8 +5488,9 @@ static u8 ext_evt_type_to_legacy(u16 evt_type) evt_type & LE_EXT_ADV_DIRECT_IND) return LE_ADV_NONCONN_IND; - BT_ERR_RATELIMITED("Unknown advertising packet type: 0x%02x", - evt_type); +invalid: + bt_dev_err_ratelimited(hdev, "Unknown advertising packet type: 0x%02x", + evt_type); return LE_ADV_INVALID; } @@ -5510,7 +5508,7 @@ static void hci_le_ext_adv_report_evt(struct hci_dev *hdev, struct sk_buff *skb) u16 evt_type; evt_type = __le16_to_cpu(ev->evt_type); - legacy_evt_type = ext_evt_type_to_legacy(evt_type); + legacy_evt_type = ext_evt_type_to_legacy(hdev, evt_type); if (legacy_evt_type != LE_ADV_INVALID) { process_adv_report(hdev, legacy_evt_type, &ev->bdaddr, ev->bdaddr_type, NULL, 0, ev->rssi, @@ -5720,6 +5718,29 @@ static void hci_le_direct_adv_report_evt(struct hci_dev *hdev, hci_dev_unlock(hdev); } +static void hci_le_phy_update_evt(struct hci_dev *hdev, struct sk_buff *skb) +{ + struct hci_ev_le_phy_update_complete *ev = (void *) skb->data; + struct hci_conn *conn; + + BT_DBG("%s status 0x%2.2x", hdev->name, ev->status); + + if (!ev->status) + return; + + hci_dev_lock(hdev); + + conn = hci_conn_hash_lookup_handle(hdev, __le16_to_cpu(ev->handle)); + if (!conn) + goto unlock; + + conn->le_tx_phy = ev->tx_phy; + conn->le_rx_phy = ev->rx_phy; + +unlock: + hci_dev_unlock(hdev); +} + static void hci_le_meta_evt(struct hci_dev *hdev, struct sk_buff *skb) { struct hci_ev_le_meta *le_ev = (void *) skb->data; @@ -5755,6 +5776,10 @@ static void hci_le_meta_evt(struct hci_dev *hdev, struct sk_buff *skb) hci_le_direct_adv_report_evt(hdev, skb); break; + case HCI_EV_LE_PHY_UPDATE_COMPLETE: + hci_le_phy_update_evt(hdev, skb); + break; + case HCI_EV_LE_EXT_ADV_REPORT: hci_le_ext_adv_report_evt(hdev, skb); break; diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c index 5d0ed28c0d3a..9c4a093f8960 100644 --- a/net/bluetooth/hci_sock.c +++ b/net/bluetooth/hci_sock.c @@ -211,7 +211,8 @@ void hci_send_to_sock(struct hci_dev *hdev, struct sk_buff *skb) if (hci_skb_pkt_type(skb) != HCI_COMMAND_PKT && hci_skb_pkt_type(skb) != HCI_EVENT_PKT && hci_skb_pkt_type(skb) != HCI_ACLDATA_PKT && - hci_skb_pkt_type(skb) != HCI_SCODATA_PKT) + hci_skb_pkt_type(skb) != HCI_SCODATA_PKT && + hci_skb_pkt_type(skb) != HCI_ISODATA_PKT) continue; if (is_filtered_packet(sk, skb)) continue; @@ -220,7 +221,8 @@ void hci_send_to_sock(struct hci_dev *hdev, struct sk_buff *skb) continue; if (hci_skb_pkt_type(skb) != HCI_EVENT_PKT && hci_skb_pkt_type(skb) != HCI_ACLDATA_PKT && - hci_skb_pkt_type(skb) != HCI_SCODATA_PKT) + hci_skb_pkt_type(skb) != HCI_SCODATA_PKT && + hci_skb_pkt_type(skb) != HCI_ISODATA_PKT) continue; } else { /* Don't send frame to other channel types */ @@ -324,6 +326,12 @@ void hci_send_to_monitor(struct hci_dev *hdev, struct sk_buff *skb) else opcode = cpu_to_le16(HCI_MON_SCO_TX_PKT); break; + case HCI_ISODATA_PKT: + if (bt_cb(skb)->incoming) + opcode = cpu_to_le16(HCI_MON_ISO_RX_PKT); + else + opcode = cpu_to_le16(HCI_MON_ISO_TX_PKT); + break; case HCI_DIAG_PKT: opcode = cpu_to_le16(HCI_MON_VENDOR_DIAG); break; @@ -831,6 +839,8 @@ static int hci_sock_release(struct socket *sock) if (!sk) return 0; + lock_sock(sk); + switch (hci_pi(sk)->channel) { case HCI_CHANNEL_MONITOR: atomic_dec(&monitor_promisc); @@ -878,6 +888,7 @@ static int hci_sock_release(struct socket *sock) skb_queue_purge(&sk->sk_receive_queue); skb_queue_purge(&sk->sk_write_queue); + release_sock(sk); sock_put(sk); return 0; } @@ -1762,7 +1773,8 @@ static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg, */ if (hci_skb_pkt_type(skb) != HCI_COMMAND_PKT && hci_skb_pkt_type(skb) != HCI_ACLDATA_PKT && - hci_skb_pkt_type(skb) != HCI_SCODATA_PKT) { + hci_skb_pkt_type(skb) != HCI_SCODATA_PKT && + hci_skb_pkt_type(skb) != HCI_ISODATA_PKT) { err = -EINVAL; goto drop; } @@ -1806,7 +1818,8 @@ static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg, } if (hci_skb_pkt_type(skb) != HCI_ACLDATA_PKT && - hci_skb_pkt_type(skb) != HCI_SCODATA_PKT) { + hci_skb_pkt_type(skb) != HCI_SCODATA_PKT && + hci_skb_pkt_type(skb) != HCI_ISODATA_PKT) { err = -EINVAL; goto drop; } diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index a845786258a0..195459a1e53e 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -1289,6 +1289,9 @@ static void l2cap_le_connect(struct l2cap_chan *chan) if (test_and_set_bit(FLAG_LE_CONN_REQ_SENT, &chan->flags)) return; + if (!chan->imtu) + chan->imtu = chan->conn->mtu; + l2cap_le_flowctl_init(chan, 0); req.psm = chan->psm; @@ -3226,6 +3229,49 @@ static inline void l2cap_txwin_setup(struct l2cap_chan *chan) chan->ack_win = chan->tx_win; } +static void l2cap_mtu_auto(struct l2cap_chan *chan) +{ + struct hci_conn *conn = chan->conn->hcon; + + chan->imtu = L2CAP_DEFAULT_MIN_MTU; + + /* The 2-DH1 packet has between 2 and 56 information bytes + * (including the 2-byte payload header) + */ + if (!(conn->pkt_type & HCI_2DH1)) + chan->imtu = 54; + + /* The 3-DH1 packet has between 2 and 85 information bytes + * (including the 2-byte payload header) + */ + if (!(conn->pkt_type & HCI_3DH1)) + chan->imtu = 83; + + /* The 2-DH3 packet has between 2 and 369 information bytes + * (including the 2-byte payload header) + */ + if (!(conn->pkt_type & HCI_2DH3)) + chan->imtu = 367; + + /* The 3-DH3 packet has between 2 and 554 information bytes + * (including the 2-byte payload header) + */ + if (!(conn->pkt_type & HCI_3DH3)) + chan->imtu = 552; + + /* The 2-DH5 packet has between 2 and 681 information bytes + * (including the 2-byte payload header) + */ + if (!(conn->pkt_type & HCI_2DH5)) + chan->imtu = 679; + + /* The 3-DH5 packet has between 2 and 1023 information bytes + * (including the 2-byte payload header) + */ + if (!(conn->pkt_type & HCI_3DH5)) + chan->imtu = 1021; +} + static int l2cap_build_conf_req(struct l2cap_chan *chan, void *data, size_t data_size) { struct l2cap_conf_req *req = data; @@ -3255,8 +3301,12 @@ static int l2cap_build_conf_req(struct l2cap_chan *chan, void *data, size_t data } done: - if (chan->imtu != L2CAP_DEFAULT_MTU) - l2cap_add_conf_opt(&ptr, L2CAP_CONF_MTU, 2, chan->imtu, endptr - ptr); + if (chan->imtu != L2CAP_DEFAULT_MTU) { + if (!chan->imtu) + l2cap_mtu_auto(chan); + l2cap_add_conf_opt(&ptr, L2CAP_CONF_MTU, 2, chan->imtu, + endptr - ptr); + } switch (chan->mode) { case L2CAP_MODE_BASIC: @@ -5031,7 +5081,6 @@ static inline int l2cap_move_channel_req(struct l2cap_conn *conn, chan->move_role = L2CAP_MOVE_ROLE_RESPONDER; l2cap_move_setup(chan); chan->move_id = req->dest_amp_id; - icid = chan->dcid; if (req->dest_amp_id == AMP_ID_BREDR) { /* Moving to BR/EDR */ diff --git a/net/bluetooth/lib.c b/net/bluetooth/lib.c index 63e65d9b4b24..c09e0a3a0ed9 100644 --- a/net/bluetooth/lib.c +++ b/net/bluetooth/lib.c @@ -183,6 +183,22 @@ void bt_err(const char *format, ...) } EXPORT_SYMBOL(bt_err); +void bt_warn_ratelimited(const char *format, ...) +{ + struct va_format vaf; + va_list args; + + va_start(args, format); + + vaf.fmt = format; + vaf.va = &args; + + pr_warn_ratelimited("%pV", &vaf); + + va_end(args); +} +EXPORT_SYMBOL(bt_warn_ratelimited); + void bt_err_ratelimited(const char *format, ...) { struct va_format vaf; diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index acb7c6d5643f..3074363c68df 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -38,7 +38,7 @@ #include "mgmt_util.h" #define MGMT_VERSION 1 -#define MGMT_REVISION 14 +#define MGMT_REVISION 15 static const u16 mgmt_commands[] = { MGMT_OP_READ_INDEX_LIST, @@ -106,6 +106,7 @@ static const u16 mgmt_commands[] = { MGMT_OP_START_LIMITED_DISCOVERY, MGMT_OP_READ_EXT_INFO, MGMT_OP_SET_APPEARANCE, + MGMT_OP_SET_BLOCKED_KEYS, }; static const u16 mgmt_events[] = { @@ -175,7 +176,7 @@ static const u16 mgmt_untrusted_events[] = { "\x00\x00\x00\x00\x00\x00\x00\x00" /* HCI to MGMT error code conversion table */ -static u8 mgmt_status_table[] = { +static const u8 mgmt_status_table[] = { MGMT_STATUS_SUCCESS, MGMT_STATUS_UNKNOWN_COMMAND, /* Unknown Command */ MGMT_STATUS_NOT_CONNECTED, /* No Connection */ @@ -2341,6 +2342,14 @@ static int load_link_keys(struct sock *sk, struct hci_dev *hdev, void *data, for (i = 0; i < key_count; i++) { struct mgmt_link_key_info *key = &cp->keys[i]; + if (hci_is_blocked_key(hdev, + HCI_BLOCKED_KEY_TYPE_LINKKEY, + key->val)) { + bt_dev_warn(hdev, "Skipping blocked link key for %pMR", + &key->addr.bdaddr); + continue; + } + /* Always ignore debug keys and require a new pairing if * the user wants to use them. */ @@ -3282,7 +3291,7 @@ static int set_appearance(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) { struct mgmt_cp_set_appearance *cp = data; - u16 apperance; + u16 appearance; int err; BT_DBG(""); @@ -3291,12 +3300,12 @@ static int set_appearance(struct sock *sk, struct hci_dev *hdev, void *data, return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_APPEARANCE, MGMT_STATUS_NOT_SUPPORTED); - apperance = le16_to_cpu(cp->appearance); + appearance = le16_to_cpu(cp->appearance); hci_dev_lock(hdev); - if (hdev->appearance != apperance) { - hdev->appearance = apperance; + if (hdev->appearance != appearance) { + hdev->appearance = appearance; if (hci_dev_test_flag(hdev, HCI_LE_ADV)) adv_expire(hdev, MGMT_ADV_FLAG_APPEARANCE); @@ -3531,6 +3540,55 @@ unlock: return err; } +static int set_blocked_keys(struct sock *sk, struct hci_dev *hdev, void *data, + u16 len) +{ + int err = MGMT_STATUS_SUCCESS; + struct mgmt_cp_set_blocked_keys *keys = data; + const u16 max_key_count = ((U16_MAX - sizeof(*keys)) / + sizeof(struct mgmt_blocked_key_info)); + u16 key_count, expected_len; + int i; + + BT_DBG("request for %s", hdev->name); + + key_count = __le16_to_cpu(keys->key_count); + if (key_count > max_key_count) { + bt_dev_err(hdev, "too big key_count value %u", key_count); + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_BLOCKED_KEYS, + MGMT_STATUS_INVALID_PARAMS); + } + + expected_len = struct_size(keys, keys, key_count); + if (expected_len != len) { + bt_dev_err(hdev, "expected %u bytes, got %u bytes", + expected_len, len); + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_SET_BLOCKED_KEYS, + MGMT_STATUS_INVALID_PARAMS); + } + + hci_dev_lock(hdev); + + hci_blocked_keys_clear(hdev); + + for (i = 0; i < keys->key_count; ++i) { + struct blocked_key *b = kzalloc(sizeof(*b), GFP_KERNEL); + + if (!b) { + err = MGMT_STATUS_NO_RESOURCES; + break; + } + + b->type = keys->keys[i].type; + memcpy(b->val, keys->keys[i].val, sizeof(b->val)); + list_add_rcu(&b->list, &hdev->blocked_keys); + } + hci_dev_unlock(hdev); + + return mgmt_cmd_complete(sk, hdev->id, MGMT_OP_SET_BLOCKED_KEYS, + err, NULL, 0); +} + static void read_local_oob_data_complete(struct hci_dev *hdev, u8 status, u16 opcode, struct sk_buff *skb) { @@ -5051,6 +5109,14 @@ static int load_irks(struct sock *sk, struct hci_dev *hdev, void *cp_data, for (i = 0; i < irk_count; i++) { struct mgmt_irk_info *irk = &cp->irks[i]; + if (hci_is_blocked_key(hdev, + HCI_BLOCKED_KEY_TYPE_IRK, + irk->val)) { + bt_dev_warn(hdev, "Skipping blocked IRK for %pMR", + &irk->addr.bdaddr); + continue; + } + hci_add_irk(hdev, &irk->addr.bdaddr, le_addr_type(irk->addr.type), irk->val, BDADDR_ANY); @@ -5134,6 +5200,14 @@ static int load_long_term_keys(struct sock *sk, struct hci_dev *hdev, struct mgmt_ltk_info *key = &cp->keys[i]; u8 type, authenticated; + if (hci_is_blocked_key(hdev, + HCI_BLOCKED_KEY_TYPE_LTK, + key->val)) { + bt_dev_warn(hdev, "Skipping blocked LTK for %pMR", + &key->addr.bdaddr); + continue; + } + switch (key->type) { case MGMT_LTK_UNAUTHENTICATED: authenticated = 0x00; @@ -6914,6 +6988,8 @@ static const struct hci_mgmt_handler mgmt_handlers[] = { { set_appearance, MGMT_SET_APPEARANCE_SIZE }, { get_phy_configuration, MGMT_GET_PHY_CONFIGURATION_SIZE }, { set_phy_configuration, MGMT_SET_PHY_CONFIGURATION_SIZE }, + { set_blocked_keys, MGMT_OP_SET_BLOCKED_KEYS_SIZE, + HCI_MGMT_VAR_LEN }, }; void mgmt_index_added(struct hci_dev *hdev) diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c index 6b42be4b5861..204f14f8b507 100644 --- a/net/bluetooth/smp.c +++ b/net/bluetooth/smp.c @@ -2453,6 +2453,15 @@ static int smp_cmd_encrypt_info(struct l2cap_conn *conn, struct sk_buff *skb) if (skb->len < sizeof(*rp)) return SMP_INVALID_PARAMS; + /* Pairing is aborted if any blocked keys are distributed */ + if (hci_is_blocked_key(conn->hcon->hdev, HCI_BLOCKED_KEY_TYPE_LTK, + rp->ltk)) { + bt_dev_warn_ratelimited(conn->hcon->hdev, + "LTK blocked for %pMR", + &conn->hcon->dst); + return SMP_INVALID_PARAMS; + } + SMP_ALLOW_CMD(smp, SMP_CMD_MASTER_IDENT); skb_pull(skb, sizeof(*rp)); @@ -2509,6 +2518,15 @@ static int smp_cmd_ident_info(struct l2cap_conn *conn, struct sk_buff *skb) if (skb->len < sizeof(*info)) return SMP_INVALID_PARAMS; + /* Pairing is aborted if any blocked keys are distributed */ + if (hci_is_blocked_key(conn->hcon->hdev, HCI_BLOCKED_KEY_TYPE_IRK, + info->irk)) { + bt_dev_warn_ratelimited(conn->hcon->hdev, + "Identity key blocked for %pMR", + &conn->hcon->dst); + return SMP_INVALID_PARAMS; + } + SMP_ALLOW_CMD(smp, SMP_CMD_IDENT_ADDR_INFO); skb_pull(skb, sizeof(*info)); @@ -3355,94 +3373,6 @@ static const struct file_operations force_bredr_smp_fops = { .llseek = default_llseek, }; -static ssize_t le_min_key_size_read(struct file *file, - char __user *user_buf, - size_t count, loff_t *ppos) -{ - struct hci_dev *hdev = file->private_data; - char buf[4]; - - snprintf(buf, sizeof(buf), "%2u\n", hdev->le_min_key_size); - - return simple_read_from_buffer(user_buf, count, ppos, buf, strlen(buf)); -} - -static ssize_t le_min_key_size_write(struct file *file, - const char __user *user_buf, - size_t count, loff_t *ppos) -{ - struct hci_dev *hdev = file->private_data; - char buf[32]; - size_t buf_size = min(count, (sizeof(buf) - 1)); - u8 key_size; - - if (copy_from_user(buf, user_buf, buf_size)) - return -EFAULT; - - buf[buf_size] = '\0'; - - sscanf(buf, "%hhu", &key_size); - - if (key_size > hdev->le_max_key_size || - key_size < SMP_MIN_ENC_KEY_SIZE) - return -EINVAL; - - hdev->le_min_key_size = key_size; - - return count; -} - -static const struct file_operations le_min_key_size_fops = { - .open = simple_open, - .read = le_min_key_size_read, - .write = le_min_key_size_write, - .llseek = default_llseek, -}; - -static ssize_t le_max_key_size_read(struct file *file, - char __user *user_buf, - size_t count, loff_t *ppos) -{ - struct hci_dev *hdev = file->private_data; - char buf[4]; - - snprintf(buf, sizeof(buf), "%2u\n", hdev->le_max_key_size); - - return simple_read_from_buffer(user_buf, count, ppos, buf, strlen(buf)); -} - -static ssize_t le_max_key_size_write(struct file *file, - const char __user *user_buf, - size_t count, loff_t *ppos) -{ - struct hci_dev *hdev = file->private_data; - char buf[32]; - size_t buf_size = min(count, (sizeof(buf) - 1)); - u8 key_size; - - if (copy_from_user(buf, user_buf, buf_size)) - return -EFAULT; - - buf[buf_size] = '\0'; - - sscanf(buf, "%hhu", &key_size); - - if (key_size > SMP_MAX_ENC_KEY_SIZE || - key_size < hdev->le_min_key_size) - return -EINVAL; - - hdev->le_max_key_size = key_size; - - return count; -} - -static const struct file_operations le_max_key_size_fops = { - .open = simple_open, - .read = le_max_key_size_read, - .write = le_max_key_size_write, - .llseek = default_llseek, -}; - int smp_register(struct hci_dev *hdev) { struct l2cap_chan *chan; @@ -3467,11 +3397,6 @@ int smp_register(struct hci_dev *hdev) hdev->smp_data = chan; - debugfs_create_file("le_min_key_size", 0644, hdev->debugfs, hdev, - &le_min_key_size_fops); - debugfs_create_file("le_max_key_size", 0644, hdev->debugfs, hdev, - &le_max_key_size_fops); - /* If the controller does not support BR/EDR Secure Connections * feature, then the BR/EDR SMP channel shall not be present. * diff --git a/net/bridge/Makefile b/net/bridge/Makefile index ac9ef337f0fa..49da7ae6f077 100644 --- a/net/bridge/Makefile +++ b/net/bridge/Makefile @@ -20,7 +20,7 @@ obj-$(CONFIG_BRIDGE_NETFILTER) += br_netfilter.o bridge-$(CONFIG_BRIDGE_IGMP_SNOOPING) += br_multicast.o br_mdb.o -bridge-$(CONFIG_BRIDGE_VLAN_FILTERING) += br_vlan.o br_vlan_tunnel.o +bridge-$(CONFIG_BRIDGE_VLAN_FILTERING) += br_vlan.o br_vlan_tunnel.o br_vlan_options.o bridge-$(CONFIG_NET_SWITCHDEV) += br_switchdev.o diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index fb38add21b37..dc3d2c1dd9d5 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -32,6 +32,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev) struct net_bridge_mdb_entry *mdst; struct pcpu_sw_netstats *brstats = this_cpu_ptr(br->stats); const struct nf_br_ops *nf_ops; + u8 state = BR_STATE_FORWARDING; const unsigned char *dest; struct ethhdr *eth; u16 vid = 0; @@ -56,7 +57,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev) eth = eth_hdr(skb); skb_pull(skb, ETH_HLEN); - if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid)) + if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid, &state)) goto out; if (IS_ENABLED(CONFIG_INET) && diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 86637000f275..7629b63f6f30 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -25,7 +25,7 @@ static inline int should_deliver(const struct net_bridge_port *p, vg = nbp_vlan_group_rcu(p); return ((p->flags & BR_HAIRPIN_MODE) || skb->dev != p->dev) && - br_allowed_egress(vg, skb) && p->state == BR_STATE_FORWARDING && + p->state == BR_STATE_FORWARDING && br_allowed_egress(vg, skb) && nbp_switchdev_allowed_egress(p, skb) && !br_skb_isolated(p, skb); } diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 8944ceb47fe9..fcc260840028 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -76,11 +76,14 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb bool local_rcv, mcast_hit = false; struct net_bridge *br; u16 vid = 0; + u8 state; if (!p || p->state == BR_STATE_DISABLED) goto drop; - if (!br_allowed_ingress(p->br, nbp_vlan_group_rcu(p), skb, &vid)) + state = p->state; + if (!br_allowed_ingress(p->br, nbp_vlan_group_rcu(p), skb, &vid, + &state)) goto out; nbp_switchdev_frame_mark(p, skb); @@ -103,7 +106,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb } } - if (p->state == BR_STATE_LEARNING) + if (state == BR_STATE_LEARNING) goto drop; BR_INPUT_SKB_CB(skb)->brdev = br->dev; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index a6226ff2f0cc..5153ffe79a01 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -113,6 +113,7 @@ enum { * @vid: VLAN id * @flags: bridge vlan flags * @priv_flags: private (in-kernel) bridge vlan flags + * @state: STP state (e.g. blocking, learning, forwarding) * @stats: per-cpu VLAN statistics * @br: if MASTER flag set, this points to a bridge struct * @port: if MASTER flag unset, this points to a port struct @@ -133,6 +134,7 @@ struct net_bridge_vlan { u16 vid; u16 flags; u16 priv_flags; + u8 state; struct br_vlan_stats __percpu *stats; union { struct net_bridge *br; @@ -157,6 +159,7 @@ struct net_bridge_vlan { * @vlan_list: sorted VLAN entry list * @num_vlans: number of total VLAN entries * @pvid: PVID VLAN id + * @pvid_state: PVID's STP state (e.g. forwarding, learning, blocking) * * IMPORTANT: Be careful when checking if there're VLAN entries using list * primitives because the bridge can have entries in its list which @@ -170,6 +173,7 @@ struct net_bridge_vlan_group { struct list_head vlan_list; u16 num_vlans; u16 pvid; + u8 pvid_state; }; /* bridge fdb flags */ @@ -935,7 +939,7 @@ static inline int br_multicast_igmp_type(const struct sk_buff *skb) #ifdef CONFIG_BRIDGE_VLAN_FILTERING bool br_allowed_ingress(const struct net_bridge *br, struct net_bridge_vlan_group *vg, struct sk_buff *skb, - u16 *vid); + u16 *vid, u8 *state); bool br_allowed_egress(struct net_bridge_vlan_group *vg, const struct sk_buff *skb); bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid); @@ -976,6 +980,8 @@ void br_vlan_notify(const struct net_bridge *br, const struct net_bridge_port *p, u16 vid, u16 vid_range, int cmd); +bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr, + const struct net_bridge_vlan *range_end); static inline struct net_bridge_vlan_group *br_vlan_group( const struct net_bridge *br) @@ -1035,7 +1041,7 @@ static inline u16 br_vlan_flags(const struct net_bridge_vlan *v, u16 pvid) static inline bool br_allowed_ingress(const struct net_bridge *br, struct net_bridge_vlan_group *vg, struct sk_buff *skb, - u16 *vid) + u16 *vid, u8 *state) { return true; } @@ -1191,6 +1197,55 @@ static inline void br_vlan_notify(const struct net_bridge *br, } #endif +/* br_vlan_options.c */ +#ifdef CONFIG_BRIDGE_VLAN_FILTERING +bool br_vlan_opts_eq(const struct net_bridge_vlan *v1, + const struct net_bridge_vlan *v2); +bool br_vlan_opts_fill(struct sk_buff *skb, const struct net_bridge_vlan *v); +size_t br_vlan_opts_nl_size(void); +int br_vlan_process_options(const struct net_bridge *br, + const struct net_bridge_port *p, + struct net_bridge_vlan *range_start, + struct net_bridge_vlan *range_end, + struct nlattr **tb, + struct netlink_ext_ack *extack); + +/* vlan state manipulation helpers using *_ONCE to annotate lock-free access */ +static inline u8 br_vlan_get_state(const struct net_bridge_vlan *v) +{ + return READ_ONCE(v->state); +} + +static inline void br_vlan_set_state(struct net_bridge_vlan *v, u8 state) +{ + WRITE_ONCE(v->state, state); +} + +static inline u8 br_vlan_get_pvid_state(const struct net_bridge_vlan_group *vg) +{ + return READ_ONCE(vg->pvid_state); +} + +static inline void br_vlan_set_pvid_state(struct net_bridge_vlan_group *vg, + u8 state) +{ + WRITE_ONCE(vg->pvid_state, state); +} + +/* learn_allow is true at ingress and false at egress */ +static inline bool br_vlan_state_allowed(u8 state, bool learn_allow) +{ + switch (state) { + case BR_STATE_LEARNING: + return learn_allow; + case BR_STATE_FORWARDING: + return true; + default: + return false; + } +} +#endif + struct nf_br_ops { int (*br_dev_xmit_hook)(struct sk_buff *skb); }; diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index e4f7dd10c3f8..6b5deca08b89 100644 --- a/net/bridge/br_vlan.c +++ b/net/bridge/br_vlan.c @@ -34,13 +34,15 @@ static struct net_bridge_vlan *br_vlan_lookup(struct rhashtable *tbl, u16 vid) return rhashtable_lookup_fast(tbl, &vid, br_vlan_rht_params); } -static bool __vlan_add_pvid(struct net_bridge_vlan_group *vg, u16 vid) +static bool __vlan_add_pvid(struct net_bridge_vlan_group *vg, + const struct net_bridge_vlan *v) { - if (vg->pvid == vid) + if (vg->pvid == v->vid) return false; smp_wmb(); - vg->pvid = vid; + br_vlan_set_pvid_state(vg, v->state); + vg->pvid = v->vid; return true; } @@ -69,7 +71,7 @@ static bool __vlan_add_flags(struct net_bridge_vlan *v, u16 flags) vg = nbp_vlan_group(v->port); if (flags & BRIDGE_VLAN_INFO_PVID) - ret = __vlan_add_pvid(vg, v->vid); + ret = __vlan_add_pvid(vg, v); else ret = __vlan_delete_pvid(vg, v->vid); @@ -293,6 +295,9 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags, vg->num_vlans++; } + /* set the state before publishing */ + v->state = BR_STATE_FORWARDING; + err = rhashtable_lookup_insert_fast(&vg->vlan_hash, &v->vnode, br_vlan_rht_params); if (err) @@ -466,7 +471,8 @@ out: /* Called under RCU */ static bool __allowed_ingress(const struct net_bridge *br, struct net_bridge_vlan_group *vg, - struct sk_buff *skb, u16 *vid) + struct sk_buff *skb, u16 *vid, + u8 *state) { struct br_vlan_stats *stats; struct net_bridge_vlan *v; @@ -532,13 +538,25 @@ static bool __allowed_ingress(const struct net_bridge *br, skb->vlan_tci |= pvid; /* if stats are disabled we can avoid the lookup */ - if (!br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) - return true; + if (!br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) { + if (*state == BR_STATE_FORWARDING) { + *state = br_vlan_get_pvid_state(vg); + return br_vlan_state_allowed(*state, true); + } else { + return true; + } + } } v = br_vlan_find(vg, *vid); if (!v || !br_vlan_should_use(v)) goto drop; + if (*state == BR_STATE_FORWARDING) { + *state = br_vlan_get_state(v); + if (!br_vlan_state_allowed(*state, true)) + goto drop; + } + if (br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) { stats = this_cpu_ptr(v->stats); u64_stats_update_begin(&stats->syncp); @@ -556,7 +574,7 @@ drop: bool br_allowed_ingress(const struct net_bridge *br, struct net_bridge_vlan_group *vg, struct sk_buff *skb, - u16 *vid) + u16 *vid, u8 *state) { /* If VLAN filtering is disabled on the bridge, all packets are * permitted. @@ -566,7 +584,7 @@ bool br_allowed_ingress(const struct net_bridge *br, return true; } - return __allowed_ingress(br, vg, skb, vid); + return __allowed_ingress(br, vg, skb, vid, state); } /* Called under RCU. */ @@ -582,7 +600,8 @@ bool br_allowed_egress(struct net_bridge_vlan_group *vg, br_vlan_get_tag(skb, &vid); v = br_vlan_find(vg, vid); - if (v && br_vlan_should_use(v)) + if (v && br_vlan_should_use(v) && + br_vlan_state_allowed(br_vlan_get_state(v), false)) return true; return false; @@ -593,6 +612,7 @@ bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid) { struct net_bridge_vlan_group *vg; struct net_bridge *br = p->br; + struct net_bridge_vlan *v; /* If filtering was disabled at input, let it pass. */ if (!br_opt_get(br, BROPT_VLAN_ENABLED)) @@ -607,13 +627,15 @@ bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid) if (!*vid) { *vid = br_get_pvid(vg); - if (!*vid) + if (!*vid || + !br_vlan_state_allowed(br_vlan_get_pvid_state(vg), true)) return false; return true; } - if (br_vlan_find(vg, *vid)) + v = br_vlan_find(vg, *vid); + if (v && br_vlan_state_allowed(br_vlan_get_state(v), true)) return true; return false; @@ -1547,7 +1569,9 @@ void br_vlan_port_event(struct net_bridge_port *p, unsigned long event) } } +/* v_opts is used to dump the options which must be equal in the whole range */ static bool br_vlan_fill_vids(struct sk_buff *skb, u16 vid, u16 vid_range, + const struct net_bridge_vlan *v_opts, u16 flags) { struct bridge_vlan_info info; @@ -1572,6 +1596,9 @@ static bool br_vlan_fill_vids(struct sk_buff *skb, u16 vid, u16 vid_range, nla_put_u16(skb, BRIDGE_VLANDB_ENTRY_RANGE, vid_range)) goto out_err; + if (v_opts && !br_vlan_opts_fill(skb, v_opts)) + goto out_err; + nla_nest_end(skb, nest); return true; @@ -1586,7 +1613,8 @@ static size_t rtnl_vlan_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct br_vlan_msg)) + nla_total_size(0) /* BRIDGE_VLANDB_ENTRY */ + nla_total_size(sizeof(u16)) /* BRIDGE_VLANDB_ENTRY_RANGE */ - + nla_total_size(sizeof(struct bridge_vlan_info)); /* BRIDGE_VLANDB_ENTRY_INFO */ + + nla_total_size(sizeof(struct bridge_vlan_info)) /* BRIDGE_VLANDB_ENTRY_INFO */ + + br_vlan_opts_nl_size(); /* bridge vlan options */ } void br_vlan_notify(const struct net_bridge *br, @@ -1595,7 +1623,7 @@ void br_vlan_notify(const struct net_bridge *br, int cmd) { struct net_bridge_vlan_group *vg; - struct net_bridge_vlan *v; + struct net_bridge_vlan *v = NULL; struct br_vlan_msg *bvm; struct nlmsghdr *nlh; struct sk_buff *skb; @@ -1647,7 +1675,7 @@ void br_vlan_notify(const struct net_bridge *br, goto out_kfree; } - if (!br_vlan_fill_vids(skb, vid, vid_range, flags)) + if (!br_vlan_fill_vids(skb, vid, vid_range, v, flags)) goto out_err; nlmsg_end(skb, nlh); @@ -1661,11 +1689,12 @@ out_kfree: } /* check if v_curr can enter a range ending in range_end */ -static bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr, - const struct net_bridge_vlan *range_end) +bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr, + const struct net_bridge_vlan *range_end) { return v_curr->vid - range_end->vid == 1 && - range_end->flags == v_curr->flags; + range_end->flags == v_curr->flags && + br_vlan_opts_eq(v_curr, range_end); } static int br_vlan_dump_dev(const struct net_device *dev, @@ -1729,7 +1758,8 @@ static int br_vlan_dump_dev(const struct net_device *dev, u16 flags = br_vlan_flags(range_start, pvid); if (!br_vlan_fill_vids(skb, range_start->vid, - range_end->vid, flags)) { + range_end->vid, range_start, + flags)) { err = -EMSGSIZE; break; } @@ -1748,7 +1778,7 @@ static int br_vlan_dump_dev(const struct net_device *dev, */ if (!err && range_start && !br_vlan_fill_vids(skb, range_start->vid, range_end->vid, - br_vlan_flags(range_start, pvid))) + range_start, br_vlan_flags(range_start, pvid))) err = -EMSGSIZE; cb->args[1] = err ? idx : 0; @@ -1808,6 +1838,7 @@ static const struct nla_policy br_vlan_db_policy[BRIDGE_VLANDB_ENTRY_MAX + 1] = [BRIDGE_VLANDB_ENTRY_INFO] = { .type = NLA_EXACT_LEN, .len = sizeof(struct bridge_vlan_info) }, [BRIDGE_VLANDB_ENTRY_RANGE] = { .type = NLA_U16 }, + [BRIDGE_VLANDB_ENTRY_STATE] = { .type = NLA_U8 }, }; static int br_vlan_rtm_process_one(struct net_device *dev, @@ -1816,11 +1847,11 @@ static int br_vlan_rtm_process_one(struct net_device *dev, { struct bridge_vlan_info *vinfo, vrange_end, *vinfo_last = NULL; struct nlattr *tb[BRIDGE_VLANDB_ENTRY_MAX + 1]; + bool changed = false, skip_processing = false; struct net_bridge_vlan_group *vg; struct net_bridge_port *p = NULL; int err = 0, cmdmap = 0; struct net_bridge *br; - bool changed = false; if (netif_is_bridge_master(dev)) { br = netdev_priv(dev); @@ -1874,16 +1905,43 @@ static int br_vlan_rtm_process_one(struct net_device *dev, switch (cmd) { case RTM_NEWVLAN: cmdmap = RTM_SETLINK; + skip_processing = !!(vinfo->flags & BRIDGE_VLAN_INFO_ONLY_OPTS); break; case RTM_DELVLAN: cmdmap = RTM_DELLINK; break; } - err = br_process_vlan_info(br, p, cmdmap, vinfo, &vinfo_last, &changed, - extack); - if (changed) - br_ifinfo_notify(cmdmap, br, p); + if (!skip_processing) { + struct bridge_vlan_info *tmp_last = vinfo_last; + + /* br_process_vlan_info may overwrite vinfo_last */ + err = br_process_vlan_info(br, p, cmdmap, vinfo, &tmp_last, + &changed, extack); + + /* notify first if anything changed */ + if (changed) + br_ifinfo_notify(cmdmap, br, p); + + if (err) + return err; + } + + /* deal with options */ + if (cmd == RTM_NEWVLAN) { + struct net_bridge_vlan *range_start, *range_end; + + if (vinfo_last) { + range_start = br_vlan_find(vg, vinfo_last->vid); + range_end = br_vlan_find(vg, vinfo->vid); + } else { + range_start = br_vlan_find(vg, vinfo->vid); + range_end = range_start; + } + + err = br_vlan_process_options(br, p, range_start, range_end, + tb, extack); + } return err; } diff --git a/net/bridge/br_vlan_options.c b/net/bridge/br_vlan_options.c new file mode 100644 index 000000000000..cd2eb194eb98 --- /dev/null +++ b/net/bridge/br_vlan_options.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (c) 2020, Nikolay Aleksandrov <nikolay@cumulusnetworks.com> +#include <linux/kernel.h> +#include <linux/netdevice.h> +#include <linux/rtnetlink.h> +#include <linux/slab.h> + +#include "br_private.h" + +/* check if the options between two vlans are equal */ +bool br_vlan_opts_eq(const struct net_bridge_vlan *v1, + const struct net_bridge_vlan *v2) +{ + return v1->state == v2->state; +} + +bool br_vlan_opts_fill(struct sk_buff *skb, const struct net_bridge_vlan *v) +{ + return !nla_put_u8(skb, BRIDGE_VLANDB_ENTRY_STATE, + br_vlan_get_state(v)); +} + +size_t br_vlan_opts_nl_size(void) +{ + return nla_total_size(sizeof(u8)); /* BRIDGE_VLANDB_ENTRY_STATE */ +} + +static int br_vlan_modify_state(struct net_bridge_vlan_group *vg, + struct net_bridge_vlan *v, + u8 state, + bool *changed, + struct netlink_ext_ack *extack) +{ + struct net_bridge *br; + + ASSERT_RTNL(); + + if (state > BR_STATE_BLOCKING) { + NL_SET_ERR_MSG_MOD(extack, "Invalid vlan state"); + return -EINVAL; + } + + if (br_vlan_is_brentry(v)) + br = v->br; + else + br = v->port->br; + + if (br->stp_enabled == BR_KERNEL_STP) { + NL_SET_ERR_MSG_MOD(extack, "Can't modify vlan state when using kernel STP"); + return -EBUSY; + } + + if (v->state == state) + return 0; + + if (v->vid == br_get_pvid(vg)) + br_vlan_set_pvid_state(vg, state); + + br_vlan_set_state(v, state); + *changed = true; + + return 0; +} + +static int br_vlan_process_one_opts(const struct net_bridge *br, + const struct net_bridge_port *p, + struct net_bridge_vlan_group *vg, + struct net_bridge_vlan *v, + struct nlattr **tb, + bool *changed, + struct netlink_ext_ack *extack) +{ + int err; + + *changed = false; + if (tb[BRIDGE_VLANDB_ENTRY_STATE]) { + u8 state = nla_get_u8(tb[BRIDGE_VLANDB_ENTRY_STATE]); + + err = br_vlan_modify_state(vg, v, state, changed, extack); + if (err) + return err; + } + + return 0; +} + +int br_vlan_process_options(const struct net_bridge *br, + const struct net_bridge_port *p, + struct net_bridge_vlan *range_start, + struct net_bridge_vlan *range_end, + struct nlattr **tb, + struct netlink_ext_ack *extack) +{ + struct net_bridge_vlan *v, *curr_start = NULL, *curr_end = NULL; + struct net_bridge_vlan_group *vg; + int vid, err = 0; + u16 pvid; + + if (p) + vg = nbp_vlan_group(p); + else + vg = br_vlan_group(br); + + if (!range_start || !br_vlan_should_use(range_start)) { + NL_SET_ERR_MSG_MOD(extack, "Vlan range start doesn't exist, can't process options"); + return -ENOENT; + } + if (!range_end || !br_vlan_should_use(range_end)) { + NL_SET_ERR_MSG_MOD(extack, "Vlan range end doesn't exist, can't process options"); + return -ENOENT; + } + + pvid = br_get_pvid(vg); + for (vid = range_start->vid; vid <= range_end->vid; vid++) { + bool changed = false; + + v = br_vlan_find(vg, vid); + if (!v || !br_vlan_should_use(v)) { + NL_SET_ERR_MSG_MOD(extack, "Vlan in range doesn't exist, can't process options"); + err = -ENOENT; + break; + } + + err = br_vlan_process_one_opts(br, p, vg, v, tb, &changed, + extack); + if (err) + break; + + if (changed) { + /* vlan options changed, check for range */ + if (!curr_start) { + curr_start = v; + curr_end = v; + continue; + } + + if (v->vid == pvid || + !br_vlan_can_enter_range(v, curr_end)) { + br_vlan_notify(br, p, curr_start->vid, + curr_end->vid, RTM_NEWVLAN); + curr_start = v; + } + curr_end = v; + } else { + /* nothing changed and nothing to notify yet */ + if (!curr_start) + continue; + + br_vlan_notify(br, p, curr_start->vid, curr_end->vid, + RTM_NEWVLAN); + curr_start = NULL; + curr_end = NULL; + } + } + if (curr_start) + br_vlan_notify(br, p, curr_start->vid, curr_end->vid, + RTM_NEWVLAN); + + return err; +} diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c index 76bd67891fb3..a0116b9503d9 100644 --- a/net/caif/caif_usb.c +++ b/net/caif/caif_usb.c @@ -62,7 +62,7 @@ static int cfusbl_transmit(struct cflayer *layr, struct cfpkt *pkt) hpad = (info->hdr_len + CFUSB_PAD_DESCR_SZ) & (CFUSB_ALIGNMENT - 1); if (skb_headroom(skb) < ETH_HLEN + CFUSB_PAD_DESCR_SZ + hpad) { - pr_warn("Headroom to small\n"); + pr_warn("Headroom too small\n"); kfree_skb(skb); return -EIO; } diff --git a/net/core/dev.c b/net/core/dev.c index 4dcc1b390667..38bc35da39f7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1764,7 +1764,6 @@ EXPORT_SYMBOL(register_netdevice_notifier); int unregister_netdevice_notifier(struct notifier_block *nb) { - struct net_device *dev; struct net *net; int err; @@ -1775,16 +1774,9 @@ int unregister_netdevice_notifier(struct notifier_block *nb) if (err) goto unlock; - for_each_net(net) { - for_each_netdev(net, dev) { - if (dev->flags & IFF_UP) { - call_netdevice_notifier(nb, NETDEV_GOING_DOWN, - dev); - call_netdevice_notifier(nb, NETDEV_DOWN, dev); - } - call_netdevice_notifier(nb, NETDEV_UNREGISTER, dev); - } - } + for_each_net(net) + call_netdevice_unregister_net_notifiers(nb, net); + unlock: rtnl_unlock(); up_write(&pernet_ops_rwsem); @@ -1792,6 +1784,42 @@ unlock: } EXPORT_SYMBOL(unregister_netdevice_notifier); +static int __register_netdevice_notifier_net(struct net *net, + struct notifier_block *nb, + bool ignore_call_fail) +{ + int err; + + err = raw_notifier_chain_register(&net->netdev_chain, nb); + if (err) + return err; + if (dev_boot_phase) + return 0; + + err = call_netdevice_register_net_notifiers(nb, net); + if (err && !ignore_call_fail) + goto chain_unregister; + + return 0; + +chain_unregister: + raw_notifier_chain_unregister(&net->netdev_chain, nb); + return err; +} + +static int __unregister_netdevice_notifier_net(struct net *net, + struct notifier_block *nb) +{ + int err; + + err = raw_notifier_chain_unregister(&net->netdev_chain, nb); + if (err) + return err; + + call_netdevice_unregister_net_notifiers(nb, net); + return 0; +} + /** * register_netdevice_notifier_net - register a per-netns network notifier block * @net: network namespace @@ -1812,23 +1840,9 @@ int register_netdevice_notifier_net(struct net *net, struct notifier_block *nb) int err; rtnl_lock(); - err = raw_notifier_chain_register(&net->netdev_chain, nb); - if (err) - goto unlock; - if (dev_boot_phase) - goto unlock; - - err = call_netdevice_register_net_notifiers(nb, net); - if (err) - goto chain_unregister; - -unlock: + err = __register_netdevice_notifier_net(net, nb, false); rtnl_unlock(); return err; - -chain_unregister: - raw_notifier_chain_unregister(&netdev_chain, nb); - goto unlock; } EXPORT_SYMBOL(register_netdevice_notifier_net); @@ -1854,17 +1868,53 @@ int unregister_netdevice_notifier_net(struct net *net, int err; rtnl_lock(); - err = raw_notifier_chain_unregister(&net->netdev_chain, nb); - if (err) - goto unlock; + err = __unregister_netdevice_notifier_net(net, nb); + rtnl_unlock(); + return err; +} +EXPORT_SYMBOL(unregister_netdevice_notifier_net); - call_netdevice_unregister_net_notifiers(nb, net); +int register_netdevice_notifier_dev_net(struct net_device *dev, + struct notifier_block *nb, + struct netdev_net_notifier *nn) +{ + int err; -unlock: + rtnl_lock(); + err = __register_netdevice_notifier_net(dev_net(dev), nb, false); + if (!err) { + nn->nb = nb; + list_add(&nn->list, &dev->net_notifier_list); + } rtnl_unlock(); return err; } -EXPORT_SYMBOL(unregister_netdevice_notifier_net); +EXPORT_SYMBOL(register_netdevice_notifier_dev_net); + +int unregister_netdevice_notifier_dev_net(struct net_device *dev, + struct notifier_block *nb, + struct netdev_net_notifier *nn) +{ + int err; + + rtnl_lock(); + list_del(&nn->list); + err = __unregister_netdevice_notifier_net(dev_net(dev), nb); + rtnl_unlock(); + return err; +} +EXPORT_SYMBOL(unregister_netdevice_notifier_dev_net); + +static void move_netdevice_notifiers_dev_net(struct net_device *dev, + struct net *net) +{ + struct netdev_net_notifier *nn; + + list_for_each_entry(nn, &dev->net_notifier_list, list) { + __unregister_netdevice_notifier_net(dev_net(dev), nn->nb); + __register_netdevice_notifier_net(net, nn->nb, true); + } +} /** * call_netdevice_notifiers_info - call all network notifier blocks @@ -3249,7 +3299,7 @@ struct sk_buff *__skb_gso_segment(struct sk_buff *skb, segs = skb_mac_gso_segment(skb, features); - if (unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs))) + if (segs != skb && unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs))) skb_warn_bad_offload(skb); return segs; @@ -5489,9 +5539,29 @@ static void flush_all_backlogs(void) put_online_cpus(); } +/* Pass the currently batched GRO_NORMAL SKBs up to the stack. */ +static void gro_normal_list(struct napi_struct *napi) +{ + if (!napi->rx_count) + return; + netif_receive_skb_list_internal(&napi->rx_list); + INIT_LIST_HEAD(&napi->rx_list); + napi->rx_count = 0; +} + +/* Queue one GRO_NORMAL SKB up for list processing. If batch size exceeded, + * pass the whole batch up to the stack. + */ +static void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb) +{ + list_add_tail(&skb->list, &napi->rx_list); + if (++napi->rx_count >= gro_normal_batch) + gro_normal_list(napi); +} + INDIRECT_CALLABLE_DECLARE(int inet_gro_complete(struct sk_buff *, int)); INDIRECT_CALLABLE_DECLARE(int ipv6_gro_complete(struct sk_buff *, int)); -static int napi_gro_complete(struct sk_buff *skb) +static int napi_gro_complete(struct napi_struct *napi, struct sk_buff *skb) { struct packet_offload *ptype; __be16 type = skb->protocol; @@ -5524,7 +5594,8 @@ static int napi_gro_complete(struct sk_buff *skb) } out: - return netif_receive_skb_internal(skb); + gro_normal_one(napi, skb); + return NET_RX_SUCCESS; } static void __napi_gro_flush_chain(struct napi_struct *napi, u32 index, @@ -5537,7 +5608,7 @@ static void __napi_gro_flush_chain(struct napi_struct *napi, u32 index, if (flush_old && NAPI_GRO_CB(skb)->age == jiffies) return; skb_list_del_init(skb); - napi_gro_complete(skb); + napi_gro_complete(napi, skb); napi->gro_hash[index].count--; } @@ -5639,7 +5710,7 @@ static void gro_pull_from_frag0(struct sk_buff *skb, int grow) } } -static void gro_flush_oldest(struct list_head *head) +static void gro_flush_oldest(struct napi_struct *napi, struct list_head *head) { struct sk_buff *oldest; @@ -5655,7 +5726,7 @@ static void gro_flush_oldest(struct list_head *head) * SKB to the chain. */ skb_list_del_init(oldest); - napi_gro_complete(oldest); + napi_gro_complete(napi, oldest); } INDIRECT_CALLABLE_DECLARE(struct sk_buff *inet_gro_receive(struct list_head *, @@ -5731,7 +5802,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff if (pp) { skb_list_del_init(pp); - napi_gro_complete(pp); + napi_gro_complete(napi, pp); napi->gro_hash[hash].count--; } @@ -5742,7 +5813,7 @@ static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff goto normal; if (unlikely(napi->gro_hash[hash].count >= MAX_GRO_SKBS)) { - gro_flush_oldest(gro_head); + gro_flush_oldest(napi, gro_head); } else { napi->gro_hash[hash].count++; } @@ -5800,26 +5871,6 @@ struct packet_offload *gro_find_complete_by_type(__be16 type) } EXPORT_SYMBOL(gro_find_complete_by_type); -/* Pass the currently batched GRO_NORMAL SKBs up to the stack. */ -static void gro_normal_list(struct napi_struct *napi) -{ - if (!napi->rx_count) - return; - netif_receive_skb_list_internal(&napi->rx_list); - INIT_LIST_HEAD(&napi->rx_list); - napi->rx_count = 0; -} - -/* Queue one GRO_NORMAL SKB up for list processing. If batch size exceeded, - * pass the whole batch up to the stack. - */ -static void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb) -{ - list_add_tail(&skb->list, &napi->rx_list); - if (++napi->rx_count >= gro_normal_batch) - gro_normal_list(napi); -} - static void napi_skb_free_stolen_head(struct sk_buff *skb) { skb_dst_drop(skb); @@ -6198,8 +6249,6 @@ bool napi_complete_done(struct napi_struct *n, int work_done) NAPIF_STATE_IN_BUSY_POLL))) return false; - gro_normal_list(n); - if (n->gro_bitmask) { unsigned long timeout = 0; @@ -6215,6 +6264,9 @@ bool napi_complete_done(struct napi_struct *n, int work_done) hrtimer_start(&n->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); } + + gro_normal_list(n); + if (unlikely(!list_empty(&n->poll_list))) { /* If n->poll_list is not empty, we need to mask irqs */ local_irq_save(flags); @@ -6546,8 +6598,6 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) goto out_unlock; } - gro_normal_list(n); - if (n->gro_bitmask) { /* flush too old packets * If HZ < 1000, flush all packets. @@ -6555,6 +6605,8 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) napi_gro_flush(n, HZ >= 1000); } + gro_normal_list(n); + /* Some drivers may have called napi_schedule * prior to exhausting their budget. */ @@ -8192,6 +8244,22 @@ int __dev_set_mtu(struct net_device *dev, int new_mtu) } EXPORT_SYMBOL(__dev_set_mtu); +int dev_validate_mtu(struct net_device *dev, int new_mtu, + struct netlink_ext_ack *extack) +{ + /* MTU must be positive, and in range */ + if (new_mtu < 0 || new_mtu < dev->min_mtu) { + NL_SET_ERR_MSG(extack, "mtu less than device minimum"); + return -EINVAL; + } + + if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) { + NL_SET_ERR_MSG(extack, "mtu greater than device maximum"); + return -EINVAL; + } + return 0; +} + /** * dev_set_mtu_ext - Change maximum transfer unit * @dev: device @@ -8208,16 +8276,9 @@ int dev_set_mtu_ext(struct net_device *dev, int new_mtu, if (new_mtu == dev->mtu) return 0; - /* MTU must be positive, and in range */ - if (new_mtu < 0 || new_mtu < dev->min_mtu) { - NL_SET_ERR_MSG(extack, "mtu less than device minimum"); - return -EINVAL; - } - - if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) { - NL_SET_ERR_MSG(extack, "mtu greater than device maximum"); - return -EINVAL; - } + err = dev_validate_mtu(dev, new_mtu, extack); + if (err) + return err; if (!netif_device_present(dev)) return -ENODEV; @@ -9272,7 +9333,7 @@ int register_netdevice(struct net_device *dev) /* Transfer changeable features to wanted_features and enable * software offloads (GSO and GRO). */ - dev->hw_features |= NETIF_F_SOFT_FEATURES; + dev->hw_features |= (NETIF_F_SOFT_FEATURES | NETIF_F_SOFT_FEATURES_OFF); dev->features |= NETIF_F_SOFT_FEATURES; if (dev->netdev_ops->ndo_udp_tunnel_add) { @@ -9317,8 +9378,10 @@ int register_netdevice(struct net_device *dev) goto err_uninit; ret = netdev_register_kobject(dev); - if (ret) + if (ret) { + dev->reg_state = NETREG_UNREGISTERED; goto err_uninit; + } dev->reg_state = NETREG_REGISTERED; __netdev_update_features(dev); @@ -9765,6 +9828,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->adj_list.lower); INIT_LIST_HEAD(&dev->ptype_all); INIT_LIST_HEAD(&dev->ptype_specific); + INIT_LIST_HEAD(&dev->net_notifier_list); #ifdef CONFIG_NET_SCHED hash_init(dev->qdisc_hash); #endif @@ -10028,6 +10092,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char kobject_uevent(&dev->dev.kobj, KOBJ_REMOVE); netdev_adjacent_del_links(dev); + /* Move per-net netdevice notifiers that are following the netdevice */ + move_netdevice_notifiers_dev_net(dev, net); + /* Actually switch the network namespace */ dev_net_set(dev, net); dev->ifindex = new_ifindex; diff --git a/net/core/devlink.c b/net/core/devlink.c index 64367eeb21e6..ca1df0ec3c97 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -4843,6 +4843,93 @@ devlink_health_reporter_destroy(struct devlink_health_reporter *reporter) } EXPORT_SYMBOL_GPL(devlink_health_reporter_destroy); +static int +devlink_nl_health_reporter_fill(struct sk_buff *msg, + struct devlink *devlink, + struct devlink_health_reporter *reporter, + enum devlink_command cmd, u32 portid, + u32 seq, int flags) +{ + struct nlattr *reporter_attr; + void *hdr; + + hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd); + if (!hdr) + return -EMSGSIZE; + + if (devlink_nl_put_handle(msg, devlink)) + goto genlmsg_cancel; + + reporter_attr = nla_nest_start_noflag(msg, + DEVLINK_ATTR_HEALTH_REPORTER); + if (!reporter_attr) + goto genlmsg_cancel; + if (nla_put_string(msg, DEVLINK_ATTR_HEALTH_REPORTER_NAME, + reporter->ops->name)) + goto reporter_nest_cancel; + if (nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_STATE, + reporter->health_state)) + goto reporter_nest_cancel; + if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_ERR_COUNT, + reporter->error_count, DEVLINK_ATTR_PAD)) + goto reporter_nest_cancel; + if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_RECOVER_COUNT, + reporter->recovery_count, DEVLINK_ATTR_PAD)) + goto reporter_nest_cancel; + if (reporter->ops->recover && + nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD, + reporter->graceful_period, + DEVLINK_ATTR_PAD)) + goto reporter_nest_cancel; + if (reporter->ops->recover && + nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER, + reporter->auto_recover)) + goto reporter_nest_cancel; + if (reporter->dump_fmsg && + nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS, + jiffies_to_msecs(reporter->dump_ts), + DEVLINK_ATTR_PAD)) + goto reporter_nest_cancel; + if (reporter->dump_fmsg && + nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS_NS, + reporter->dump_real_ts, DEVLINK_ATTR_PAD)) + goto reporter_nest_cancel; + + nla_nest_end(msg, reporter_attr); + genlmsg_end(msg, hdr); + return 0; + +reporter_nest_cancel: + nla_nest_end(msg, reporter_attr); +genlmsg_cancel: + genlmsg_cancel(msg, hdr); + return -EMSGSIZE; +} + +static void devlink_recover_notify(struct devlink_health_reporter *reporter, + enum devlink_command cmd) +{ + struct sk_buff *msg; + int err; + + WARN_ON(cmd != DEVLINK_CMD_HEALTH_REPORTER_RECOVER); + + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!msg) + return; + + err = devlink_nl_health_reporter_fill(msg, reporter->devlink, + reporter, cmd, 0, 0, 0); + if (err) { + nlmsg_free(msg); + return; + } + + genlmsg_multicast_netns(&devlink_nl_family, + devlink_net(reporter->devlink), + msg, 0, DEVLINK_MCGRP_CONFIG, GFP_KERNEL); +} + void devlink_health_reporter_recovery_done(struct devlink_health_reporter *reporter) { @@ -4869,6 +4956,7 @@ devlink_health_reporter_recover(struct devlink_health_reporter *reporter, devlink_health_reporter_recovery_done(reporter); reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_HEALTHY; + devlink_recover_notify(reporter, DEVLINK_CMD_HEALTH_REPORTER_RECOVER); return 0; } @@ -4935,6 +5023,7 @@ int devlink_health_report(struct devlink_health_reporter *reporter, reporter->error_count++; prev_health_state = reporter->health_state; reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_ERROR; + devlink_recover_notify(reporter, DEVLINK_CMD_HEALTH_REPORTER_RECOVER); /* abort if the previous error wasn't recovered */ if (reporter->auto_recover && @@ -5017,93 +5106,6 @@ devlink_health_reporter_put(struct devlink_health_reporter *reporter) refcount_dec(&reporter->refcount); } -static int -devlink_nl_health_reporter_fill(struct sk_buff *msg, - struct devlink *devlink, - struct devlink_health_reporter *reporter, - enum devlink_command cmd, u32 portid, - u32 seq, int flags) -{ - struct nlattr *reporter_attr; - void *hdr; - - hdr = genlmsg_put(msg, portid, seq, &devlink_nl_family, flags, cmd); - if (!hdr) - return -EMSGSIZE; - - if (devlink_nl_put_handle(msg, devlink)) - goto genlmsg_cancel; - - reporter_attr = nla_nest_start_noflag(msg, - DEVLINK_ATTR_HEALTH_REPORTER); - if (!reporter_attr) - goto genlmsg_cancel; - if (nla_put_string(msg, DEVLINK_ATTR_HEALTH_REPORTER_NAME, - reporter->ops->name)) - goto reporter_nest_cancel; - if (nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_STATE, - reporter->health_state)) - goto reporter_nest_cancel; - if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_ERR_COUNT, - reporter->error_count, DEVLINK_ATTR_PAD)) - goto reporter_nest_cancel; - if (nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_RECOVER_COUNT, - reporter->recovery_count, DEVLINK_ATTR_PAD)) - goto reporter_nest_cancel; - if (reporter->ops->recover && - nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_GRACEFUL_PERIOD, - reporter->graceful_period, - DEVLINK_ATTR_PAD)) - goto reporter_nest_cancel; - if (reporter->ops->recover && - nla_put_u8(msg, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_RECOVER, - reporter->auto_recover)) - goto reporter_nest_cancel; - if (reporter->dump_fmsg && - nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS, - jiffies_to_msecs(reporter->dump_ts), - DEVLINK_ATTR_PAD)) - goto reporter_nest_cancel; - if (reporter->dump_fmsg && - nla_put_u64_64bit(msg, DEVLINK_ATTR_HEALTH_REPORTER_DUMP_TS_NS, - reporter->dump_real_ts, DEVLINK_ATTR_PAD)) - goto reporter_nest_cancel; - - nla_nest_end(msg, reporter_attr); - genlmsg_end(msg, hdr); - return 0; - -reporter_nest_cancel: - nla_nest_end(msg, reporter_attr); -genlmsg_cancel: - genlmsg_cancel(msg, hdr); - return -EMSGSIZE; -} - -static void devlink_recover_notify(struct devlink_health_reporter *reporter, - enum devlink_command cmd) -{ - struct sk_buff *msg; - int err; - - WARN_ON(cmd != DEVLINK_CMD_HEALTH_REPORTER_RECOVER); - - msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); - if (!msg) - return; - - err = devlink_nl_health_reporter_fill(msg, reporter->devlink, - reporter, cmd, 0, 0, 0); - if (err) { - nlmsg_free(msg); - return; - } - - genlmsg_multicast_netns(&devlink_nl_family, - devlink_net(reporter->devlink), - msg, 0, DEVLINK_MCGRP_CONFIG, GFP_KERNEL); -} - void devlink_health_reporter_state_update(struct devlink_health_reporter *reporter, enum devlink_health_reporter_state state) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 920784a9b7ff..789a73aa7bd8 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -3290,6 +3290,7 @@ static void *neigh_stat_seq_next(struct seq_file *seq, void *v, loff_t *pos) *pos = cpu+1; return per_cpu_ptr(tbl->stats, cpu); } + (*pos)++; return NULL; } diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 890be1b4877e..294bfcf0ce0e 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -323,10 +323,6 @@ struct pktgen_dev { struct in6_addr max_in6_daddr; struct in6_addr min_in6_saddr; struct in6_addr max_in6_saddr; - u64 max_in6_h; - u64 max_in6_l; - u64 min_in6_h; - u64 min_in6_l; /* If we're doing ranges, random or incremental, then this * defines the min/max for those ranges. @@ -1359,59 +1355,6 @@ static ssize_t pktgen_if_write(struct file *file, sprintf(pg_result, "OK: dst6_max=%s", buf); return count; } - if (!strcmp(name, "src6_min")) { - len = strn_len(&user_buffer[i], sizeof(buf) - 1); - if (len < 0) - return len; - - pkt_dev->flags |= F_IPV6; - - if (copy_from_user(buf, &user_buffer[i], len)) - return -EFAULT; - buf[len] = 0; - - in6_pton(buf, -1, pkt_dev->min_in6_saddr.s6_addr, -1, NULL); - snprintf(buf, sizeof(buf), "%pI6c", &pkt_dev->min_in6_saddr); - - memcpy(&pkt_dev->min_in6_h, pkt_dev->min_in6_saddr.s6_addr, 8); - memcpy(&pkt_dev->min_in6_l, pkt_dev->min_in6_saddr.s6_addr + 8, 8); - pkt_dev->min_in6_h = be64_to_cpu(pkt_dev->min_in6_h); - pkt_dev->min_in6_l = be64_to_cpu(pkt_dev->min_in6_l); - - pkt_dev->cur_in6_saddr = pkt_dev->min_in6_saddr; - if (debug) - pr_debug("src6_min set to: %s\n", buf); - - i += len; - sprintf(pg_result, "OK: src6_min=%s", buf); - return count; - } - if (!strcmp(name, "src6_max")) { - len = strn_len(&user_buffer[i], sizeof(buf) - 1); - if (len < 0) - return len; - - pkt_dev->flags |= F_IPV6; - - if (copy_from_user(buf, &user_buffer[i], len)) - return -EFAULT; - buf[len] = 0; - - in6_pton(buf, -1, pkt_dev->max_in6_saddr.s6_addr, -1, NULL); - snprintf(buf, sizeof(buf), "%pI6c", &pkt_dev->max_in6_saddr); - - memcpy(&pkt_dev->max_in6_h, pkt_dev->max_in6_saddr.s6_addr, 8); - memcpy(&pkt_dev->max_in6_l, pkt_dev->max_in6_saddr.s6_addr + 8, 8); - pkt_dev->max_in6_h = be64_to_cpu(pkt_dev->max_in6_h); - pkt_dev->max_in6_l = be64_to_cpu(pkt_dev->max_in6_l); - - if (debug) - pr_debug("src6_max set to: %s\n", buf); - - i += len; - sprintf(pg_result, "OK: src6_max=%s", buf); - return count; - } if (!strcmp(name, "src6")) { len = strn_len(&user_buffer[i], sizeof(buf) - 1); if (len < 0) @@ -2343,45 +2286,6 @@ static void set_cur_queue_map(struct pktgen_dev *pkt_dev) pkt_dev->cur_queue_map = pkt_dev->cur_queue_map % pkt_dev->odev->real_num_tx_queues; } -/* generate ipv6 source addr */ -static void set_src_in6_addr(struct pktgen_dev *pkt_dev) -{ - u64 min6, max6, rand, i; - struct in6_addr addr6; - __be64 addr_l, *t; - - min6 = pkt_dev->min_in6_l; - max6 = pkt_dev->max_in6_l; - - /* only generate source address in least significant 64 bits range - * most significant 64 bits must be equal - */ - if (pkt_dev->max_in6_h != pkt_dev->min_in6_h || min6 >= max6) - return; - - addr6 = pkt_dev->min_in6_saddr; - t = (__be64 *)addr6.s6_addr + 1; - - if (pkt_dev->flags & F_IPSRC_RND) { - do { - prandom_bytes(&rand, sizeof(rand)); - rand = rand % (max6 - min6) + min6; - addr_l = cpu_to_be64(rand); - memcpy(t, &addr_l, 8); - } while (ipv6_addr_loopback(&addr6) || - ipv6_addr_v4mapped(&addr6) || - ipv6_addr_is_multicast(&addr6)); - } else { - addr6 = pkt_dev->cur_in6_saddr; - i = be64_to_cpu(*t); - if (++i > max6) - i = min6; - addr_l = cpu_to_be64(i); - memcpy(t, &addr_l, 8); - } - pkt_dev->cur_in6_saddr = addr6; -} - /* Increment/randomize headers according to flags and current values * for IP src/dest, UDP src/dst port, MAC-Addr src/dst */ @@ -2550,8 +2454,6 @@ static void mod_cur_headers(struct pktgen_dev *pkt_dev) } } else { /* IPV6 * */ - set_src_in6_addr(pkt_dev); - if (!ipv6_addr_any(&pkt_dev->min_in6_daddr)) { int i; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 20bc406f3871..cdad6ed532c4 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3053,8 +3053,17 @@ struct net_device *rtnl_create_link(struct net *net, const char *ifname, dev->rtnl_link_ops = ops; dev->rtnl_link_state = RTNL_LINK_INITIALIZING; - if (tb[IFLA_MTU]) - dev->mtu = nla_get_u32(tb[IFLA_MTU]); + if (tb[IFLA_MTU]) { + u32 mtu = nla_get_u32(tb[IFLA_MTU]); + int err; + + err = dev_validate_mtu(dev, mtu, extack); + if (err) { + free_netdev(dev); + return ERR_PTR(err); + } + dev->mtu = mtu; + } if (tb[IFLA_ADDRESS]) { memcpy(dev->dev_addr, nla_data(tb[IFLA_ADDRESS]), nla_len(tb[IFLA_ADDRESS])); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 48a7029529c9..864cb9e9622f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3639,6 +3639,97 @@ static inline skb_frag_t skb_head_frag_to_page_desc(struct sk_buff *frag_skb) return head_frag; } +struct sk_buff *skb_segment_list(struct sk_buff *skb, + netdev_features_t features, + unsigned int offset) +{ + struct sk_buff *list_skb = skb_shinfo(skb)->frag_list; + unsigned int tnl_hlen = skb_tnl_header_len(skb); + unsigned int delta_truesize = 0; + unsigned int delta_len = 0; + struct sk_buff *tail = NULL; + struct sk_buff *nskb; + + skb_push(skb, -skb_network_offset(skb) + offset); + + skb_shinfo(skb)->frag_list = NULL; + + do { + nskb = list_skb; + list_skb = list_skb->next; + + if (!tail) + skb->next = nskb; + else + tail->next = nskb; + + tail = nskb; + + delta_len += nskb->len; + delta_truesize += nskb->truesize; + + skb_push(nskb, -skb_network_offset(nskb) + offset); + + __copy_skb_header(nskb, skb); + + skb_headers_offset_update(nskb, skb_headroom(nskb) - skb_headroom(skb)); + skb_copy_from_linear_data_offset(skb, -tnl_hlen, + nskb->data - tnl_hlen, + offset + tnl_hlen); + + if (skb_needs_linearize(nskb, features) && + __skb_linearize(nskb)) + goto err_linearize; + + } while (list_skb); + + skb->truesize = skb->truesize - delta_truesize; + skb->data_len = skb->data_len - delta_len; + skb->len = skb->len - delta_len; + + skb_gso_reset(skb); + + skb->prev = tail; + + if (skb_needs_linearize(skb, features) && + __skb_linearize(skb)) + goto err_linearize; + + skb_get(skb); + + return skb; + +err_linearize: + kfree_skb_list(skb->next); + skb->next = NULL; + return ERR_PTR(-ENOMEM); +} +EXPORT_SYMBOL_GPL(skb_segment_list); + +int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb) +{ + if (unlikely(p->len + skb->len >= 65536)) + return -E2BIG; + + if (NAPI_GRO_CB(p)->last == p) + skb_shinfo(p)->frag_list = skb; + else + NAPI_GRO_CB(p)->last->next = skb; + + skb_pull(skb, skb_gro_offset(skb)); + + NAPI_GRO_CB(p)->last = skb; + NAPI_GRO_CB(p)->count++; + p->data_len += skb->len; + p->truesize += skb->truesize; + p->len += skb->len; + + NAPI_GRO_CB(skb)->same_flow = 1; + + return 0; +} +EXPORT_SYMBOL_GPL(skb_gro_receive_list); + /** * skb_segment - Perform protocol segmentation on skb. * @head_skb: buffer to segment diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 3866d7e20c07..ded2d5227678 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -594,8 +594,6 @@ EXPORT_SYMBOL_GPL(sk_psock_destroy); void sk_psock_drop(struct sock *sk, struct sk_psock *psock) { - sock_owned_by_me(sk); - sk_psock_cork_free(psock); sk_psock_zap_ingress(psock); diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c index f19f179538b9..91e9f2223c39 100644 --- a/net/core/sock_reuseport.c +++ b/net/core/sock_reuseport.c @@ -107,7 +107,6 @@ static struct sock_reuseport *reuseport_grow(struct sock_reuseport *reuse) if (!more_reuse) return NULL; - more_reuse->max_socks = more_socks_size; more_reuse->num_socks = reuse->num_socks; more_reuse->prog = reuse->prog; more_reuse->reuseport_id = reuse->reuseport_id; diff --git a/net/core/utils.c b/net/core/utils.c index 6b6e51db9f3b..1f31a39236d5 100644 --- a/net/core/utils.c +++ b/net/core/utils.c @@ -438,6 +438,23 @@ void inet_proto_csum_replace4(__sum16 *sum, struct sk_buff *skb, } EXPORT_SYMBOL(inet_proto_csum_replace4); +/** + * inet_proto_csum_replace16 - update layer 4 header checksum field + * @sum: Layer 4 header checksum field + * @skb: sk_buff for the packet + * @from: old IPv6 address + * @to: new IPv6 address + * @pseudohdr: True if layer 4 header checksum includes pseudoheader + * + * Update layer 4 header as per the update in IPv6 src/dst address. + * + * There is no need to update skb->csum in this function, because update in two + * fields a.) IPv6 src/dst address and b.) L4 header checksum cancels each other + * for skb->csum calculation. Whereas inet_proto_csum_replace4 function needs to + * update skb->csum, because update in 3 fields a.) IPv4 src/dst address, + * b.) IPv4 Header checksum and c.) L4 header checksum results in same diff as + * L4 Header checksum for skb->csum calculation. + */ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb, const __be32 *from, const __be32 *to, bool pseudohdr) @@ -449,9 +466,6 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb, if (skb->ip_summed != CHECKSUM_PARTIAL) { *sum = csum_fold(csum_partial(diff, sizeof(diff), ~csum_unfold(*sum))); - if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr) - skb->csum = ~csum_partial(diff, sizeof(diff), - ~skb->csum); } else if (pseudohdr) *sum = ~csum_fold(csum_partial(diff, sizeof(diff), csum_unfold(*sum))); diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index c6d81f2baf4e..e7c30b472034 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -849,6 +849,19 @@ static int dsa_switch_parse(struct dsa_switch *ds, struct dsa_chip_data *cd) return dsa_switch_parse_ports(ds, cd); } +static void dsa_switch_release_ports(struct dsa_switch *ds) +{ + struct dsa_switch_tree *dst = ds->dst; + struct dsa_port *dp, *next; + + list_for_each_entry_safe(dp, next, &dst->ports, list) { + if (dp->ds != ds) + continue; + list_del(&dp->list); + kfree(dp); + } +} + static int dsa_switch_probe(struct dsa_switch *ds) { struct dsa_switch_tree *dst; @@ -865,12 +878,17 @@ static int dsa_switch_probe(struct dsa_switch *ds) if (!ds->num_ports) return -EINVAL; - if (np) + if (np) { err = dsa_switch_parse_of(ds, np); - else if (pdata) + if (err) + dsa_switch_release_ports(ds); + } else if (pdata) { err = dsa_switch_parse(ds, pdata); - else + if (err) + dsa_switch_release_ports(ds); + } else { err = -ENODEV; + } if (err) return err; @@ -878,8 +896,10 @@ static int dsa_switch_probe(struct dsa_switch *ds) dst = ds->dst; dsa_tree_get(dst); err = dsa_tree_setup(dst); - if (err) + if (err) { + dsa_switch_release_ports(ds); dsa_tree_put(dst); + } return err; } @@ -900,15 +920,9 @@ EXPORT_SYMBOL_GPL(dsa_register_switch); static void dsa_switch_remove(struct dsa_switch *ds) { struct dsa_switch_tree *dst = ds->dst; - struct dsa_port *dp, *next; dsa_tree_teardown(dst); - - list_for_each_entry_safe(dp, next, &dst->ports, list) { - list_del(&dp->list); - kfree(dp); - } - + dsa_switch_release_ports(ds); dsa_tree_put(dst); } diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index 9040fe55e0f5..c8b903302ff2 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -335,22 +335,6 @@ int eth_mac_addr(struct net_device *dev, void *p) } EXPORT_SYMBOL(eth_mac_addr); -/** - * eth_change_mtu - set new MTU size - * @dev: network device - * @new_mtu: new Maximum Transfer Unit - * - * Allow changing MTU size. Needs to be overridden for devices - * supporting jumbo frames. - */ -int eth_change_mtu(struct net_device *dev, int new_mtu) -{ - netdev_warn(dev, "%s is deprecated\n", __func__); - dev->mtu = new_mtu; - return 0; -} -EXPORT_SYMBOL(eth_change_mtu); - int eth_validate_addr(struct net_device *dev) { if (!is_valid_ether_addr(dev->dev_addr)) diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile index 9a1332fb0cc6..424545a4aaec 100644 --- a/net/ethtool/Makefile +++ b/net/ethtool/Makefile @@ -5,4 +5,4 @@ obj-y += ioctl.o common.o obj-$(CONFIG_ETHTOOL_NETLINK) += ethtool_nl.o ethtool_nl-y := netlink.o bitset.o strset.o linkinfo.o linkmodes.o \ - linkstate.o + linkstate.o debug.o wol.o diff --git a/net/ethtool/common.c b/net/ethtool/common.c index e621b1694d2f..636ec6d5110e 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -59,6 +59,7 @@ const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN] = { [NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record", [NETIF_F_HW_TLS_TX_BIT] = "tls-hw-tx-offload", [NETIF_F_HW_TLS_RX_BIT] = "tls-hw-rx-offload", + [NETIF_F_GRO_FRAGLIST_BIT] = "rx-gro-list", }; const char @@ -170,6 +171,37 @@ const char link_mode_names[][ETH_GSTRING_LEN] = { }; static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS); +const char netif_msg_class_names[][ETH_GSTRING_LEN] = { + [NETIF_MSG_DRV_BIT] = "drv", + [NETIF_MSG_PROBE_BIT] = "probe", + [NETIF_MSG_LINK_BIT] = "link", + [NETIF_MSG_TIMER_BIT] = "timer", + [NETIF_MSG_IFDOWN_BIT] = "ifdown", + [NETIF_MSG_IFUP_BIT] = "ifup", + [NETIF_MSG_RX_ERR_BIT] = "rx_err", + [NETIF_MSG_TX_ERR_BIT] = "tx_err", + [NETIF_MSG_TX_QUEUED_BIT] = "tx_queued", + [NETIF_MSG_INTR_BIT] = "intr", + [NETIF_MSG_TX_DONE_BIT] = "tx_done", + [NETIF_MSG_RX_STATUS_BIT] = "rx_status", + [NETIF_MSG_PKTDATA_BIT] = "pktdata", + [NETIF_MSG_HW_BIT] = "hw", + [NETIF_MSG_WOL_BIT] = "wol", +}; +static_assert(ARRAY_SIZE(netif_msg_class_names) == NETIF_MSG_CLASS_COUNT); + +const char wol_mode_names[][ETH_GSTRING_LEN] = { + [const_ilog2(WAKE_PHY)] = "phy", + [const_ilog2(WAKE_UCAST)] = "ucast", + [const_ilog2(WAKE_MCAST)] = "mcast", + [const_ilog2(WAKE_BCAST)] = "bcast", + [const_ilog2(WAKE_ARP)] = "arp", + [const_ilog2(WAKE_MAGIC)] = "magic", + [const_ilog2(WAKE_MAGICSECURE)] = "magicsecure", + [const_ilog2(WAKE_FILTER)] = "filter", +}; +static_assert(ARRAY_SIZE(wol_mode_names) == WOL_MODE_COUNT); + /* return false if legacy contained non-0 deprecated fields * maxtxpkt/maxrxpkt. rest of ksettings always updated */ diff --git a/net/ethtool/common.h b/net/ethtool/common.h index 5c5f7dc90cd4..40ba74e0b9bb 100644 --- a/net/ethtool/common.h +++ b/net/ethtool/common.h @@ -19,6 +19,8 @@ tunable_strings[__ETHTOOL_TUNABLE_COUNT][ETH_GSTRING_LEN]; extern const char phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN]; extern const char link_mode_names[][ETH_GSTRING_LEN]; +extern const char netif_msg_class_names[][ETH_GSTRING_LEN]; +extern const char wol_mode_names[][ETH_GSTRING_LEN]; int __ethtool_get_link(struct net_device *dev); diff --git a/net/ethtool/debug.c b/net/ethtool/debug.c new file mode 100644 index 000000000000..aaef4843e6ba --- /dev/null +++ b/net/ethtool/debug.c @@ -0,0 +1,134 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "netlink.h" +#include "common.h" +#include "bitset.h" + +struct debug_req_info { + struct ethnl_req_info base; +}; + +struct debug_reply_data { + struct ethnl_reply_data base; + u32 msg_mask; +}; + +#define DEBUG_REPDATA(__reply_base) \ + container_of(__reply_base, struct debug_reply_data, base) + +static const struct nla_policy +debug_get_policy[ETHTOOL_A_DEBUG_MAX + 1] = { + [ETHTOOL_A_DEBUG_UNSPEC] = { .type = NLA_REJECT }, + [ETHTOOL_A_DEBUG_HEADER] = { .type = NLA_NESTED }, + [ETHTOOL_A_DEBUG_MSGMASK] = { .type = NLA_REJECT }, +}; + +static int debug_prepare_data(const struct ethnl_req_info *req_base, + struct ethnl_reply_data *reply_base, + struct genl_info *info) +{ + struct debug_reply_data *data = DEBUG_REPDATA(reply_base); + struct net_device *dev = reply_base->dev; + int ret; + + if (!dev->ethtool_ops->get_msglevel) + return -EOPNOTSUPP; + + ret = ethnl_ops_begin(dev); + if (ret < 0) + return ret; + data->msg_mask = dev->ethtool_ops->get_msglevel(dev); + ethnl_ops_complete(dev); + + return 0; +} + +static int debug_reply_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct debug_reply_data *data = DEBUG_REPDATA(reply_base); + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + + return ethnl_bitset32_size(&data->msg_mask, NULL, NETIF_MSG_CLASS_COUNT, + netif_msg_class_names, compact); +} + +static int debug_fill_reply(struct sk_buff *skb, + const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + const struct debug_reply_data *data = DEBUG_REPDATA(reply_base); + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + + return ethnl_put_bitset32(skb, ETHTOOL_A_DEBUG_MSGMASK, &data->msg_mask, + NULL, NETIF_MSG_CLASS_COUNT, + netif_msg_class_names, compact); +} + +const struct ethnl_request_ops ethnl_debug_request_ops = { + .request_cmd = ETHTOOL_MSG_DEBUG_GET, + .reply_cmd = ETHTOOL_MSG_DEBUG_GET_REPLY, + .hdr_attr = ETHTOOL_A_DEBUG_HEADER, + .max_attr = ETHTOOL_A_DEBUG_MAX, + .req_info_size = sizeof(struct debug_req_info), + .reply_data_size = sizeof(struct debug_reply_data), + .request_policy = debug_get_policy, + + .prepare_data = debug_prepare_data, + .reply_size = debug_reply_size, + .fill_reply = debug_fill_reply, +}; + +/* DEBUG_SET */ + +static const struct nla_policy +debug_set_policy[ETHTOOL_A_DEBUG_MAX + 1] = { + [ETHTOOL_A_DEBUG_UNSPEC] = { .type = NLA_REJECT }, + [ETHTOOL_A_DEBUG_HEADER] = { .type = NLA_NESTED }, + [ETHTOOL_A_DEBUG_MSGMASK] = { .type = NLA_NESTED }, +}; + +int ethnl_set_debug(struct sk_buff *skb, struct genl_info *info) +{ + struct nlattr *tb[ETHTOOL_A_DEBUG_MAX + 1]; + struct ethnl_req_info req_info = {}; + struct net_device *dev; + bool mod = false; + u32 msg_mask; + int ret; + + ret = nlmsg_parse(info->nlhdr, GENL_HDRLEN, tb, + ETHTOOL_A_DEBUG_MAX, debug_set_policy, + info->extack); + if (ret < 0) + return ret; + ret = ethnl_parse_header(&req_info, tb[ETHTOOL_A_DEBUG_HEADER], + genl_info_net(info), info->extack, true); + if (ret < 0) + return ret; + dev = req_info.dev; + if (!dev->ethtool_ops->get_msglevel || !dev->ethtool_ops->set_msglevel) + return -EOPNOTSUPP; + + rtnl_lock(); + ret = ethnl_ops_begin(dev); + if (ret < 0) + goto out_rtnl; + + msg_mask = dev->ethtool_ops->get_msglevel(dev); + ret = ethnl_update_bitset32(&msg_mask, NETIF_MSG_CLASS_COUNT, + tb[ETHTOOL_A_DEBUG_MSGMASK], + netif_msg_class_names, info->extack, &mod); + if (ret < 0 || !mod) + goto out_ops; + + dev->ethtool_ops->set_msglevel(dev, msg_mask); + ethtool_notify(dev, ETHTOOL_MSG_DEBUG_NTF, NULL); + +out_ops: + ethnl_ops_complete(dev); +out_rtnl: + rtnl_unlock(); + dev_put(dev); + return ret; +} diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 182bffbffa78..b987052d91ef 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -17,6 +17,7 @@ #include <linux/phy.h> #include <linux/bitops.h> #include <linux/uaccess.h> +#include <linux/vermagic.h> #include <linux/vmalloc.h> #include <linux/sfp.h> #include <linux/slab.h> @@ -655,6 +656,7 @@ static noinline_for_stack int ethtool_get_drvinfo(struct net_device *dev, memset(&info, 0, sizeof(info)); info.cmd = ETHTOOL_GDRVINFO; + strlcpy(info.version, UTS_RELEASE, sizeof(info.version)); if (ops->get_drvinfo) { ops->get_drvinfo(dev, &info); } else if (dev->dev.parent && dev->dev.parent->driver) { @@ -1316,6 +1318,7 @@ static int ethtool_set_wol(struct net_device *dev, char __user *useraddr) return ret; dev->wol_enabled = !!wol.wolopts; + ethtool_notify(dev, ETHTOOL_MSG_WOL_NTF, NULL); return 0; } @@ -2545,6 +2548,8 @@ int dev_ethtool(struct net *net, struct ifreq *ifr) case ETHTOOL_SMSGLVL: rc = ethtool_set_value_void(dev, useraddr, dev->ethtool_ops->set_msglevel); + if (!rc) + ethtool_notify(dev, ETHTOOL_MSG_DEBUG_NTF, NULL); break; case ETHTOOL_GEEE: rc = ethtool_get_eee(dev, useraddr); diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c index 86b79f9bc08d..180c194fab07 100644 --- a/net/ethtool/netlink.c +++ b/net/ethtool/netlink.c @@ -134,11 +134,12 @@ nla_put_failure: /** * ethnl_reply_init() - Create skb for a reply and fill device identification - * @payload: payload length (without netlink and genetlink header) - * @dev: device the reply is about (may be null) - * @cmd: ETHTOOL_MSG_* message type for reply - * @info: genetlink info of the received packet we respond to - * @ehdrp: place to store payload pointer returned by genlmsg_new() + * @payload: payload length (without netlink and genetlink header) + * @dev: device the reply is about (may be null) + * @cmd: ETHTOOL_MSG_* message type for reply + * @hdr_attrtype: attribute type for common header + * @info: genetlink info of the received packet we respond to + * @ehdrp: place to store payload pointer returned by genlmsg_new() * * Return: pointer to allocated skb on success, NULL on error */ @@ -188,10 +189,11 @@ static int ethnl_multicast(struct sk_buff *skb, struct net_device *dev) /** * struct ethnl_dump_ctx - context structure for generic dumpit() callback - * @ops: request ops of currently processed message type - * @req_info: parsed request header of processed request - * @pos_hash: saved iteration position - hashbucket - * @pos_idx: saved iteration position - index + * @ops: request ops of currently processed message type + * @req_info: parsed request header of processed request + * @reply_data: data needed to compose the reply + * @pos_hash: saved iteration position - hashbucket + * @pos_idx: saved iteration position - index * * These parameters are kept in struct netlink_callback as context preserved * between iterations. They are initialized by ethnl_default_start() and used @@ -211,6 +213,8 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = { [ETHTOOL_MSG_LINKINFO_GET] = ðnl_linkinfo_request_ops, [ETHTOOL_MSG_LINKMODES_GET] = ðnl_linkmodes_request_ops, [ETHTOOL_MSG_LINKSTATE_GET] = ðnl_linkstate_request_ops, + [ETHTOOL_MSG_DEBUG_GET] = ðnl_debug_request_ops, + [ETHTOOL_MSG_WOL_GET] = ðnl_wol_request_ops, }; static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb) @@ -268,9 +272,9 @@ out: /** * ethnl_init_reply_data() - Initialize reply data for GET request - * @req_info: pointer to embedded struct ethnl_req_info - * @ops: instance of struct ethnl_request_ops describing the layout - * @dev: network device to initialize the reply for + * @reply_data: pointer to embedded struct ethnl_reply_data + * @ops: instance of struct ethnl_request_ops describing the layout + * @dev: network device to initialize the reply for * * Fills the reply data part with zeros and sets the dev member. Must be called * before calling the ->fill_reply() callback (for each iteration when handling @@ -521,6 +525,8 @@ static const struct ethnl_request_ops * ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = { [ETHTOOL_MSG_LINKINFO_NTF] = ðnl_linkinfo_request_ops, [ETHTOOL_MSG_LINKMODES_NTF] = ðnl_linkmodes_request_ops, + [ETHTOOL_MSG_DEBUG_NTF] = ðnl_debug_request_ops, + [ETHTOOL_MSG_WOL_NTF] = ðnl_wol_request_ops, }; /* default notification handler */ @@ -604,6 +610,8 @@ typedef void (*ethnl_notify_handler_t)(struct net_device *dev, unsigned int cmd, static const ethnl_notify_handler_t ethnl_notify_handlers[] = { [ETHTOOL_MSG_LINKINFO_NTF] = ethnl_default_notify, [ETHTOOL_MSG_LINKMODES_NTF] = ethnl_default_notify, + [ETHTOOL_MSG_DEBUG_NTF] = ethnl_default_notify, + [ETHTOOL_MSG_WOL_NTF] = ethnl_default_notify, }; void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data) @@ -662,6 +670,31 @@ static const struct genl_ops ethtool_genl_ops[] = { .dumpit = ethnl_default_dumpit, .done = ethnl_default_done, }, + { + .cmd = ETHTOOL_MSG_DEBUG_GET, + .doit = ethnl_default_doit, + .start = ethnl_default_start, + .dumpit = ethnl_default_dumpit, + .done = ethnl_default_done, + }, + { + .cmd = ETHTOOL_MSG_DEBUG_SET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_set_debug, + }, + { + .cmd = ETHTOOL_MSG_WOL_GET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_default_doit, + .start = ethnl_default_start, + .dumpit = ethnl_default_dumpit, + .done = ethnl_default_done, + }, + { + .cmd = ETHTOOL_MSG_WOL_SET, + .flags = GENL_UNS_ADMIN_PERM, + .doit = ethnl_set_wol, + }, }; static const struct genl_multicast_group ethtool_nl_mcgrps[] = { diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index da9d6521a4eb..60efd87686ad 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -334,8 +334,12 @@ extern const struct ethnl_request_ops ethnl_strset_request_ops; extern const struct ethnl_request_ops ethnl_linkinfo_request_ops; extern const struct ethnl_request_ops ethnl_linkmodes_request_ops; extern const struct ethnl_request_ops ethnl_linkstate_request_ops; +extern const struct ethnl_request_ops ethnl_debug_request_ops; +extern const struct ethnl_request_ops ethnl_wol_request_ops; int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info); int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info); +int ethnl_set_debug(struct sk_buff *skb, struct genl_info *info); +int ethnl_set_wol(struct sk_buff *skb, struct genl_info *info); #endif /* _NET_ETHTOOL_NETLINK_H */ diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c index 82a059c13c1c..8e5911887b4c 100644 --- a/net/ethtool/strset.c +++ b/net/ethtool/strset.c @@ -50,6 +50,16 @@ static const struct strset_info info_template[] = { .count = __ETHTOOL_LINK_MODE_MASK_NBITS, .strings = link_mode_names, }, + [ETH_SS_MSG_CLASSES] = { + .per_dev = false, + .count = NETIF_MSG_CLASS_COUNT, + .strings = netif_msg_class_names, + }, + [ETH_SS_WOL_MODES] = { + .per_dev = false, + .count = WOL_MODE_COUNT, + .strings = wol_mode_names, + }, }; struct strset_req_info { @@ -85,6 +95,7 @@ get_stringset_policy[ETHTOOL_A_STRINGSET_MAX + 1] = { /** * strset_include() - test if a string set should be included in reply + * @info: parsed client request * @data: pointer to request data structure * @id: id of string set to check (ETH_SS_* constants) */ diff --git a/net/ethtool/wol.c b/net/ethtool/wol.c new file mode 100644 index 000000000000..e1b8a65b64c4 --- /dev/null +++ b/net/ethtool/wol.c @@ -0,0 +1,177 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "netlink.h" +#include "common.h" +#include "bitset.h" + +struct wol_req_info { + struct ethnl_req_info base; +}; + +struct wol_reply_data { + struct ethnl_reply_data base; + struct ethtool_wolinfo wol; + bool show_sopass; +}; + +#define WOL_REPDATA(__reply_base) \ + container_of(__reply_base, struct wol_reply_data, base) + +static const struct nla_policy +wol_get_policy[ETHTOOL_A_WOL_MAX + 1] = { + [ETHTOOL_A_WOL_UNSPEC] = { .type = NLA_REJECT }, + [ETHTOOL_A_WOL_HEADER] = { .type = NLA_NESTED }, + [ETHTOOL_A_WOL_MODES] = { .type = NLA_REJECT }, + [ETHTOOL_A_WOL_SOPASS] = { .type = NLA_REJECT }, +}; + +static int wol_prepare_data(const struct ethnl_req_info *req_base, + struct ethnl_reply_data *reply_base, + struct genl_info *info) +{ + struct wol_reply_data *data = WOL_REPDATA(reply_base); + struct net_device *dev = reply_base->dev; + int ret; + + if (!dev->ethtool_ops->get_wol) + return -EOPNOTSUPP; + + ret = ethnl_ops_begin(dev); + if (ret < 0) + return ret; + dev->ethtool_ops->get_wol(dev, &data->wol); + ethnl_ops_complete(dev); + /* do not include password in notifications */ + data->show_sopass = info && (data->wol.supported & WAKE_MAGICSECURE); + + return 0; +} + +static int wol_reply_size(const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + const struct wol_reply_data *data = WOL_REPDATA(reply_base); + int len; + + len = ethnl_bitset32_size(&data->wol.wolopts, &data->wol.supported, + WOL_MODE_COUNT, wol_mode_names, compact); + if (len < 0) + return len; + if (data->show_sopass) + len += nla_total_size(sizeof(data->wol.sopass)); + + return len; +} + +static int wol_fill_reply(struct sk_buff *skb, + const struct ethnl_req_info *req_base, + const struct ethnl_reply_data *reply_base) +{ + bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS; + const struct wol_reply_data *data = WOL_REPDATA(reply_base); + int ret; + + ret = ethnl_put_bitset32(skb, ETHTOOL_A_WOL_MODES, &data->wol.wolopts, + &data->wol.supported, WOL_MODE_COUNT, + wol_mode_names, compact); + if (ret < 0) + return ret; + if (data->show_sopass && + nla_put(skb, ETHTOOL_A_WOL_SOPASS, sizeof(data->wol.sopass), + data->wol.sopass)) + return -EMSGSIZE; + + return 0; +} + +const struct ethnl_request_ops ethnl_wol_request_ops = { + .request_cmd = ETHTOOL_MSG_WOL_GET, + .reply_cmd = ETHTOOL_MSG_WOL_GET_REPLY, + .hdr_attr = ETHTOOL_A_WOL_HEADER, + .max_attr = ETHTOOL_A_WOL_MAX, + .req_info_size = sizeof(struct wol_req_info), + .reply_data_size = sizeof(struct wol_reply_data), + .request_policy = wol_get_policy, + + .prepare_data = wol_prepare_data, + .reply_size = wol_reply_size, + .fill_reply = wol_fill_reply, +}; + +/* WOL_SET */ + +static const struct nla_policy +wol_set_policy[ETHTOOL_A_WOL_MAX + 1] = { + [ETHTOOL_A_WOL_UNSPEC] = { .type = NLA_REJECT }, + [ETHTOOL_A_WOL_HEADER] = { .type = NLA_NESTED }, + [ETHTOOL_A_WOL_MODES] = { .type = NLA_NESTED }, + [ETHTOOL_A_WOL_SOPASS] = { .type = NLA_BINARY, + .len = SOPASS_MAX }, +}; + +int ethnl_set_wol(struct sk_buff *skb, struct genl_info *info) +{ + struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL }; + struct nlattr *tb[ETHTOOL_A_WOL_MAX + 1]; + struct ethnl_req_info req_info = {}; + struct net_device *dev; + bool mod = false; + int ret; + + ret = nlmsg_parse(info->nlhdr, GENL_HDRLEN, tb, ETHTOOL_A_WOL_MAX, + wol_set_policy, info->extack); + if (ret < 0) + return ret; + ret = ethnl_parse_header(&req_info, tb[ETHTOOL_A_WOL_HEADER], + genl_info_net(info), info->extack, true); + if (ret < 0) + return ret; + dev = req_info.dev; + if (!dev->ethtool_ops->get_wol || !dev->ethtool_ops->set_wol) + return -EOPNOTSUPP; + + rtnl_lock(); + ret = ethnl_ops_begin(dev); + if (ret < 0) + goto out_rtnl; + + dev->ethtool_ops->get_wol(dev, &wol); + ret = ethnl_update_bitset32(&wol.wolopts, WOL_MODE_COUNT, + tb[ETHTOOL_A_WOL_MODES], wol_mode_names, + info->extack, &mod); + if (ret < 0) + goto out_ops; + if (wol.wolopts & ~wol.supported) { + NL_SET_ERR_MSG_ATTR(info->extack, tb[ETHTOOL_A_WOL_MODES], + "cannot enable unsupported WoL mode"); + ret = -EINVAL; + goto out_ops; + } + if (tb[ETHTOOL_A_WOL_SOPASS]) { + if (!(wol.supported & WAKE_MAGICSECURE)) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_WOL_SOPASS], + "magicsecure not supported, cannot set password"); + ret = -EINVAL; + goto out_ops; + } + ethnl_update_binary(wol.sopass, sizeof(wol.sopass), + tb[ETHTOOL_A_WOL_SOPASS], &mod); + } + + if (!mod) + goto out_ops; + ret = dev->ethtool_ops->set_wol(dev, &wol); + if (ret) + goto out_ops; + dev->wol_enabled = !!wol.wolopts; + ethtool_notify(dev, ETHTOOL_MSG_WOL_NTF, NULL); + +out_ops: + ethnl_ops_complete(dev); +out_rtnl: + rtnl_unlock(); + dev_put(dev); + return ret; +} diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c index 0e4a7cf6bc87..e2e219c7854a 100644 --- a/net/ipv4/esp4_offload.c +++ b/net/ipv4/esp4_offload.c @@ -57,6 +57,8 @@ static struct sk_buff *esp4_gro_receive(struct list_head *head, if (!x) goto out_reset; + skb->mark = xfrm_smark_get(skb->mark, x); + sp->xvec[sp->len++] = x; sp->olen++; diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c index 30fa771d382a..dcc79ff54b41 100644 --- a/net/ipv4/fou.c +++ b/net/ipv4/fou.c @@ -662,8 +662,8 @@ static const struct nla_policy fou_nl_policy[FOU_ATTR_MAX + 1] = { [FOU_ATTR_REMCSUM_NOPARTIAL] = { .type = NLA_FLAG, }, [FOU_ATTR_LOCAL_V4] = { .type = NLA_U32, }, [FOU_ATTR_PEER_V4] = { .type = NLA_U32, }, - [FOU_ATTR_LOCAL_V6] = { .type = sizeof(struct in6_addr), }, - [FOU_ATTR_PEER_V6] = { .type = sizeof(struct in6_addr), }, + [FOU_ATTR_LOCAL_V6] = { .len = sizeof(struct in6_addr), }, + [FOU_ATTR_PEER_V6] = { .len = sizeof(struct in6_addr), }, [FOU_ATTR_PEER_PORT] = { .type = NLA_U16, }, [FOU_ATTR_IFINDEX] = { .type = NLA_S32, }, }; diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 0fe2a5d3e258..74e1d964a615 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -1236,10 +1236,8 @@ int ip_tunnel_init(struct net_device *dev) iph->version = 4; iph->ihl = 5; - if (tunnel->collect_md) { - dev->features |= NETIF_F_NETNS_LOCAL; + if (tunnel->collect_md) netif_keep_dst(dev); - } return 0; } EXPORT_SYMBOL_GPL(ip_tunnel_init); diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index e90b600c7a25..37cddd18f282 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -187,8 +187,17 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev, int mtu; if (!dst) { - dev->stats.tx_carrier_errors++; - goto tx_error_icmp; + struct rtable *rt; + + fl->u.ip4.flowi4_oif = dev->ifindex; + fl->u.ip4.flowi4_flags |= FLOWI_FLAG_ANYSRC; + rt = __ip_route_output_key(dev_net(dev), &fl->u.ip4); + if (IS_ERR(rt)) { + dev->stats.tx_carrier_errors++; + goto tx_error_icmp; + } + dst = &rt->dst; + skb_dst_set(skb, dst); } dst_hold(dst); diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 511eaa94e2d1..d072c326dd64 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -321,7 +321,9 @@ static size_t nh_nlmsg_size_single(struct nexthop *nh) static size_t nh_nlmsg_size(struct nexthop *nh) { - size_t sz = nla_total_size(4); /* NHA_ID */ + size_t sz = NLMSG_ALIGN(sizeof(struct nhmsg)); + + sz += nla_total_size(4); /* NHA_ID */ if (nh->is_group) sz += nh_nlmsg_size_grp(nh); diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index cc90243ccf76..2580303249e2 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -289,6 +289,8 @@ static const struct snmp_mib snmp4_net_list[] = { SNMP_MIB_ITEM("TCPRcvQDrop", LINUX_MIB_TCPRCVQDROP), SNMP_MIB_ITEM("TCPWqueueTooBig", LINUX_MIB_TCPWQUEUETOOBIG), SNMP_MIB_ITEM("TCPFastOpenPassiveAltKey", LINUX_MIB_TCPFASTOPENPASSIVEALTKEY), + SNMP_MIB_ITEM("TcpTimeoutRehash", LINUX_MIB_TCPTIMEOUTREHASH), + SNMP_MIB_ITEM("TcpDuplicateDataRehash", LINUX_MIB_TCPDUPLICATEDATAREHASH), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 2010888e68ca..d5c57b3f77d5 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -271,6 +271,7 @@ static void *rt_cpu_seq_next(struct seq_file *seq, void *v, loff_t *pos) *pos = cpu+1; return &per_cpu(rt_cache_stat, cpu); } + (*pos)++; return NULL; } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6711a97de3ce..484485ae74c2 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -271,6 +271,7 @@ #include <net/icmp.h> #include <net/inet_common.h> #include <net/tcp.h> +#include <net/mptcp.h> #include <net/xfrm.h> #include <net/ip.h> #include <net/sock.h> @@ -2524,6 +2525,7 @@ static void tcp_rtx_queue_purge(struct sock *sk) { struct rb_node *p = rb_first(&sk->tcp_rtx_queue); + tcp_sk(sk)->highest_sack = NULL; while (p) { struct sk_buff *skb = rb_to_skb(p); @@ -2614,7 +2616,6 @@ int tcp_disconnect(struct sock *sk, int flags) WRITE_ONCE(tp->write_seq, seq); icsk->icsk_backoff = 0; - tp->snd_cwnd = 2; icsk->icsk_probes_out = 0; icsk->icsk_rto = TCP_TIMEOUT_INIT; tp->snd_ssthresh = TCP_INFINITE_SSTHRESH; @@ -3336,6 +3337,7 @@ static size_t tcp_opt_stats_get_size(void) nla_total_size(sizeof(u32)) + /* TCP_NLA_DSACK_DUPS */ nla_total_size(sizeof(u32)) + /* TCP_NLA_REORD_SEEN */ nla_total_size(sizeof(u32)) + /* TCP_NLA_SRTT */ + nla_total_size(sizeof(u16)) + /* TCP_NLA_TIMEOUT_REHASH */ 0; } @@ -3390,6 +3392,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk) nla_put_u32(stats, TCP_NLA_DSACK_DUPS, tp->dsack_dups); nla_put_u32(stats, TCP_NLA_REORD_SEEN, tp->reord_seen); nla_put_u32(stats, TCP_NLA_SRTT, tp->srtt_us >> 3); + nla_put_u16(stats, TCP_NLA_TIMEOUT_REHASH, tp->timeout_rehash); return stats; } @@ -4021,4 +4024,5 @@ void __init tcp_init(void) tcp_metrics_init(); BUG_ON(tcp_register_congestion_control(&tcp_reno) != 0); tcp_tasklet_init(); + mptcp_init(); } diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index a6545ef0d27b..6c4d79baff26 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -779,8 +779,7 @@ static void bbr_update_bw(struct sock *sk, const struct rate_sample *rs) * bandwidth sample. Delivered is in packets and interval_us in uS and * ratio will be <<1 for most connections. So delivered is first scaled. */ - bw = (u64)rs->delivered * BW_UNIT; - do_div(bw, rs->interval_us); + bw = div64_long((u64)rs->delivered * BW_UNIT, rs->interval_us); /* If this sample is application-limited, it is likely to have a very * low delivered count that represents application behavior rather than diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 358365598216..e8b840a4767e 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -79,6 +79,7 @@ #include <trace/events/tcp.h> #include <linux/jump_label_ratelimit.h> #include <net/busy_poll.h> +#include <net/mptcp.h> int sysctl_tcp_max_orphans __read_mostly = NR_FILE; @@ -3164,6 +3165,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack, tp->retransmit_skb_hint = NULL; if (unlikely(skb == tp->lost_skb_hint)) tp->lost_skb_hint = NULL; + tcp_highest_sack_replace(sk, skb, next); tcp_rtx_queue_unlink_and_free(skb, sk); } @@ -3924,6 +3926,10 @@ void tcp_parse_options(const struct net *net, */ break; #endif + case TCPOPT_MPTCP: + mptcp_parse_option(skb, ptr, opsize, opt_rx); + break; + case TCPOPT_FASTOPEN: tcp_parse_fastopen_option( opsize - TCPOLEN_FASTOPEN_BASE, @@ -4265,8 +4271,10 @@ static void tcp_rcv_spurious_retrans(struct sock *sk, const struct sk_buff *skb) * The receiver remembers and reflects via DSACKs. Leverage the * DSACK state and change the txhash to re-route speculatively. */ - if (TCP_SKB_CB(skb)->seq == tcp_sk(sk)->duplicate_sack[0].start_seq) + if (TCP_SKB_CB(skb)->seq == tcp_sk(sk)->duplicate_sack[0].start_seq) { sk_rethink_txhash(sk); + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDUPLICATEDATAREHASH); + } } static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb) @@ -4765,6 +4773,9 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) bool fragstolen; int eaten; + if (sk_is_mptcp(sk)) + mptcp_incoming_options(sk, skb, &tp->rx_opt); + if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) { __kfree_skb(skb); return; @@ -5973,6 +5984,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); tcp_initialize_rcv_mss(sk); + if (sk_is_mptcp(sk)) + mptcp_rcv_synsent(sk); + /* Remember, tcp_poll() does not lock socket! * Change state from SYN-SENT only after copied_seq * is initialized. */ @@ -6338,8 +6352,11 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) case TCP_CLOSE_WAIT: case TCP_CLOSING: case TCP_LAST_ACK: - if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) + if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { + if (sk_is_mptcp(sk)) + mptcp_incoming_options(sk, skb, &tp->rx_opt); break; + } /* fall through */ case TCP_FIN_WAIT1: case TCP_FIN_WAIT2: @@ -6595,6 +6612,9 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, tcp_rsk(req)->af_specific = af_ops; tcp_rsk(req)->ts_off = 0; +#if IS_ENABLED(CONFIG_MPTCP) + tcp_rsk(req)->is_mptcp = 0; +#endif tcp_clear_options(&tmp_opt); tmp_opt.mss_clamp = af_ops->mss_clamp; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 786978cb2db7..306e25d743e8 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -38,6 +38,7 @@ #define pr_fmt(fmt) "TCP: " fmt #include <net/tcp.h> +#include <net/mptcp.h> #include <linux/compiler.h> #include <linux/gfp.h> @@ -414,6 +415,7 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp) #define OPTION_WSCALE (1 << 3) #define OPTION_FAST_OPEN_COOKIE (1 << 8) #define OPTION_SMC (1 << 9) +#define OPTION_MPTCP (1 << 10) static void smc_options_write(__be32 *ptr, u16 *options) { @@ -439,8 +441,17 @@ struct tcp_out_options { __u8 *hash_location; /* temporary pointer, overloaded */ __u32 tsval, tsecr; /* need to include OPTION_TS */ struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */ + struct mptcp_out_options mptcp; }; +static void mptcp_options_write(__be32 *ptr, struct tcp_out_options *opts) +{ +#if IS_ENABLED(CONFIG_MPTCP) + if (unlikely(OPTION_MPTCP & opts->options)) + mptcp_write_options(ptr, &opts->mptcp); +#endif +} + /* Write previously computed TCP options to the packet. * * Beware: Something in the Internet is very sensitive to the ordering of @@ -549,6 +560,8 @@ static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp, } smc_options_write(ptr, &options); + + mptcp_options_write(ptr, opts); } static void smc_set_option(const struct tcp_sock *tp, @@ -584,6 +597,22 @@ static void smc_set_option_cond(const struct tcp_sock *tp, #endif } +static void mptcp_set_option_cond(const struct request_sock *req, + struct tcp_out_options *opts, + unsigned int *remaining) +{ + if (rsk_is_mptcp(req)) { + unsigned int size; + + if (mptcp_synack_options(req, &size, &opts->mptcp)) { + if (*remaining >= size) { + opts->options |= OPTION_MPTCP; + *remaining -= size; + } + } + } +} + /* Compute TCP options for SYN packets. This is not the final * network wire format yet. */ @@ -653,6 +682,15 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, smc_set_option(tp, opts, &remaining); + if (sk_is_mptcp(sk)) { + unsigned int size; + + if (mptcp_syn_options(sk, skb, &size, &opts->mptcp)) { + opts->options |= OPTION_MPTCP; + remaining -= size; + } + } + return MAX_TCP_OPTION_SPACE - remaining; } @@ -714,6 +752,8 @@ static unsigned int tcp_synack_options(const struct sock *sk, } } + mptcp_set_option_cond(req, opts, &remaining); + smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); return MAX_TCP_OPTION_SPACE - remaining; @@ -751,6 +791,23 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb size += TCPOLEN_TSTAMP_ALIGNED; } + /* MPTCP options have precedence over SACK for the limited TCP + * option space because a MPTCP connection would be forced to + * fall back to regular TCP if a required multipath option is + * missing. SACK still gets a chance to use whatever space is + * left. + */ + if (sk_is_mptcp(sk)) { + unsigned int remaining = MAX_TCP_OPTION_SPACE - size; + unsigned int opt_size = 0; + + if (mptcp_established_options(sk, skb, &opt_size, remaining, + &opts->mptcp)) { + opts->options |= OPTION_MPTCP; + size += opt_size; + } + } + eff_sacks = tp->rx_opt.num_sacks + tp->rx_opt.dsack; if (unlikely(eff_sacks)) { const unsigned int remaining = MAX_TCP_OPTION_SPACE - size; @@ -3236,6 +3293,7 @@ int tcp_send_synack(struct sock *sk) if (!nskb) return -ENOMEM; INIT_LIST_HEAD(&nskb->tcp_tsorted_anchor); + tcp_highest_sack_replace(sk, skb, nskb); tcp_rtx_queue_unlink_and_free(skb, sk); __skb_header_release(nskb); tcp_rbtree_insert(&sk->tcp_rtx_queue, nskb); diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 1097b438befe..c3f26dcd6704 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -223,6 +223,9 @@ static int tcp_write_timeout(struct sock *sk) dst_negative_advice(sk); } else { sk_rethink_txhash(sk); + tp->timeout_rehash++; + __NET_INC_STATS(sock_net(sk), + LINUX_MIB_TCPTIMEOUTREHASH); } retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries; expired = icsk->icsk_retransmits >= retry_until; @@ -234,6 +237,9 @@ static int tcp_write_timeout(struct sock *sk) dst_negative_advice(sk); } else { sk_rethink_txhash(sk); + tp->timeout_rehash++; + __NET_INC_STATS(sock_net(sk), + LINUX_MIB_TCPTIMEOUTREHASH); } retry_until = net->ipv4.sysctl_tcp_retries2; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index e4fd4408b775..db76b9609299 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1368,7 +1368,8 @@ static void udp_rmem_release(struct sock *sk, int size, int partial, if (likely(partial)) { up->forward_deficit += size; size = up->forward_deficit; - if (size < (sk->sk_rcvbuf >> 2)) + if (size < (sk->sk_rcvbuf >> 2) && + !skb_queue_empty(&up->reader_queue)) return; } else { size += up->forward_deficit; diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index b25e42100ceb..1a98583a79f4 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -184,6 +184,20 @@ out_unlock: } EXPORT_SYMBOL(skb_udp_tunnel_segment); +static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb, + netdev_features_t features) +{ + unsigned int mss = skb_shinfo(skb)->gso_size; + + skb = skb_segment_list(skb, features, skb_mac_header_len(skb)); + if (IS_ERR(skb)) + return skb; + + udp_hdr(skb)->len = htons(sizeof(struct udphdr) + mss); + + return skb; +} + struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, netdev_features_t features) { @@ -196,6 +210,9 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, __sum16 check; __be16 newlen; + if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST) + return __udp_gso_segment_list(gso_skb, features); + mss = skb_shinfo(gso_skb)->gso_size; if (gso_skb->len <= sizeof(*uh) + mss) return ERR_PTR(-EINVAL); @@ -354,6 +371,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head, struct udphdr *uh2; struct sk_buff *p; unsigned int ulen; + int ret = 0; /* requires non zero csum, for symmetry with GSO */ if (!uh->check) { @@ -369,7 +387,6 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head, } /* pull encapsulating udp header */ skb_gro_pull(skb, sizeof(struct udphdr)); - skb_gro_postpull_rcsum(skb, uh, sizeof(struct udphdr)); list_for_each_entry(p, head, list) { if (!NAPI_GRO_CB(p)->same_flow) @@ -383,14 +400,40 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head, continue; } + if (NAPI_GRO_CB(skb)->is_flist != NAPI_GRO_CB(p)->is_flist) { + NAPI_GRO_CB(skb)->flush = 1; + return p; + } + /* Terminate the flow on len mismatch or if it grow "too much". * Under small packet flood GRO count could elsewhere grow a lot * leading to excessive truesize values. * On len mismatch merge the first packet shorter than gso_size, * otherwise complete the GRO packet. */ - if (ulen > ntohs(uh2->len) || skb_gro_receive(p, skb) || - ulen != ntohs(uh2->len) || + if (ulen > ntohs(uh2->len)) { + pp = p; + } else { + if (NAPI_GRO_CB(skb)->is_flist) { + if (!pskb_may_pull(skb, skb_gro_offset(skb))) { + NAPI_GRO_CB(skb)->flush = 1; + return NULL; + } + if ((skb->ip_summed != p->ip_summed) || + (skb->csum_level != p->csum_level)) { + NAPI_GRO_CB(skb)->flush = 1; + return NULL; + } + ret = skb_gro_receive_list(p, skb); + } else { + skb_gro_postpull_rcsum(skb, uh, + sizeof(struct udphdr)); + + ret = skb_gro_receive(p, skb); + } + } + + if (ret || ulen != ntohs(uh2->len) || NAPI_GRO_CB(p)->count >= UDP_GRO_CNT_MAX) pp = p; @@ -401,36 +444,29 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head, return NULL; } -INDIRECT_CALLABLE_DECLARE(struct sock *udp6_lib_lookup_skb(struct sk_buff *skb, - __be16 sport, __be16 dport)); struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb, - struct udphdr *uh, udp_lookup_t lookup) + struct udphdr *uh, struct sock *sk) { struct sk_buff *pp = NULL; struct sk_buff *p; struct udphdr *uh2; unsigned int off = skb_gro_offset(skb); int flush = 1; - struct sock *sk; - rcu_read_lock(); - sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb, - udp4_lib_lookup_skb, skb, uh->source, uh->dest); - if (!sk) - goto out_unlock; + if (skb->dev->features & NETIF_F_GRO_FRAGLIST) + NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1; - if (udp_sk(sk)->gro_enabled) { + if ((sk && udp_sk(sk)->gro_enabled) || NAPI_GRO_CB(skb)->is_flist) { pp = call_gro_receive(udp_gro_receive_segment, head, skb); - rcu_read_unlock(); return pp; } - if (NAPI_GRO_CB(skb)->encap_mark || + if (!sk || NAPI_GRO_CB(skb)->encap_mark || (skb->ip_summed != CHECKSUM_PARTIAL && NAPI_GRO_CB(skb)->csum_cnt == 0 && !NAPI_GRO_CB(skb)->csum_valid) || !udp_sk(sk)->gro_receive) - goto out_unlock; + goto out; /* mark that this skb passed once through the tunnel gro layer */ NAPI_GRO_CB(skb)->encap_mark = 1; @@ -457,8 +493,7 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb, skb_gro_postpull_rcsum(skb, uh, sizeof(struct udphdr)); pp = call_gro_receive_sk(udp_sk(sk)->gro_receive, sk, head, skb); -out_unlock: - rcu_read_unlock(); +out: skb_gro_flush_final(skb, pp, flush); return pp; } @@ -468,8 +503,10 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); + struct sk_buff *pp; + struct sock *sk; - if (unlikely(!uh) || !static_branch_unlikely(&udp_encap_needed_key)) + if (unlikely(!uh)) goto flush; /* Don't bother verifying checksum if we're going to flush anyway. */ @@ -484,7 +521,11 @@ struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb) inet_gro_compute_pseudo); skip: NAPI_GRO_CB(skb)->is_ipv6 = 0; - return udp_gro_receive(head, skb, uh, udp4_lib_lookup_skb); + rcu_read_lock(); + sk = static_branch_unlikely(&udp_encap_needed_key) ? udp4_lib_lookup_skb(skb, uh->source, uh->dest) : NULL; + pp = udp_gro_receive(head, skb, uh, sk); + rcu_read_unlock(); + return pp; flush: NAPI_GRO_CB(skb)->flush = 1; @@ -517,9 +558,7 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff, rcu_read_lock(); sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb, udp4_lib_lookup_skb, skb, uh->source, uh->dest); - if (sk && udp_sk(sk)->gro_enabled) { - err = udp_gro_complete_segment(skb); - } else if (sk && udp_sk(sk)->gro_complete) { + if (sk && udp_sk(sk)->gro_complete) { skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL; @@ -529,6 +568,8 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff, skb->encapsulation = 1; err = udp_sk(sk)->gro_complete(sk, skb, nhoff + sizeof(struct udphdr)); + } else { + err = udp_gro_complete_segment(skb); } rcu_read_unlock(); @@ -544,6 +585,23 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff) const struct iphdr *iph = ip_hdr(skb); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); + if (NAPI_GRO_CB(skb)->is_flist) { + uh->len = htons(skb->len - nhoff); + + skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); + skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count; + + if (skb->ip_summed == CHECKSUM_UNNECESSARY) { + if (skb->csum_level < SKB_MAX_CSUM_LEVEL) + skb->csum_level++; + } else { + skb->ip_summed = CHECKSUM_UNNECESSARY; + skb->csum_level = 0; + } + + return 0; + } + if (uh->check) uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr, iph->daddr, 0); diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c index e31626ffccd1..fd535053245b 100644 --- a/net/ipv6/esp6_offload.c +++ b/net/ipv6/esp6_offload.c @@ -79,6 +79,8 @@ static struct sk_buff *esp6_gro_receive(struct list_head *head, if (!x) goto out_reset; + skb->mark = xfrm_smark_get(skb->mark, x); + sp->xvec[sp->len++] = x; sp->olen++; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index b1e9a10e1133..58fbde244381 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -2571,14 +2571,13 @@ static void *ipv6_route_seq_next(struct seq_file *seq, void *v, loff_t *pos) struct net *net = seq_file_net(seq); struct ipv6_route_iter *iter = seq->private; + ++(*pos); if (!v) goto iter_table; n = rcu_dereference_bh(((struct fib6_info *)v)->fib6_next); - if (n) { - ++*pos; + if (n) return n; - } iter_table: ipv6_route_check_sernum(iter); @@ -2586,8 +2585,6 @@ iter_table: r = fib6_walk_continue(&iter->w); spin_unlock_bh(&iter->tbl->tb6_lock); if (r > 0) { - if (v) - ++*pos; return iter->w.leaf; } else if (r < 0) { fib6_walker_unlink(net, &iter->w); diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c index ee968d980746..55bfc5149d0c 100644 --- a/net/ipv6/ip6_gre.c +++ b/net/ipv6/ip6_gre.c @@ -1466,7 +1466,6 @@ static int ip6gre_tunnel_init_common(struct net_device *dev) dev->mtu -= 8; if (tunnel->parms.collect_md) { - dev->features |= NETIF_F_NETNS_LOCAL; netif_keep_dst(dev); } ip6gre_tnl_init_features(dev); @@ -1894,7 +1893,6 @@ static void ip6gre_tap_setup(struct net_device *dev) dev->needs_free_netdev = true; dev->priv_destructor = ip6gre_dev_free; - dev->features |= NETIF_F_NETNS_LOCAL; dev->priv_flags &= ~IFF_TX_SKB_SHARING; dev->priv_flags |= IFF_LIVE_ADDR_CHANGE; netif_keep_dst(dev); @@ -2197,7 +2195,6 @@ static void ip6erspan_tap_setup(struct net_device *dev) dev->needs_free_netdev = true; dev->priv_destructor = ip6gre_dev_free; - dev->features |= NETIF_F_NETNS_LOCAL; dev->priv_flags &= ~IFF_TX_SKB_SHARING; dev->priv_flags |= IFF_LIVE_ADDR_CHANGE; netif_keep_dst(dev); diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 2f376dbc37d5..b5dd20c4599b 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1877,10 +1877,8 @@ static int ip6_tnl_dev_init(struct net_device *dev) if (err) return err; ip6_tnl_link_config(t); - if (t->parms.collect_md) { - dev->features |= NETIF_F_NETNS_LOCAL; + if (t->parms.collect_md) netif_keep_dst(dev); - } return 0; } diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index 6f08b760c2a7..524006aa0d78 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -449,8 +449,17 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) int err = -1; int mtu; - if (!dst) - goto tx_err_link_failure; + if (!dst) { + fl->u.ip6.flowi6_oif = dev->ifindex; + fl->u.ip6.flowi6_flags |= FLOWI_FLAG_ANYSRC; + dst = ip6_route_output(dev_net(dev), NULL, &fl->u.ip6); + if (dst->error) { + dst_release(dst); + dst = NULL; + goto tx_err_link_failure; + } + skb_dst_set(skb, dst); + } dst_hold(dst); dst = xfrm_lookup(t->net, dst, fl, NULL, 0); diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 85a5447a3e8d..7cbc19731997 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -23,6 +23,7 @@ #include <net/addrconf.h> #include <net/ip6_route.h> #include <net/dst_cache.h> +#include <net/ip_tunnels.h> #ifdef CONFIG_IPV6_SEG6_HMAC #include <net/seg6_hmac.h> #endif @@ -135,7 +136,8 @@ static bool decap_and_validate(struct sk_buff *skb, int proto) skb_reset_network_header(skb); skb_reset_transport_header(skb); - skb->encapsulation = 0; + if (iptunnel_pull_offloads(skb)) + return false; return true; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 5b5260103b65..33a578a3eb3a 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -238,6 +238,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3]; icsk->icsk_af_ops = &ipv6_mapped; + if (sk_is_mptcp(sk)) + mptcp_handle_ipv6_mapped(sk, true); sk->sk_backlog_rcv = tcp_v4_do_rcv; #ifdef CONFIG_TCP_MD5SIG tp->af_specific = &tcp_sock_ipv6_mapped_specific; @@ -248,6 +250,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, if (err) { icsk->icsk_ext_hdr_len = exthdrlen; icsk->icsk_af_ops = &ipv6_specific; + if (sk_is_mptcp(sk)) + mptcp_handle_ipv6_mapped(sk, false); sk->sk_backlog_rcv = tcp_v6_do_rcv; #ifdef CONFIG_TCP_MD5SIG tp->af_specific = &tcp_sock_ipv6_specific; @@ -1203,6 +1207,8 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff * newnp->saddr = newsk->sk_v6_rcv_saddr; inet_csk(newsk)->icsk_af_ops = &ipv6_mapped; + if (sk_is_mptcp(newsk)) + mptcp_handle_ipv6_mapped(newsk, true); newsk->sk_backlog_rcv = tcp_v4_do_rcv; #ifdef CONFIG_TCP_MD5SIG newtp->af_specific = &tcp_sock_ipv6_mapped_specific; @@ -2163,9 +2169,16 @@ int __init tcpv6_init(void) ret = register_pernet_subsys(&tcpv6_net_ops); if (ret) goto out_tcpv6_protosw; + + ret = mptcpv6_init(); + if (ret) + goto out_tcpv6_pernet_subsys; + out: return ret; +out_tcpv6_pernet_subsys: + unregister_pernet_subsys(&tcpv6_net_ops); out_tcpv6_protosw: inet6_unregister_protosw(&tcpv6_protosw); out_tcpv6_protocol: diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c index f0d5fc27d0b5..584157a07759 100644 --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -115,8 +115,10 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) { struct udphdr *uh = udp_gro_udphdr(skb); + struct sk_buff *pp; + struct sock *sk; - if (unlikely(!uh) || !static_branch_unlikely(&udpv6_encap_needed_key)) + if (unlikely(!uh)) goto flush; /* Don't bother verifying checksum if we're going to flush anyway. */ @@ -132,7 +134,11 @@ struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb) skip: NAPI_GRO_CB(skb)->is_ipv6 = 1; - return udp_gro_receive(head, skb, uh, udp6_lib_lookup_skb); + rcu_read_lock(); + sk = static_branch_unlikely(&udpv6_encap_needed_key) ? udp6_lib_lookup_skb(skb, uh->source, uh->dest) : NULL; + pp = udp_gro_receive(head, skb, uh, sk); + rcu_read_unlock(); + return pp; flush: NAPI_GRO_CB(skb)->flush = 1; @@ -144,6 +150,23 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff) const struct ipv6hdr *ipv6h = ipv6_hdr(skb); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); + if (NAPI_GRO_CB(skb)->is_flist) { + uh->len = htons(skb->len - nhoff); + + skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); + skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count; + + if (skb->ip_summed == CHECKSUM_UNNECESSARY) { + if (skb->csum_level < SKB_MAX_CSUM_LEVEL) + skb->csum_level++; + } else { + skb->ip_summed = CHECKSUM_UNNECESSARY; + skb->csum_level = 0; + } + + return 0; + } + if (uh->check) uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr, &ipv6h->daddr, 0); diff --git a/net/mptcp/Kconfig b/net/mptcp/Kconfig new file mode 100644 index 000000000000..5db56d2218c5 --- /dev/null +++ b/net/mptcp/Kconfig @@ -0,0 +1,26 @@ + +config MPTCP + bool "MPTCP: Multipath TCP" + depends on INET + select SKB_EXTENSIONS + select CRYPTO_LIB_SHA256 + help + Multipath TCP (MPTCP) connections send and receive data over multiple + subflows in order to utilize multiple network paths. Each subflow + uses the TCP protocol, and TCP options carry header information for + MPTCP. + +config MPTCP_IPV6 + bool "MPTCP: IPv6 support for Multipath TCP" + depends on MPTCP + select IPV6 + default y + +config MPTCP_HMAC_TEST + bool "Tests for MPTCP HMAC implementation" + default n + help + This option enable boot time self-test for the HMAC implementation + used by the MPTCP code + + Say N if you are unsure. diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile new file mode 100644 index 000000000000..4e98d9edfd0a --- /dev/null +++ b/net/mptcp/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_MPTCP) += mptcp.o + +mptcp-y := protocol.o subflow.o options.o token.o crypto.o ctrl.o diff --git a/net/mptcp/crypto.c b/net/mptcp/crypto.c new file mode 100644 index 000000000000..40d1bb18fd60 --- /dev/null +++ b/net/mptcp/crypto.c @@ -0,0 +1,152 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP cryptographic functions + * Copyright (c) 2017 - 2019, Intel Corporation. + * + * Note: This code is based on mptcp_ctrl.c, mptcp_ipv4.c, and + * mptcp_ipv6 from multipath-tcp.org, authored by: + * + * Sébastien Barré <sebastien.barre@uclouvain.be> + * Christoph Paasch <christoph.paasch@uclouvain.be> + * Jaakko Korkeaniemi <jaakko.korkeaniemi@aalto.fi> + * Gregory Detal <gregory.detal@uclouvain.be> + * Fabien Duchêne <fabien.duchene@uclouvain.be> + * Andreas Seelinger <Andreas.Seelinger@rwth-aachen.de> + * Lavkesh Lahngir <lavkesh51@gmail.com> + * Andreas Ripke <ripke@neclab.eu> + * Vlad Dogaru <vlad.dogaru@intel.com> + * Octavian Purdila <octavian.purdila@intel.com> + * John Ronan <jronan@tssg.org> + * Catalin Nicutar <catalin.nicutar@gmail.com> + * Brandon Heller <brandonh@stanford.edu> + */ + +#include <linux/kernel.h> +#include <crypto/sha.h> +#include <asm/unaligned.h> + +#include "protocol.h" + +#define SHA256_DIGEST_WORDS (SHA256_DIGEST_SIZE / 4) + +void mptcp_crypto_key_sha(u64 key, u32 *token, u64 *idsn) +{ + __be32 mptcp_hashed_key[SHA256_DIGEST_WORDS]; + __be64 input = cpu_to_be64(key); + struct sha256_state state; + + sha256_init(&state); + sha256_update(&state, (__force u8 *)&input, sizeof(input)); + sha256_final(&state, (u8 *)mptcp_hashed_key); + + if (token) + *token = be32_to_cpu(mptcp_hashed_key[0]); + if (idsn) + *idsn = be64_to_cpu(*((__be64 *)&mptcp_hashed_key[6])); +} + +void mptcp_crypto_hmac_sha(u64 key1, u64 key2, u32 nonce1, u32 nonce2, + void *hmac) +{ + u8 input[SHA256_BLOCK_SIZE + SHA256_DIGEST_SIZE]; + __be32 mptcp_hashed_key[SHA256_DIGEST_WORDS]; + __be32 *hash_out = (__force __be32 *)hmac; + struct sha256_state state; + u8 key1be[8]; + u8 key2be[8]; + int i; + + put_unaligned_be64(key1, key1be); + put_unaligned_be64(key2, key2be); + + /* Generate key xored with ipad */ + memset(input, 0x36, SHA_MESSAGE_BYTES); + for (i = 0; i < 8; i++) + input[i] ^= key1be[i]; + for (i = 0; i < 8; i++) + input[i + 8] ^= key2be[i]; + + put_unaligned_be32(nonce1, &input[SHA256_BLOCK_SIZE]); + put_unaligned_be32(nonce2, &input[SHA256_BLOCK_SIZE + 4]); + + sha256_init(&state); + sha256_update(&state, input, SHA256_BLOCK_SIZE + 8); + + /* emit sha256(K1 || msg) on the second input block, so we can + * reuse 'input' for the last hashing + */ + sha256_final(&state, &input[SHA256_BLOCK_SIZE]); + + /* Prepare second part of hmac */ + memset(input, 0x5C, SHA_MESSAGE_BYTES); + for (i = 0; i < 8; i++) + input[i] ^= key1be[i]; + for (i = 0; i < 8; i++) + input[i + 8] ^= key2be[i]; + + sha256_init(&state); + sha256_update(&state, input, SHA256_BLOCK_SIZE + SHA256_DIGEST_SIZE); + sha256_final(&state, (u8 *)mptcp_hashed_key); + + /* takes only first 160 bits */ + for (i = 0; i < 5; i++) + hash_out[i] = mptcp_hashed_key[i]; +} + +#ifdef CONFIG_MPTCP_HMAC_TEST +struct test_cast { + char *key; + char *msg; + char *result; +}; + +/* we can't reuse RFC 4231 test vectors, as we have constraint on the + * input and key size, and we truncate the output. + */ +static struct test_cast tests[] = { + { + .key = "0b0b0b0b0b0b0b0b", + .msg = "48692054", + .result = "8385e24fb4235ac37556b6b886db106284a1da67", + }, + { + .key = "aaaaaaaaaaaaaaaa", + .msg = "dddddddd", + .result = "2c5e219164ff1dca1c4a92318d847bb6b9d44492", + }, + { + .key = "0102030405060708", + .msg = "cdcdcdcd", + .result = "e73b9ba9969969cefb04aa0d6df18ec2fcc075b6", + }, +}; + +static int __init test_mptcp_crypto(void) +{ + char hmac[20], hmac_hex[41]; + u32 nonce1, nonce2; + u64 key1, key2; + int i, j; + + for (i = 0; i < ARRAY_SIZE(tests); ++i) { + /* mptcp hmap will convert to be before computing the hmac */ + key1 = be64_to_cpu(*((__be64 *)&tests[i].key[0])); + key2 = be64_to_cpu(*((__be64 *)&tests[i].key[8])); + nonce1 = be32_to_cpu(*((__be32 *)&tests[i].msg[0])); + nonce2 = be32_to_cpu(*((__be32 *)&tests[i].msg[4])); + + mptcp_crypto_hmac_sha(key1, key2, nonce1, nonce2, hmac); + for (j = 0; j < 20; ++j) + sprintf(&hmac_hex[j << 1], "%02x", hmac[j] & 0xff); + hmac_hex[40] = 0; + + if (memcmp(hmac_hex, tests[i].result, 40)) + pr_err("test %d failed, got %s expected %s", i, + hmac_hex, tests[i].result); + else + pr_info("test %d [ ok ]", i); + } + return 0; +} + +late_initcall(test_mptcp_crypto); +#endif diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c new file mode 100644 index 000000000000..8e39585d37f3 --- /dev/null +++ b/net/mptcp/ctrl.c @@ -0,0 +1,130 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP + * + * Copyright (c) 2019, Tessares SA. + */ + +#include <linux/sysctl.h> + +#include <net/net_namespace.h> +#include <net/netns/generic.h> + +#include "protocol.h" + +#define MPTCP_SYSCTL_PATH "net/mptcp" + +static int mptcp_pernet_id; +struct mptcp_pernet { + struct ctl_table_header *ctl_table_hdr; + + int mptcp_enabled; +}; + +static struct mptcp_pernet *mptcp_get_pernet(struct net *net) +{ + return net_generic(net, mptcp_pernet_id); +} + +int mptcp_is_enabled(struct net *net) +{ + return mptcp_get_pernet(net)->mptcp_enabled; +} + +static struct ctl_table mptcp_sysctl_table[] = { + { + .procname = "enabled", + .maxlen = sizeof(int), + .mode = 0644, + /* users with CAP_NET_ADMIN or root (not and) can change this + * value, same as other sysctl or the 'net' tree. + */ + .proc_handler = proc_dointvec, + }, + {} +}; + +static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet) +{ + pernet->mptcp_enabled = 1; +} + +static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet) +{ + struct ctl_table_header *hdr; + struct ctl_table *table; + + table = mptcp_sysctl_table; + if (!net_eq(net, &init_net)) { + table = kmemdup(table, sizeof(mptcp_sysctl_table), GFP_KERNEL); + if (!table) + goto err_alloc; + } + + table[0].data = &pernet->mptcp_enabled; + + hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table); + if (!hdr) + goto err_reg; + + pernet->ctl_table_hdr = hdr; + + return 0; + +err_reg: + if (!net_eq(net, &init_net)) + kfree(table); +err_alloc: + return -ENOMEM; +} + +static void mptcp_pernet_del_table(struct mptcp_pernet *pernet) +{ + struct ctl_table *table = pernet->ctl_table_hdr->ctl_table_arg; + + unregister_net_sysctl_table(pernet->ctl_table_hdr); + + kfree(table); +} + +static int __net_init mptcp_net_init(struct net *net) +{ + struct mptcp_pernet *pernet = mptcp_get_pernet(net); + + mptcp_pernet_set_defaults(pernet); + + return mptcp_pernet_new_table(net, pernet); +} + +/* Note: the callback will only be called per extra netns */ +static void __net_exit mptcp_net_exit(struct net *net) +{ + struct mptcp_pernet *pernet = mptcp_get_pernet(net); + + mptcp_pernet_del_table(pernet); +} + +static struct pernet_operations mptcp_pernet_ops = { + .init = mptcp_net_init, + .exit = mptcp_net_exit, + .id = &mptcp_pernet_id, + .size = sizeof(struct mptcp_pernet), +}; + +void __init mptcp_init(void) +{ + mptcp_proto_init(); + + if (register_pernet_subsys(&mptcp_pernet_ops) < 0) + panic("Failed to register MPTCP pernet subsystem.\n"); +} + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +int __init mptcpv6_init(void) +{ + int err; + + err = mptcp_proto_v6_init(); + + return err; +} +#endif diff --git a/net/mptcp/options.c b/net/mptcp/options.c new file mode 100644 index 000000000000..45acd877bef3 --- /dev/null +++ b/net/mptcp/options.c @@ -0,0 +1,586 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP + * + * Copyright (c) 2017 - 2019, Intel Corporation. + */ + +#include <linux/kernel.h> +#include <net/tcp.h> +#include <net/mptcp.h> +#include "protocol.h" + +static bool mptcp_cap_flag_sha256(u8 flags) +{ + return (flags & MPTCP_CAP_FLAG_MASK) == MPTCP_CAP_HMAC_SHA256; +} + +void mptcp_parse_option(const struct sk_buff *skb, const unsigned char *ptr, + int opsize, struct tcp_options_received *opt_rx) +{ + struct mptcp_options_received *mp_opt = &opt_rx->mptcp; + u8 subtype = *ptr >> 4; + int expected_opsize; + u8 version; + u8 flags; + + switch (subtype) { + case MPTCPOPT_MP_CAPABLE: + /* strict size checking */ + if (!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) { + if (skb->len > tcp_hdr(skb)->doff << 2) + expected_opsize = TCPOLEN_MPTCP_MPC_ACK_DATA; + else + expected_opsize = TCPOLEN_MPTCP_MPC_ACK; + } else { + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACK) + expected_opsize = TCPOLEN_MPTCP_MPC_SYNACK; + else + expected_opsize = TCPOLEN_MPTCP_MPC_SYN; + } + if (opsize != expected_opsize) + break; + + /* try to be gentle vs future versions on the initial syn */ + version = *ptr++ & MPTCP_VERSION_MASK; + if (opsize != TCPOLEN_MPTCP_MPC_SYN) { + if (version != MPTCP_SUPPORTED_VERSION) + break; + } else if (version < MPTCP_SUPPORTED_VERSION) { + break; + } + + flags = *ptr++; + if (!mptcp_cap_flag_sha256(flags) || + (flags & MPTCP_CAP_EXTENSIBILITY)) + break; + + /* RFC 6824, Section 3.1: + * "For the Checksum Required bit (labeled "A"), if either + * host requires the use of checksums, checksums MUST be used. + * In other words, the only way for checksums not to be used + * is if both hosts in their SYNs set A=0." + * + * Section 3.3.0: + * "If a checksum is not present when its use has been + * negotiated, the receiver MUST close the subflow with a RST as + * it is considered broken." + * + * We don't implement DSS checksum - fall back to TCP. + */ + if (flags & MPTCP_CAP_CHECKSUM_REQD) + break; + + mp_opt->mp_capable = 1; + if (opsize >= TCPOLEN_MPTCP_MPC_SYNACK) { + mp_opt->sndr_key = get_unaligned_be64(ptr); + ptr += 8; + } + if (opsize >= TCPOLEN_MPTCP_MPC_ACK) { + mp_opt->rcvr_key = get_unaligned_be64(ptr); + ptr += 8; + } + if (opsize == TCPOLEN_MPTCP_MPC_ACK_DATA) { + /* Section 3.1.: + * "the data parameters in a MP_CAPABLE are semantically + * equivalent to those in a DSS option and can be used + * interchangeably." + */ + mp_opt->dss = 1; + mp_opt->use_map = 1; + mp_opt->mpc_map = 1; + mp_opt->data_len = get_unaligned_be16(ptr); + ptr += 2; + } + pr_debug("MP_CAPABLE version=%x, flags=%x, optlen=%d sndr=%llu, rcvr=%llu len=%d", + version, flags, opsize, mp_opt->sndr_key, + mp_opt->rcvr_key, mp_opt->data_len); + break; + + case MPTCPOPT_DSS: + pr_debug("DSS"); + ptr++; + + /* we must clear 'mpc_map' be able to detect MP_CAPABLE + * map vs DSS map in mptcp_incoming_options(), and reconstruct + * map info accordingly + */ + mp_opt->mpc_map = 0; + flags = (*ptr++) & MPTCP_DSS_FLAG_MASK; + mp_opt->data_fin = (flags & MPTCP_DSS_DATA_FIN) != 0; + mp_opt->dsn64 = (flags & MPTCP_DSS_DSN64) != 0; + mp_opt->use_map = (flags & MPTCP_DSS_HAS_MAP) != 0; + mp_opt->ack64 = (flags & MPTCP_DSS_ACK64) != 0; + mp_opt->use_ack = (flags & MPTCP_DSS_HAS_ACK); + + pr_debug("data_fin=%d dsn64=%d use_map=%d ack64=%d use_ack=%d", + mp_opt->data_fin, mp_opt->dsn64, + mp_opt->use_map, mp_opt->ack64, + mp_opt->use_ack); + + expected_opsize = TCPOLEN_MPTCP_DSS_BASE; + + if (mp_opt->use_ack) { + if (mp_opt->ack64) + expected_opsize += TCPOLEN_MPTCP_DSS_ACK64; + else + expected_opsize += TCPOLEN_MPTCP_DSS_ACK32; + } + + if (mp_opt->use_map) { + if (mp_opt->dsn64) + expected_opsize += TCPOLEN_MPTCP_DSS_MAP64; + else + expected_opsize += TCPOLEN_MPTCP_DSS_MAP32; + } + + /* RFC 6824, Section 3.3: + * If a checksum is present, but its use had + * not been negotiated in the MP_CAPABLE handshake, + * the checksum field MUST be ignored. + */ + if (opsize != expected_opsize && + opsize != expected_opsize + TCPOLEN_MPTCP_DSS_CHECKSUM) + break; + + mp_opt->dss = 1; + + if (mp_opt->use_ack) { + if (mp_opt->ack64) { + mp_opt->data_ack = get_unaligned_be64(ptr); + ptr += 8; + } else { + mp_opt->data_ack = get_unaligned_be32(ptr); + ptr += 4; + } + + pr_debug("data_ack=%llu", mp_opt->data_ack); + } + + if (mp_opt->use_map) { + if (mp_opt->dsn64) { + mp_opt->data_seq = get_unaligned_be64(ptr); + ptr += 8; + } else { + mp_opt->data_seq = get_unaligned_be32(ptr); + ptr += 4; + } + + mp_opt->subflow_seq = get_unaligned_be32(ptr); + ptr += 4; + + mp_opt->data_len = get_unaligned_be16(ptr); + ptr += 2; + + pr_debug("data_seq=%llu subflow_seq=%u data_len=%u", + mp_opt->data_seq, mp_opt->subflow_seq, + mp_opt->data_len); + } + + break; + + default: + break; + } +} + +void mptcp_get_options(const struct sk_buff *skb, + struct tcp_options_received *opt_rx) +{ + const unsigned char *ptr; + const struct tcphdr *th = tcp_hdr(skb); + int length = (th->doff * 4) - sizeof(struct tcphdr); + + ptr = (const unsigned char *)(th + 1); + + while (length > 0) { + int opcode = *ptr++; + int opsize; + + switch (opcode) { + case TCPOPT_EOL: + return; + case TCPOPT_NOP: /* Ref: RFC 793 section 3.1 */ + length--; + continue; + default: + opsize = *ptr++; + if (opsize < 2) /* "silly options" */ + return; + if (opsize > length) + return; /* don't parse partial options */ + if (opcode == TCPOPT_MPTCP) + mptcp_parse_option(skb, ptr, opsize, opt_rx); + ptr += opsize - 2; + length -= opsize; + } + } +} + +bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb, + unsigned int *size, struct mptcp_out_options *opts) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + + /* we will use snd_isn to detect first pkt [re]transmission + * in mptcp_established_options_mp() + */ + subflow->snd_isn = TCP_SKB_CB(skb)->end_seq; + if (subflow->request_mptcp) { + pr_debug("local_key=%llu", subflow->local_key); + opts->suboptions = OPTION_MPTCP_MPC_SYN; + opts->sndr_key = subflow->local_key; + *size = TCPOLEN_MPTCP_MPC_SYN; + return true; + } + return false; +} + +void mptcp_rcv_synsent(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct tcp_sock *tp = tcp_sk(sk); + + pr_debug("subflow=%p", subflow); + if (subflow->request_mptcp && tp->rx_opt.mptcp.mp_capable) { + subflow->mp_capable = 1; + subflow->can_ack = 1; + subflow->remote_key = tp->rx_opt.mptcp.sndr_key; + } else { + tcp_sk(sk)->is_mptcp = 0; + } +} + +static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb, + unsigned int *size, + unsigned int remaining, + struct mptcp_out_options *opts) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct mptcp_ext *mpext; + unsigned int data_len; + + pr_debug("subflow=%p fourth_ack=%d seq=%x:%x remaining=%d", subflow, + subflow->fourth_ack, subflow->snd_isn, + skb ? TCP_SKB_CB(skb)->seq : 0, remaining); + + if (subflow->mp_capable && !subflow->fourth_ack && skb && + subflow->snd_isn == TCP_SKB_CB(skb)->seq) { + /* When skb is not available, we better over-estimate the + * emitted options len. A full DSS option is longer than + * TCPOLEN_MPTCP_MPC_ACK_DATA, so let's the caller try to fit + * that. + */ + mpext = mptcp_get_ext(skb); + data_len = mpext ? mpext->data_len : 0; + + /* we will check ext_copy.data_len in mptcp_write_options() to + * discriminate between TCPOLEN_MPTCP_MPC_ACK_DATA and + * TCPOLEN_MPTCP_MPC_ACK + */ + opts->ext_copy.data_len = data_len; + opts->suboptions = OPTION_MPTCP_MPC_ACK; + opts->sndr_key = subflow->local_key; + opts->rcvr_key = subflow->remote_key; + + /* Section 3.1. + * The MP_CAPABLE option is carried on the SYN, SYN/ACK, and ACK + * packets that start the first subflow of an MPTCP connection, + * as well as the first packet that carries data + */ + if (data_len > 0) + *size = ALIGN(TCPOLEN_MPTCP_MPC_ACK_DATA, 4); + else + *size = TCPOLEN_MPTCP_MPC_ACK; + + pr_debug("subflow=%p, local_key=%llu, remote_key=%llu map_len=%d", + subflow, subflow->local_key, subflow->remote_key, + data_len); + + return true; + } + return false; +} + +static void mptcp_write_data_fin(struct mptcp_subflow_context *subflow, + struct mptcp_ext *ext) +{ + ext->data_fin = 1; + + if (!ext->use_map) { + /* RFC6824 requires a DSS mapping with specific values + * if DATA_FIN is set but no data payload is mapped + */ + ext->use_map = 1; + ext->dsn64 = 1; + ext->data_seq = mptcp_sk(subflow->conn)->write_seq; + ext->subflow_seq = 0; + ext->data_len = 1; + } else { + /* If there's an existing DSS mapping, DATA_FIN consumes + * 1 additional byte of mapping space. + */ + ext->data_len++; + } +} + +static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb, + unsigned int *size, + unsigned int remaining, + struct mptcp_out_options *opts) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + unsigned int dss_size = 0; + struct mptcp_ext *mpext; + struct mptcp_sock *msk; + unsigned int ack_size; + bool ret = false; + u8 tcp_fin; + + if (skb) { + mpext = mptcp_get_ext(skb); + tcp_fin = TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN; + } else { + mpext = NULL; + tcp_fin = 0; + } + + if (!skb || (mpext && mpext->use_map) || tcp_fin) { + unsigned int map_size; + + map_size = TCPOLEN_MPTCP_DSS_BASE + TCPOLEN_MPTCP_DSS_MAP64; + + remaining -= map_size; + dss_size = map_size; + if (mpext) + opts->ext_copy = *mpext; + + if (skb && tcp_fin && + subflow->conn->sk_state != TCP_ESTABLISHED) + mptcp_write_data_fin(subflow, &opts->ext_copy); + ret = true; + } + + opts->ext_copy.use_ack = 0; + msk = mptcp_sk(subflow->conn); + if (!msk || !READ_ONCE(msk->can_ack)) { + *size = ALIGN(dss_size, 4); + return ret; + } + + ack_size = TCPOLEN_MPTCP_DSS_ACK64; + + /* Add kind/length/subtype/flag overhead if mapping is not populated */ + if (dss_size == 0) + ack_size += TCPOLEN_MPTCP_DSS_BASE; + + dss_size += ack_size; + + opts->ext_copy.data_ack = msk->ack_seq; + opts->ext_copy.ack64 = 1; + opts->ext_copy.use_ack = 1; + + *size = ALIGN(dss_size, 4); + return true; +} + +bool mptcp_established_options(struct sock *sk, struct sk_buff *skb, + unsigned int *size, unsigned int remaining, + struct mptcp_out_options *opts) +{ + unsigned int opt_size = 0; + bool ret = false; + + if (mptcp_established_options_mp(sk, skb, &opt_size, remaining, opts)) + ret = true; + else if (mptcp_established_options_dss(sk, skb, &opt_size, remaining, + opts)) + ret = true; + + /* we reserved enough space for the above options, and exceeding the + * TCP option space would be fatal + */ + if (WARN_ON_ONCE(opt_size > remaining)) + return false; + + *size += opt_size; + remaining -= opt_size; + + return ret; +} + +bool mptcp_synack_options(const struct request_sock *req, unsigned int *size, + struct mptcp_out_options *opts) +{ + struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req); + + if (subflow_req->mp_capable) { + opts->suboptions = OPTION_MPTCP_MPC_SYNACK; + opts->sndr_key = subflow_req->local_key; + *size = TCPOLEN_MPTCP_MPC_SYNACK; + pr_debug("subflow_req=%p, local_key=%llu", + subflow_req, subflow_req->local_key); + return true; + } + return false; +} + +static bool check_fourth_ack(struct mptcp_subflow_context *subflow, + struct sk_buff *skb, + struct mptcp_options_received *mp_opt) +{ + /* here we can process OoO, in-window pkts, only in-sequence 4th ack + * are relevant + */ + if (likely(subflow->fourth_ack || + TCP_SKB_CB(skb)->seq != subflow->ssn_offset + 1)) + return true; + + if (mp_opt->use_ack) + subflow->fourth_ack = 1; + + if (subflow->can_ack) + return true; + + /* If the first established packet does not contain MP_CAPABLE + data + * then fallback to TCP + */ + if (!mp_opt->mp_capable) { + subflow->mp_capable = 0; + tcp_sk(mptcp_subflow_tcp_sock(subflow))->is_mptcp = 0; + return false; + } + subflow->remote_key = mp_opt->sndr_key; + subflow->can_ack = 1; + return true; +} + +void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb, + struct tcp_options_received *opt_rx) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct mptcp_options_received *mp_opt; + struct mptcp_ext *mpext; + + mp_opt = &opt_rx->mptcp; + if (!check_fourth_ack(subflow, skb, mp_opt)) + return; + + if (!mp_opt->dss) + return; + + mpext = skb_ext_add(skb, SKB_EXT_MPTCP); + if (!mpext) + return; + + memset(mpext, 0, sizeof(*mpext)); + + if (mp_opt->use_map) { + if (mp_opt->mpc_map) { + /* this is an MP_CAPABLE carrying MPTCP data + * we know this map the first chunk of data + */ + mptcp_crypto_key_sha(subflow->remote_key, NULL, + &mpext->data_seq); + mpext->data_seq++; + mpext->subflow_seq = 1; + mpext->dsn64 = 1; + mpext->mpc_map = 1; + } else { + mpext->data_seq = mp_opt->data_seq; + mpext->subflow_seq = mp_opt->subflow_seq; + mpext->dsn64 = mp_opt->dsn64; + } + mpext->data_len = mp_opt->data_len; + mpext->use_map = 1; + } + + if (mp_opt->use_ack) { + mpext->data_ack = mp_opt->data_ack; + mpext->use_ack = 1; + mpext->ack64 = mp_opt->ack64; + } + + mpext->data_fin = mp_opt->data_fin; +} + +void mptcp_write_options(__be32 *ptr, struct mptcp_out_options *opts) +{ + if ((OPTION_MPTCP_MPC_SYN | OPTION_MPTCP_MPC_SYNACK | + OPTION_MPTCP_MPC_ACK) & opts->suboptions) { + u8 len; + + if (OPTION_MPTCP_MPC_SYN & opts->suboptions) + len = TCPOLEN_MPTCP_MPC_SYN; + else if (OPTION_MPTCP_MPC_SYNACK & opts->suboptions) + len = TCPOLEN_MPTCP_MPC_SYNACK; + else if (opts->ext_copy.data_len) + len = TCPOLEN_MPTCP_MPC_ACK_DATA; + else + len = TCPOLEN_MPTCP_MPC_ACK; + + *ptr++ = htonl((TCPOPT_MPTCP << 24) | (len << 16) | + (MPTCPOPT_MP_CAPABLE << 12) | + (MPTCP_SUPPORTED_VERSION << 8) | + MPTCP_CAP_HMAC_SHA256); + + if (!((OPTION_MPTCP_MPC_SYNACK | OPTION_MPTCP_MPC_ACK) & + opts->suboptions)) + goto mp_capable_done; + + put_unaligned_be64(opts->sndr_key, ptr); + ptr += 2; + if (!((OPTION_MPTCP_MPC_ACK) & opts->suboptions)) + goto mp_capable_done; + + put_unaligned_be64(opts->rcvr_key, ptr); + ptr += 2; + if (!opts->ext_copy.data_len) + goto mp_capable_done; + + put_unaligned_be32(opts->ext_copy.data_len << 16 | + TCPOPT_NOP << 8 | TCPOPT_NOP, ptr); + ptr += 1; + } + +mp_capable_done: + if (opts->ext_copy.use_ack || opts->ext_copy.use_map) { + struct mptcp_ext *mpext = &opts->ext_copy; + u8 len = TCPOLEN_MPTCP_DSS_BASE; + u8 flags = 0; + + if (mpext->use_ack) { + len += TCPOLEN_MPTCP_DSS_ACK64; + flags = MPTCP_DSS_HAS_ACK | MPTCP_DSS_ACK64; + } + + if (mpext->use_map) { + len += TCPOLEN_MPTCP_DSS_MAP64; + + /* Use only 64-bit mapping flags for now, add + * support for optional 32-bit mappings later. + */ + flags |= MPTCP_DSS_HAS_MAP | MPTCP_DSS_DSN64; + if (mpext->data_fin) + flags |= MPTCP_DSS_DATA_FIN; + } + + *ptr++ = htonl((TCPOPT_MPTCP << 24) | + (len << 16) | + (MPTCPOPT_DSS << 12) | + (flags)); + + if (mpext->use_ack) { + put_unaligned_be64(mpext->data_ack, ptr); + ptr += 2; + } + + if (mpext->use_map) { + put_unaligned_be64(mpext->data_seq, ptr); + ptr += 2; + put_unaligned_be32(mpext->subflow_seq, ptr); + ptr += 1; + put_unaligned_be32(mpext->data_len << 16 | + TCPOPT_NOP << 8 | TCPOPT_NOP, ptr); + } + } +} diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c new file mode 100644 index 000000000000..39fdca79ce90 --- /dev/null +++ b/net/mptcp/protocol.c @@ -0,0 +1,1276 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP + * + * Copyright (c) 2017 - 2019, Intel Corporation. + */ + +#define pr_fmt(fmt) "MPTCP: " fmt + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/netdevice.h> +#include <linux/sched/signal.h> +#include <linux/atomic.h> +#include <net/sock.h> +#include <net/inet_common.h> +#include <net/inet_hashtables.h> +#include <net/protocol.h> +#include <net/tcp.h> +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +#include <net/transp_v6.h> +#endif +#include <net/mptcp.h> +#include "protocol.h" + +#define MPTCP_SAME_STATE TCP_MAX_STATES + +static void __mptcp_close(struct sock *sk, long timeout); + +static const struct proto_ops *tcp_proto_ops(struct sock *sk) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6) + return &inet6_stream_ops; +#endif + return &inet_stream_ops; +} + +/* MP_CAPABLE handshake failed, convert msk to plain tcp, replacing + * socket->sk and stream ops and destroying msk + * return the msk socket, as we can't access msk anymore after this function + * completes + * Called with msk lock held, releases such lock before returning + */ +static struct socket *__mptcp_fallback_to_tcp(struct mptcp_sock *msk, + struct sock *ssk) +{ + struct mptcp_subflow_context *subflow; + struct socket *sock; + struct sock *sk; + + sk = (struct sock *)msk; + sock = sk->sk_socket; + subflow = mptcp_subflow_ctx(ssk); + + /* detach the msk socket */ + list_del_init(&subflow->node); + sock_orphan(sk); + sock->sk = NULL; + + /* socket is now TCP */ + lock_sock(ssk); + sock_graft(ssk, sock); + if (subflow->conn) { + /* We can't release the ULP data on a live socket, + * restore the tcp callback + */ + mptcp_subflow_tcp_fallback(ssk, subflow); + sock_put(subflow->conn); + subflow->conn = NULL; + } + release_sock(ssk); + sock->ops = tcp_proto_ops(ssk); + + /* destroy the left-over msk sock */ + __mptcp_close(sk, 0); + return sock; +} + +/* If msk has an initial subflow socket, and the MP_CAPABLE handshake has not + * completed yet or has failed, return the subflow socket. + * Otherwise return NULL. + */ +static struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk) +{ + if (!msk->subflow || READ_ONCE(msk->can_ack)) + return NULL; + + return msk->subflow; +} + +static bool __mptcp_needs_tcp_fallback(const struct mptcp_sock *msk) +{ + return msk->first && !sk_is_mptcp(msk->first); +} + +/* if the mp_capable handshake has failed, it fallbacks msk to plain TCP, + * releases the socket lock and returns a reference to the now TCP socket. + * Otherwise returns NULL + */ +static struct socket *__mptcp_tcp_fallback(struct mptcp_sock *msk) +{ + sock_owned_by_me((const struct sock *)msk); + + if (likely(!__mptcp_needs_tcp_fallback(msk))) + return NULL; + + if (msk->subflow) { + /* the first subflow is an active connection, discart the + * paired socket + */ + msk->subflow->sk = NULL; + sock_release(msk->subflow); + msk->subflow = NULL; + } + + return __mptcp_fallback_to_tcp(msk, msk->first); +} + +static bool __mptcp_can_create_subflow(const struct mptcp_sock *msk) +{ + return !msk->first; +} + +static struct socket *__mptcp_socket_create(struct mptcp_sock *msk, int state) +{ + struct mptcp_subflow_context *subflow; + struct sock *sk = (struct sock *)msk; + struct socket *ssock; + int err; + + ssock = __mptcp_nmpc_socket(msk); + if (ssock) + goto set_state; + + if (!__mptcp_can_create_subflow(msk)) + return ERR_PTR(-EINVAL); + + err = mptcp_subflow_create_socket(sk, &ssock); + if (err) + return ERR_PTR(err); + + msk->first = ssock->sk; + msk->subflow = ssock; + subflow = mptcp_subflow_ctx(ssock->sk); + list_add(&subflow->node, &msk->conn_list); + subflow->request_mptcp = 1; + +set_state: + if (state != MPTCP_SAME_STATE) + inet_sk_state_store(sk, state); + return ssock; +} + +static struct sock *mptcp_subflow_get(const struct mptcp_sock *msk) +{ + struct mptcp_subflow_context *subflow; + + sock_owned_by_me((const struct sock *)msk); + + mptcp_for_each_subflow(msk, subflow) { + return mptcp_subflow_tcp_sock(subflow); + } + + return NULL; +} + +static bool mptcp_ext_cache_refill(struct mptcp_sock *msk) +{ + if (!msk->cached_ext) + msk->cached_ext = __skb_ext_alloc(); + + return !!msk->cached_ext; +} + +static struct sock *mptcp_subflow_recv_lookup(const struct mptcp_sock *msk) +{ + struct mptcp_subflow_context *subflow; + struct sock *sk = (struct sock *)msk; + + sock_owned_by_me(sk); + + mptcp_for_each_subflow(msk, subflow) { + if (subflow->data_avail) + return mptcp_subflow_tcp_sock(subflow); + } + + return NULL; +} + +static inline bool mptcp_skb_can_collapse_to(const struct mptcp_sock *msk, + const struct sk_buff *skb, + const struct mptcp_ext *mpext) +{ + if (!tcp_skb_can_collapse_to(skb)) + return false; + + /* can collapse only if MPTCP level sequence is in order */ + return mpext && mpext->data_seq + mpext->data_len == msk->write_seq; +} + +static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, + struct msghdr *msg, long *timeo, int *pmss_now, + int *ps_goal) +{ + int mss_now, avail_size, size_goal, ret; + struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_ext *mpext = NULL; + struct sk_buff *skb, *tail; + bool can_collapse = false; + struct page_frag *pfrag; + size_t psize; + + /* use the mptcp page cache so that we can easily move the data + * from one substream to another, but do per subflow memory accounting + */ + pfrag = sk_page_frag(sk); + while (!sk_page_frag_refill(ssk, pfrag) || + !mptcp_ext_cache_refill(msk)) { + ret = sk_stream_wait_memory(ssk, timeo); + if (ret) + return ret; + if (unlikely(__mptcp_needs_tcp_fallback(msk))) + return 0; + } + + /* compute copy limit */ + mss_now = tcp_send_mss(ssk, &size_goal, msg->msg_flags); + *pmss_now = mss_now; + *ps_goal = size_goal; + avail_size = size_goal; + skb = tcp_write_queue_tail(ssk); + if (skb) { + mpext = skb_ext_find(skb, SKB_EXT_MPTCP); + + /* Limit the write to the size available in the + * current skb, if any, so that we create at most a new skb. + * Explicitly tells TCP internals to avoid collapsing on later + * queue management operation, to avoid breaking the ext <-> + * SSN association set here + */ + can_collapse = (size_goal - skb->len > 0) && + mptcp_skb_can_collapse_to(msk, skb, mpext); + if (!can_collapse) + TCP_SKB_CB(skb)->eor = 1; + else + avail_size = size_goal - skb->len; + } + psize = min_t(size_t, pfrag->size - pfrag->offset, avail_size); + + /* Copy to page */ + pr_debug("left=%zu", msg_data_left(msg)); + psize = copy_page_from_iter(pfrag->page, pfrag->offset, + min_t(size_t, msg_data_left(msg), psize), + &msg->msg_iter); + pr_debug("left=%zu", msg_data_left(msg)); + if (!psize) + return -EINVAL; + + /* tell the TCP stack to delay the push so that we can safely + * access the skb after the sendpages call + */ + ret = do_tcp_sendpages(ssk, pfrag->page, pfrag->offset, psize, + msg->msg_flags | MSG_SENDPAGE_NOTLAST); + if (ret <= 0) + return ret; + if (unlikely(ret < psize)) + iov_iter_revert(&msg->msg_iter, psize - ret); + + /* if the tail skb extension is still the cached one, collapsing + * really happened. Note: we can't check for 'same skb' as the sk_buff + * hdr on tail can be transmitted, freed and re-allocated by the + * do_tcp_sendpages() call + */ + tail = tcp_write_queue_tail(ssk); + if (mpext && tail && mpext == skb_ext_find(tail, SKB_EXT_MPTCP)) { + WARN_ON_ONCE(!can_collapse); + mpext->data_len += ret; + goto out; + } + + skb = tcp_write_queue_tail(ssk); + mpext = __skb_ext_set(skb, SKB_EXT_MPTCP, msk->cached_ext); + msk->cached_ext = NULL; + + memset(mpext, 0, sizeof(*mpext)); + mpext->data_seq = msk->write_seq; + mpext->subflow_seq = mptcp_subflow_ctx(ssk)->rel_write_seq; + mpext->data_len = ret; + mpext->use_map = 1; + mpext->dsn64 = 1; + + pr_debug("data_seq=%llu subflow_seq=%u data_len=%u dsn64=%d", + mpext->data_seq, mpext->subflow_seq, mpext->data_len, + mpext->dsn64); + +out: + pfrag->offset += ret; + msk->write_seq += ret; + mptcp_subflow_ctx(ssk)->rel_write_seq += ret; + + return ret; +} + +static void ssk_check_wmem(struct mptcp_sock *msk, struct sock *ssk) +{ + struct socket *sock; + + if (likely(sk_stream_is_writeable(ssk))) + return; + + sock = READ_ONCE(ssk->sk_socket); + + if (sock) { + clear_bit(MPTCP_SEND_SPACE, &msk->flags); + smp_mb__after_atomic(); + /* set NOSPACE only after clearing SEND_SPACE flag */ + set_bit(SOCK_NOSPACE, &sock->flags); + } +} + +static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + int mss_now = 0, size_goal = 0, ret = 0; + struct mptcp_sock *msk = mptcp_sk(sk); + struct socket *ssock; + size_t copied = 0; + struct sock *ssk; + long timeo; + + if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL)) + return -EOPNOTSUPP; + + lock_sock(sk); + ssock = __mptcp_tcp_fallback(msk); + if (unlikely(ssock)) { +fallback: + pr_debug("fallback passthrough"); + ret = sock_sendmsg(ssock, msg); + return ret >= 0 ? ret + copied : (copied ? copied : ret); + } + + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); + + ssk = mptcp_subflow_get(msk); + if (!ssk) { + release_sock(sk); + return -ENOTCONN; + } + + pr_debug("conn_list->subflow=%p", ssk); + + lock_sock(ssk); + while (msg_data_left(msg)) { + ret = mptcp_sendmsg_frag(sk, ssk, msg, &timeo, &mss_now, + &size_goal); + if (ret < 0) + break; + if (ret == 0 && unlikely(__mptcp_needs_tcp_fallback(msk))) { + release_sock(ssk); + ssock = __mptcp_tcp_fallback(msk); + goto fallback; + } + + copied += ret; + } + + if (copied) { + ret = copied; + tcp_push(ssk, msg->msg_flags, mss_now, tcp_sk(ssk)->nonagle, + size_goal); + } + + ssk_check_wmem(msk, ssk); + release_sock(ssk); + release_sock(sk); + return ret; +} + +int mptcp_read_actor(read_descriptor_t *desc, struct sk_buff *skb, + unsigned int offset, size_t len) +{ + struct mptcp_read_arg *arg = desc->arg.data; + size_t copy_len; + + copy_len = min(desc->count, len); + + if (likely(arg->msg)) { + int err; + + err = skb_copy_datagram_msg(skb, offset, arg->msg, copy_len); + if (err) { + pr_debug("error path"); + desc->error = err; + return err; + } + } else { + pr_debug("Flushing skb payload"); + } + + desc->count -= copy_len; + + pr_debug("consumed %zu bytes, %zu left", copy_len, desc->count); + return copy_len; +} + +static void mptcp_wait_data(struct sock *sk, long *timeo) +{ + DEFINE_WAIT_FUNC(wait, woken_wake_function); + struct mptcp_sock *msk = mptcp_sk(sk); + + add_wait_queue(sk_sleep(sk), &wait); + sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); + + sk_wait_event(sk, timeo, + test_and_clear_bit(MPTCP_DATA_READY, &msk->flags), &wait); + + sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); + remove_wait_queue(sk_sleep(sk), &wait); +} + +static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int nonblock, int flags, int *addr_len) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct mptcp_subflow_context *subflow; + bool more_data_avail = false; + struct mptcp_read_arg arg; + read_descriptor_t desc; + bool wait_data = false; + struct socket *ssock; + struct tcp_sock *tp; + bool done = false; + struct sock *ssk; + int copied = 0; + int target; + long timeo; + + if (msg->msg_flags & ~(MSG_WAITALL | MSG_DONTWAIT)) + return -EOPNOTSUPP; + + lock_sock(sk); + ssock = __mptcp_tcp_fallback(msk); + if (unlikely(ssock)) { +fallback: + pr_debug("fallback-read subflow=%p", + mptcp_subflow_ctx(ssock->sk)); + copied = sock_recvmsg(ssock, msg, flags); + return copied; + } + + arg.msg = msg; + desc.arg.data = &arg; + desc.error = 0; + + timeo = sock_rcvtimeo(sk, nonblock); + + len = min_t(size_t, len, INT_MAX); + target = sock_rcvlowat(sk, flags & MSG_WAITALL, len); + + while (!done) { + u32 map_remaining; + int bytes_read; + + ssk = mptcp_subflow_recv_lookup(msk); + pr_debug("msk=%p ssk=%p", msk, ssk); + if (!ssk) + goto wait_for_data; + + subflow = mptcp_subflow_ctx(ssk); + tp = tcp_sk(ssk); + + lock_sock(ssk); + do { + /* try to read as much data as available */ + map_remaining = subflow->map_data_len - + mptcp_subflow_get_map_offset(subflow); + desc.count = min_t(size_t, len - copied, map_remaining); + pr_debug("reading %zu bytes, copied %d", desc.count, + copied); + bytes_read = tcp_read_sock(ssk, &desc, + mptcp_read_actor); + if (bytes_read < 0) { + if (!copied) + copied = bytes_read; + done = true; + goto next; + } + + pr_debug("msk ack_seq=%llx -> %llx", msk->ack_seq, + msk->ack_seq + bytes_read); + msk->ack_seq += bytes_read; + copied += bytes_read; + if (copied >= len) { + done = true; + goto next; + } + if (tp->urg_data && tp->urg_seq == tp->copied_seq) { + pr_err("Urgent data present, cannot proceed"); + done = true; + goto next; + } +next: + more_data_avail = mptcp_subflow_data_available(ssk); + } while (more_data_avail && !done); + release_sock(ssk); + continue; + +wait_for_data: + more_data_avail = false; + + /* only the master socket status is relevant here. The exit + * conditions mirror closely tcp_recvmsg() + */ + if (copied >= target) + break; + + if (copied) { + if (sk->sk_err || + sk->sk_state == TCP_CLOSE || + (sk->sk_shutdown & RCV_SHUTDOWN) || + !timeo || + signal_pending(current)) + break; + } else { + if (sk->sk_err) { + copied = sock_error(sk); + break; + } + + if (sk->sk_shutdown & RCV_SHUTDOWN) + break; + + if (sk->sk_state == TCP_CLOSE) { + copied = -ENOTCONN; + break; + } + + if (!timeo) { + copied = -EAGAIN; + break; + } + + if (signal_pending(current)) { + copied = sock_intr_errno(timeo); + break; + } + } + + pr_debug("block timeout %ld", timeo); + wait_data = true; + mptcp_wait_data(sk, &timeo); + if (unlikely(__mptcp_tcp_fallback(msk))) + goto fallback; + } + + if (more_data_avail) { + if (!test_bit(MPTCP_DATA_READY, &msk->flags)) + set_bit(MPTCP_DATA_READY, &msk->flags); + } else if (!wait_data) { + clear_bit(MPTCP_DATA_READY, &msk->flags); + + /* .. race-breaker: ssk might get new data after last + * data_available() returns false. + */ + ssk = mptcp_subflow_recv_lookup(msk); + if (unlikely(ssk)) + set_bit(MPTCP_DATA_READY, &msk->flags); + } + + release_sock(sk); + return copied; +} + +/* subflow sockets can be either outgoing (connect) or incoming + * (accept). + * + * Outgoing subflows use in-kernel sockets. + * Incoming subflows do not have their own 'struct socket' allocated, + * so we need to use tcp_close() after detaching them from the mptcp + * parent socket. + */ +static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, + struct mptcp_subflow_context *subflow, + long timeout) +{ + struct socket *sock = READ_ONCE(ssk->sk_socket); + + list_del(&subflow->node); + + if (sock && sock != sk->sk_socket) { + /* outgoing subflow */ + sock_release(sock); + } else { + /* incoming subflow */ + tcp_close(ssk, timeout); + } +} + +static int __mptcp_init_sock(struct sock *sk) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + + INIT_LIST_HEAD(&msk->conn_list); + __set_bit(MPTCP_SEND_SPACE, &msk->flags); + + msk->first = NULL; + + return 0; +} + +static int mptcp_init_sock(struct sock *sk) +{ + if (!mptcp_is_enabled(sock_net(sk))) + return -ENOPROTOOPT; + + return __mptcp_init_sock(sk); +} + +static void mptcp_subflow_shutdown(struct sock *ssk, int how) +{ + lock_sock(ssk); + + switch (ssk->sk_state) { + case TCP_LISTEN: + if (!(how & RCV_SHUTDOWN)) + break; + /* fall through */ + case TCP_SYN_SENT: + tcp_disconnect(ssk, O_NONBLOCK); + break; + default: + ssk->sk_shutdown |= how; + tcp_shutdown(ssk, how); + break; + } + + /* Wake up anyone sleeping in poll. */ + ssk->sk_state_change(ssk); + release_sock(ssk); +} + +/* Called with msk lock held, releases such lock before returning */ +static void __mptcp_close(struct sock *sk, long timeout) +{ + struct mptcp_subflow_context *subflow, *tmp; + struct mptcp_sock *msk = mptcp_sk(sk); + + mptcp_token_destroy(msk->token); + inet_sk_state_store(sk, TCP_CLOSE); + + list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + + __mptcp_close_ssk(sk, ssk, subflow, timeout); + } + + if (msk->cached_ext) + __skb_ext_put(msk->cached_ext); + release_sock(sk); + sk_common_release(sk); +} + +static void mptcp_close(struct sock *sk, long timeout) +{ + lock_sock(sk); + __mptcp_close(sk, timeout); +} + +static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk) +{ +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + const struct ipv6_pinfo *ssk6 = inet6_sk(ssk); + struct ipv6_pinfo *msk6 = inet6_sk(msk); + + msk->sk_v6_daddr = ssk->sk_v6_daddr; + msk->sk_v6_rcv_saddr = ssk->sk_v6_rcv_saddr; + + if (msk6 && ssk6) { + msk6->saddr = ssk6->saddr; + msk6->flow_label = ssk6->flow_label; + } +#endif + + inet_sk(msk)->inet_num = inet_sk(ssk)->inet_num; + inet_sk(msk)->inet_dport = inet_sk(ssk)->inet_dport; + inet_sk(msk)->inet_sport = inet_sk(ssk)->inet_sport; + inet_sk(msk)->inet_daddr = inet_sk(ssk)->inet_daddr; + inet_sk(msk)->inet_saddr = inet_sk(ssk)->inet_saddr; + inet_sk(msk)->inet_rcv_saddr = inet_sk(ssk)->inet_rcv_saddr; +} + +static struct sock *mptcp_accept(struct sock *sk, int flags, int *err, + bool kern) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct socket *listener; + struct sock *newsk; + + listener = __mptcp_nmpc_socket(msk); + if (WARN_ON_ONCE(!listener)) { + *err = -EINVAL; + return NULL; + } + + pr_debug("msk=%p, listener=%p", msk, mptcp_subflow_ctx(listener->sk)); + newsk = inet_csk_accept(listener->sk, flags, err, kern); + if (!newsk) + return NULL; + + pr_debug("msk=%p, subflow is mptcp=%d", msk, sk_is_mptcp(newsk)); + + if (sk_is_mptcp(newsk)) { + struct mptcp_subflow_context *subflow; + struct sock *new_mptcp_sock; + struct sock *ssk = newsk; + u64 ack_seq; + + subflow = mptcp_subflow_ctx(newsk); + lock_sock(sk); + + local_bh_disable(); + new_mptcp_sock = sk_clone_lock(sk, GFP_ATOMIC); + if (!new_mptcp_sock) { + *err = -ENOBUFS; + local_bh_enable(); + release_sock(sk); + mptcp_subflow_shutdown(newsk, SHUT_RDWR + 1); + tcp_close(newsk, 0); + return NULL; + } + + __mptcp_init_sock(new_mptcp_sock); + + msk = mptcp_sk(new_mptcp_sock); + msk->local_key = subflow->local_key; + msk->token = subflow->token; + msk->subflow = NULL; + msk->first = newsk; + + mptcp_token_update_accept(newsk, new_mptcp_sock); + + msk->write_seq = subflow->idsn + 1; + if (subflow->can_ack) { + msk->can_ack = true; + msk->remote_key = subflow->remote_key; + mptcp_crypto_key_sha(msk->remote_key, NULL, &ack_seq); + ack_seq++; + msk->ack_seq = ack_seq; + } + newsk = new_mptcp_sock; + mptcp_copy_inaddrs(newsk, ssk); + list_add(&subflow->node, &msk->conn_list); + + /* will be fully established at mptcp_stream_accept() + * completion. + */ + inet_sk_state_store(new_mptcp_sock, TCP_SYN_RECV); + bh_unlock_sock(new_mptcp_sock); + local_bh_enable(); + release_sock(sk); + + /* the subflow can already receive packet, avoid racing with + * the receive path and process the pending ones + */ + lock_sock(ssk); + subflow->rel_write_seq = 1; + subflow->tcp_sock = ssk; + subflow->conn = new_mptcp_sock; + if (unlikely(!skb_queue_empty(&ssk->sk_receive_queue))) + mptcp_subflow_data_available(ssk); + release_sock(ssk); + } + + return newsk; +} + +static void mptcp_destroy(struct sock *sk) +{ +} + +static int mptcp_setsockopt(struct sock *sk, int level, int optname, + char __user *uoptval, unsigned int optlen) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + char __kernel *optval; + int ret = -EOPNOTSUPP; + struct socket *ssock; + + /* will be treated as __user in tcp_setsockopt */ + optval = (char __kernel __force *)uoptval; + + pr_debug("msk=%p", msk); + + /* @@ the meaning of setsockopt() when the socket is connected and + * there are multiple subflows is not defined. + */ + lock_sock(sk); + ssock = __mptcp_socket_create(msk, MPTCP_SAME_STATE); + if (!IS_ERR(ssock)) { + pr_debug("subflow=%p", ssock->sk); + ret = kernel_setsockopt(ssock, level, optname, optval, optlen); + } + release_sock(sk); + + return ret; +} + +static int mptcp_getsockopt(struct sock *sk, int level, int optname, + char __user *uoptval, int __user *uoption) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + char __kernel *optval; + int ret = -EOPNOTSUPP; + int __kernel *option; + struct socket *ssock; + + /* will be treated as __user in tcp_getsockopt */ + optval = (char __kernel __force *)uoptval; + option = (int __kernel __force *)uoption; + + pr_debug("msk=%p", msk); + + /* @@ the meaning of getsockopt() when the socket is connected and + * there are multiple subflows is not defined. + */ + lock_sock(sk); + ssock = __mptcp_socket_create(msk, MPTCP_SAME_STATE); + if (!IS_ERR(ssock)) { + pr_debug("subflow=%p", ssock->sk); + ret = kernel_getsockopt(ssock, level, optname, optval, option); + } + release_sock(sk); + + return ret; +} + +static int mptcp_get_port(struct sock *sk, unsigned short snum) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + struct socket *ssock; + + ssock = __mptcp_nmpc_socket(msk); + pr_debug("msk=%p, subflow=%p", msk, ssock); + if (WARN_ON_ONCE(!ssock)) + return -EINVAL; + + return inet_csk_get_port(ssock->sk, snum); +} + +void mptcp_finish_connect(struct sock *ssk) +{ + struct mptcp_subflow_context *subflow; + struct mptcp_sock *msk; + struct sock *sk; + u64 ack_seq; + + subflow = mptcp_subflow_ctx(ssk); + + if (!subflow->mp_capable) + return; + + sk = subflow->conn; + msk = mptcp_sk(sk); + + pr_debug("msk=%p, token=%u", sk, subflow->token); + + mptcp_crypto_key_sha(subflow->remote_key, NULL, &ack_seq); + ack_seq++; + subflow->map_seq = ack_seq; + subflow->map_subflow_seq = 1; + subflow->rel_write_seq = 1; + + /* the socket is not connected yet, no msk/subflow ops can access/race + * accessing the field below + */ + WRITE_ONCE(msk->remote_key, subflow->remote_key); + WRITE_ONCE(msk->local_key, subflow->local_key); + WRITE_ONCE(msk->token, subflow->token); + WRITE_ONCE(msk->write_seq, subflow->idsn + 1); + WRITE_ONCE(msk->ack_seq, ack_seq); + WRITE_ONCE(msk->can_ack, 1); +} + +static void mptcp_sock_graft(struct sock *sk, struct socket *parent) +{ + write_lock_bh(&sk->sk_callback_lock); + rcu_assign_pointer(sk->sk_wq, &parent->wq); + sk_set_socket(sk, parent); + sk->sk_uid = SOCK_INODE(parent)->i_uid; + write_unlock_bh(&sk->sk_callback_lock); +} + +static bool mptcp_memory_free(const struct sock *sk, int wake) +{ + struct mptcp_sock *msk = mptcp_sk(sk); + + return wake ? test_bit(MPTCP_SEND_SPACE, &msk->flags) : true; +} + +static struct proto mptcp_prot = { + .name = "MPTCP", + .owner = THIS_MODULE, + .init = mptcp_init_sock, + .close = mptcp_close, + .accept = mptcp_accept, + .setsockopt = mptcp_setsockopt, + .getsockopt = mptcp_getsockopt, + .shutdown = tcp_shutdown, + .destroy = mptcp_destroy, + .sendmsg = mptcp_sendmsg, + .recvmsg = mptcp_recvmsg, + .hash = inet_hash, + .unhash = inet_unhash, + .get_port = mptcp_get_port, + .stream_memory_free = mptcp_memory_free, + .obj_size = sizeof(struct mptcp_sock), + .no_autobind = true, +}; + +static int mptcp_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) +{ + struct mptcp_sock *msk = mptcp_sk(sock->sk); + struct socket *ssock; + int err; + + lock_sock(sock->sk); + ssock = __mptcp_socket_create(msk, MPTCP_SAME_STATE); + if (IS_ERR(ssock)) { + err = PTR_ERR(ssock); + goto unlock; + } + + err = ssock->ops->bind(ssock, uaddr, addr_len); + if (!err) + mptcp_copy_inaddrs(sock->sk, ssock->sk); + +unlock: + release_sock(sock->sk); + return err; +} + +static int mptcp_stream_connect(struct socket *sock, struct sockaddr *uaddr, + int addr_len, int flags) +{ + struct mptcp_sock *msk = mptcp_sk(sock->sk); + struct socket *ssock; + int err; + + lock_sock(sock->sk); + ssock = __mptcp_socket_create(msk, TCP_SYN_SENT); + if (IS_ERR(ssock)) { + err = PTR_ERR(ssock); + goto unlock; + } + +#ifdef CONFIG_TCP_MD5SIG + /* no MPTCP if MD5SIG is enabled on this socket or we may run out of + * TCP option space. + */ + if (rcu_access_pointer(tcp_sk(ssock->sk)->md5sig_info)) + mptcp_subflow_ctx(ssock->sk)->request_mptcp = 0; +#endif + + err = ssock->ops->connect(ssock, uaddr, addr_len, flags); + inet_sk_state_store(sock->sk, inet_sk_state_load(ssock->sk)); + mptcp_copy_inaddrs(sock->sk, ssock->sk); + +unlock: + release_sock(sock->sk); + return err; +} + +static int mptcp_v4_getname(struct socket *sock, struct sockaddr *uaddr, + int peer) +{ + if (sock->sk->sk_prot == &tcp_prot) { + /* we are being invoked from __sys_accept4, after + * mptcp_accept() has just accepted a non-mp-capable + * flow: sk is a tcp_sk, not an mptcp one. + * + * Hand the socket over to tcp so all further socket ops + * bypass mptcp. + */ + sock->ops = &inet_stream_ops; + } + + return inet_getname(sock, uaddr, peer); +} + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +static int mptcp_v6_getname(struct socket *sock, struct sockaddr *uaddr, + int peer) +{ + if (sock->sk->sk_prot == &tcpv6_prot) { + /* we are being invoked from __sys_accept4 after + * mptcp_accept() has accepted a non-mp-capable + * subflow: sk is a tcp_sk, not mptcp. + * + * Hand the socket over to tcp so all further + * socket ops bypass mptcp. + */ + sock->ops = &inet6_stream_ops; + } + + return inet6_getname(sock, uaddr, peer); +} +#endif + +static int mptcp_listen(struct socket *sock, int backlog) +{ + struct mptcp_sock *msk = mptcp_sk(sock->sk); + struct socket *ssock; + int err; + + pr_debug("msk=%p", msk); + + lock_sock(sock->sk); + ssock = __mptcp_socket_create(msk, TCP_LISTEN); + if (IS_ERR(ssock)) { + err = PTR_ERR(ssock); + goto unlock; + } + + err = ssock->ops->listen(ssock, backlog); + inet_sk_state_store(sock->sk, inet_sk_state_load(ssock->sk)); + if (!err) + mptcp_copy_inaddrs(sock->sk, ssock->sk); + +unlock: + release_sock(sock->sk); + return err; +} + +static bool is_tcp_proto(const struct proto *p) +{ +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + return p == &tcp_prot || p == &tcpv6_prot; +#else + return p == &tcp_prot; +#endif +} + +static int mptcp_stream_accept(struct socket *sock, struct socket *newsock, + int flags, bool kern) +{ + struct mptcp_sock *msk = mptcp_sk(sock->sk); + struct socket *ssock; + int err; + + pr_debug("msk=%p", msk); + + lock_sock(sock->sk); + if (sock->sk->sk_state != TCP_LISTEN) + goto unlock_fail; + + ssock = __mptcp_nmpc_socket(msk); + if (!ssock) + goto unlock_fail; + + sock_hold(ssock->sk); + release_sock(sock->sk); + + err = ssock->ops->accept(sock, newsock, flags, kern); + if (err == 0 && !is_tcp_proto(newsock->sk->sk_prot)) { + struct mptcp_sock *msk = mptcp_sk(newsock->sk); + struct mptcp_subflow_context *subflow; + + /* set ssk->sk_socket of accept()ed flows to mptcp socket. + * This is needed so NOSPACE flag can be set from tcp stack. + */ + list_for_each_entry(subflow, &msk->conn_list, node) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + + if (!ssk->sk_socket) + mptcp_sock_graft(ssk, newsock); + } + + inet_sk_state_store(newsock->sk, TCP_ESTABLISHED); + } + + sock_put(ssock->sk); + return err; + +unlock_fail: + release_sock(sock->sk); + return -EINVAL; +} + +static __poll_t mptcp_poll(struct file *file, struct socket *sock, + struct poll_table_struct *wait) +{ + struct sock *sk = sock->sk; + struct mptcp_sock *msk; + struct socket *ssock; + __poll_t mask = 0; + + msk = mptcp_sk(sk); + lock_sock(sk); + ssock = __mptcp_nmpc_socket(msk); + if (ssock) { + mask = ssock->ops->poll(file, ssock, wait); + release_sock(sk); + return mask; + } + + release_sock(sk); + sock_poll_wait(file, sock, wait); + lock_sock(sk); + ssock = __mptcp_tcp_fallback(msk); + if (unlikely(ssock)) + return ssock->ops->poll(file, ssock, NULL); + + if (test_bit(MPTCP_DATA_READY, &msk->flags)) + mask = EPOLLIN | EPOLLRDNORM; + if (sk_stream_is_writeable(sk) && + test_bit(MPTCP_SEND_SPACE, &msk->flags)) + mask |= EPOLLOUT | EPOLLWRNORM; + if (sk->sk_shutdown & RCV_SHUTDOWN) + mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP; + + release_sock(sk); + + return mask; +} + +static int mptcp_shutdown(struct socket *sock, int how) +{ + struct mptcp_sock *msk = mptcp_sk(sock->sk); + struct mptcp_subflow_context *subflow; + int ret = 0; + + pr_debug("sk=%p, how=%d", msk, how); + + lock_sock(sock->sk); + + if (how == SHUT_WR || how == SHUT_RDWR) + inet_sk_state_store(sock->sk, TCP_FIN_WAIT1); + + how++; + + if ((how & ~SHUTDOWN_MASK) || !how) { + ret = -EINVAL; + goto out_unlock; + } + + if (sock->state == SS_CONNECTING) { + if ((1 << sock->sk->sk_state) & + (TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_CLOSE)) + sock->state = SS_DISCONNECTING; + else + sock->state = SS_CONNECTED; + } + + mptcp_for_each_subflow(msk, subflow) { + struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow); + + mptcp_subflow_shutdown(tcp_sk, how); + } + +out_unlock: + release_sock(sock->sk); + + return ret; +} + +static const struct proto_ops mptcp_stream_ops = { + .family = PF_INET, + .owner = THIS_MODULE, + .release = inet_release, + .bind = mptcp_bind, + .connect = mptcp_stream_connect, + .socketpair = sock_no_socketpair, + .accept = mptcp_stream_accept, + .getname = mptcp_v4_getname, + .poll = mptcp_poll, + .ioctl = inet_ioctl, + .gettstamp = sock_gettstamp, + .listen = mptcp_listen, + .shutdown = mptcp_shutdown, + .setsockopt = sock_common_setsockopt, + .getsockopt = sock_common_getsockopt, + .sendmsg = inet_sendmsg, + .recvmsg = inet_recvmsg, + .mmap = sock_no_mmap, + .sendpage = inet_sendpage, +#ifdef CONFIG_COMPAT + .compat_setsockopt = compat_sock_common_setsockopt, + .compat_getsockopt = compat_sock_common_getsockopt, +#endif +}; + +static struct inet_protosw mptcp_protosw = { + .type = SOCK_STREAM, + .protocol = IPPROTO_MPTCP, + .prot = &mptcp_prot, + .ops = &mptcp_stream_ops, + .flags = INET_PROTOSW_ICSK, +}; + +void mptcp_proto_init(void) +{ + mptcp_prot.h.hashinfo = tcp_prot.h.hashinfo; + + mptcp_subflow_init(); + + if (proto_register(&mptcp_prot, 1) != 0) + panic("Failed to register MPTCP proto.\n"); + + inet_register_protosw(&mptcp_protosw); +} + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +static const struct proto_ops mptcp_v6_stream_ops = { + .family = PF_INET6, + .owner = THIS_MODULE, + .release = inet6_release, + .bind = mptcp_bind, + .connect = mptcp_stream_connect, + .socketpair = sock_no_socketpair, + .accept = mptcp_stream_accept, + .getname = mptcp_v6_getname, + .poll = mptcp_poll, + .ioctl = inet6_ioctl, + .gettstamp = sock_gettstamp, + .listen = mptcp_listen, + .shutdown = mptcp_shutdown, + .setsockopt = sock_common_setsockopt, + .getsockopt = sock_common_getsockopt, + .sendmsg = inet6_sendmsg, + .recvmsg = inet6_recvmsg, + .mmap = sock_no_mmap, + .sendpage = inet_sendpage, +#ifdef CONFIG_COMPAT + .compat_setsockopt = compat_sock_common_setsockopt, + .compat_getsockopt = compat_sock_common_getsockopt, +#endif +}; + +static struct proto mptcp_v6_prot; + +static void mptcp_v6_destroy(struct sock *sk) +{ + mptcp_destroy(sk); + inet6_destroy_sock(sk); +} + +static struct inet_protosw mptcp_v6_protosw = { + .type = SOCK_STREAM, + .protocol = IPPROTO_MPTCP, + .prot = &mptcp_v6_prot, + .ops = &mptcp_v6_stream_ops, + .flags = INET_PROTOSW_ICSK, +}; + +int mptcp_proto_v6_init(void) +{ + int err; + + mptcp_v6_prot = mptcp_prot; + strcpy(mptcp_v6_prot.name, "MPTCPv6"); + mptcp_v6_prot.slab = NULL; + mptcp_v6_prot.destroy = mptcp_v6_destroy; + mptcp_v6_prot.obj_size = sizeof(struct mptcp_sock) + + sizeof(struct ipv6_pinfo); + + err = proto_register(&mptcp_v6_prot, 1); + if (err) + return err; + + err = inet6_register_protosw(&mptcp_v6_protosw); + if (err) + proto_unregister(&mptcp_v6_prot); + + return err; +} +#endif diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h new file mode 100644 index 000000000000..8a99a2930284 --- /dev/null +++ b/net/mptcp/protocol.h @@ -0,0 +1,240 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Multipath TCP + * + * Copyright (c) 2017 - 2019, Intel Corporation. + */ + +#ifndef __MPTCP_PROTOCOL_H +#define __MPTCP_PROTOCOL_H + +#include <linux/random.h> +#include <net/tcp.h> +#include <net/inet_connection_sock.h> + +#define MPTCP_SUPPORTED_VERSION 1 + +/* MPTCP option bits */ +#define OPTION_MPTCP_MPC_SYN BIT(0) +#define OPTION_MPTCP_MPC_SYNACK BIT(1) +#define OPTION_MPTCP_MPC_ACK BIT(2) + +/* MPTCP option subtypes */ +#define MPTCPOPT_MP_CAPABLE 0 +#define MPTCPOPT_MP_JOIN 1 +#define MPTCPOPT_DSS 2 +#define MPTCPOPT_ADD_ADDR 3 +#define MPTCPOPT_RM_ADDR 4 +#define MPTCPOPT_MP_PRIO 5 +#define MPTCPOPT_MP_FAIL 6 +#define MPTCPOPT_MP_FASTCLOSE 7 + +/* MPTCP suboption lengths */ +#define TCPOLEN_MPTCP_MPC_SYN 4 +#define TCPOLEN_MPTCP_MPC_SYNACK 12 +#define TCPOLEN_MPTCP_MPC_ACK 20 +#define TCPOLEN_MPTCP_MPC_ACK_DATA 22 +#define TCPOLEN_MPTCP_DSS_BASE 4 +#define TCPOLEN_MPTCP_DSS_ACK32 4 +#define TCPOLEN_MPTCP_DSS_ACK64 8 +#define TCPOLEN_MPTCP_DSS_MAP32 10 +#define TCPOLEN_MPTCP_DSS_MAP64 14 +#define TCPOLEN_MPTCP_DSS_CHECKSUM 2 + +/* MPTCP MP_CAPABLE flags */ +#define MPTCP_VERSION_MASK (0x0F) +#define MPTCP_CAP_CHECKSUM_REQD BIT(7) +#define MPTCP_CAP_EXTENSIBILITY BIT(6) +#define MPTCP_CAP_HMAC_SHA256 BIT(0) +#define MPTCP_CAP_FLAG_MASK (0x3F) + +/* MPTCP DSS flags */ +#define MPTCP_DSS_DATA_FIN BIT(4) +#define MPTCP_DSS_DSN64 BIT(3) +#define MPTCP_DSS_HAS_MAP BIT(2) +#define MPTCP_DSS_ACK64 BIT(1) +#define MPTCP_DSS_HAS_ACK BIT(0) +#define MPTCP_DSS_FLAG_MASK (0x1F) + +/* MPTCP socket flags */ +#define MPTCP_DATA_READY BIT(0) +#define MPTCP_SEND_SPACE BIT(1) + +/* MPTCP connection sock */ +struct mptcp_sock { + /* inet_connection_sock must be the first member */ + struct inet_connection_sock sk; + u64 local_key; + u64 remote_key; + u64 write_seq; + u64 ack_seq; + u32 token; + unsigned long flags; + bool can_ack; + struct list_head conn_list; + struct skb_ext *cached_ext; /* for the next sendmsg */ + struct socket *subflow; /* outgoing connect/listener/!mp_capable */ + struct sock *first; +}; + +#define mptcp_for_each_subflow(__msk, __subflow) \ + list_for_each_entry(__subflow, &((__msk)->conn_list), node) + +static inline struct mptcp_sock *mptcp_sk(const struct sock *sk) +{ + return (struct mptcp_sock *)sk; +} + +struct mptcp_subflow_request_sock { + struct tcp_request_sock sk; + u16 mp_capable : 1, + mp_join : 1, + backup : 1, + remote_key_valid : 1; + u64 local_key; + u64 remote_key; + u64 idsn; + u32 token; + u32 ssn_offset; +}; + +static inline struct mptcp_subflow_request_sock * +mptcp_subflow_rsk(const struct request_sock *rsk) +{ + return (struct mptcp_subflow_request_sock *)rsk; +} + +/* MPTCP subflow context */ +struct mptcp_subflow_context { + struct list_head node;/* conn_list of subflows */ + u64 local_key; + u64 remote_key; + u64 idsn; + u64 map_seq; + u32 snd_isn; + u32 token; + u32 rel_write_seq; + u32 map_subflow_seq; + u32 ssn_offset; + u32 map_data_len; + u32 request_mptcp : 1, /* send MP_CAPABLE */ + mp_capable : 1, /* remote is MPTCP capable */ + fourth_ack : 1, /* send initial DSS */ + conn_finished : 1, + map_valid : 1, + mpc_map : 1, + data_avail : 1, + rx_eof : 1, + can_ack : 1; /* only after processing the remote a key */ + + struct sock *tcp_sock; /* tcp sk backpointer */ + struct sock *conn; /* parent mptcp_sock */ + const struct inet_connection_sock_af_ops *icsk_af_ops; + void (*tcp_data_ready)(struct sock *sk); + void (*tcp_state_change)(struct sock *sk); + void (*tcp_write_space)(struct sock *sk); + + struct rcu_head rcu; +}; + +static inline struct mptcp_subflow_context * +mptcp_subflow_ctx(const struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + /* Use RCU on icsk_ulp_data only for sock diag code */ + return (__force struct mptcp_subflow_context *)icsk->icsk_ulp_data; +} + +static inline struct sock * +mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) +{ + return subflow->tcp_sock; +} + +static inline u64 +mptcp_subflow_get_map_offset(const struct mptcp_subflow_context *subflow) +{ + return tcp_sk(mptcp_subflow_tcp_sock(subflow))->copied_seq - + subflow->ssn_offset - + subflow->map_subflow_seq; +} + +static inline u64 +mptcp_subflow_get_mapped_dsn(const struct mptcp_subflow_context *subflow) +{ + return subflow->map_seq + mptcp_subflow_get_map_offset(subflow); +} + +int mptcp_is_enabled(struct net *net); +bool mptcp_subflow_data_available(struct sock *sk); +void mptcp_subflow_init(void); +int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock); + +static inline void mptcp_subflow_tcp_fallback(struct sock *sk, + struct mptcp_subflow_context *ctx) +{ + sk->sk_data_ready = ctx->tcp_data_ready; + sk->sk_state_change = ctx->tcp_state_change; + sk->sk_write_space = ctx->tcp_write_space; + + inet_csk(sk)->icsk_af_ops = ctx->icsk_af_ops; +} + +extern const struct inet_connection_sock_af_ops ipv4_specific; +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +extern const struct inet_connection_sock_af_ops ipv6_specific; +#endif + +void mptcp_proto_init(void); +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +int mptcp_proto_v6_init(void); +#endif + +struct mptcp_read_arg { + struct msghdr *msg; +}; + +int mptcp_read_actor(read_descriptor_t *desc, struct sk_buff *skb, + unsigned int offset, size_t len); + +void mptcp_get_options(const struct sk_buff *skb, + struct tcp_options_received *opt_rx); + +void mptcp_finish_connect(struct sock *sk); + +int mptcp_token_new_request(struct request_sock *req); +void mptcp_token_destroy_request(u32 token); +int mptcp_token_new_connect(struct sock *sk); +int mptcp_token_new_accept(u32 token); +void mptcp_token_update_accept(struct sock *sk, struct sock *conn); +void mptcp_token_destroy(u32 token); + +void mptcp_crypto_key_sha(u64 key, u32 *token, u64 *idsn); +static inline void mptcp_crypto_key_gen_sha(u64 *key, u32 *token, u64 *idsn) +{ + /* we might consider a faster version that computes the key as a + * hash of some information available in the MPTCP socket. Use + * random data at the moment, as it's probably the safest option + * in case multiple sockets are opened in different namespaces at + * the same time. + */ + get_random_bytes(key, sizeof(u64)); + mptcp_crypto_key_sha(*key, token, idsn); +} + +void mptcp_crypto_hmac_sha(u64 key1, u64 key2, u32 nonce1, u32 nonce2, + void *hash_out); + +static inline struct mptcp_ext *mptcp_get_ext(struct sk_buff *skb) +{ + return (struct mptcp_ext *)skb_ext_find(skb, SKB_EXT_MPTCP); +} + +static inline bool before64(__u64 seq1, __u64 seq2) +{ + return (__s64)(seq1 - seq2) < 0; +} + +#define after64(seq2, seq1) before64(seq1, seq2) + +#endif /* __MPTCP_PROTOCOL_H */ diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c new file mode 100644 index 000000000000..1662e1178949 --- /dev/null +++ b/net/mptcp/subflow.c @@ -0,0 +1,860 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP + * + * Copyright (c) 2017 - 2019, Intel Corporation. + */ + +#define pr_fmt(fmt) "MPTCP: " fmt + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/netdevice.h> +#include <net/sock.h> +#include <net/inet_common.h> +#include <net/inet_hashtables.h> +#include <net/protocol.h> +#include <net/tcp.h> +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +#include <net/ip6_route.h> +#endif +#include <net/mptcp.h> +#include "protocol.h" + +static int subflow_rebuild_header(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + int err = 0; + + if (subflow->request_mptcp && !subflow->token) { + pr_debug("subflow=%p", sk); + err = mptcp_token_new_connect(sk); + } + + if (err) + return err; + + return subflow->icsk_af_ops->rebuild_header(sk); +} + +static void subflow_req_destructor(struct request_sock *req) +{ + struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req); + + pr_debug("subflow_req=%p", subflow_req); + + if (subflow_req->mp_capable) + mptcp_token_destroy_request(subflow_req->token); + tcp_request_sock_ops.destructor(req); +} + +static void subflow_init_req(struct request_sock *req, + const struct sock *sk_listener, + struct sk_buff *skb) +{ + struct mptcp_subflow_context *listener = mptcp_subflow_ctx(sk_listener); + struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req); + struct tcp_options_received rx_opt; + + pr_debug("subflow_req=%p, listener=%p", subflow_req, listener); + + memset(&rx_opt.mptcp, 0, sizeof(rx_opt.mptcp)); + mptcp_get_options(skb, &rx_opt); + + subflow_req->mp_capable = 0; + subflow_req->remote_key_valid = 0; + +#ifdef CONFIG_TCP_MD5SIG + /* no MPTCP if MD5SIG is enabled on this socket or we may run out of + * TCP option space. + */ + if (rcu_access_pointer(tcp_sk(sk_listener)->md5sig_info)) + return; +#endif + + if (rx_opt.mptcp.mp_capable && listener->request_mptcp) { + int err; + + err = mptcp_token_new_request(req); + if (err == 0) + subflow_req->mp_capable = 1; + + subflow_req->ssn_offset = TCP_SKB_CB(skb)->seq; + } +} + +static void subflow_v4_init_req(struct request_sock *req, + const struct sock *sk_listener, + struct sk_buff *skb) +{ + tcp_rsk(req)->is_mptcp = 1; + + tcp_request_sock_ipv4_ops.init_req(req, sk_listener, skb); + + subflow_init_req(req, sk_listener, skb); +} + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +static void subflow_v6_init_req(struct request_sock *req, + const struct sock *sk_listener, + struct sk_buff *skb) +{ + tcp_rsk(req)->is_mptcp = 1; + + tcp_request_sock_ipv6_ops.init_req(req, sk_listener, skb); + + subflow_init_req(req, sk_listener, skb); +} +#endif + +static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + + subflow->icsk_af_ops->sk_rx_dst_set(sk, skb); + + if (subflow->conn && !subflow->conn_finished) { + pr_debug("subflow=%p, remote_key=%llu", mptcp_subflow_ctx(sk), + subflow->remote_key); + mptcp_finish_connect(sk); + subflow->conn_finished = 1; + + if (skb) { + pr_debug("synack seq=%u", TCP_SKB_CB(skb)->seq); + subflow->ssn_offset = TCP_SKB_CB(skb)->seq; + } + } +} + +static struct request_sock_ops subflow_request_sock_ops; +static struct tcp_request_sock_ops subflow_request_sock_ipv4_ops; + +static int subflow_v4_conn_request(struct sock *sk, struct sk_buff *skb) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + + pr_debug("subflow=%p", subflow); + + /* Never answer to SYNs sent to broadcast or multicast */ + if (skb_rtable(skb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST)) + goto drop; + + return tcp_conn_request(&subflow_request_sock_ops, + &subflow_request_sock_ipv4_ops, + sk, skb); +drop: + tcp_listendrop(sk); + return 0; +} + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) +static struct tcp_request_sock_ops subflow_request_sock_ipv6_ops; +static struct inet_connection_sock_af_ops subflow_v6_specific; +static struct inet_connection_sock_af_ops subflow_v6m_specific; + +static int subflow_v6_conn_request(struct sock *sk, struct sk_buff *skb) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + + pr_debug("subflow=%p", subflow); + + if (skb->protocol == htons(ETH_P_IP)) + return subflow_v4_conn_request(sk, skb); + + if (!ipv6_unicast_destination(skb)) + goto drop; + + return tcp_conn_request(&subflow_request_sock_ops, + &subflow_request_sock_ipv6_ops, sk, skb); + +drop: + tcp_listendrop(sk); + return 0; /* don't send reset */ +} +#endif + +static struct sock *subflow_syn_recv_sock(const struct sock *sk, + struct sk_buff *skb, + struct request_sock *req, + struct dst_entry *dst, + struct request_sock *req_unhash, + bool *own_req) +{ + struct mptcp_subflow_context *listener = mptcp_subflow_ctx(sk); + struct mptcp_subflow_request_sock *subflow_req; + struct tcp_options_received opt_rx; + struct sock *child; + + pr_debug("listener=%p, req=%p, conn=%p", listener, req, listener->conn); + + /* if the sk is MP_CAPABLE, we try to fetch the client key */ + subflow_req = mptcp_subflow_rsk(req); + if (subflow_req->mp_capable) { + if (TCP_SKB_CB(skb)->seq != subflow_req->ssn_offset + 1) { + /* here we can receive and accept an in-window, + * out-of-order pkt, which will not carry the MP_CAPABLE + * opt even on mptcp enabled paths + */ + goto create_child; + } + + opt_rx.mptcp.mp_capable = 0; + mptcp_get_options(skb, &opt_rx); + if (opt_rx.mptcp.mp_capable) { + subflow_req->remote_key = opt_rx.mptcp.sndr_key; + subflow_req->remote_key_valid = 1; + } else { + subflow_req->mp_capable = 0; + } + } + +create_child: + child = listener->icsk_af_ops->syn_recv_sock(sk, skb, req, dst, + req_unhash, own_req); + + if (child && *own_req) { + struct mptcp_subflow_context *ctx = mptcp_subflow_ctx(child); + + /* we have null ctx on TCP fallback, not fatal on MPC + * handshake + */ + if (!ctx) + return child; + + if (ctx->mp_capable) { + if (mptcp_token_new_accept(ctx->token)) + goto close_child; + } + } + + return child; + +close_child: + pr_debug("closing child socket"); + tcp_send_active_reset(child, GFP_ATOMIC); + inet_csk_prepare_forced_close(child); + tcp_done(child); + return NULL; +} + +static struct inet_connection_sock_af_ops subflow_specific; + +enum mapping_status { + MAPPING_OK, + MAPPING_INVALID, + MAPPING_EMPTY, + MAPPING_DATA_FIN +}; + +static u64 expand_seq(u64 old_seq, u16 old_data_len, u64 seq) +{ + if ((u32)seq == (u32)old_seq) + return old_seq; + + /* Assume map covers data not mapped yet. */ + return seq | ((old_seq + old_data_len + 1) & GENMASK_ULL(63, 32)); +} + +static void warn_bad_map(struct mptcp_subflow_context *subflow, u32 ssn) +{ + WARN_ONCE(1, "Bad mapping: ssn=%d map_seq=%d map_data_len=%d", + ssn, subflow->map_subflow_seq, subflow->map_data_len); +} + +static bool skb_is_fully_mapped(struct sock *ssk, struct sk_buff *skb) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + unsigned int skb_consumed; + + skb_consumed = tcp_sk(ssk)->copied_seq - TCP_SKB_CB(skb)->seq; + if (WARN_ON_ONCE(skb_consumed >= skb->len)) + return true; + + return skb->len - skb_consumed <= subflow->map_data_len - + mptcp_subflow_get_map_offset(subflow); +} + +static bool validate_mapping(struct sock *ssk, struct sk_buff *skb) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + u32 ssn = tcp_sk(ssk)->copied_seq - subflow->ssn_offset; + + if (unlikely(before(ssn, subflow->map_subflow_seq))) { + /* Mapping covers data later in the subflow stream, + * currently unsupported. + */ + warn_bad_map(subflow, ssn); + return false; + } + if (unlikely(!before(ssn, subflow->map_subflow_seq + + subflow->map_data_len))) { + /* Mapping does covers past subflow data, invalid */ + warn_bad_map(subflow, ssn + skb->len); + return false; + } + return true; +} + +static enum mapping_status get_mapping_status(struct sock *ssk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + struct mptcp_ext *mpext; + struct sk_buff *skb; + u16 data_len; + u64 map_seq; + + skb = skb_peek(&ssk->sk_receive_queue); + if (!skb) + return MAPPING_EMPTY; + + mpext = mptcp_get_ext(skb); + if (!mpext || !mpext->use_map) { + if (!subflow->map_valid && !skb->len) { + /* the TCP stack deliver 0 len FIN pkt to the receive + * queue, that is the only 0len pkts ever expected here, + * and we can admit no mapping only for 0 len pkts + */ + if (!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)) + WARN_ONCE(1, "0len seq %d:%d flags %x", + TCP_SKB_CB(skb)->seq, + TCP_SKB_CB(skb)->end_seq, + TCP_SKB_CB(skb)->tcp_flags); + sk_eat_skb(ssk, skb); + return MAPPING_EMPTY; + } + + if (!subflow->map_valid) + return MAPPING_INVALID; + + goto validate_seq; + } + + pr_debug("seq=%llu is64=%d ssn=%u data_len=%u data_fin=%d", + mpext->data_seq, mpext->dsn64, mpext->subflow_seq, + mpext->data_len, mpext->data_fin); + + data_len = mpext->data_len; + if (data_len == 0) { + pr_err("Infinite mapping not handled"); + return MAPPING_INVALID; + } + + if (mpext->data_fin == 1) { + if (data_len == 1) { + pr_debug("DATA_FIN with no payload"); + if (subflow->map_valid) { + /* A DATA_FIN might arrive in a DSS + * option before the previous mapping + * has been fully consumed. Continue + * handling the existing mapping. + */ + skb_ext_del(skb, SKB_EXT_MPTCP); + return MAPPING_OK; + } else { + return MAPPING_DATA_FIN; + } + } + + /* Adjust for DATA_FIN using 1 byte of sequence space */ + data_len--; + } + + if (!mpext->dsn64) { + map_seq = expand_seq(subflow->map_seq, subflow->map_data_len, + mpext->data_seq); + pr_debug("expanded seq=%llu", subflow->map_seq); + } else { + map_seq = mpext->data_seq; + } + + if (subflow->map_valid) { + /* Allow replacing only with an identical map */ + if (subflow->map_seq == map_seq && + subflow->map_subflow_seq == mpext->subflow_seq && + subflow->map_data_len == data_len) { + skb_ext_del(skb, SKB_EXT_MPTCP); + return MAPPING_OK; + } + + /* If this skb data are fully covered by the current mapping, + * the new map would need caching, which is not supported + */ + if (skb_is_fully_mapped(ssk, skb)) + return MAPPING_INVALID; + + /* will validate the next map after consuming the current one */ + return MAPPING_OK; + } + + subflow->map_seq = map_seq; + subflow->map_subflow_seq = mpext->subflow_seq; + subflow->map_data_len = data_len; + subflow->map_valid = 1; + subflow->mpc_map = mpext->mpc_map; + pr_debug("new map seq=%llu subflow_seq=%u data_len=%u", + subflow->map_seq, subflow->map_subflow_seq, + subflow->map_data_len); + +validate_seq: + /* we revalidate valid mapping on new skb, because we must ensure + * the current skb is completely covered by the available mapping + */ + if (!validate_mapping(ssk, skb)) + return MAPPING_INVALID; + + skb_ext_del(skb, SKB_EXT_MPTCP); + return MAPPING_OK; +} + +static bool subflow_check_data_avail(struct sock *ssk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + enum mapping_status status; + struct mptcp_sock *msk; + struct sk_buff *skb; + + pr_debug("msk=%p ssk=%p data_avail=%d skb=%p", subflow->conn, ssk, + subflow->data_avail, skb_peek(&ssk->sk_receive_queue)); + if (subflow->data_avail) + return true; + + if (!subflow->conn) + return false; + + msk = mptcp_sk(subflow->conn); + for (;;) { + u32 map_remaining; + size_t delta; + u64 ack_seq; + u64 old_ack; + + status = get_mapping_status(ssk); + pr_debug("msk=%p ssk=%p status=%d", msk, ssk, status); + if (status == MAPPING_INVALID) { + ssk->sk_err = EBADMSG; + goto fatal; + } + + if (status != MAPPING_OK) + return false; + + skb = skb_peek(&ssk->sk_receive_queue); + if (WARN_ON_ONCE(!skb)) + return false; + + /* if msk lacks the remote key, this subflow must provide an + * MP_CAPABLE-based mapping + */ + if (unlikely(!READ_ONCE(msk->can_ack))) { + if (!subflow->mpc_map) { + ssk->sk_err = EBADMSG; + goto fatal; + } + WRITE_ONCE(msk->remote_key, subflow->remote_key); + WRITE_ONCE(msk->ack_seq, subflow->map_seq); + WRITE_ONCE(msk->can_ack, true); + } + + old_ack = READ_ONCE(msk->ack_seq); + ack_seq = mptcp_subflow_get_mapped_dsn(subflow); + pr_debug("msk ack_seq=%llx subflow ack_seq=%llx", old_ack, + ack_seq); + if (ack_seq == old_ack) + break; + + /* only accept in-sequence mapping. Old values are spurious + * retransmission; we can hit "future" values on active backup + * subflow switch, we relay on retransmissions to get + * in-sequence data. + * Cuncurrent subflows support will require subflow data + * reordering + */ + map_remaining = subflow->map_data_len - + mptcp_subflow_get_map_offset(subflow); + if (before64(ack_seq, old_ack)) + delta = min_t(size_t, old_ack - ack_seq, map_remaining); + else + delta = min_t(size_t, ack_seq - old_ack, map_remaining); + + /* discard mapped data */ + pr_debug("discarding %zu bytes, current map len=%d", delta, + map_remaining); + if (delta) { + struct mptcp_read_arg arg = { + .msg = NULL, + }; + read_descriptor_t desc = { + .count = delta, + .arg.data = &arg, + }; + int ret; + + ret = tcp_read_sock(ssk, &desc, mptcp_read_actor); + if (ret < 0) { + ssk->sk_err = -ret; + goto fatal; + } + if (ret < delta) + return false; + if (delta == map_remaining) + subflow->map_valid = 0; + } + } + return true; + +fatal: + /* fatal protocol error, close the socket */ + /* This barrier is coupled with smp_rmb() in tcp_poll() */ + smp_wmb(); + ssk->sk_error_report(ssk); + tcp_set_state(ssk, TCP_CLOSE); + tcp_send_active_reset(ssk, GFP_ATOMIC); + return false; +} + +bool mptcp_subflow_data_available(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct sk_buff *skb; + + /* check if current mapping is still valid */ + if (subflow->map_valid && + mptcp_subflow_get_map_offset(subflow) >= subflow->map_data_len) { + subflow->map_valid = 0; + subflow->data_avail = 0; + + pr_debug("Done with mapping: seq=%u data_len=%u", + subflow->map_subflow_seq, + subflow->map_data_len); + } + + if (!subflow_check_data_avail(sk)) { + subflow->data_avail = 0; + return false; + } + + skb = skb_peek(&sk->sk_receive_queue); + subflow->data_avail = skb && + before(tcp_sk(sk)->copied_seq, TCP_SKB_CB(skb)->end_seq); + return subflow->data_avail; +} + +static void subflow_data_ready(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct sock *parent = subflow->conn; + + if (!parent || !subflow->mp_capable) { + subflow->tcp_data_ready(sk); + + if (parent) + parent->sk_data_ready(parent); + return; + } + + if (mptcp_subflow_data_available(sk)) { + set_bit(MPTCP_DATA_READY, &mptcp_sk(parent)->flags); + + parent->sk_data_ready(parent); + } +} + +static void subflow_write_space(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct sock *parent = subflow->conn; + + sk_stream_write_space(sk); + if (parent && sk_stream_is_writeable(sk)) { + set_bit(MPTCP_SEND_SPACE, &mptcp_sk(parent)->flags); + smp_mb__after_atomic(); + /* set SEND_SPACE before sk_stream_write_space clears NOSPACE */ + sk_stream_write_space(parent); + } +} + +static struct inet_connection_sock_af_ops * +subflow_default_af_ops(struct sock *sk) +{ +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + if (sk->sk_family == AF_INET6) + return &subflow_v6_specific; +#endif + return &subflow_specific; +} + +void mptcp_handle_ipv6_mapped(struct sock *sk, bool mapped) +{ +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct inet_connection_sock *icsk = inet_csk(sk); + struct inet_connection_sock_af_ops *target; + + target = mapped ? &subflow_v6m_specific : subflow_default_af_ops(sk); + + pr_debug("subflow=%p family=%d ops=%p target=%p mapped=%d", + subflow, sk->sk_family, icsk->icsk_af_ops, target, mapped); + + if (likely(icsk->icsk_af_ops == target)) + return; + + subflow->icsk_af_ops = icsk->icsk_af_ops; + icsk->icsk_af_ops = target; +#endif +} + +int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) +{ + struct mptcp_subflow_context *subflow; + struct net *net = sock_net(sk); + struct socket *sf; + int err; + + err = sock_create_kern(net, sk->sk_family, SOCK_STREAM, IPPROTO_TCP, + &sf); + if (err) + return err; + + lock_sock(sf->sk); + + /* kernel sockets do not by default acquire net ref, but TCP timer + * needs it. + */ + sf->sk->sk_net_refcnt = 1; + get_net(net); + this_cpu_add(*net->core.sock_inuse, 1); + err = tcp_set_ulp(sf->sk, "mptcp"); + release_sock(sf->sk); + + if (err) + return err; + + subflow = mptcp_subflow_ctx(sf->sk); + pr_debug("subflow=%p", subflow); + + *new_sock = sf; + sock_hold(sk); + subflow->conn = sk; + + return 0; +} + +static struct mptcp_subflow_context *subflow_create_ctx(struct sock *sk, + gfp_t priority) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + struct mptcp_subflow_context *ctx; + + ctx = kzalloc(sizeof(*ctx), priority); + if (!ctx) + return NULL; + + rcu_assign_pointer(icsk->icsk_ulp_data, ctx); + INIT_LIST_HEAD(&ctx->node); + + pr_debug("subflow=%p", ctx); + + ctx->tcp_sock = sk; + + return ctx; +} + +static void __subflow_state_change(struct sock *sk) +{ + struct socket_wq *wq; + + rcu_read_lock(); + wq = rcu_dereference(sk->sk_wq); + if (skwq_has_sleeper(wq)) + wake_up_interruptible_all(&wq->wait); + rcu_read_unlock(); +} + +static bool subflow_is_done(const struct sock *sk) +{ + return sk->sk_shutdown & RCV_SHUTDOWN || sk->sk_state == TCP_CLOSE; +} + +static void subflow_state_change(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct sock *parent = READ_ONCE(subflow->conn); + + __subflow_state_change(sk); + + /* as recvmsg() does not acquire the subflow socket for ssk selection + * a fin packet carrying a DSS can be unnoticed if we don't trigger + * the data available machinery here. + */ + if (parent && subflow->mp_capable && mptcp_subflow_data_available(sk)) { + set_bit(MPTCP_DATA_READY, &mptcp_sk(parent)->flags); + + parent->sk_data_ready(parent); + } + + if (parent && !(parent->sk_shutdown & RCV_SHUTDOWN) && + !subflow->rx_eof && subflow_is_done(sk)) { + subflow->rx_eof = 1; + parent->sk_shutdown |= RCV_SHUTDOWN; + __subflow_state_change(parent); + } +} + +static int subflow_ulp_init(struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + struct mptcp_subflow_context *ctx; + struct tcp_sock *tp = tcp_sk(sk); + int err = 0; + + /* disallow attaching ULP to a socket unless it has been + * created with sock_create_kern() + */ + if (!sk->sk_kern_sock) { + err = -EOPNOTSUPP; + goto out; + } + + ctx = subflow_create_ctx(sk, GFP_KERNEL); + if (!ctx) { + err = -ENOMEM; + goto out; + } + + pr_debug("subflow=%p, family=%d", ctx, sk->sk_family); + + tp->is_mptcp = 1; + ctx->icsk_af_ops = icsk->icsk_af_ops; + icsk->icsk_af_ops = subflow_default_af_ops(sk); + ctx->tcp_data_ready = sk->sk_data_ready; + ctx->tcp_state_change = sk->sk_state_change; + ctx->tcp_write_space = sk->sk_write_space; + sk->sk_data_ready = subflow_data_ready; + sk->sk_write_space = subflow_write_space; + sk->sk_state_change = subflow_state_change; +out: + return err; +} + +static void subflow_ulp_release(struct sock *sk) +{ + struct mptcp_subflow_context *ctx = mptcp_subflow_ctx(sk); + + if (!ctx) + return; + + if (ctx->conn) + sock_put(ctx->conn); + + kfree_rcu(ctx, rcu); +} + +static void subflow_ulp_fallback(struct sock *sk, + struct mptcp_subflow_context *old_ctx) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + mptcp_subflow_tcp_fallback(sk, old_ctx); + icsk->icsk_ulp_ops = NULL; + rcu_assign_pointer(icsk->icsk_ulp_data, NULL); + tcp_sk(sk)->is_mptcp = 0; +} + +static void subflow_ulp_clone(const struct request_sock *req, + struct sock *newsk, + const gfp_t priority) +{ + struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req); + struct mptcp_subflow_context *old_ctx = mptcp_subflow_ctx(newsk); + struct mptcp_subflow_context *new_ctx; + + if (!subflow_req->mp_capable) { + subflow_ulp_fallback(newsk, old_ctx); + return; + } + + new_ctx = subflow_create_ctx(newsk, priority); + if (!new_ctx) { + subflow_ulp_fallback(newsk, old_ctx); + return; + } + + /* see comments in subflow_syn_recv_sock(), MPTCP connection is fully + * established only after we receive the remote key + */ + new_ctx->conn_finished = 1; + new_ctx->icsk_af_ops = old_ctx->icsk_af_ops; + new_ctx->tcp_data_ready = old_ctx->tcp_data_ready; + new_ctx->tcp_state_change = old_ctx->tcp_state_change; + new_ctx->tcp_write_space = old_ctx->tcp_write_space; + new_ctx->mp_capable = 1; + new_ctx->fourth_ack = subflow_req->remote_key_valid; + new_ctx->can_ack = subflow_req->remote_key_valid; + new_ctx->remote_key = subflow_req->remote_key; + new_ctx->local_key = subflow_req->local_key; + new_ctx->token = subflow_req->token; + new_ctx->ssn_offset = subflow_req->ssn_offset; + new_ctx->idsn = subflow_req->idsn; +} + +static struct tcp_ulp_ops subflow_ulp_ops __read_mostly = { + .name = "mptcp", + .owner = THIS_MODULE, + .init = subflow_ulp_init, + .release = subflow_ulp_release, + .clone = subflow_ulp_clone, +}; + +static int subflow_ops_init(struct request_sock_ops *subflow_ops) +{ + subflow_ops->obj_size = sizeof(struct mptcp_subflow_request_sock); + subflow_ops->slab_name = "request_sock_subflow"; + + subflow_ops->slab = kmem_cache_create(subflow_ops->slab_name, + subflow_ops->obj_size, 0, + SLAB_ACCOUNT | + SLAB_TYPESAFE_BY_RCU, + NULL); + if (!subflow_ops->slab) + return -ENOMEM; + + subflow_ops->destructor = subflow_req_destructor; + + return 0; +} + +void mptcp_subflow_init(void) +{ + subflow_request_sock_ops = tcp_request_sock_ops; + if (subflow_ops_init(&subflow_request_sock_ops) != 0) + panic("MPTCP: failed to init subflow request sock ops\n"); + + subflow_request_sock_ipv4_ops = tcp_request_sock_ipv4_ops; + subflow_request_sock_ipv4_ops.init_req = subflow_v4_init_req; + + subflow_specific = ipv4_specific; + subflow_specific.conn_request = subflow_v4_conn_request; + subflow_specific.syn_recv_sock = subflow_syn_recv_sock; + subflow_specific.sk_rx_dst_set = subflow_finish_connect; + subflow_specific.rebuild_header = subflow_rebuild_header; + +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + subflow_request_sock_ipv6_ops = tcp_request_sock_ipv6_ops; + subflow_request_sock_ipv6_ops.init_req = subflow_v6_init_req; + + subflow_v6_specific = ipv6_specific; + subflow_v6_specific.conn_request = subflow_v6_conn_request; + subflow_v6_specific.syn_recv_sock = subflow_syn_recv_sock; + subflow_v6_specific.sk_rx_dst_set = subflow_finish_connect; + subflow_v6_specific.rebuild_header = subflow_rebuild_header; + + subflow_v6m_specific = subflow_v6_specific; + subflow_v6m_specific.queue_xmit = ipv4_specific.queue_xmit; + subflow_v6m_specific.send_check = ipv4_specific.send_check; + subflow_v6m_specific.net_header_len = ipv4_specific.net_header_len; + subflow_v6m_specific.mtu_reduced = ipv4_specific.mtu_reduced; + subflow_v6m_specific.net_frag_header_len = 0; +#endif + + if (tcp_register_ulp(&subflow_ulp_ops) != 0) + panic("MPTCP: failed to register subflows to ULP\n"); +} diff --git a/net/mptcp/token.c b/net/mptcp/token.c new file mode 100644 index 000000000000..84d887806090 --- /dev/null +++ b/net/mptcp/token.c @@ -0,0 +1,195 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP token management + * Copyright (c) 2017 - 2019, Intel Corporation. + * + * Note: This code is based on mptcp_ctrl.c from multipath-tcp.org, + * authored by: + * + * Sébastien Barré <sebastien.barre@uclouvain.be> + * Christoph Paasch <christoph.paasch@uclouvain.be> + * Jaakko Korkeaniemi <jaakko.korkeaniemi@aalto.fi> + * Gregory Detal <gregory.detal@uclouvain.be> + * Fabien Duchêne <fabien.duchene@uclouvain.be> + * Andreas Seelinger <Andreas.Seelinger@rwth-aachen.de> + * Lavkesh Lahngir <lavkesh51@gmail.com> + * Andreas Ripke <ripke@neclab.eu> + * Vlad Dogaru <vlad.dogaru@intel.com> + * Octavian Purdila <octavian.purdila@intel.com> + * John Ronan <jronan@tssg.org> + * Catalin Nicutar <catalin.nicutar@gmail.com> + * Brandon Heller <brandonh@stanford.edu> + */ + +#define pr_fmt(fmt) "MPTCP: " fmt + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/radix-tree.h> +#include <linux/ip.h> +#include <linux/tcp.h> +#include <net/sock.h> +#include <net/inet_common.h> +#include <net/protocol.h> +#include <net/mptcp.h> +#include "protocol.h" + +static RADIX_TREE(token_tree, GFP_ATOMIC); +static RADIX_TREE(token_req_tree, GFP_ATOMIC); +static DEFINE_SPINLOCK(token_tree_lock); +static int token_used __read_mostly; + +/** + * mptcp_token_new_request - create new key/idsn/token for subflow_request + * @req - the request socket + * + * This function is called when a new mptcp connection is coming in. + * + * It creates a unique token to identify the new mptcp connection, + * a secret local key and the initial data sequence number (idsn). + * + * Returns 0 on success. + */ +int mptcp_token_new_request(struct request_sock *req) +{ + struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req); + int err; + + while (1) { + u32 token; + + mptcp_crypto_key_gen_sha(&subflow_req->local_key, + &subflow_req->token, + &subflow_req->idsn); + pr_debug("req=%p local_key=%llu, token=%u, idsn=%llu\n", + req, subflow_req->local_key, subflow_req->token, + subflow_req->idsn); + + token = subflow_req->token; + spin_lock_bh(&token_tree_lock); + if (!radix_tree_lookup(&token_req_tree, token) && + !radix_tree_lookup(&token_tree, token)) + break; + spin_unlock_bh(&token_tree_lock); + } + + err = radix_tree_insert(&token_req_tree, + subflow_req->token, &token_used); + spin_unlock_bh(&token_tree_lock); + return err; +} + +/** + * mptcp_token_new_connect - create new key/idsn/token for subflow + * @sk - the socket that will initiate a connection + * + * This function is called when a new outgoing mptcp connection is + * initiated. + * + * It creates a unique token to identify the new mptcp connection, + * a secret local key and the initial data sequence number (idsn). + * + * On success, the mptcp connection can be found again using + * the computed token at a later time, this is needed to process + * join requests. + * + * returns 0 on success. + */ +int mptcp_token_new_connect(struct sock *sk) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + struct sock *mptcp_sock = subflow->conn; + int err; + + while (1) { + u32 token; + + mptcp_crypto_key_gen_sha(&subflow->local_key, &subflow->token, + &subflow->idsn); + + pr_debug("ssk=%p, local_key=%llu, token=%u, idsn=%llu\n", + sk, subflow->local_key, subflow->token, subflow->idsn); + + token = subflow->token; + spin_lock_bh(&token_tree_lock); + if (!radix_tree_lookup(&token_req_tree, token) && + !radix_tree_lookup(&token_tree, token)) + break; + spin_unlock_bh(&token_tree_lock); + } + err = radix_tree_insert(&token_tree, subflow->token, mptcp_sock); + spin_unlock_bh(&token_tree_lock); + + return err; +} + +/** + * mptcp_token_new_accept - insert token for later processing + * @token: the token to insert to the tree + * + * Called when a SYN packet creates a new logical connection, i.e. + * is not a join request. + * + * We don't have an mptcp socket yet at that point. + * This is paired with mptcp_token_update_accept, called on accept(). + */ +int mptcp_token_new_accept(u32 token) +{ + int err; + + spin_lock_bh(&token_tree_lock); + err = radix_tree_insert(&token_tree, token, &token_used); + spin_unlock_bh(&token_tree_lock); + + return err; +} + +/** + * mptcp_token_update_accept - update token to map to mptcp socket + * @conn: the new struct mptcp_sock + * @sk: the initial subflow for this mptcp socket + * + * Called when the first mptcp socket is created on accept to + * refresh the dummy mapping (done to reserve the token) with + * the mptcp_socket structure that wasn't allocated before. + */ +void mptcp_token_update_accept(struct sock *sk, struct sock *conn) +{ + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); + void __rcu **slot; + + spin_lock_bh(&token_tree_lock); + slot = radix_tree_lookup_slot(&token_tree, subflow->token); + WARN_ON_ONCE(!slot); + if (slot) { + WARN_ON_ONCE(rcu_access_pointer(*slot) != &token_used); + radix_tree_replace_slot(&token_tree, slot, conn); + } + spin_unlock_bh(&token_tree_lock); +} + +/** + * mptcp_token_destroy_request - remove mptcp connection/token + * @token - token of mptcp connection to remove + * + * Remove not-yet-fully-established incoming connection identified + * by @token. + */ +void mptcp_token_destroy_request(u32 token) +{ + spin_lock_bh(&token_tree_lock); + radix_tree_delete(&token_req_tree, token); + spin_unlock_bh(&token_tree_lock); +} + +/** + * mptcp_token_destroy - remove mptcp connection/token + * @token - token of mptcp connection to remove + * + * Remove the connection identified by @token. + */ +void mptcp_token_destroy(u32 token) +{ + spin_lock_bh(&token_tree_lock); + radix_tree_delete(&token_tree, token); + spin_unlock_bh(&token_tree_lock); +} diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index 5e9b2eb24349..3f572e5a975e 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -81,7 +81,8 @@ nf_tables-objs := nf_tables_core.o nf_tables_api.o nft_chain_filter.o \ nft_chain_route.o nf_tables_offload.o nf_tables_set-objs := nf_tables_set_core.o \ - nft_set_hash.o nft_set_bitmap.o nft_set_rbtree.o + nft_set_hash.o nft_set_bitmap.o nft_set_rbtree.o \ + nft_set_pipapo.o obj-$(CONFIG_NF_TABLES) += nf_tables.o obj-$(CONFIG_NF_TABLES_SET) += nf_tables_set.o diff --git a/net/netfilter/ipset/ip_set_bitmap_gen.h b/net/netfilter/ipset/ip_set_bitmap_gen.h index 077a2cb65fcb..26ab0e9612d8 100644 --- a/net/netfilter/ipset/ip_set_bitmap_gen.h +++ b/net/netfilter/ipset/ip_set_bitmap_gen.h @@ -75,7 +75,7 @@ mtype_flush(struct ip_set *set) if (set->extensions & IPSET_EXT_DESTROY) mtype_ext_cleanup(set); - memset(map->members, 0, map->memsize); + bitmap_zero(map->members, map->elements); set->elements = 0; set->ext_size = 0; } diff --git a/net/netfilter/ipset/ip_set_bitmap_ip.c b/net/netfilter/ipset/ip_set_bitmap_ip.c index abe8f77d7d23..0a2196f59106 100644 --- a/net/netfilter/ipset/ip_set_bitmap_ip.c +++ b/net/netfilter/ipset/ip_set_bitmap_ip.c @@ -37,7 +37,7 @@ MODULE_ALIAS("ip_set_bitmap:ip"); /* Type structure */ struct bitmap_ip { - void *members; /* the set members */ + unsigned long *members; /* the set members */ u32 first_ip; /* host byte order, included in range */ u32 last_ip; /* host byte order, included in range */ u32 elements; /* number of max elements in the set */ @@ -220,7 +220,7 @@ init_map_ip(struct ip_set *set, struct bitmap_ip *map, u32 first_ip, u32 last_ip, u32 elements, u32 hosts, u8 netmask) { - map->members = ip_set_alloc(map->memsize); + map->members = bitmap_zalloc(elements, GFP_KERNEL | __GFP_NOWARN); if (!map->members) return false; map->first_ip = first_ip; @@ -322,7 +322,7 @@ bitmap_ip_create(struct net *net, struct ip_set *set, struct nlattr *tb[], if (!map) return -ENOMEM; - map->memsize = bitmap_bytes(0, elements - 1); + map->memsize = BITS_TO_LONGS(elements) * sizeof(unsigned long); set->variant = &bitmap_ip; if (!init_map_ip(set, map, first_ip, last_ip, elements, hosts, netmask)) { diff --git a/net/netfilter/ipset/ip_set_bitmap_ipmac.c b/net/netfilter/ipset/ip_set_bitmap_ipmac.c index b618713297da..739e343efaf6 100644 --- a/net/netfilter/ipset/ip_set_bitmap_ipmac.c +++ b/net/netfilter/ipset/ip_set_bitmap_ipmac.c @@ -42,7 +42,7 @@ enum { /* Type structure */ struct bitmap_ipmac { - void *members; /* the set members */ + unsigned long *members; /* the set members */ u32 first_ip; /* host byte order, included in range */ u32 last_ip; /* host byte order, included in range */ u32 elements; /* number of max elements in the set */ @@ -299,7 +299,7 @@ static bool init_map_ipmac(struct ip_set *set, struct bitmap_ipmac *map, u32 first_ip, u32 last_ip, u32 elements) { - map->members = ip_set_alloc(map->memsize); + map->members = bitmap_zalloc(elements, GFP_KERNEL | __GFP_NOWARN); if (!map->members) return false; map->first_ip = first_ip; @@ -360,7 +360,7 @@ bitmap_ipmac_create(struct net *net, struct ip_set *set, struct nlattr *tb[], if (!map) return -ENOMEM; - map->memsize = bitmap_bytes(0, elements - 1); + map->memsize = BITS_TO_LONGS(elements) * sizeof(unsigned long); set->variant = &bitmap_ipmac; if (!init_map_ipmac(set, map, first_ip, last_ip, elements)) { kfree(map); diff --git a/net/netfilter/ipset/ip_set_bitmap_port.c b/net/netfilter/ipset/ip_set_bitmap_port.c index 23d6095cb196..b49978dd810d 100644 --- a/net/netfilter/ipset/ip_set_bitmap_port.c +++ b/net/netfilter/ipset/ip_set_bitmap_port.c @@ -30,7 +30,7 @@ MODULE_ALIAS("ip_set_bitmap:port"); /* Type structure */ struct bitmap_port { - void *members; /* the set members */ + unsigned long *members; /* the set members */ u16 first_port; /* host byte order, included in range */ u16 last_port; /* host byte order, included in range */ u32 elements; /* number of max elements in the set */ @@ -231,7 +231,7 @@ static bool init_map_port(struct ip_set *set, struct bitmap_port *map, u16 first_port, u16 last_port) { - map->members = ip_set_alloc(map->memsize); + map->members = bitmap_zalloc(map->elements, GFP_KERNEL | __GFP_NOWARN); if (!map->members) return false; map->first_port = first_port; @@ -271,7 +271,7 @@ bitmap_port_create(struct net *net, struct ip_set *set, struct nlattr *tb[], return -ENOMEM; map->elements = elements; - map->memsize = bitmap_bytes(0, map->elements); + map->memsize = BITS_TO_LONGS(elements) * sizeof(unsigned long); set->variant = &bitmap_port; if (!init_map_port(set, map, first_port, last_port)) { kfree(map); diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c index 8dc892a9dc91..605e0f68f8bd 100644 --- a/net/netfilter/ipvs/ip_vs_sync.c +++ b/net/netfilter/ipvs/ip_vs_sync.c @@ -1239,7 +1239,7 @@ static void ip_vs_process_message(struct netns_ipvs *ipvs, __u8 *buffer, p = msg_end; if (p + sizeof(s->v4) > buffer+buflen) { - IP_VS_ERR_RL("BACKUP, Dropping buffer, to small\n"); + IP_VS_ERR_RL("BACKUP, Dropping buffer, too small\n"); return; } s = (union ip_vs_sync_conn *)p; diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c index 0399ae8f1188..4f897b14b606 100644 --- a/net/netfilter/nf_conntrack_proto_sctp.c +++ b/net/netfilter/nf_conntrack_proto_sctp.c @@ -114,7 +114,7 @@ static const u8 sctp_conntracks[2][11][SCTP_CONNTRACK_MAX] = { { /* ORIGINAL */ /* sNO, sCL, sCW, sCE, sES, sSS, sSR, sSA, sHS, sHA */ -/* init */ {sCW, sCW, sCW, sCE, sES, sSS, sSR, sSA, sCW, sHA}, +/* init */ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sCW, sHA}, /* init_ack */ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sCL, sHA}, /* abort */ {sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL}, /* shutdown */ {sCL, sCL, sCW, sCE, sSS, sSS, sSR, sSA, sCL, sSS}, @@ -130,7 +130,7 @@ static const u8 sctp_conntracks[2][11][SCTP_CONNTRACK_MAX] = { /* REPLY */ /* sNO, sCL, sCW, sCE, sES, sSS, sSR, sSA, sHS, sHA */ /* init */ {sIV, sCL, sCW, sCE, sES, sSS, sSR, sSA, sIV, sHA},/* INIT in sCL Big TODO */ -/* init_ack */ {sIV, sCL, sCW, sCE, sES, sSS, sSR, sSA, sIV, sHA}, +/* init_ack */ {sIV, sCW, sCW, sCE, sES, sSS, sSR, sSA, sIV, sHA}, /* abort */ {sIV, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sIV, sCL}, /* shutdown */ {sIV, sCL, sCW, sCE, sSR, sSS, sSR, sSA, sIV, sSR}, /* shutdown_ack */ {sIV, sCL, sCW, sCE, sES, sSA, sSA, sSA, sIV, sHA}, @@ -316,7 +316,7 @@ sctp_new(struct nf_conn *ct, const struct sk_buff *skb, ct->proto.sctp.vtag[IP_CT_DIR_REPLY] = sh->vtag; } - ct->proto.sctp.state = new_state; + ct->proto.sctp.state = SCTP_CONNTRACK_NONE; } return true; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 65f51a2e9c2a..d1318bdf49ca 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -553,47 +553,70 @@ static inline u64 nf_tables_alloc_handle(struct nft_table *table) static const struct nft_chain_type *chain_type[NFPROTO_NUMPROTO][NFT_CHAIN_T_MAX]; static const struct nft_chain_type * +__nft_chain_type_get(u8 family, enum nft_chain_types type) +{ + if (family >= NFPROTO_NUMPROTO || + type >= NFT_CHAIN_T_MAX) + return NULL; + + return chain_type[family][type]; +} + +static const struct nft_chain_type * __nf_tables_chain_type_lookup(const struct nlattr *nla, u8 family) { + const struct nft_chain_type *type; int i; for (i = 0; i < NFT_CHAIN_T_MAX; i++) { - if (chain_type[family][i] != NULL && - !nla_strcmp(nla, chain_type[family][i]->name)) - return chain_type[family][i]; + type = __nft_chain_type_get(family, i); + if (!type) + continue; + if (!nla_strcmp(nla, type->name)) + return type; } return NULL; } -/* - * Loading a module requires dropping mutex that guards the transaction. - * A different client might race to start a new transaction meanwhile. Zap the - * list of pending transaction and then restore it once the mutex is grabbed - * again. Users of this function return EAGAIN which implicitly triggers the - * transaction abort path to clean up the list of pending transactions. - */ +struct nft_module_request { + struct list_head list; + char module[MODULE_NAME_LEN]; + bool done; +}; + #ifdef CONFIG_MODULES -static void nft_request_module(struct net *net, const char *fmt, ...) +static int nft_request_module(struct net *net, const char *fmt, ...) { char module_name[MODULE_NAME_LEN]; - LIST_HEAD(commit_list); + struct nft_module_request *req; va_list args; int ret; - list_splice_init(&net->nft.commit_list, &commit_list); - va_start(args, fmt); ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args); va_end(args); if (ret >= MODULE_NAME_LEN) - return; + return 0; - mutex_unlock(&net->nft.commit_mutex); - request_module("%s", module_name); - mutex_lock(&net->nft.commit_mutex); + list_for_each_entry(req, &net->nft.module_list, list) { + if (!strcmp(req->module, module_name)) { + if (req->done) + return 0; - WARN_ON_ONCE(!list_empty(&net->nft.commit_list)); - list_splice(&commit_list, &net->nft.commit_list); + /* A request to load this module already exists. */ + return -EAGAIN; + } + } + + req = kmalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->done = false; + strlcpy(req->module, module_name, MODULE_NAME_LEN); + list_add_tail(&req->list, &net->nft.module_list); + + return -EAGAIN; } #endif @@ -617,10 +640,9 @@ nf_tables_chain_type_lookup(struct net *net, const struct nlattr *nla, lockdep_nfnl_nft_mutex_not_held(); #ifdef CONFIG_MODULES if (autoload) { - nft_request_module(net, "nft-chain-%u-%.*s", family, - nla_len(nla), (const char *)nla_data(nla)); - type = __nf_tables_chain_type_lookup(nla, family); - if (type != NULL) + if (nft_request_module(net, "nft-chain-%u-%.*s", family, + nla_len(nla), + (const char *)nla_data(nla)) == -EAGAIN) return ERR_PTR(-EAGAIN); } #endif @@ -1162,11 +1184,8 @@ static void nf_tables_table_destroy(struct nft_ctx *ctx) void nft_register_chain_type(const struct nft_chain_type *ctype) { - if (WARN_ON(ctype->family >= NFPROTO_NUMPROTO)) - return; - nfnl_lock(NFNL_SUBSYS_NFTABLES); - if (WARN_ON(chain_type[ctype->family][ctype->type] != NULL)) { + if (WARN_ON(__nft_chain_type_get(ctype->family, ctype->type))) { nfnl_unlock(NFNL_SUBSYS_NFTABLES); return; } @@ -1768,7 +1787,10 @@ static int nft_chain_parse_hook(struct net *net, hook->num = ntohl(nla_get_be32(ha[NFTA_HOOK_HOOKNUM])); hook->priority = ntohl(nla_get_be32(ha[NFTA_HOOK_PRIORITY])); - type = chain_type[family][NFT_CHAIN_T_DEFAULT]; + type = __nft_chain_type_get(family, NFT_CHAIN_T_DEFAULT); + if (!type) + return -EOPNOTSUPP; + if (nla[NFTA_CHAIN_TYPE]) { type = nf_tables_chain_type_lookup(net, nla[NFTA_CHAIN_TYPE], family, autoload); @@ -2328,9 +2350,8 @@ static const struct nft_expr_type *__nft_expr_type_get(u8 family, static int nft_expr_type_request_module(struct net *net, u8 family, struct nlattr *nla) { - nft_request_module(net, "nft-expr-%u-%.*s", family, - nla_len(nla), (char *)nla_data(nla)); - if (__nft_expr_type_get(family, nla)) + if (nft_request_module(net, "nft-expr-%u-%.*s", family, + nla_len(nla), (char *)nla_data(nla)) == -EAGAIN) return -EAGAIN; return 0; @@ -2356,9 +2377,9 @@ static const struct nft_expr_type *nft_expr_type_get(struct net *net, if (nft_expr_type_request_module(net, family, nla) == -EAGAIN) return ERR_PTR(-EAGAIN); - nft_request_module(net, "nft-expr-%.*s", - nla_len(nla), (char *)nla_data(nla)); - if (__nft_expr_type_get(family, nla)) + if (nft_request_module(net, "nft-expr-%.*s", + nla_len(nla), + (char *)nla_data(nla)) == -EAGAIN) return ERR_PTR(-EAGAIN); } #endif @@ -2449,9 +2470,10 @@ static int nf_tables_expr_parse(const struct nft_ctx *ctx, err = PTR_ERR(ops); #ifdef CONFIG_MODULES if (err == -EAGAIN) - nft_expr_type_request_module(ctx->net, - ctx->family, - tb[NFTA_EXPR_NAME]); + if (nft_expr_type_request_module(ctx->net, + ctx->family, + tb[NFTA_EXPR_NAME]) != -EAGAIN) + err = -ENOENT; #endif goto err1; } @@ -3288,8 +3310,7 @@ nft_select_set_ops(const struct nft_ctx *ctx, lockdep_nfnl_nft_mutex_not_held(); #ifdef CONFIG_MODULES if (list_empty(&nf_tables_set_types)) { - nft_request_module(ctx->net, "nft-set"); - if (!list_empty(&nf_tables_set_types)) + if (nft_request_module(ctx->net, "nft-set") == -EAGAIN) return ERR_PTR(-EAGAIN); } #endif @@ -3370,6 +3391,7 @@ static const struct nla_policy nft_set_policy[NFTA_SET_MAX + 1] = { static const struct nla_policy nft_set_desc_policy[NFTA_SET_DESC_MAX + 1] = { [NFTA_SET_DESC_SIZE] = { .type = NLA_U32 }, + [NFTA_SET_DESC_CONCAT] = { .type = NLA_NESTED }, }; static int nft_ctx_init_from_setattr(struct nft_ctx *ctx, struct net *net, @@ -3536,6 +3558,33 @@ static __be64 nf_jiffies64_to_msecs(u64 input) return cpu_to_be64(jiffies64_to_msecs(input)); } +static int nf_tables_fill_set_concat(struct sk_buff *skb, + const struct nft_set *set) +{ + struct nlattr *concat, *field; + int i; + + concat = nla_nest_start_noflag(skb, NFTA_SET_DESC_CONCAT); + if (!concat) + return -ENOMEM; + + for (i = 0; i < set->field_count; i++) { + field = nla_nest_start_noflag(skb, NFTA_LIST_ELEM); + if (!field) + return -ENOMEM; + + if (nla_put_be32(skb, NFTA_SET_FIELD_LEN, + htonl(set->field_len[i]))) + return -ENOMEM; + + nla_nest_end(skb, field); + } + + nla_nest_end(skb, concat); + + return 0; +} + static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx, const struct nft_set *set, u16 event, u16 flags) { @@ -3599,11 +3648,17 @@ static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx, goto nla_put_failure; desc = nla_nest_start_noflag(skb, NFTA_SET_DESC); + if (desc == NULL) goto nla_put_failure; if (set->size && nla_put_be32(skb, NFTA_SET_DESC_SIZE, htonl(set->size))) goto nla_put_failure; + + if (set->field_count > 1 && + nf_tables_fill_set_concat(skb, set)) + goto nla_put_failure; + nla_nest_end(skb, desc); nlmsg_end(skb, nlh); @@ -3776,6 +3831,53 @@ err: return err; } +static const struct nla_policy nft_concat_policy[NFTA_SET_FIELD_MAX + 1] = { + [NFTA_SET_FIELD_LEN] = { .type = NLA_U32 }, +}; + +static int nft_set_desc_concat_parse(const struct nlattr *attr, + struct nft_set_desc *desc) +{ + struct nlattr *tb[NFTA_SET_FIELD_MAX + 1]; + u32 len; + int err; + + err = nla_parse_nested_deprecated(tb, NFTA_SET_FIELD_MAX, attr, + nft_concat_policy, NULL); + if (err < 0) + return err; + + if (!tb[NFTA_SET_FIELD_LEN]) + return -EINVAL; + + len = ntohl(nla_get_be32(tb[NFTA_SET_FIELD_LEN])); + + if (len * BITS_PER_BYTE / 32 > NFT_REG32_COUNT) + return -E2BIG; + + desc->field_len[desc->field_count++] = len; + + return 0; +} + +static int nft_set_desc_concat(struct nft_set_desc *desc, + const struct nlattr *nla) +{ + struct nlattr *attr; + int rem, err; + + nla_for_each_nested(attr, nla, rem) { + if (nla_type(attr) != NFTA_LIST_ELEM) + return -EINVAL; + + err = nft_set_desc_concat_parse(attr, desc); + if (err < 0) + return err; + } + + return 0; +} + static int nf_tables_set_desc_parse(struct nft_set_desc *desc, const struct nlattr *nla) { @@ -3789,8 +3891,10 @@ static int nf_tables_set_desc_parse(struct nft_set_desc *desc, if (da[NFTA_SET_DESC_SIZE] != NULL) desc->size = ntohl(nla_get_be32(da[NFTA_SET_DESC_SIZE])); + if (da[NFTA_SET_DESC_CONCAT]) + err = nft_set_desc_concat(desc, da[NFTA_SET_DESC_CONCAT]); - return 0; + return err; } static int nf_tables_newset(struct net *net, struct sock *nlsk, @@ -3813,6 +3917,7 @@ static int nf_tables_newset(struct net *net, struct sock *nlsk, unsigned char *udata; u16 udlen; int err; + int i; if (nla[NFTA_SET_TABLE] == NULL || nla[NFTA_SET_NAME] == NULL || @@ -3991,6 +4096,10 @@ static int nf_tables_newset(struct net *net, struct sock *nlsk, set->gc_int = gc_int; set->handle = nf_tables_alloc_handle(table); + set->field_count = desc.field_count; + for (i = 0; i < desc.field_count; i++) + set->field_len[i] = desc.field_len[i]; + err = ops->init(set, &desc, nla); if (err < 0) goto err3; @@ -4194,6 +4303,9 @@ const struct nft_set_ext_type nft_set_ext_types[] = { .len = sizeof(struct nft_userdata), .align = __alignof__(struct nft_userdata), }, + [NFT_SET_EXT_KEY_END] = { + .align = __alignof__(u32), + }, }; EXPORT_SYMBOL_GPL(nft_set_ext_types); @@ -4212,6 +4324,7 @@ static const struct nla_policy nft_set_elem_policy[NFTA_SET_ELEM_MAX + 1] = { [NFTA_SET_ELEM_EXPR] = { .type = NLA_NESTED }, [NFTA_SET_ELEM_OBJREF] = { .type = NLA_STRING, .len = NFT_OBJ_MAXNAMELEN - 1 }, + [NFTA_SET_ELEM_KEY_END] = { .type = NLA_NESTED }, }; static const struct nla_policy nft_set_elem_list_policy[NFTA_SET_ELEM_LIST_MAX + 1] = { @@ -4261,6 +4374,11 @@ static int nf_tables_fill_setelem(struct sk_buff *skb, NFT_DATA_VALUE, set->klen) < 0) goto nla_put_failure; + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) && + nft_data_dump(skb, NFTA_SET_ELEM_KEY_END, nft_set_ext_key_end(ext), + NFT_DATA_VALUE, set->klen) < 0) + goto nla_put_failure; + if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && nft_data_dump(skb, NFTA_SET_ELEM_DATA, nft_set_ext_data(ext), set->dtype == NFT_DATA_VERDICT ? NFT_DATA_VERDICT : NFT_DATA_VALUE, @@ -4503,11 +4621,28 @@ static int nft_setelem_parse_flags(const struct nft_set *set, return 0; } +static int nft_setelem_parse_key(struct nft_ctx *ctx, struct nft_set *set, + struct nft_data *key, struct nlattr *attr) +{ + struct nft_data_desc desc; + int err; + + err = nft_data_init(ctx, key, NFT_DATA_VALUE_MAXLEN, &desc, attr); + if (err < 0) + return err; + + if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) { + nft_data_release(key, desc.type); + return -EINVAL; + } + + return 0; +} + static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, const struct nlattr *attr) { struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; - struct nft_data_desc desc; struct nft_set_elem elem; struct sk_buff *skb; uint32_t flags = 0; @@ -4526,15 +4661,16 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (err < 0) return err; - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, - nla[NFTA_SET_ELEM_KEY]); + err = nft_setelem_parse_key(ctx, set, &elem.key.val, + nla[NFTA_SET_ELEM_KEY]); if (err < 0) return err; - err = -EINVAL; - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) { - nft_data_release(&elem.key.val, desc.type); - return err; + if (nla[NFTA_SET_ELEM_KEY_END]) { + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, + nla[NFTA_SET_ELEM_KEY_END]); + if (err < 0) + return err; } priv = set->ops->get(ctx->net, set, &elem, flags); @@ -4662,8 +4798,8 @@ static struct nft_trans *nft_trans_elem_alloc(struct nft_ctx *ctx, void *nft_set_elem_init(const struct nft_set *set, const struct nft_set_ext_tmpl *tmpl, - const u32 *key, const u32 *data, - u64 timeout, u64 expiration, gfp_t gfp) + const u32 *key, const u32 *key_end, + const u32 *data, u64 timeout, u64 expiration, gfp_t gfp) { struct nft_set_ext *ext; void *elem; @@ -4676,6 +4812,8 @@ void *nft_set_elem_init(const struct nft_set *set, nft_set_ext_init(ext, tmpl); memcpy(nft_set_ext_key(ext), key, set->klen); + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) + memcpy(nft_set_ext_key_end(ext), key_end, set->klen); if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) memcpy(nft_set_ext_data(ext), data, set->dlen); if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) { @@ -4735,13 +4873,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, { struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; u8 genmask = nft_genmask_next(ctx->net); - struct nft_data_desc d1, d2; struct nft_set_ext_tmpl tmpl; struct nft_set_ext *ext, *ext2; struct nft_set_elem elem; struct nft_set_binding *binding; struct nft_object *obj = NULL; struct nft_userdata *udata; + struct nft_data_desc desc; struct nft_data data; enum nft_registers dreg; struct nft_trans *trans; @@ -4807,15 +4945,22 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, return err; } - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &d1, - nla[NFTA_SET_ELEM_KEY]); + err = nft_setelem_parse_key(ctx, set, &elem.key.val, + nla[NFTA_SET_ELEM_KEY]); if (err < 0) - goto err1; - err = -EINVAL; - if (d1.type != NFT_DATA_VALUE || d1.len != set->klen) - goto err2; + return err; + + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); + + if (nla[NFTA_SET_ELEM_KEY_END]) { + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, + nla[NFTA_SET_ELEM_KEY_END]); + if (err < 0) + goto err_parse_key; + + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); + } - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, d1.len); if (timeout > 0) { nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION); if (timeout != set->timeout) @@ -4825,27 +4970,27 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (nla[NFTA_SET_ELEM_OBJREF] != NULL) { if (!(set->flags & NFT_SET_OBJECT)) { err = -EINVAL; - goto err2; + goto err_parse_key_end; } obj = nft_obj_lookup(ctx->net, ctx->table, nla[NFTA_SET_ELEM_OBJREF], set->objtype, genmask); if (IS_ERR(obj)) { err = PTR_ERR(obj); - goto err2; + goto err_parse_key_end; } nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF); } if (nla[NFTA_SET_ELEM_DATA] != NULL) { - err = nft_data_init(ctx, &data, sizeof(data), &d2, + err = nft_data_init(ctx, &data, sizeof(data), &desc, nla[NFTA_SET_ELEM_DATA]); if (err < 0) - goto err2; + goto err_parse_key_end; err = -EINVAL; - if (set->dtype != NFT_DATA_VERDICT && d2.len != set->dlen) - goto err3; + if (set->dtype != NFT_DATA_VERDICT && desc.len != set->dlen) + goto err_parse_data; dreg = nft_type_to_reg(set->dtype); list_for_each_entry(binding, &set->bindings, list) { @@ -4861,18 +5006,18 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, err = nft_validate_register_store(&bind_ctx, dreg, &data, - d2.type, d2.len); + desc.type, desc.len); if (err < 0) - goto err3; + goto err_parse_data; - if (d2.type == NFT_DATA_VERDICT && + if (desc.type == NFT_DATA_VERDICT && (data.verdict.code == NFT_GOTO || data.verdict.code == NFT_JUMP)) nft_validate_state_update(ctx->net, NFT_VALIDATE_NEED); } - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_DATA, d2.len); + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_DATA, desc.len); } /* The full maximum length of userdata can exceed the maximum @@ -4888,10 +5033,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, } err = -ENOMEM; - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, data.data, + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, + elem.key_end.val.data, data.data, timeout, expiration, GFP_KERNEL); if (elem.priv == NULL) - goto err3; + goto err_parse_data; ext = nft_set_elem_ext(set, elem.priv); if (flags) @@ -4908,7 +5054,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set); if (trans == NULL) - goto err4; + goto err_trans; ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; err = set->ops->insert(ctx->net, set, &elem, &ext2); @@ -4919,7 +5065,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^ nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF)) { err = -EBUSY; - goto err5; + goto err_element_clash; } if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) && nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) && @@ -4932,33 +5078,35 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, else if (!(nlmsg_flags & NLM_F_EXCL)) err = 0; } - goto err5; + goto err_element_clash; } if (set->size && !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) { err = -ENFILE; - goto err6; + goto err_set_full; } nft_trans_elem(trans) = elem; list_add_tail(&trans->list, &ctx->net->nft.commit_list); return 0; -err6: +err_set_full: set->ops->remove(ctx->net, set, &elem); -err5: +err_element_clash: kfree(trans); -err4: +err_trans: if (obj) obj->use--; kfree(elem.priv); -err3: +err_parse_data: if (nla[NFTA_SET_ELEM_DATA] != NULL) - nft_data_release(&data, d2.type); -err2: - nft_data_release(&elem.key.val, d1.type); -err1: + nft_data_release(&data, desc.type); +err_parse_key_end: + nft_data_release(&elem.key_end.val, NFT_DATA_VALUE); +err_parse_key: + nft_data_release(&elem.key.val, NFT_DATA_VALUE); + return err; } @@ -5053,7 +5201,6 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, { struct nlattr *nla[NFTA_SET_ELEM_MAX + 1]; struct nft_set_ext_tmpl tmpl; - struct nft_data_desc desc; struct nft_set_elem elem; struct nft_set_ext *ext; struct nft_trans *trans; @@ -5064,11 +5211,10 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, err = nla_parse_nested_deprecated(nla, NFTA_SET_ELEM_MAX, attr, nft_set_elem_policy, NULL); if (err < 0) - goto err1; + return err; - err = -EINVAL; if (nla[NFTA_SET_ELEM_KEY] == NULL) - goto err1; + return -EINVAL; nft_set_ext_prepare(&tmpl); @@ -5078,37 +5224,41 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, if (flags != 0) nft_set_ext_add(&tmpl, NFT_SET_EXT_FLAGS); - err = nft_data_init(ctx, &elem.key.val, sizeof(elem.key), &desc, - nla[NFTA_SET_ELEM_KEY]); + err = nft_setelem_parse_key(ctx, set, &elem.key.val, + nla[NFTA_SET_ELEM_KEY]); if (err < 0) - goto err1; + return err; - err = -EINVAL; - if (desc.type != NFT_DATA_VALUE || desc.len != set->klen) - goto err2; + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen); - nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, desc.len); + if (nla[NFTA_SET_ELEM_KEY_END]) { + err = nft_setelem_parse_key(ctx, set, &elem.key_end.val, + nla[NFTA_SET_ELEM_KEY_END]); + if (err < 0) + return err; + + nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen); + } err = -ENOMEM; - elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0, - 0, GFP_KERNEL); + elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, + elem.key_end.val.data, NULL, 0, 0, + GFP_KERNEL); if (elem.priv == NULL) - goto err2; + goto fail_elem; ext = nft_set_elem_ext(set, elem.priv); if (flags) *nft_set_ext_flags(ext) = flags; trans = nft_trans_elem_alloc(ctx, NFT_MSG_DELSETELEM, set); - if (trans == NULL) { - err = -ENOMEM; - goto err3; - } + if (trans == NULL) + goto fail_trans; priv = set->ops->deactivate(ctx->net, set, &elem); if (priv == NULL) { err = -ENOENT; - goto err4; + goto fail_ops; } kfree(elem.priv); elem.priv = priv; @@ -5119,13 +5269,12 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, list_add_tail(&trans->list, &ctx->net->nft.commit_list); return 0; -err4: +fail_ops: kfree(trans); -err3: +fail_trans: kfree(elem.priv); -err2: - nft_data_release(&elem.key.val, desc.type); -err1: +fail_elem: + nft_data_release(&elem.key.val, NFT_DATA_VALUE); return err; } @@ -5415,8 +5564,7 @@ nft_obj_type_get(struct net *net, u32 objtype) lockdep_nfnl_nft_mutex_not_held(); #ifdef CONFIG_MODULES if (type == NULL) { - nft_request_module(net, "nft-obj-%u", objtype); - if (__nft_obj_type_get(objtype)) + if (nft_request_module(net, "nft-obj-%u", objtype) == -EAGAIN) return ERR_PTR(-EAGAIN); } #endif @@ -5989,8 +6137,7 @@ nft_flowtable_type_get(struct net *net, u8 family) lockdep_nfnl_nft_mutex_not_held(); #ifdef CONFIG_MODULES if (type == NULL) { - nft_request_module(net, "nf-flowtable-%u", family); - if (__nft_flowtable_type_get(family)) + if (nft_request_module(net, "nf-flowtable-%u", family) == -EAGAIN) return ERR_PTR(-EAGAIN); } #endif @@ -6992,6 +7139,18 @@ static void nft_chain_del(struct nft_chain *chain) list_del_rcu(&chain->list); } +static void nf_tables_module_autoload_cleanup(struct net *net) +{ + struct nft_module_request *req, *next; + + WARN_ON_ONCE(!list_empty(&net->nft.commit_list)); + list_for_each_entry_safe(req, next, &net->nft.module_list, list) { + WARN_ON_ONCE(!req->done); + list_del(&req->list); + kfree(req); + } +} + static void nf_tables_commit_release(struct net *net) { struct nft_trans *trans; @@ -7004,6 +7163,7 @@ static void nf_tables_commit_release(struct net *net) * to prevent expensive synchronize_rcu() in commit phase. */ if (list_empty(&net->nft.commit_list)) { + nf_tables_module_autoload_cleanup(net); mutex_unlock(&net->nft.commit_mutex); return; } @@ -7018,6 +7178,7 @@ static void nf_tables_commit_release(struct net *net) list_splice_tail_init(&net->nft.commit_list, &nf_tables_destroy_list); spin_unlock(&nf_tables_destroy_list_lock); + nf_tables_module_autoload_cleanup(net); mutex_unlock(&net->nft.commit_mutex); schedule_work(&trans_destroy_work); @@ -7209,6 +7370,26 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) return 0; } +static void nf_tables_module_autoload(struct net *net) +{ + struct nft_module_request *req, *next; + LIST_HEAD(module_list); + + list_splice_init(&net->nft.module_list, &module_list); + mutex_unlock(&net->nft.commit_mutex); + list_for_each_entry_safe(req, next, &module_list, list) { + if (req->done) { + list_del(&req->list); + kfree(req); + } else { + request_module("%s", req->module); + req->done = true; + } + } + mutex_lock(&net->nft.commit_mutex); + list_splice(&module_list, &net->nft.module_list); +} + static void nf_tables_abort_release(struct nft_trans *trans) { switch (trans->msg_type) { @@ -7238,7 +7419,7 @@ static void nf_tables_abort_release(struct nft_trans *trans) kfree(trans); } -static int __nf_tables_abort(struct net *net) +static int __nf_tables_abort(struct net *net, bool autoload) { struct nft_trans *trans, *next; struct nft_trans_elem *te; @@ -7360,6 +7541,11 @@ static int __nf_tables_abort(struct net *net) nf_tables_abort_release(trans); } + if (autoload) + nf_tables_module_autoload(net); + else + nf_tables_module_autoload_cleanup(net); + return 0; } @@ -7368,9 +7554,9 @@ static void nf_tables_cleanup(struct net *net) nft_validate_state_update(net, NFT_VALIDATE_SKIP); } -static int nf_tables_abort(struct net *net, struct sk_buff *skb) +static int nf_tables_abort(struct net *net, struct sk_buff *skb, bool autoload) { - int ret = __nf_tables_abort(net); + int ret = __nf_tables_abort(net, autoload); mutex_unlock(&net->nft.commit_mutex); @@ -7965,6 +8151,7 @@ static int __net_init nf_tables_init_net(struct net *net) { INIT_LIST_HEAD(&net->nft.tables); INIT_LIST_HEAD(&net->nft.commit_list); + INIT_LIST_HEAD(&net->nft.module_list); mutex_init(&net->nft.commit_mutex); net->nft.base_seq = 1; net->nft.validate_state = NFT_VALIDATE_SKIP; @@ -7976,7 +8163,7 @@ static void __net_exit nf_tables_exit_net(struct net *net) { mutex_lock(&net->nft.commit_mutex); if (!list_empty(&net->nft.commit_list)) - __nf_tables_abort(net); + __nf_tables_abort(net, false); __nft_release_tables(net); mutex_unlock(&net->nft.commit_mutex); WARN_ON_ONCE(!list_empty(&net->nft.tables)); diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c index a9ea29afb09f..2bb28483af22 100644 --- a/net/netfilter/nf_tables_offload.c +++ b/net/netfilter/nf_tables_offload.c @@ -564,7 +564,7 @@ static void nft_indr_block_cb(struct net_device *dev, mutex_lock(&net->nft.commit_mutex); chain = __nft_offload_get_chain(dev); - if (chain) { + if (chain && chain->flags & NFT_CHAIN_HW_OFFLOAD) { struct nft_base_chain *basechain; basechain = nft_base_chain(chain); diff --git a/net/netfilter/nf_tables_set_core.c b/net/netfilter/nf_tables_set_core.c index a9fce8d10051..586b621007eb 100644 --- a/net/netfilter/nf_tables_set_core.c +++ b/net/netfilter/nf_tables_set_core.c @@ -9,12 +9,14 @@ static int __init nf_tables_set_module_init(void) nft_register_set(&nft_set_rhash_type); nft_register_set(&nft_set_bitmap_type); nft_register_set(&nft_set_rbtree_type); + nft_register_set(&nft_set_pipapo_type); return 0; } static void __exit nf_tables_set_module_exit(void) { + nft_unregister_set(&nft_set_pipapo_type); nft_unregister_set(&nft_set_rbtree_type); nft_unregister_set(&nft_set_bitmap_type); nft_unregister_set(&nft_set_rhash_type); diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c index 4abbb452cf6c..99127e2d95a8 100644 --- a/net/netfilter/nfnetlink.c +++ b/net/netfilter/nfnetlink.c @@ -476,7 +476,7 @@ ack: } done: if (status & NFNL_BATCH_REPLAY) { - ss->abort(net, oskb); + ss->abort(net, oskb, true); nfnl_err_reset(&err_list); kfree_skb(skb); module_put(ss->owner); @@ -487,11 +487,11 @@ done: status |= NFNL_BATCH_REPLAY; goto done; } else if (err) { - ss->abort(net, oskb); + ss->abort(net, oskb, false); netlink_ack(oskb, nlmsg_hdr(oskb), err, NULL); } } else { - ss->abort(net, oskb); + ss->abort(net, oskb, false); } if (ss->cleanup) ss->cleanup(net); diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c index 8887295414dc..683785225a3e 100644 --- a/net/netfilter/nft_dynset.c +++ b/net/netfilter/nft_dynset.c @@ -54,7 +54,7 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr, timeout = priv->timeout ? : set->timeout; elem = nft_set_elem_init(set, &priv->tmpl, - ®s->data[priv->sreg_key], + ®s->data[priv->sreg_key], NULL, ®s->data[priv->sreg_data], timeout, 0, GFP_ATOMIC); if (elem == NULL) diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c index f54d6ae15bb1..b42247aa48a9 100644 --- a/net/netfilter/nft_osf.c +++ b/net/netfilter/nft_osf.c @@ -61,6 +61,9 @@ static int nft_osf_init(const struct nft_ctx *ctx, int err; u8 ttl; + if (!tb[NFTA_OSF_DREG]) + return -EINVAL; + if (tb[NFTA_OSF_TTL]) { ttl = nla_get_u8(tb[NFTA_OSF_TTL]); if (ttl > 2) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c new file mode 100644 index 000000000000..f0cb1e13af50 --- /dev/null +++ b/net/netfilter/nft_set_pipapo.c @@ -0,0 +1,2102 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* PIPAPO: PIle PAcket POlicies: set for arbitrary concatenations of ranges + * + * Copyright (c) 2019-2020 Red Hat GmbH + * + * Author: Stefano Brivio <sbrivio@redhat.com> + */ + +/** + * DOC: Theory of Operation + * + * + * Problem + * ------- + * + * Match packet bytes against entries composed of ranged or non-ranged packet + * field specifiers, mapping them to arbitrary references. For example: + * + * :: + * + * --- fields ---> + * | [net],[port],[net]... => [reference] + * entries [net],[port],[net]... => [reference] + * | [net],[port],[net]... => [reference] + * V ... + * + * where [net] fields can be IP ranges or netmasks, and [port] fields are port + * ranges. Arbitrary packet fields can be matched. + * + * + * Algorithm Overview + * ------------------ + * + * This algorithm is loosely inspired by [Ligatti 2010], and fundamentally + * relies on the consideration that every contiguous range in a space of b bits + * can be converted into b * 2 netmasks, from Theorem 3 in [Rottenstreich 2010], + * as also illustrated in Section 9 of [Kogan 2014]. + * + * Classification against a number of entries, that require matching given bits + * of a packet field, is performed by grouping those bits in sets of arbitrary + * size, and classifying packet bits one group at a time. + * + * Example: + * to match the source port (16 bits) of a packet, we can divide those 16 bits + * in 4 groups of 4 bits each. Given the entry: + * 0000 0001 0101 1001 + * and a packet with source port: + * 0000 0001 1010 1001 + * first and second groups match, but the third doesn't. We conclude that the + * packet doesn't match the given entry. + * + * Translate the set to a sequence of lookup tables, one per field. Each table + * has two dimensions: bit groups to be matched for a single packet field, and + * all the possible values of said groups (buckets). Input entries are + * represented as one or more rules, depending on the number of composing + * netmasks for the given field specifier, and a group match is indicated as a + * set bit, with number corresponding to the rule index, in all the buckets + * whose value matches the entry for a given group. + * + * Rules are mapped between fields through an array of x, n pairs, with each + * item mapping a matched rule to one or more rules. The position of the pair in + * the array indicates the matched rule to be mapped to the next field, x + * indicates the first rule index in the next field, and n the amount of + * next-field rules the current rule maps to. + * + * The mapping array for the last field maps to the desired references. + * + * To match, we perform table lookups using the values of grouped packet bits, + * and use a sequence of bitwise operations to progressively evaluate rule + * matching. + * + * A stand-alone, reference implementation, also including notes about possible + * future optimisations, is available at: + * https://pipapo.lameexcu.se/ + * + * Insertion + * --------- + * + * - For each packet field: + * + * - divide the b packet bits we want to classify into groups of size t, + * obtaining ceil(b / t) groups + * + * Example: match on destination IP address, with t = 4: 32 bits, 8 groups + * of 4 bits each + * + * - allocate a lookup table with one column ("bucket") for each possible + * value of a group, and with one row for each group + * + * Example: 8 groups, 2^4 buckets: + * + * :: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 + * 1 + * 2 + * 3 + * 4 + * 5 + * 6 + * 7 + * + * - map the bits we want to classify for the current field, for a given + * entry, to a single rule for non-ranged and netmask set items, and to one + * or multiple rules for ranges. Ranges are expanded to composing netmasks + * by pipapo_expand(). + * + * Example: 2 entries, 10.0.0.5:1024 and 192.168.1.0-192.168.2.1:2048 + * - rule #0: 10.0.0.5 + * - rule #1: 192.168.1.0/24 + * - rule #2: 192.168.2.0/31 + * + * - insert references to the rules in the lookup table, selecting buckets + * according to bit values of a rule in the given group. This is done by + * pipapo_insert(). + * + * Example: given: + * - rule #0: 10.0.0.5 mapping to buckets + * < 0 10 0 0 0 0 0 5 > + * - rule #1: 192.168.1.0/24 mapping to buckets + * < 12 0 10 8 0 1 < 0..15 > < 0..15 > > + * - rule #2: 192.168.2.0/31 mapping to buckets + * < 12 0 10 8 0 2 0 < 0..1 > > + * + * these bits are set in the lookup table: + * + * :: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0 1,2 + * 1 1,2 0 + * 2 0 1,2 + * 3 0 1,2 + * 4 0,1,2 + * 5 0 1 2 + * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 + * + * - if this is not the last field in the set, fill a mapping array that maps + * rules from the lookup table to rules belonging to the same entry in + * the next lookup table, done by pipapo_map(). + * + * Note that as rules map to contiguous ranges of rules, given how netmask + * expansion and insertion is performed, &union nft_pipapo_map_bucket stores + * this information as pairs of first rule index, rule count. + * + * Example: 2 entries, 10.0.0.5:1024 and 192.168.1.0-192.168.2.1:2048, + * given lookup table #0 for field 0 (see example above): + * + * :: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0 1,2 + * 1 1,2 0 + * 2 0 1,2 + * 3 0 1,2 + * 4 0,1,2 + * 5 0 1 2 + * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 + * + * and lookup table #1 for field 1 with: + * - rule #0: 1024 mapping to buckets + * < 0 0 4 0 > + * - rule #1: 2048 mapping to buckets + * < 0 0 5 0 > + * + * :: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0,1 + * 1 0,1 + * 2 0 1 + * 3 0,1 + * + * we need to map rules for 10.0.0.5 in lookup table #0 (rule #0) to 1024 + * in lookup table #1 (rule #0) and rules for 192.168.1.0-192.168.2.1 + * (rules #1, #2) to 2048 in lookup table #2 (rule #1): + * + * :: + * + * rule indices in current field: 0 1 2 + * map to rules in next field: 0 1 1 + * + * - if this is the last field in the set, fill a mapping array that maps + * rules from the last lookup table to element pointers, also done by + * pipapo_map(). + * + * Note that, in this implementation, we have two elements (start, end) for + * each entry. The pointer to the end element is stored in this array, and + * the pointer to the start element is linked from it. + * + * Example: entry 10.0.0.5:1024 has a corresponding &struct nft_pipapo_elem + * pointer, 0x66, and element for 192.168.1.0-192.168.2.1:2048 is at 0x42. + * From the rules of lookup table #1 as mapped above: + * + * :: + * + * rule indices in last field: 0 1 + * map to elements: 0x42 0x66 + * + * + * Matching + * -------- + * + * We use a result bitmap, with the size of a single lookup table bucket, to + * represent the matching state that applies at every algorithm step. This is + * done by pipapo_lookup(). + * + * - For each packet field: + * + * - start with an all-ones result bitmap (res_map in pipapo_lookup()) + * + * - perform a lookup into the table corresponding to the current field, + * for each group, and at every group, AND the current result bitmap with + * the value from the lookup table bucket + * + * :: + * + * Example: 192.168.1.5 < 12 0 10 8 0 1 0 5 >, with lookup table from + * insertion examples. + * Lookup table buckets are at least 3 bits wide, we'll assume 8 bits for + * convenience in this example. Initial result bitmap is 0xff, the steps + * below show the value of the result bitmap after each group is processed: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0 1,2 + * result bitmap is now: 0xff & 0x6 [bucket 12] = 0x6 + * + * 1 1,2 0 + * result bitmap is now: 0x6 & 0x6 [bucket 0] = 0x6 + * + * 2 0 1,2 + * result bitmap is now: 0x6 & 0x6 [bucket 10] = 0x6 + * + * 3 0 1,2 + * result bitmap is now: 0x6 & 0x6 [bucket 8] = 0x6 + * + * 4 0,1,2 + * result bitmap is now: 0x6 & 0x7 [bucket 0] = 0x6 + * + * 5 0 1 2 + * result bitmap is now: 0x6 & 0x2 [bucket 1] = 0x2 + * + * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + * result bitmap is now: 0x2 & 0x7 [bucket 0] = 0x2 + * + * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 + * final result bitmap for this field is: 0x2 & 0x3 [bucket 5] = 0x2 + * + * - at the next field, start with a new, all-zeroes result bitmap. For each + * bit set in the previous result bitmap, fill the new result bitmap + * (fill_map in pipapo_lookup()) with the rule indices from the + * corresponding buckets of the mapping field for this field, done by + * pipapo_refill() + * + * Example: with mapping table from insertion examples, with the current + * result bitmap from the previous example, 0x02: + * + * :: + * + * rule indices in current field: 0 1 2 + * map to rules in next field: 0 1 1 + * + * the new result bitmap will be 0x02: rule 1 was set, and rule 1 will be + * set. + * + * We can now extend this example to cover the second iteration of the step + * above (lookup and AND bitmap): assuming the port field is + * 2048 < 0 0 5 0 >, with starting result bitmap 0x2, and lookup table + * for "port" field from pre-computation example: + * + * :: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0,1 + * 1 0,1 + * 2 0 1 + * 3 0,1 + * + * operations are: 0x2 & 0x3 [bucket 0] & 0x3 [bucket 0] & 0x2 [bucket 5] + * & 0x3 [bucket 0], resulting bitmap is 0x2. + * + * - if this is the last field in the set, look up the value from the mapping + * array corresponding to the final result bitmap + * + * Example: 0x2 resulting bitmap from 192.168.1.5:2048, mapping array for + * last field from insertion example: + * + * :: + * + * rule indices in last field: 0 1 + * map to elements: 0x42 0x66 + * + * the matching element is at 0x42. + * + * + * References + * ---------- + * + * [Ligatti 2010] + * A Packet-classification Algorithm for Arbitrary Bitmask Rules, with + * Automatic Time-space Tradeoffs + * Jay Ligatti, Josh Kuhn, and Chris Gage. + * Proceedings of the IEEE International Conference on Computer + * Communication Networks (ICCCN), August 2010. + * http://www.cse.usf.edu/~ligatti/papers/grouper-conf.pdf + * + * [Rottenstreich 2010] + * Worst-Case TCAM Rule Expansion + * Ori Rottenstreich and Isaac Keslassy. + * 2010 Proceedings IEEE INFOCOM, San Diego, CA, 2010. + * http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.212.4592&rep=rep1&type=pdf + * + * [Kogan 2014] + * SAX-PAC (Scalable And eXpressive PAcket Classification) + * Kirill Kogan, Sergey Nikolenko, Ori Rottenstreich, William Culhane, + * and Patrick Eugster. + * Proceedings of the 2014 ACM conference on SIGCOMM, August 2014. + * http://www.sigcomm.org/sites/default/files/ccr/papers/2014/August/2619239-2626294.pdf + */ + +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/log2.h> +#include <linux/module.h> +#include <linux/netlink.h> +#include <linux/netfilter.h> +#include <linux/netfilter/nf_tables.h> +#include <net/netfilter/nf_tables_core.h> +#include <uapi/linux/netfilter/nf_tables.h> +#include <net/ipv6.h> /* For the maximum length of a field */ +#include <linux/bitmap.h> +#include <linux/bitops.h> + +/* Count of concatenated fields depends on count of 32-bit nftables registers */ +#define NFT_PIPAPO_MAX_FIELDS NFT_REG32_COUNT + +/* Largest supported field size */ +#define NFT_PIPAPO_MAX_BYTES (sizeof(struct in6_addr)) +#define NFT_PIPAPO_MAX_BITS (NFT_PIPAPO_MAX_BYTES * BITS_PER_BYTE) + +/* Number of bits to be grouped together in lookup table buckets, arbitrary */ +#define NFT_PIPAPO_GROUP_BITS 4 +#define NFT_PIPAPO_GROUPS_PER_BYTE (BITS_PER_BYTE / NFT_PIPAPO_GROUP_BITS) + +/* Fields are padded to 32 bits in input registers */ +#define NFT_PIPAPO_GROUPS_PADDED_SIZE(x) \ + (round_up((x) / NFT_PIPAPO_GROUPS_PER_BYTE, sizeof(u32))) +#define NFT_PIPAPO_GROUPS_PADDING(x) \ + (NFT_PIPAPO_GROUPS_PADDED_SIZE((x)) - (x) / NFT_PIPAPO_GROUPS_PER_BYTE) + +/* Number of buckets, given by 2 ^ n, with n grouped bits */ +#define NFT_PIPAPO_BUCKETS (1 << NFT_PIPAPO_GROUP_BITS) + +/* Each n-bit range maps to up to n * 2 rules */ +#define NFT_PIPAPO_MAP_NBITS (const_ilog2(NFT_PIPAPO_MAX_BITS * 2)) + +/* Use the rest of mapping table buckets for rule indices, but it makes no sense + * to exceed 32 bits + */ +#if BITS_PER_LONG == 64 +#define NFT_PIPAPO_MAP_TOBITS 32 +#else +#define NFT_PIPAPO_MAP_TOBITS (BITS_PER_LONG - NFT_PIPAPO_MAP_NBITS) +#endif + +/* ...which gives us the highest allowed index for a rule */ +#define NFT_PIPAPO_RULE0_MAX ((1UL << (NFT_PIPAPO_MAP_TOBITS - 1)) \ + - (1UL << NFT_PIPAPO_MAP_NBITS)) + +#define nft_pipapo_for_each_field(field, index, match) \ + for ((field) = (match)->f, (index) = 0; \ + (index) < (match)->field_count; \ + (index)++, (field)++) + +/** + * union nft_pipapo_map_bucket - Bucket of mapping table + * @to: First rule number (in next field) this rule maps to + * @n: Number of rules (in next field) this rule maps to + * @e: If there's no next field, pointer to element this rule maps to + */ +union nft_pipapo_map_bucket { + struct { +#if BITS_PER_LONG == 64 + static_assert(NFT_PIPAPO_MAP_TOBITS <= 32); + u32 to; + + static_assert(NFT_PIPAPO_MAP_NBITS <= 32); + u32 n; +#else + unsigned long to:NFT_PIPAPO_MAP_TOBITS; + unsigned long n:NFT_PIPAPO_MAP_NBITS; +#endif + }; + struct nft_pipapo_elem *e; +}; + +/** + * struct nft_pipapo_field - Lookup, mapping tables and related data for a field + * @groups: Amount of 4-bit groups + * @rules: Number of inserted rules + * @bsize: Size of each bucket in lookup table, in longs + * @lt: Lookup table: 'groups' rows of NFT_PIPAPO_BUCKETS buckets + * @mt: Mapping table: one bucket per rule + */ +struct nft_pipapo_field { + int groups; + unsigned long rules; + size_t bsize; + unsigned long *lt; + union nft_pipapo_map_bucket *mt; +}; + +/** + * struct nft_pipapo_match - Data used for lookup and matching + * @field_count Amount of fields in set + * @scratch: Preallocated per-CPU maps for partial matching results + * @bsize_max: Maximum lookup table bucket size of all fields, in longs + * @rcu Matching data is swapped on commits + * @f: Fields, with lookup and mapping tables + */ +struct nft_pipapo_match { + int field_count; + unsigned long * __percpu *scratch; + size_t bsize_max; + struct rcu_head rcu; + struct nft_pipapo_field f[0]; +}; + +/* Current working bitmap index, toggled between field matches */ +static DEFINE_PER_CPU(bool, nft_pipapo_scratch_index); + +/** + * struct nft_pipapo - Representation of a set + * @match: Currently in-use matching data + * @clone: Copy where pending insertions and deletions are kept + * @groups: Total amount of 4-bit groups for fields in this set + * @width: Total bytes to be matched for one packet, including padding + * @dirty: Working copy has pending insertions or deletions + * @last_gc: Timestamp of last garbage collection run, jiffies + */ +struct nft_pipapo { + struct nft_pipapo_match __rcu *match; + struct nft_pipapo_match *clone; + int groups; + int width; + bool dirty; + unsigned long last_gc; +}; + +struct nft_pipapo_elem; + +/** + * struct nft_pipapo_elem - API-facing representation of single set element + * @ext: nftables API extensions + */ +struct nft_pipapo_elem { + struct nft_set_ext ext; +}; + +/** + * pipapo_refill() - For each set bit, set bits from selected mapping table item + * @map: Bitmap to be scanned for set bits + * @len: Length of bitmap in longs + * @rules: Number of rules in field + * @dst: Destination bitmap + * @mt: Mapping table containing bit set specifiers + * @match_only: Find a single bit and return, don't fill + * + * Iteration over set bits with __builtin_ctzl(): Daniel Lemire, public domain. + * + * For each bit set in map, select the bucket from mapping table with index + * corresponding to the position of the bit set. Use start bit and amount of + * bits specified in bucket to fill region in dst. + * + * Return: -1 on no match, bit position on 'match_only', 0 otherwise. + */ +static int pipapo_refill(unsigned long *map, int len, int rules, + unsigned long *dst, union nft_pipapo_map_bucket *mt, + bool match_only) +{ + unsigned long bitset; + int k, ret = -1; + + for (k = 0; k < len; k++) { + bitset = map[k]; + while (bitset) { + unsigned long t = bitset & -bitset; + int r = __builtin_ctzl(bitset); + int i = k * BITS_PER_LONG + r; + + if (unlikely(i >= rules)) { + map[k] = 0; + return -1; + } + + if (unlikely(match_only)) { + bitmap_clear(map, i, 1); + return i; + } + + ret = 0; + + bitmap_set(dst, mt[i].to, mt[i].n); + + bitset ^= t; + } + map[k] = 0; + } + + return ret; +} + +/** + * nft_pipapo_lookup() - Lookup function + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * @ext: nftables API extension pointer, filled with matching reference + * + * For more details, see DOC: Theory of Operation. + * + * Return: true on match, false otherwise. + */ +static bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, + const u32 *key, const struct nft_set_ext **ext) +{ + struct nft_pipapo *priv = nft_set_priv(set); + unsigned long *res_map, *fill_map; + u8 genmask = nft_genmask_cur(net); + const u8 *rp = (const u8 *)key; + struct nft_pipapo_match *m; + struct nft_pipapo_field *f; + bool map_index; + int i; + + local_bh_disable(); + + map_index = raw_cpu_read(nft_pipapo_scratch_index); + + m = rcu_dereference(priv->match); + + if (unlikely(!m || !*raw_cpu_ptr(m->scratch))) + goto out; + + res_map = *raw_cpu_ptr(m->scratch) + (map_index ? m->bsize_max : 0); + fill_map = *raw_cpu_ptr(m->scratch) + (map_index ? 0 : m->bsize_max); + + memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); + + nft_pipapo_for_each_field(f, i, m) { + bool last = i == m->field_count - 1; + unsigned long *lt = f->lt; + int b, group; + + /* For each 4-bit group: select lookup table bucket depending on + * packet bytes value, then AND bucket value + */ + for (group = 0; group < f->groups; group += 2) { + u8 v; + + v = *rp >> 4; + __bitmap_and(res_map, res_map, lt + v * f->bsize, + f->bsize * BITS_PER_LONG); + lt += f->bsize * NFT_PIPAPO_BUCKETS; + + v = *rp & 0x0f; + rp++; + __bitmap_and(res_map, res_map, lt + v * f->bsize, + f->bsize * BITS_PER_LONG); + lt += f->bsize * NFT_PIPAPO_BUCKETS; + } + + /* Now populate the bitmap for the next field, unless this is + * the last field, in which case return the matched 'ext' + * pointer if any. + * + * Now res_map contains the matching bitmap, and fill_map is the + * bitmap for the next field. + */ +next_match: + b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, + last); + if (b < 0) { + raw_cpu_write(nft_pipapo_scratch_index, map_index); + local_bh_enable(); + + return false; + } + + if (last) { + *ext = &f->mt[b].e->ext; + if (unlikely(nft_set_elem_expired(*ext) || + !nft_set_elem_active(*ext, genmask))) + goto next_match; + + /* Last field: we're just returning the key without + * filling the initial bitmap for the next field, so the + * current inactive bitmap is clean and can be reused as + * *next* bitmap (not initial) for the next packet. + */ + raw_cpu_write(nft_pipapo_scratch_index, map_index); + local_bh_enable(); + + return true; + } + + /* Swap bitmap indices: res_map is the initial bitmap for the + * next field, and fill_map is guaranteed to be all-zeroes at + * this point. + */ + map_index = !map_index; + swap(res_map, fill_map); + + rp += NFT_PIPAPO_GROUPS_PADDING(f->groups); + } + +out: + local_bh_enable(); + return false; +} + +/** + * pipapo_get() - Get matching element reference given key data + * @net: Network namespace + * @set: nftables API set representation + * @data: Key data to be matched against existing elements + * @genmask: If set, check that element is active in given genmask + * + * This is essentially the same as the lookup function, except that it matches + * key data against the uncommitted copy and doesn't use preallocated maps for + * bitmap results. + * + * Return: pointer to &struct nft_pipapo_elem on match, error pointer otherwise. + */ +static struct nft_pipapo_elem *pipapo_get(const struct net *net, + const struct nft_set *set, + const u8 *data, u8 genmask) +{ + struct nft_pipapo_elem *ret = ERR_PTR(-ENOENT); + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *m = priv->clone; + unsigned long *res_map, *fill_map = NULL; + struct nft_pipapo_field *f; + int i; + + res_map = kmalloc_array(m->bsize_max, sizeof(*res_map), GFP_ATOMIC); + if (!res_map) { + ret = ERR_PTR(-ENOMEM); + goto out; + } + + fill_map = kcalloc(m->bsize_max, sizeof(*res_map), GFP_ATOMIC); + if (!fill_map) { + ret = ERR_PTR(-ENOMEM); + goto out; + } + + memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); + + nft_pipapo_for_each_field(f, i, m) { + bool last = i == m->field_count - 1; + unsigned long *lt = f->lt; + int b, group; + + /* For each 4-bit group: select lookup table bucket depending on + * packet bytes value, then AND bucket value + */ + for (group = 0; group < f->groups; group++) { + u8 v; + + if (group % 2) { + v = *data & 0x0f; + data++; + } else { + v = *data >> 4; + } + __bitmap_and(res_map, res_map, lt + v * f->bsize, + f->bsize * BITS_PER_LONG); + + lt += f->bsize * NFT_PIPAPO_BUCKETS; + } + + /* Now populate the bitmap for the next field, unless this is + * the last field, in which case return the matched 'ext' + * pointer if any. + * + * Now res_map contains the matching bitmap, and fill_map is the + * bitmap for the next field. + */ +next_match: + b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, + last); + if (b < 0) + goto out; + + if (last) { + if (nft_set_elem_expired(&f->mt[b].e->ext) || + (genmask && + !nft_set_elem_active(&f->mt[b].e->ext, genmask))) + goto next_match; + + ret = f->mt[b].e; + goto out; + } + + data += NFT_PIPAPO_GROUPS_PADDING(f->groups); + + /* Swap bitmap indices: fill_map will be the initial bitmap for + * the next field (i.e. the new res_map), and res_map is + * guaranteed to be all-zeroes at this point, ready to be filled + * according to the next mapping table. + */ + swap(res_map, fill_map); + } + +out: + kfree(fill_map); + kfree(res_map); + return ret; +} + +/** + * nft_pipapo_get() - Get matching element reference given key data + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * @flags: Unused + */ +void *nft_pipapo_get(const struct net *net, const struct nft_set *set, + const struct nft_set_elem *elem, unsigned int flags) +{ + return pipapo_get(net, set, (const u8 *)elem->key.val.data, + nft_genmask_cur(net)); +} + +/** + * pipapo_resize() - Resize lookup or mapping table, or both + * @f: Field containing lookup and mapping tables + * @old_rules: Previous amount of rules in field + * @rules: New amount of rules + * + * Increase, decrease or maintain tables size depending on new amount of rules, + * and copy data over. In case the new size is smaller, throw away data for + * highest-numbered rules. + * + * Return: 0 on success, -ENOMEM on allocation failure. + */ +static int pipapo_resize(struct nft_pipapo_field *f, int old_rules, int rules) +{ + long *new_lt = NULL, *new_p, *old_lt = f->lt, *old_p; + union nft_pipapo_map_bucket *new_mt, *old_mt = f->mt; + size_t new_bucket_size, copy; + int group, bucket; + + new_bucket_size = DIV_ROUND_UP(rules, BITS_PER_LONG); + + if (new_bucket_size == f->bsize) + goto mt; + + if (new_bucket_size > f->bsize) + copy = f->bsize; + else + copy = new_bucket_size; + + new_lt = kvzalloc(f->groups * NFT_PIPAPO_BUCKETS * new_bucket_size * + sizeof(*new_lt), GFP_KERNEL); + if (!new_lt) + return -ENOMEM; + + new_p = new_lt; + old_p = old_lt; + for (group = 0; group < f->groups; group++) { + for (bucket = 0; bucket < NFT_PIPAPO_BUCKETS; bucket++) { + memcpy(new_p, old_p, copy * sizeof(*new_p)); + new_p += copy; + old_p += copy; + + if (new_bucket_size > f->bsize) + new_p += new_bucket_size - f->bsize; + else + old_p += f->bsize - new_bucket_size; + } + } + +mt: + new_mt = kvmalloc(rules * sizeof(*new_mt), GFP_KERNEL); + if (!new_mt) { + kvfree(new_lt); + return -ENOMEM; + } + + memcpy(new_mt, f->mt, min(old_rules, rules) * sizeof(*new_mt)); + if (rules > old_rules) { + memset(new_mt + old_rules, 0, + (rules - old_rules) * sizeof(*new_mt)); + } + + if (new_lt) { + f->bsize = new_bucket_size; + f->lt = new_lt; + kvfree(old_lt); + } + + f->mt = new_mt; + kvfree(old_mt); + + return 0; +} + +/** + * pipapo_bucket_set() - Set rule bit in bucket given group and group value + * @f: Field containing lookup table + * @rule: Rule index + * @group: Group index + * @v: Value of bit group + */ +static void pipapo_bucket_set(struct nft_pipapo_field *f, int rule, int group, + int v) +{ + unsigned long *pos; + + pos = f->lt + f->bsize * NFT_PIPAPO_BUCKETS * group; + pos += f->bsize * v; + + __set_bit(rule, pos); +} + +/** + * pipapo_insert() - Insert new rule in field given input key and mask length + * @f: Field containing lookup table + * @k: Input key for classification, without nftables padding + * @mask_bits: Length of mask; matches field length for non-ranged entry + * + * Insert a new rule reference in lookup buckets corresponding to k and + * mask_bits. + * + * Return: 1 on success (one rule inserted), negative error code on failure. + */ +static int pipapo_insert(struct nft_pipapo_field *f, const uint8_t *k, + int mask_bits) +{ + int rule = f->rules++, group, ret; + + ret = pipapo_resize(f, f->rules - 1, f->rules); + if (ret) + return ret; + + for (group = 0; group < f->groups; group++) { + int i, v; + u8 mask; + + if (group % 2) + v = k[group / 2] & 0x0f; + else + v = k[group / 2] >> 4; + + if (mask_bits >= (group + 1) * 4) { + /* Not masked */ + pipapo_bucket_set(f, rule, group, v); + } else if (mask_bits <= group * 4) { + /* Completely masked */ + for (i = 0; i < NFT_PIPAPO_BUCKETS; i++) + pipapo_bucket_set(f, rule, group, i); + } else { + /* The mask limit falls on this group */ + mask = 0x0f >> (mask_bits - group * 4); + for (i = 0; i < NFT_PIPAPO_BUCKETS; i++) { + if ((i & ~mask) == (v & ~mask)) + pipapo_bucket_set(f, rule, group, i); + } + } + } + + return 1; +} + +/** + * pipapo_step_diff() - Check if setting @step bit in netmask would change it + * @base: Mask we are expanding + * @step: Step bit for given expansion step + * @len: Total length of mask space (set and unset bits), bytes + * + * Convenience function for mask expansion. + * + * Return: true if step bit changes mask (i.e. isn't set), false otherwise. + */ +static bool pipapo_step_diff(u8 *base, int step, int len) +{ + /* Network order, byte-addressed */ +#ifdef __BIG_ENDIAN__ + return !(BIT(step % BITS_PER_BYTE) & base[step / BITS_PER_BYTE]); +#else + return !(BIT(step % BITS_PER_BYTE) & + base[len - 1 - step / BITS_PER_BYTE]); +#endif +} + +/** + * pipapo_step_after_end() - Check if mask exceeds range end with given step + * @base: Mask we are expanding + * @end: End of range + * @step: Step bit for given expansion step, highest bit to be set + * @len: Total length of mask space (set and unset bits), bytes + * + * Convenience function for mask expansion. + * + * Return: true if mask exceeds range setting step bits, false otherwise. + */ +static bool pipapo_step_after_end(const u8 *base, const u8 *end, int step, + int len) +{ + u8 tmp[NFT_PIPAPO_MAX_BYTES]; + int i; + + memcpy(tmp, base, len); + + /* Network order, byte-addressed */ + for (i = 0; i <= step; i++) +#ifdef __BIG_ENDIAN__ + tmp[i / BITS_PER_BYTE] |= BIT(i % BITS_PER_BYTE); +#else + tmp[len - 1 - i / BITS_PER_BYTE] |= BIT(i % BITS_PER_BYTE); +#endif + + return memcmp(tmp, end, len) > 0; +} + +/** + * pipapo_base_sum() - Sum step bit to given len-sized netmask base with carry + * @base: Netmask base + * @step: Step bit to sum + * @len: Netmask length, bytes + */ +static void pipapo_base_sum(u8 *base, int step, int len) +{ + bool carry = false; + int i; + + /* Network order, byte-addressed */ +#ifdef __BIG_ENDIAN__ + for (i = step / BITS_PER_BYTE; i < len; i++) { +#else + for (i = len - 1 - step / BITS_PER_BYTE; i >= 0; i--) { +#endif + if (carry) + base[i]++; + else + base[i] += 1 << (step % BITS_PER_BYTE); + + if (base[i]) + break; + + carry = true; + } +} + +/** + * pipapo_expand() - Expand to composing netmasks, insert into lookup table + * @f: Field containing lookup table + * @start: Start of range + * @end: End of range + * @len: Length of value in bits + * + * Expand range to composing netmasks and insert corresponding rule references + * in lookup buckets. + * + * Return: number of inserted rules on success, negative error code on failure. + */ +static int pipapo_expand(struct nft_pipapo_field *f, + const u8 *start, const u8 *end, int len) +{ + int step, masks = 0, bytes = DIV_ROUND_UP(len, BITS_PER_BYTE); + u8 base[NFT_PIPAPO_MAX_BYTES]; + + memcpy(base, start, bytes); + while (memcmp(base, end, bytes) <= 0) { + int err; + + step = 0; + while (pipapo_step_diff(base, step, bytes)) { + if (pipapo_step_after_end(base, end, step, bytes)) + break; + + step++; + if (step >= len) { + if (!masks) { + pipapo_insert(f, base, 0); + masks = 1; + } + goto out; + } + } + + err = pipapo_insert(f, base, len - step); + + if (err < 0) + return err; + + masks++; + pipapo_base_sum(base, step, bytes); + } +out: + return masks; +} + +/** + * pipapo_map() - Insert rules in mapping tables, mapping them between fields + * @m: Matching data, including mapping table + * @map: Table of rule maps: array of first rule and amount of rules + * in next field a given rule maps to, for each field + * @ext: For last field, nft_set_ext pointer matching rules map to + */ +static void pipapo_map(struct nft_pipapo_match *m, + union nft_pipapo_map_bucket map[NFT_PIPAPO_MAX_FIELDS], + struct nft_pipapo_elem *e) +{ + struct nft_pipapo_field *f; + int i, j; + + for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) { + for (j = 0; j < map[i].n; j++) { + f->mt[map[i].to + j].to = map[i + 1].to; + f->mt[map[i].to + j].n = map[i + 1].n; + } + } + + /* Last field: map to ext instead of mapping to next field */ + for (j = 0; j < map[i].n; j++) + f->mt[map[i].to + j].e = e; +} + +/** + * pipapo_realloc_scratch() - Reallocate scratch maps for partial match results + * @clone: Copy of matching data with pending insertions and deletions + * @bsize_max Maximum bucket size, scratch maps cover two buckets + * + * Return: 0 on success, -ENOMEM on failure. + */ +static int pipapo_realloc_scratch(struct nft_pipapo_match *clone, + unsigned long bsize_max) +{ + int i; + + for_each_possible_cpu(i) { + unsigned long *scratch; + + scratch = kzalloc_node(bsize_max * sizeof(*scratch) * 2, + GFP_KERNEL, cpu_to_node(i)); + if (!scratch) { + /* On failure, there's no need to undo previous + * allocations: this means that some scratch maps have + * a bigger allocated size now (this is only called on + * insertion), but the extra space won't be used by any + * CPU as new elements are not inserted and m->bsize_max + * is not updated. + */ + return -ENOMEM; + } + + kfree(*per_cpu_ptr(clone->scratch, i)); + + *per_cpu_ptr(clone->scratch, i) = scratch; + } + + return 0; +} + +/** + * nft_pipapo_insert() - Validate and insert ranged elements + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * @ext2: Filled with pointer to &struct nft_set_ext in inserted element + * + * Return: 0 on success, error pointer on failure. + */ +static int nft_pipapo_insert(const struct net *net, const struct nft_set *set, + const struct nft_set_elem *elem, + struct nft_set_ext **ext2) +{ + const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv); + union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; + const u8 *start = (const u8 *)elem->key.val.data, *end; + struct nft_pipapo_elem *e = elem->priv, *dup; + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *m = priv->clone; + u8 genmask = nft_genmask_next(net); + struct nft_pipapo_field *f; + int i, bsize_max, err = 0; + + dup = pipapo_get(net, set, start, genmask); + if (PTR_ERR(dup) == -ENOENT) { + if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END)) { + end = (const u8 *)nft_set_ext_key_end(ext)->data; + dup = pipapo_get(net, set, end, nft_genmask_next(net)); + } else { + end = start; + } + } + + if (PTR_ERR(dup) != -ENOENT) { + if (IS_ERR(dup)) + return PTR_ERR(dup); + *ext2 = &dup->ext; + return -EEXIST; + } + + /* Validate */ + nft_pipapo_for_each_field(f, i, m) { + const u8 *start_p = start, *end_p = end; + + if (f->rules >= (unsigned long)NFT_PIPAPO_RULE0_MAX) + return -ENOSPC; + + if (memcmp(start_p, end_p, + f->groups / NFT_PIPAPO_GROUPS_PER_BYTE) > 0) + return -EINVAL; + + start_p += NFT_PIPAPO_GROUPS_PADDED_SIZE(f->groups); + end_p += NFT_PIPAPO_GROUPS_PADDED_SIZE(f->groups); + } + + /* Insert */ + priv->dirty = true; + + bsize_max = m->bsize_max; + + nft_pipapo_for_each_field(f, i, m) { + int ret; + + rulemap[i].to = f->rules; + + ret = memcmp(start, end, + f->groups / NFT_PIPAPO_GROUPS_PER_BYTE); + if (!ret) { + ret = pipapo_insert(f, start, + f->groups * NFT_PIPAPO_GROUP_BITS); + } else { + ret = pipapo_expand(f, start, end, + f->groups * NFT_PIPAPO_GROUP_BITS); + } + + if (f->bsize > bsize_max) + bsize_max = f->bsize; + + rulemap[i].n = ret; + + start += NFT_PIPAPO_GROUPS_PADDED_SIZE(f->groups); + end += NFT_PIPAPO_GROUPS_PADDED_SIZE(f->groups); + } + + if (!*this_cpu_ptr(m->scratch) || bsize_max > m->bsize_max) { + err = pipapo_realloc_scratch(m, bsize_max); + if (err) + return err; + + this_cpu_write(nft_pipapo_scratch_index, false); + + m->bsize_max = bsize_max; + } + + *ext2 = &e->ext; + + pipapo_map(m, rulemap, e); + + return 0; +} + +/** + * pipapo_clone() - Clone matching data to create new working copy + * @old: Existing matching data + * + * Return: copy of matching data passed as 'old', error pointer on failure + */ +static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old) +{ + struct nft_pipapo_field *dst, *src; + struct nft_pipapo_match *new; + int i; + + new = kmalloc(sizeof(*new) + sizeof(*dst) * old->field_count, + GFP_KERNEL); + if (!new) + return ERR_PTR(-ENOMEM); + + new->field_count = old->field_count; + new->bsize_max = old->bsize_max; + + new->scratch = alloc_percpu(*new->scratch); + if (!new->scratch) + goto out_scratch; + + rcu_head_init(&new->rcu); + + src = old->f; + dst = new->f; + + for (i = 0; i < old->field_count; i++) { + memcpy(dst, src, offsetof(struct nft_pipapo_field, lt)); + + dst->lt = kvzalloc(src->groups * NFT_PIPAPO_BUCKETS * + src->bsize * sizeof(*dst->lt), + GFP_KERNEL); + if (!dst->lt) + goto out_lt; + + memcpy(dst->lt, src->lt, + src->bsize * sizeof(*dst->lt) * + src->groups * NFT_PIPAPO_BUCKETS); + + dst->mt = kvmalloc(src->rules * sizeof(*src->mt), GFP_KERNEL); + if (!dst->mt) + goto out_mt; + + memcpy(dst->mt, src->mt, src->rules * sizeof(*src->mt)); + src++; + dst++; + } + + return new; + +out_mt: + kvfree(dst->lt); +out_lt: + for (dst--; i > 0; i--) { + kvfree(dst->mt); + kvfree(dst->lt); + dst--; + } + free_percpu(new->scratch); +out_scratch: + kfree(new); + + return ERR_PTR(-ENOMEM); +} + +/** + * pipapo_rules_same_key() - Get number of rules originated from the same entry + * @f: Field containing mapping table + * @first: Index of first rule in set of rules mapping to same entry + * + * Using the fact that all rules in a field that originated from the same entry + * will map to the same set of rules in the next field, or to the same element + * reference, return the cardinality of the set of rules that originated from + * the same entry as the rule with index @first, @first rule included. + * + * In pictures: + * rules + * field #0 0 1 2 3 4 + * map to: 0 1 2-4 2-4 5-9 + * . . ....... . ... + * | | | | \ \ + * | | | | \ \ + * | | | | \ \ + * ' ' ' ' ' \ + * in field #1 0 1 2 3 4 5 ... + * + * if this is called for rule 2 on field #0, it will return 3, as also rules 2 + * and 3 in field 0 map to the same set of rules (2, 3, 4) in the next field. + * + * For the last field in a set, we can rely on associated entries to map to the + * same element references. + * + * Return: Number of rules that originated from the same entry as @first. + */ +static int pipapo_rules_same_key(struct nft_pipapo_field *f, int first) +{ + struct nft_pipapo_elem *e = NULL; /* Keep gcc happy */ + int r; + + for (r = first; r < f->rules; r++) { + if (r != first && e != f->mt[r].e) + return r - first; + + e = f->mt[r].e; + } + + if (r != first) + return r - first; + + return 0; +} + +/** + * pipapo_unmap() - Remove rules from mapping tables, renumber remaining ones + * @mt: Mapping array + * @rules: Original amount of rules in mapping table + * @start: First rule index to be removed + * @n: Amount of rules to be removed + * @to_offset: First rule index, in next field, this group of rules maps to + * @is_last: If this is the last field, delete reference from mapping array + * + * This is used to unmap rules from the mapping table for a single field, + * maintaining consistency and compactness for the existing ones. + * + * In pictures: let's assume that we want to delete rules 2 and 3 from the + * following mapping array: + * + * rules + * 0 1 2 3 4 + * map to: 4-10 4-10 11-15 11-15 16-18 + * + * the result will be: + * + * rules + * 0 1 2 + * map to: 4-10 4-10 11-13 + * + * for fields before the last one. In case this is the mapping table for the + * last field in a set, and rules map to pointers to &struct nft_pipapo_elem: + * + * rules + * 0 1 2 3 4 + * element pointers: 0x42 0x42 0x33 0x33 0x44 + * + * the result will be: + * + * rules + * 0 1 2 + * element pointers: 0x42 0x42 0x44 + */ +static void pipapo_unmap(union nft_pipapo_map_bucket *mt, int rules, + int start, int n, int to_offset, bool is_last) +{ + int i; + + memmove(mt + start, mt + start + n, (rules - start - n) * sizeof(*mt)); + memset(mt + rules - n, 0, n * sizeof(*mt)); + + if (is_last) + return; + + for (i = start; i < rules - n; i++) + mt[i].to -= to_offset; +} + +/** + * pipapo_drop() - Delete entry from lookup and mapping tables, given rule map + * @m: Matching data + * @rulemap Table of rule maps, arrays of first rule and amount of rules + * in next field a given entry maps to, for each field + * + * For each rule in lookup table buckets mapping to this set of rules, drop + * all bits set in lookup table mapping. In pictures, assuming we want to drop + * rules 0 and 1 from this lookup table: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0 1,2 + * 1 1,2 0 + * 2 0 1,2 + * 3 0 1,2 + * 4 0,1,2 + * 5 0 1 2 + * 6 0,1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + * 7 1,2 1,2 1 1 1 0,1 1 1 1 1 1 1 1 1 1 1 + * + * rule 2 becomes rule 0, and the result will be: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 0 + * 1 0 + * 2 0 + * 3 0 + * 4 0 + * 5 0 + * 6 0 + * 7 0 0 + * + * once this is done, call unmap() to drop all the corresponding rule references + * from mapping tables. + */ +static void pipapo_drop(struct nft_pipapo_match *m, + union nft_pipapo_map_bucket rulemap[]) +{ + struct nft_pipapo_field *f; + int i; + + nft_pipapo_for_each_field(f, i, m) { + int g; + + for (g = 0; g < f->groups; g++) { + unsigned long *pos; + int b; + + pos = f->lt + g * NFT_PIPAPO_BUCKETS * f->bsize; + + for (b = 0; b < NFT_PIPAPO_BUCKETS; b++) { + bitmap_cut(pos, pos, rulemap[i].to, + rulemap[i].n, + f->bsize * BITS_PER_LONG); + + pos += f->bsize; + } + } + + pipapo_unmap(f->mt, f->rules, rulemap[i].to, rulemap[i].n, + rulemap[i + 1].n, i == m->field_count - 1); + if (pipapo_resize(f, f->rules, f->rules - rulemap[i].n)) { + /* We can ignore this, a failure to shrink tables down + * doesn't make tables invalid. + */ + ; + } + f->rules -= rulemap[i].n; + } +} + +/** + * pipapo_gc() - Drop expired entries from set, destroy start and end elements + * @set: nftables API set representation + * @m: Matching data + */ +static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m) +{ + struct nft_pipapo *priv = nft_set_priv(set); + int rules_f0, first_rule = 0; + + while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) { + union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; + struct nft_pipapo_field *f; + struct nft_pipapo_elem *e; + int i, start, rules_fx; + + start = first_rule; + rules_fx = rules_f0; + + nft_pipapo_for_each_field(f, i, m) { + rulemap[i].to = start; + rulemap[i].n = rules_fx; + + if (i < m->field_count - 1) { + rules_fx = f->mt[start].n; + start = f->mt[start].to; + } + } + + /* Pick the last field, and its last index */ + f--; + i--; + e = f->mt[rulemap[i].to].e; + if (nft_set_elem_expired(&e->ext) && + !nft_set_elem_mark_busy(&e->ext)) { + priv->dirty = true; + pipapo_drop(m, rulemap); + + rcu_barrier(); + nft_set_elem_destroy(set, e, true); + + /* And check again current first rule, which is now the + * first we haven't checked. + */ + } else { + first_rule += rules_f0; + } + } + + priv->last_gc = jiffies; +} + +/** + * pipapo_free_fields() - Free per-field tables contained in matching data + * @m: Matching data + */ +static void pipapo_free_fields(struct nft_pipapo_match *m) +{ + struct nft_pipapo_field *f; + int i; + + nft_pipapo_for_each_field(f, i, m) { + kvfree(f->lt); + kvfree(f->mt); + } +} + +/** + * pipapo_reclaim_match - RCU callback to free fields from old matching data + * @rcu: RCU head + */ +static void pipapo_reclaim_match(struct rcu_head *rcu) +{ + struct nft_pipapo_match *m; + int i; + + m = container_of(rcu, struct nft_pipapo_match, rcu); + + for_each_possible_cpu(i) + kfree(*per_cpu_ptr(m->scratch, i)); + + free_percpu(m->scratch); + + pipapo_free_fields(m); + + kfree(m); +} + +/** + * pipapo_commit() - Replace lookup data with current working copy + * @set: nftables API set representation + * + * While at it, check if we should perform garbage collection on the working + * copy before committing it for lookup, and don't replace the table if the + * working copy doesn't have pending changes. + * + * We also need to create a new working copy for subsequent insertions and + * deletions. + */ +static void pipapo_commit(const struct nft_set *set) +{ + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *new_clone, *old; + + if (time_after_eq(jiffies, priv->last_gc + nft_set_gc_interval(set))) + pipapo_gc(set, priv->clone); + + if (!priv->dirty) + return; + + new_clone = pipapo_clone(priv->clone); + if (IS_ERR(new_clone)) + return; + + priv->dirty = false; + + old = rcu_access_pointer(priv->match); + rcu_assign_pointer(priv->match, priv->clone); + if (old) + call_rcu(&old->rcu, pipapo_reclaim_match); + + priv->clone = new_clone; +} + +/** + * nft_pipapo_activate() - Mark element reference as active given key, commit + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * + * On insertion, elements are added to a copy of the matching data currently + * in use for lookups, and not directly inserted into current lookup data, so + * we'll take care of that by calling pipapo_commit() here. Both + * nft_pipapo_insert() and nft_pipapo_activate() are called once for each + * element, hence we can't purpose either one as a real commit operation. + */ +static void nft_pipapo_activate(const struct net *net, + const struct nft_set *set, + const struct nft_set_elem *elem) +{ + struct nft_pipapo_elem *e; + + e = pipapo_get(net, set, (const u8 *)elem->key.val.data, 0); + if (IS_ERR(e)) + return; + + nft_set_elem_change_active(net, set, &e->ext); + nft_set_elem_clear_busy(&e->ext); + + pipapo_commit(set); +} + +/** + * pipapo_deactivate() - Check that element is in set, mark as inactive + * @net: Network namespace + * @set: nftables API set representation + * @data: Input key data + * @ext: nftables API extension pointer, used to check for end element + * + * This is a convenience function that can be called from both + * nft_pipapo_deactivate() and nft_pipapo_flush(), as they are in fact the same + * operation. + * + * Return: deactivated element if found, NULL otherwise. + */ +static void *pipapo_deactivate(const struct net *net, const struct nft_set *set, + const u8 *data, const struct nft_set_ext *ext) +{ + struct nft_pipapo_elem *e; + + e = pipapo_get(net, set, data, nft_genmask_next(net)); + if (IS_ERR(e)) + return NULL; + + nft_set_elem_change_active(net, set, &e->ext); + + return e; +} + +/** + * nft_pipapo_deactivate() - Call pipapo_deactivate() to make element inactive + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * + * Return: deactivated element if found, NULL otherwise. + */ +static void *nft_pipapo_deactivate(const struct net *net, + const struct nft_set *set, + const struct nft_set_elem *elem) +{ + const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv); + + return pipapo_deactivate(net, set, (const u8 *)elem->key.val.data, ext); +} + +/** + * nft_pipapo_flush() - Call pipapo_deactivate() to make element inactive + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * + * This is functionally the same as nft_pipapo_deactivate(), with a slightly + * different interface, and it's also called once for each element in a set + * being flushed, so we can't implement, strictly speaking, a flush operation, + * which would otherwise be as simple as allocating an empty copy of the + * matching data. + * + * Note that we could in theory do that, mark the set as flushed, and ignore + * subsequent calls, but we would leak all the elements after the first one, + * because they wouldn't then be freed as result of API calls. + * + * Return: true if element was found and deactivated. + */ +static bool nft_pipapo_flush(const struct net *net, const struct nft_set *set, + void *elem) +{ + struct nft_pipapo_elem *e = elem; + + return pipapo_deactivate(net, set, (const u8 *)nft_set_ext_key(&e->ext), + &e->ext); +} + +/** + * pipapo_get_boundaries() - Get byte interval for associated rules + * @f: Field including lookup table + * @first_rule: First rule (lowest index) + * @rule_count: Number of associated rules + * @left: Byte expression for left boundary (start of range) + * @right: Byte expression for right boundary (end of range) + * + * Given the first rule and amount of rules that originated from the same entry, + * build the original range associated with the entry, and calculate the length + * of the originating netmask. + * + * In pictures: + * + * bucket + * group 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + * 0 1,2 + * 1 1,2 + * 2 1,2 + * 3 1,2 + * 4 1,2 + * 5 1 2 + * 6 1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + * 7 1,2 1,2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + * + * this is the lookup table corresponding to the IPv4 range + * 192.168.1.0-192.168.2.1, which was expanded to the two composing netmasks, + * rule #1: 192.168.1.0/24, and rule #2: 192.168.2.0/31. + * + * This function fills @left and @right with the byte values of the leftmost + * and rightmost bucket indices for the lowest and highest rule indices, + * respectively. If @first_rule is 1 and @rule_count is 2, we obtain, in + * nibbles: + * left: < 12, 0, 10, 8, 0, 1, 0, 0 > + * right: < 12, 0, 10, 8, 0, 2, 2, 1 > + * corresponding to bytes: + * left: < 192, 168, 1, 0 > + * right: < 192, 168, 2, 1 > + * with mask length irrelevant here, unused on return, as the range is already + * defined by its start and end points. The mask length is relevant for a single + * ranged entry instead: if @first_rule is 1 and @rule_count is 1, we ignore + * rule 2 above: @left becomes < 192, 168, 1, 0 >, @right becomes + * < 192, 168, 1, 255 >, and the mask length, calculated from the distances + * between leftmost and rightmost bucket indices for each group, would be 24. + * + * Return: mask length, in bits. + */ +static int pipapo_get_boundaries(struct nft_pipapo_field *f, int first_rule, + int rule_count, u8 *left, u8 *right) +{ + u8 *l = left, *r = right; + int g, mask_len = 0; + + for (g = 0; g < f->groups; g++) { + int b, x0, x1; + + x0 = -1; + x1 = -1; + for (b = 0; b < NFT_PIPAPO_BUCKETS; b++) { + unsigned long *pos; + + pos = f->lt + (g * NFT_PIPAPO_BUCKETS + b) * f->bsize; + if (test_bit(first_rule, pos) && x0 == -1) + x0 = b; + if (test_bit(first_rule + rule_count - 1, pos)) + x1 = b; + } + + if (g % 2) { + *(l++) |= x0 & 0x0f; + *(r++) |= x1 & 0x0f; + } else { + *l |= x0 << 4; + *r |= x1 << 4; + } + + if (x1 - x0 == 0) + mask_len += 4; + else if (x1 - x0 == 1) + mask_len += 3; + else if (x1 - x0 == 3) + mask_len += 2; + else if (x1 - x0 == 7) + mask_len += 1; + } + + return mask_len; +} + +/** + * pipapo_match_field() - Match rules against byte ranges + * @f: Field including the lookup table + * @first_rule: First of associated rules originating from same entry + * @rule_count: Amount of associated rules + * @start: Start of range to be matched + * @end: End of range to be matched + * + * Return: true on match, false otherwise. + */ +static bool pipapo_match_field(struct nft_pipapo_field *f, + int first_rule, int rule_count, + const u8 *start, const u8 *end) +{ + u8 right[NFT_PIPAPO_MAX_BYTES] = { 0 }; + u8 left[NFT_PIPAPO_MAX_BYTES] = { 0 }; + + pipapo_get_boundaries(f, first_rule, rule_count, left, right); + + return !memcmp(start, left, f->groups / NFT_PIPAPO_GROUPS_PER_BYTE) && + !memcmp(end, right, f->groups / NFT_PIPAPO_GROUPS_PER_BYTE); +} + +/** + * nft_pipapo_remove() - Remove element given key, commit + * @net: Network namespace + * @set: nftables API set representation + * @elem: nftables API element representation containing key data + * + * Similarly to nft_pipapo_activate(), this is used as commit operation by the + * API, but it's called once per element in the pending transaction, so we can't + * implement this as a single commit operation. Closest we can get is to remove + * the matched element here, if any, and commit the updated matching data. + */ +static void nft_pipapo_remove(const struct net *net, const struct nft_set *set, + const struct nft_set_elem *elem) +{ + const u8 *data = (const u8 *)elem->key.val.data; + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *m = priv->clone; + int rules_f0, first_rule = 0; + struct nft_pipapo_elem *e; + + e = pipapo_get(net, set, data, 0); + if (IS_ERR(e)) + return; + + while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) { + union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; + const u8 *match_start, *match_end; + struct nft_pipapo_field *f; + int i, start, rules_fx; + + match_start = data; + match_end = (const u8 *)nft_set_ext_key_end(&e->ext)->data; + + start = first_rule; + rules_fx = rules_f0; + + nft_pipapo_for_each_field(f, i, m) { + if (!pipapo_match_field(f, start, rules_fx, + match_start, match_end)) + break; + + rulemap[i].to = start; + rulemap[i].n = rules_fx; + + rules_fx = f->mt[start].n; + start = f->mt[start].to; + + match_start += NFT_PIPAPO_GROUPS_PADDED_SIZE(f->groups); + match_end += NFT_PIPAPO_GROUPS_PADDED_SIZE(f->groups); + } + + if (i == m->field_count) { + priv->dirty = true; + pipapo_drop(m, rulemap); + pipapo_commit(set); + return; + } + + first_rule += rules_f0; + } +} + +/** + * nft_pipapo_walk() - Walk over elements + * @ctx: nftables API context + * @set: nftables API set representation + * @iter: Iterator + * + * As elements are referenced in the mapping array for the last field, directly + * scan that array: there's no need to follow rule mappings from the first + * field. + */ +static void nft_pipapo_walk(const struct nft_ctx *ctx, struct nft_set *set, + struct nft_set_iter *iter) +{ + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *m; + struct nft_pipapo_field *f; + int i, r; + + rcu_read_lock(); + m = rcu_dereference(priv->match); + + if (unlikely(!m)) + goto out; + + for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) + ; + + for (r = 0; r < f->rules; r++) { + struct nft_pipapo_elem *e; + struct nft_set_elem elem; + + if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e) + continue; + + if (iter->count < iter->skip) + goto cont; + + e = f->mt[r].e; + if (nft_set_elem_expired(&e->ext)) + goto cont; + + elem.priv = e; + + iter->err = iter->fn(ctx, set, iter, &elem); + if (iter->err < 0) + goto out; + +cont: + iter->count++; + } + +out: + rcu_read_unlock(); +} + +/** + * nft_pipapo_privsize() - Return the size of private data for the set + * @nla: netlink attributes, ignored as size doesn't depend on them + * @desc: Set description, ignored as size doesn't depend on it + * + * Return: size of private data for this set implementation, in bytes + */ +static u64 nft_pipapo_privsize(const struct nlattr * const nla[], + const struct nft_set_desc *desc) +{ + return sizeof(struct nft_pipapo); +} + +/** + * nft_pipapo_estimate() - Estimate set size, space and lookup complexity + * @desc: Set description, element count and field description used here + * @features: Flags: NFT_SET_INTERVAL needs to be there + * @est: Storage for estimation data + * + * The size for this set type can vary dramatically, as it depends on the number + * of rules (composing netmasks) the entries expand to. We compute the worst + * case here. + * + * In general, for a non-ranged entry or a single composing netmask, we need + * one bit in each of the sixteen NFT_PIPAPO_BUCKETS, for each 4-bit group (that + * is, each input bit needs four bits of matching data), plus a bucket in the + * mapping table for each field. + * + * Return: true only for compatible range concatenations + */ +static bool nft_pipapo_estimate(const struct nft_set_desc *desc, u32 features, + struct nft_set_estimate *est) +{ + unsigned long entry_size; + int i; + + if (!(features & NFT_SET_INTERVAL) || desc->field_count <= 1) + return false; + + for (i = 0, entry_size = 0; i < desc->field_count; i++) { + unsigned long rules; + + if (desc->field_len[i] > NFT_PIPAPO_MAX_BYTES) + return false; + + /* Worst-case ranges for each concatenated field: each n-bit + * field can expand to up to n * 2 rules in each bucket, and + * each rule also needs a mapping bucket. + */ + rules = ilog2(desc->field_len[i] * BITS_PER_BYTE) * 2; + entry_size += rules * NFT_PIPAPO_BUCKETS / BITS_PER_BYTE; + entry_size += rules * sizeof(union nft_pipapo_map_bucket); + } + + /* Rules in lookup and mapping tables are needed for each entry */ + est->size = desc->size * entry_size; + if (est->size && div_u64(est->size, desc->size) != entry_size) + return false; + + est->size += sizeof(struct nft_pipapo) + + sizeof(struct nft_pipapo_match) * 2; + + est->size += sizeof(struct nft_pipapo_field) * desc->field_count; + + est->lookup = NFT_SET_CLASS_O_LOG_N; + + est->space = NFT_SET_CLASS_O_N; + + return true; +} + +/** + * nft_pipapo_init() - Initialise data for a set instance + * @set: nftables API set representation + * @desc: Set description + * @nla: netlink attributes + * + * Validate number and size of fields passed as NFTA_SET_DESC_CONCAT netlink + * attributes, initialise internal set parameters, current instance of matching + * data and a copy for subsequent insertions. + * + * Return: 0 on success, negative error code on failure. + */ +static int nft_pipapo_init(const struct nft_set *set, + const struct nft_set_desc *desc, + const struct nlattr * const nla[]) +{ + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *m; + struct nft_pipapo_field *f; + int err, i; + + if (desc->field_count > NFT_PIPAPO_MAX_FIELDS) + return -EINVAL; + + m = kmalloc(sizeof(*priv->match) + sizeof(*f) * desc->field_count, + GFP_KERNEL); + if (!m) + return -ENOMEM; + + m->field_count = desc->field_count; + m->bsize_max = 0; + + m->scratch = alloc_percpu(unsigned long *); + if (!m->scratch) { + err = -ENOMEM; + goto out_free; + } + for_each_possible_cpu(i) + *per_cpu_ptr(m->scratch, i) = NULL; + + rcu_head_init(&m->rcu); + + nft_pipapo_for_each_field(f, i, m) { + f->groups = desc->field_len[i] * NFT_PIPAPO_GROUPS_PER_BYTE; + priv->groups += f->groups; + + priv->width += round_up(desc->field_len[i], sizeof(u32)); + + f->bsize = 0; + f->rules = 0; + f->lt = NULL; + f->mt = NULL; + } + + /* Create an initial clone of matching data for next insertion */ + priv->clone = pipapo_clone(m); + if (IS_ERR(priv->clone)) { + err = PTR_ERR(priv->clone); + goto out_free; + } + + priv->dirty = false; + + rcu_assign_pointer(priv->match, m); + + return 0; + +out_free: + free_percpu(m->scratch); + kfree(m); + + return err; +} + +/** + * nft_pipapo_destroy() - Free private data for set and all committed elements + * @set: nftables API set representation + */ +static void nft_pipapo_destroy(const struct nft_set *set) +{ + struct nft_pipapo *priv = nft_set_priv(set); + struct nft_pipapo_match *m; + struct nft_pipapo_field *f; + int i, r, cpu; + + m = rcu_dereference_protected(priv->match, true); + if (m) { + rcu_barrier(); + + for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) + ; + + for (r = 0; r < f->rules; r++) { + struct nft_pipapo_elem *e; + + if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e) + continue; + + e = f->mt[r].e; + + nft_set_elem_destroy(set, e, true); + } + + for_each_possible_cpu(cpu) + kfree(*per_cpu_ptr(m->scratch, cpu)); + free_percpu(m->scratch); + + pipapo_free_fields(m); + kfree(m); + priv->match = NULL; + } + + if (priv->clone) { + for_each_possible_cpu(cpu) + kfree(*per_cpu_ptr(priv->clone->scratch, cpu)); + free_percpu(priv->clone->scratch); + + pipapo_free_fields(priv->clone); + kfree(priv->clone); + priv->clone = NULL; + } +} + +/** + * nft_pipapo_gc_init() - Initialise garbage collection + * @set: nftables API set representation + * + * Instead of actually setting up a periodic work for garbage collection, as + * this operation requires a swap of matching data with the working copy, we'll + * do that opportunistically with other commit operations if the interval is + * elapsed, so we just need to set the current jiffies timestamp here. + */ +static void nft_pipapo_gc_init(const struct nft_set *set) +{ + struct nft_pipapo *priv = nft_set_priv(set); + + priv->last_gc = jiffies; +} + +struct nft_set_type nft_set_pipapo_type __read_mostly = { + .owner = THIS_MODULE, + .features = NFT_SET_INTERVAL | NFT_SET_MAP | NFT_SET_OBJECT | + NFT_SET_TIMEOUT, + .ops = { + .lookup = nft_pipapo_lookup, + .insert = nft_pipapo_insert, + .activate = nft_pipapo_activate, + .deactivate = nft_pipapo_deactivate, + .flush = nft_pipapo_flush, + .remove = nft_pipapo_remove, + .walk = nft_pipapo_walk, + .get = nft_pipapo_get, + .privsize = nft_pipapo_privsize, + .estimate = nft_pipapo_estimate, + .init = nft_pipapo_init, + .destroy = nft_pipapo_destroy, + .gc_init = nft_pipapo_gc_init, + .elemsize = offsetof(struct nft_pipapo_elem, ext), + }, +}; diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index a9f804f7a04a..5000b938ab1e 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -466,6 +466,9 @@ static void nft_rbtree_destroy(const struct nft_set *set) static bool nft_rbtree_estimate(const struct nft_set_desc *desc, u32 features, struct nft_set_estimate *est) { + if (desc->field_count > 1) + return false; + if (desc->size) est->size = sizeof(struct nft_rbtree) + desc->size * sizeof(struct nft_rbtree_elem); diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index 46b8ff24020d..1e8eeb044b07 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -1475,7 +1475,7 @@ static int __init rose_proto_init(void) int rc; if (rose_ndevs > 0x7FFFFFFF/sizeof(struct net_device *)) { - printk(KERN_ERR "ROSE: rose_proto_init - rose_ndevs parameter to large\n"); + printk(KERN_ERR "ROSE: rose_proto_init - rose_ndevs parameter too large\n"); rc = -EINVAL; goto out; } diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 86bd133b4fa0..96d54e5bf7bc 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -413,7 +413,7 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb) { struct rxrpc_skb_priv *sp = rxrpc_skb(skb); enum rxrpc_call_state state; - unsigned int j; + unsigned int j, nr_subpackets; rxrpc_serial_t serial = sp->hdr.serial, ack_serial = 0; rxrpc_seq_t seq0 = sp->hdr.seq, hard_ack; bool immediate_ack = false, jumbo_bad = false; @@ -457,7 +457,8 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb) call->ackr_prev_seq = seq0; hard_ack = READ_ONCE(call->rx_hard_ack); - if (sp->nr_subpackets > 1) { + nr_subpackets = sp->nr_subpackets; + if (nr_subpackets > 1) { if (call->nr_jumbo_bad > 3) { ack = RXRPC_ACK_NOSPACE; ack_serial = serial; @@ -465,11 +466,11 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb) } } - for (j = 0; j < sp->nr_subpackets; j++) { + for (j = 0; j < nr_subpackets; j++) { rxrpc_serial_t serial = sp->hdr.serial + j; rxrpc_seq_t seq = seq0 + j; unsigned int ix = seq & RXRPC_RXTX_BUFF_MASK; - bool terminal = (j == sp->nr_subpackets - 1); + bool terminal = (j == nr_subpackets - 1); bool last = terminal && (sp->rx_flags & RXRPC_SKB_INCL_LAST); u8 flags, annotation = j; @@ -506,7 +507,7 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb) } if (call->rxtx_buffer[ix]) { - rxrpc_input_dup_data(call, seq, sp->nr_subpackets > 1, + rxrpc_input_dup_data(call, seq, nr_subpackets > 1, &jumbo_bad); if (ack != RXRPC_ACK_DUPLICATE) { ack = RXRPC_ACK_DUPLICATE; @@ -564,6 +565,7 @@ static void rxrpc_input_data(struct rxrpc_call *call, struct sk_buff *skb) * ring. */ skb = NULL; + sp = NULL; } if (last) { diff --git a/net/sched/Kconfig b/net/sched/Kconfig index b1e7ec726958..edde0e519438 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -366,6 +366,19 @@ config NET_SCH_PIE If unsure, say N. +config NET_SCH_FQ_PIE + depends on NET_SCH_PIE + tristate "Flow Queue Proportional Integral controller Enhanced (FQ-PIE)" + help + Say Y here if you want to use the Flow Queue Proportional Integral + controller Enhanced (FQ-PIE) packet scheduling algorithm. + For more information, please see https://tools.ietf.org/html/rfc8033 + + To compile this driver as a module, choose M here: the module + will be called sch_fq_pie. + + If unsure, say N. + config NET_SCH_INGRESS tristate "Ingress/classifier-action Qdisc" depends on NET_CLS_ACT diff --git a/net/sched/Makefile b/net/sched/Makefile index bc8856b865ff..31c367a6cd09 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -59,6 +59,7 @@ obj-$(CONFIG_NET_SCH_CAKE) += sch_cake.o obj-$(CONFIG_NET_SCH_FQ) += sch_fq.o obj-$(CONFIG_NET_SCH_HHF) += sch_hhf.o obj-$(CONFIG_NET_SCH_PIE) += sch_pie.o +obj-$(CONFIG_NET_SCH_FQ_PIE) += sch_fq_pie.o obj-$(CONFIG_NET_SCH_CBS) += sch_cbs.o obj-$(CONFIG_NET_SCH_ETF) += sch_etf.o obj-$(CONFIG_NET_SCH_TAPRIO) += sch_taprio.o diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 76e0d122616a..c2cdd0fc2e70 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -2055,9 +2055,8 @@ replay: &chain_info)); mutex_unlock(&chain->filter_chain_lock); - tp_new = tcf_proto_create(nla_data(tca[TCA_KIND]), - protocol, prio, chain, rtnl_held, - extack); + tp_new = tcf_proto_create(name, protocol, prio, chain, + rtnl_held, extack); if (IS_ERR(tp_new)) { err = PTR_ERR(tp_new); goto errout_tp; diff --git a/net/sched/cls_basic.c b/net/sched/cls_basic.c index 4aafbe3d435c..f256a7c69093 100644 --- a/net/sched/cls_basic.c +++ b/net/sched/cls_basic.c @@ -263,12 +263,17 @@ skip: } } -static void basic_bind_class(void *fh, u32 classid, unsigned long cl) +static void basic_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct basic_filter *f = fh; - if (f && f->res.classid == classid) - f->res.class = cl; + if (f && f->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &f->res, base); + else + __tcf_unbind_filter(q, &f->res); + } } static int basic_dump(struct net *net, struct tcf_proto *tp, void *fh, diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c index 8229ed4a67be..6e3e63db0e01 100644 --- a/net/sched/cls_bpf.c +++ b/net/sched/cls_bpf.c @@ -631,12 +631,17 @@ nla_put_failure: return -1; } -static void cls_bpf_bind_class(void *fh, u32 classid, unsigned long cl) +static void cls_bpf_bind_class(void *fh, u32 classid, unsigned long cl, + void *q, unsigned long base) { struct cls_bpf_prog *prog = fh; - if (prog && prog->res.classid == classid) - prog->res.class = cl; + if (prog && prog->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &prog->res, base); + else + __tcf_unbind_filter(q, &prog->res); + } } static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg, diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index b0f42e62dd76..f9c0d1e8d380 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -2765,12 +2765,17 @@ nla_put_failure: return -EMSGSIZE; } -static void fl_bind_class(void *fh, u32 classid, unsigned long cl) +static void fl_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct cls_fl_filter *f = fh; - if (f && f->res.classid == classid) - f->res.class = cl; + if (f && f->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &f->res, base); + else + __tcf_unbind_filter(q, &f->res); + } } static bool fl_delete_empty(struct tcf_proto *tp) diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c index c9496c920d6f..ec945294626a 100644 --- a/net/sched/cls_fw.c +++ b/net/sched/cls_fw.c @@ -419,12 +419,17 @@ nla_put_failure: return -1; } -static void fw_bind_class(void *fh, u32 classid, unsigned long cl) +static void fw_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct fw_filter *f = fh; - if (f && f->res.classid == classid) - f->res.class = cl; + if (f && f->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &f->res, base); + else + __tcf_unbind_filter(q, &f->res); + } } static struct tcf_proto_ops cls_fw_ops __read_mostly = { diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c index 7fc2eb62aa98..039cc86974f4 100644 --- a/net/sched/cls_matchall.c +++ b/net/sched/cls_matchall.c @@ -393,12 +393,17 @@ nla_put_failure: return -1; } -static void mall_bind_class(void *fh, u32 classid, unsigned long cl) +static void mall_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct cls_mall_head *head = fh; - if (head && head->res.classid == classid) - head->res.class = cl; + if (head && head->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &head->res, base); + else + __tcf_unbind_filter(q, &head->res); + } } static struct tcf_proto_ops cls_mall_ops __read_mostly = { diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c index 2d9e0b4484ea..6f8786b06bde 100644 --- a/net/sched/cls_route.c +++ b/net/sched/cls_route.c @@ -641,12 +641,17 @@ nla_put_failure: return -1; } -static void route4_bind_class(void *fh, u32 classid, unsigned long cl) +static void route4_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct route4_filter *f = fh; - if (f && f->res.classid == classid) - f->res.class = cl; + if (f && f->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &f->res, base); + else + __tcf_unbind_filter(q, &f->res); + } } static struct tcf_proto_ops cls_route4_ops __read_mostly = { diff --git a/net/sched/cls_rsvp.h b/net/sched/cls_rsvp.h index 2f3c03b25d5d..c22624131949 100644 --- a/net/sched/cls_rsvp.h +++ b/net/sched/cls_rsvp.h @@ -738,12 +738,17 @@ nla_put_failure: return -1; } -static void rsvp_bind_class(void *fh, u32 classid, unsigned long cl) +static void rsvp_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct rsvp_filter *f = fh; - if (f && f->res.classid == classid) - f->res.class = cl; + if (f && f->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &f->res, base); + else + __tcf_unbind_filter(q, &f->res); + } } static struct tcf_proto_ops RSVP_OPS __read_mostly = { diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c index e573e5a5c794..3d4a1280352f 100644 --- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -654,12 +654,17 @@ nla_put_failure: return -1; } -static void tcindex_bind_class(void *fh, u32 classid, unsigned long cl) +static void tcindex_bind_class(void *fh, u32 classid, unsigned long cl, + void *q, unsigned long base) { struct tcindex_filter_result *r = fh; - if (r && r->res.classid == classid) - r->res.class = cl; + if (r && r->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &r->res, base); + else + __tcf_unbind_filter(q, &r->res); + } } static struct tcf_proto_ops cls_tcindex_ops __read_mostly = { diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index a0e6fac613de..e15ff335953d 100644 --- a/net/sched/cls_u32.c +++ b/net/sched/cls_u32.c @@ -1255,12 +1255,17 @@ static int u32_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb, return 0; } -static void u32_bind_class(void *fh, u32 classid, unsigned long cl) +static void u32_bind_class(void *fh, u32 classid, unsigned long cl, void *q, + unsigned long base) { struct tc_u_knode *n = fh; - if (n && n->res.classid == classid) - n->res.class = cl; + if (n && n->res.classid == classid) { + if (cl) + __tcf_bind_filter(q, &n->res, base); + else + __tcf_unbind_filter(q, &n->res); + } } static int u32_dump(struct net *net, struct tcf_proto *tp, void *fh, diff --git a/net/sched/ematch.c b/net/sched/ematch.c index 8f2ad706784d..dd3b8c11a2e0 100644 --- a/net/sched/ematch.c +++ b/net/sched/ematch.c @@ -238,6 +238,9 @@ static int tcf_em_validate(struct tcf_proto *tp, goto errout; if (em->ops->change) { + err = -EINVAL; + if (em_hdr->flags & TCF_EM_SIMPLE) + goto errout; err = em->ops->change(net, data, data_len, em); if (err < 0) goto errout; @@ -263,12 +266,12 @@ static int tcf_em_validate(struct tcf_proto *tp, } em->data = (unsigned long) v; } + em->datalen = data_len; } } em->matchid = em_hdr->matchid; em->flags = em_hdr->flags; - em->datalen = data_len; em->net = net; err = 0; diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 1047825d9f48..50794125bf02 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -1891,8 +1891,9 @@ static int tclass_del_notify(struct net *net, struct tcf_bind_args { struct tcf_walker w; - u32 classid; + unsigned long base; unsigned long cl; + u32 classid; }; static int tcf_node_bind(struct tcf_proto *tp, void *n, struct tcf_walker *arg) @@ -1903,28 +1904,30 @@ static int tcf_node_bind(struct tcf_proto *tp, void *n, struct tcf_walker *arg) struct Qdisc *q = tcf_block_q(tp->chain->block); sch_tree_lock(q); - tp->ops->bind_class(n, a->classid, a->cl); + tp->ops->bind_class(n, a->classid, a->cl, q, a->base); sch_tree_unlock(q); } return 0; } -static void tc_bind_tclass(struct Qdisc *q, u32 portid, u32 clid, - unsigned long new_cl) +struct tc_bind_class_args { + struct qdisc_walker w; + unsigned long new_cl; + u32 portid; + u32 clid; +}; + +static int tc_bind_class_walker(struct Qdisc *q, unsigned long cl, + struct qdisc_walker *w) { + struct tc_bind_class_args *a = (struct tc_bind_class_args *)w; const struct Qdisc_class_ops *cops = q->ops->cl_ops; struct tcf_block *block; struct tcf_chain *chain; - unsigned long cl; - cl = cops->find(q, portid); - if (!cl) - return; - if (!cops->tcf_block) - return; block = cops->tcf_block(q, cl, NULL); if (!block) - return; + return 0; for (chain = tcf_get_next_chain(block, NULL); chain; chain = tcf_get_next_chain(block, chain)) { @@ -1935,11 +1938,29 @@ static void tc_bind_tclass(struct Qdisc *q, u32 portid, u32 clid, struct tcf_bind_args arg = {}; arg.w.fn = tcf_node_bind; - arg.classid = clid; - arg.cl = new_cl; + arg.classid = a->clid; + arg.base = cl; + arg.cl = a->new_cl; tp->ops->walk(tp, &arg.w, true); } } + + return 0; +} + +static void tc_bind_tclass(struct Qdisc *q, u32 portid, u32 clid, + unsigned long new_cl) +{ + const struct Qdisc_class_ops *cops = q->ops->cl_ops; + struct tc_bind_class_args args = {}; + + if (!cops->tcf_block) + return; + args.portid = portid; + args.clid = clid; + args.new_cl = new_cl; + args.w.fn = tc_bind_class_walker; + q->ops->cl_ops->walk(q, &args.w); } #else diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c new file mode 100644 index 000000000000..bbd0dea6b6b9 --- /dev/null +++ b/net/sched/sch_fq_pie.c @@ -0,0 +1,562 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Flow Queue PIE discipline + * + * Copyright (C) 2019 Mohit P. Tahiliani <tahiliani@nitk.edu.in> + * Copyright (C) 2019 Sachin D. Patil <sdp.sachin@gmail.com> + * Copyright (C) 2019 V. Saicharan <vsaicharan1998@gmail.com> + * Copyright (C) 2019 Mohit Bhasi <mohitbhasi1998@gmail.com> + * Copyright (C) 2019 Leslie Monis <lesliemonis@gmail.com> + * Copyright (C) 2019 Gautam Ramakrishnan <gautamramk@gmail.com> + */ + +#include <linux/jhash.h> +#include <linux/sizes.h> +#include <linux/vmalloc.h> +#include <net/pkt_cls.h> +#include <net/pie.h> + +/* Flow Queue PIE + * + * Principles: + * - Packets are classified on flows. + * - This is a Stochastic model (as we use a hash, several flows might + * be hashed to the same slot) + * - Each flow has a PIE managed queue. + * - Flows are linked onto two (Round Robin) lists, + * so that new flows have priority on old ones. + * - For a given flow, packets are not reordered. + * - Drops during enqueue only. + * - ECN capability is off by default. + * - ECN threshold (if ECN is enabled) is at 10% by default. + * - Uses timestamps to calculate queue delay by default. + */ + +/** + * struct fq_pie_flow - contains data for each flow + * @vars: pie vars associated with the flow + * @deficit: number of remaining byte credits + * @backlog: size of data in the flow + * @qlen: number of packets in the flow + * @flowchain: flowchain for the flow + * @head: first packet in the flow + * @tail: last packet in the flow + */ +struct fq_pie_flow { + struct pie_vars vars; + s32 deficit; + u32 backlog; + u32 qlen; + struct list_head flowchain; + struct sk_buff *head; + struct sk_buff *tail; +}; + +struct fq_pie_sched_data { + struct tcf_proto __rcu *filter_list; /* optional external classifier */ + struct tcf_block *block; + struct fq_pie_flow *flows; + struct Qdisc *sch; + struct list_head old_flows; + struct list_head new_flows; + struct pie_params p_params; + u32 ecn_prob; + u32 flows_cnt; + u32 quantum; + u32 memory_limit; + u32 new_flow_count; + u32 memory_usage; + u32 overmemory; + struct pie_stats stats; + struct timer_list adapt_timer; +}; + +static unsigned int fq_pie_hash(const struct fq_pie_sched_data *q, + struct sk_buff *skb) +{ + return reciprocal_scale(skb_get_hash(skb), q->flows_cnt); +} + +static unsigned int fq_pie_classify(struct sk_buff *skb, struct Qdisc *sch, + int *qerr) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + struct tcf_proto *filter; + struct tcf_result res; + int result; + + if (TC_H_MAJ(skb->priority) == sch->handle && + TC_H_MIN(skb->priority) > 0 && + TC_H_MIN(skb->priority) <= q->flows_cnt) + return TC_H_MIN(skb->priority); + + filter = rcu_dereference_bh(q->filter_list); + if (!filter) + return fq_pie_hash(q, skb) + 1; + + *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; + result = tcf_classify(skb, filter, &res, false); + if (result >= 0) { +#ifdef CONFIG_NET_CLS_ACT + switch (result) { + case TC_ACT_STOLEN: + case TC_ACT_QUEUED: + case TC_ACT_TRAP: + *qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN; + /* fall through */ + case TC_ACT_SHOT: + return 0; + } +#endif + if (TC_H_MIN(res.classid) <= q->flows_cnt) + return TC_H_MIN(res.classid); + } + return 0; +} + +/* add skb to flow queue (tail add) */ +static inline void flow_queue_add(struct fq_pie_flow *flow, + struct sk_buff *skb) +{ + if (!flow->head) + flow->head = skb; + else + flow->tail->next = skb; + flow->tail = skb; + skb->next = NULL; +} + +static int fq_pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, + struct sk_buff **to_free) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + struct fq_pie_flow *sel_flow; + int uninitialized_var(ret); + u8 memory_limited = false; + u8 enqueue = false; + u32 pkt_len; + u32 idx; + + /* Classifies packet into corresponding flow */ + idx = fq_pie_classify(skb, sch, &ret); + sel_flow = &q->flows[idx]; + + /* Checks whether adding a new packet would exceed memory limit */ + get_pie_cb(skb)->mem_usage = skb->truesize; + memory_limited = q->memory_usage > q->memory_limit + skb->truesize; + + /* Checks if the qdisc is full */ + if (unlikely(qdisc_qlen(sch) >= sch->limit)) { + q->stats.overlimit++; + goto out; + } else if (unlikely(memory_limited)) { + q->overmemory++; + } + + if (!pie_drop_early(sch, &q->p_params, &sel_flow->vars, + sel_flow->backlog, skb->len)) { + enqueue = true; + } else if (q->p_params.ecn && + sel_flow->vars.prob <= (MAX_PROB / 100) * q->ecn_prob && + INET_ECN_set_ce(skb)) { + /* If packet is ecn capable, mark it if drop probability + * is lower than the parameter ecn_prob, else drop it. + */ + q->stats.ecn_mark++; + enqueue = true; + } + if (enqueue) { + /* Set enqueue time only when dq_rate_estimator is disabled. */ + if (!q->p_params.dq_rate_estimator) + pie_set_enqueue_time(skb); + + pkt_len = qdisc_pkt_len(skb); + q->stats.packets_in++; + q->memory_usage += skb->truesize; + sch->qstats.backlog += pkt_len; + sch->q.qlen++; + flow_queue_add(sel_flow, skb); + if (list_empty(&sel_flow->flowchain)) { + list_add_tail(&sel_flow->flowchain, &q->new_flows); + q->new_flow_count++; + sel_flow->deficit = q->quantum; + sel_flow->qlen = 0; + sel_flow->backlog = 0; + } + sel_flow->qlen++; + sel_flow->backlog += pkt_len; + return NET_XMIT_SUCCESS; + } +out: + q->stats.dropped++; + sel_flow->vars.accu_prob = 0; + sel_flow->vars.accu_prob_overflows = 0; + __qdisc_drop(skb, to_free); + qdisc_qstats_drop(sch); + return NET_XMIT_CN; +} + +static const struct nla_policy fq_pie_policy[TCA_FQ_PIE_MAX + 1] = { + [TCA_FQ_PIE_LIMIT] = {.type = NLA_U32}, + [TCA_FQ_PIE_FLOWS] = {.type = NLA_U32}, + [TCA_FQ_PIE_TARGET] = {.type = NLA_U32}, + [TCA_FQ_PIE_TUPDATE] = {.type = NLA_U32}, + [TCA_FQ_PIE_ALPHA] = {.type = NLA_U32}, + [TCA_FQ_PIE_BETA] = {.type = NLA_U32}, + [TCA_FQ_PIE_QUANTUM] = {.type = NLA_U32}, + [TCA_FQ_PIE_MEMORY_LIMIT] = {.type = NLA_U32}, + [TCA_FQ_PIE_ECN_PROB] = {.type = NLA_U32}, + [TCA_FQ_PIE_ECN] = {.type = NLA_U32}, + [TCA_FQ_PIE_BYTEMODE] = {.type = NLA_U32}, + [TCA_FQ_PIE_DQ_RATE_ESTIMATOR] = {.type = NLA_U32}, +}; + +static inline struct sk_buff *dequeue_head(struct fq_pie_flow *flow) +{ + struct sk_buff *skb = flow->head; + + flow->head = skb->next; + skb->next = NULL; + return skb; +} + +static struct sk_buff *fq_pie_qdisc_dequeue(struct Qdisc *sch) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + struct sk_buff *skb = NULL; + struct fq_pie_flow *flow; + struct list_head *head; + u32 pkt_len; + +begin: + head = &q->new_flows; + if (list_empty(head)) { + head = &q->old_flows; + if (list_empty(head)) + return NULL; + } + + flow = list_first_entry(head, struct fq_pie_flow, flowchain); + /* Flow has exhausted all its credits */ + if (flow->deficit <= 0) { + flow->deficit += q->quantum; + list_move_tail(&flow->flowchain, &q->old_flows); + goto begin; + } + + if (flow->head) { + skb = dequeue_head(flow); + pkt_len = qdisc_pkt_len(skb); + sch->qstats.backlog -= pkt_len; + sch->q.qlen--; + qdisc_bstats_update(sch, skb); + } + + if (!skb) { + /* force a pass through old_flows to prevent starvation */ + if (head == &q->new_flows && !list_empty(&q->old_flows)) + list_move_tail(&flow->flowchain, &q->old_flows); + else + list_del_init(&flow->flowchain); + goto begin; + } + + flow->qlen--; + flow->deficit -= pkt_len; + flow->backlog -= pkt_len; + q->memory_usage -= get_pie_cb(skb)->mem_usage; + pie_process_dequeue(skb, &q->p_params, &flow->vars, flow->backlog); + return skb; +} + +static int fq_pie_change(struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + struct nlattr *tb[TCA_FQ_PIE_MAX + 1]; + unsigned int len_dropped = 0; + unsigned int num_dropped = 0; + int err; + + if (!opt) + return -EINVAL; + + err = nla_parse_nested(tb, TCA_FQ_PIE_MAX, opt, fq_pie_policy, extack); + if (err < 0) + return err; + + sch_tree_lock(sch); + if (tb[TCA_FQ_PIE_LIMIT]) { + u32 limit = nla_get_u32(tb[TCA_FQ_PIE_LIMIT]); + + q->p_params.limit = limit; + sch->limit = limit; + } + if (tb[TCA_FQ_PIE_FLOWS]) { + if (q->flows) { + NL_SET_ERR_MSG_MOD(extack, + "Number of flows cannot be changed"); + goto flow_error; + } + q->flows_cnt = nla_get_u32(tb[TCA_FQ_PIE_FLOWS]); + if (!q->flows_cnt || q->flows_cnt > 65536) { + NL_SET_ERR_MSG_MOD(extack, + "Number of flows must be < 65536"); + goto flow_error; + } + } + + /* convert from microseconds to pschedtime */ + if (tb[TCA_FQ_PIE_TARGET]) { + /* target is in us */ + u32 target = nla_get_u32(tb[TCA_FQ_PIE_TARGET]); + + /* convert to pschedtime */ + q->p_params.target = + PSCHED_NS2TICKS((u64)target * NSEC_PER_USEC); + } + + /* tupdate is in jiffies */ + if (tb[TCA_FQ_PIE_TUPDATE]) + q->p_params.tupdate = + usecs_to_jiffies(nla_get_u32(tb[TCA_FQ_PIE_TUPDATE])); + + if (tb[TCA_FQ_PIE_ALPHA]) + q->p_params.alpha = nla_get_u32(tb[TCA_FQ_PIE_ALPHA]); + + if (tb[TCA_FQ_PIE_BETA]) + q->p_params.beta = nla_get_u32(tb[TCA_FQ_PIE_BETA]); + + if (tb[TCA_FQ_PIE_QUANTUM]) + q->quantum = nla_get_u32(tb[TCA_FQ_PIE_QUANTUM]); + + if (tb[TCA_FQ_PIE_MEMORY_LIMIT]) + q->memory_limit = nla_get_u32(tb[TCA_FQ_PIE_MEMORY_LIMIT]); + + if (tb[TCA_FQ_PIE_ECN_PROB]) + q->ecn_prob = nla_get_u32(tb[TCA_FQ_PIE_ECN_PROB]); + + if (tb[TCA_FQ_PIE_ECN]) + q->p_params.ecn = nla_get_u32(tb[TCA_FQ_PIE_ECN]); + + if (tb[TCA_FQ_PIE_BYTEMODE]) + q->p_params.bytemode = nla_get_u32(tb[TCA_FQ_PIE_BYTEMODE]); + + if (tb[TCA_FQ_PIE_DQ_RATE_ESTIMATOR]) + q->p_params.dq_rate_estimator = + nla_get_u32(tb[TCA_FQ_PIE_DQ_RATE_ESTIMATOR]); + + /* Drop excess packets if new limit is lower */ + while (sch->q.qlen > sch->limit) { + struct sk_buff *skb = fq_pie_qdisc_dequeue(sch); + + kfree_skb(skb); + len_dropped += qdisc_pkt_len(skb); + num_dropped += 1; + } + qdisc_tree_reduce_backlog(sch, num_dropped, len_dropped); + + sch_tree_unlock(sch); + return 0; + +flow_error: + sch_tree_unlock(sch); + return -EINVAL; +} + +static void fq_pie_timer(struct timer_list *t) +{ + struct fq_pie_sched_data *q = from_timer(q, t, adapt_timer); + struct Qdisc *sch = q->sch; + spinlock_t *root_lock; /* to lock qdisc for probability calculations */ + u16 idx; + + root_lock = qdisc_lock(qdisc_root_sleeping(sch)); + spin_lock(root_lock); + + for (idx = 0; idx < q->flows_cnt; idx++) + pie_calculate_probability(&q->p_params, &q->flows[idx].vars, + q->flows[idx].backlog); + + /* reset the timer to fire after 'tupdate' jiffies. */ + if (q->p_params.tupdate) + mod_timer(&q->adapt_timer, jiffies + q->p_params.tupdate); + + spin_unlock(root_lock); +} + +static int fq_pie_init(struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + int err; + u16 idx; + + pie_params_init(&q->p_params); + sch->limit = 10 * 1024; + q->p_params.limit = sch->limit; + q->quantum = psched_mtu(qdisc_dev(sch)); + q->sch = sch; + q->ecn_prob = 10; + q->flows_cnt = 1024; + q->memory_limit = SZ_32M; + + INIT_LIST_HEAD(&q->new_flows); + INIT_LIST_HEAD(&q->old_flows); + + if (opt) { + err = fq_pie_change(sch, opt, extack); + + if (err) + return err; + } + + err = tcf_block_get(&q->block, &q->filter_list, sch, extack); + if (err) + goto init_failure; + + q->flows = kvcalloc(q->flows_cnt, sizeof(struct fq_pie_flow), + GFP_KERNEL); + if (!q->flows) { + err = -ENOMEM; + goto init_failure; + } + for (idx = 0; idx < q->flows_cnt; idx++) { + struct fq_pie_flow *flow = q->flows + idx; + + INIT_LIST_HEAD(&flow->flowchain); + pie_vars_init(&flow->vars); + } + + timer_setup(&q->adapt_timer, fq_pie_timer, 0); + mod_timer(&q->adapt_timer, jiffies + HZ / 2); + + return 0; + +init_failure: + q->flows_cnt = 0; + + return err; +} + +static int fq_pie_dump(struct Qdisc *sch, struct sk_buff *skb) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + struct nlattr *opts; + + opts = nla_nest_start(skb, TCA_OPTIONS); + if (!opts) + return -EMSGSIZE; + + /* convert target from pschedtime to us */ + if (nla_put_u32(skb, TCA_FQ_PIE_LIMIT, sch->limit) || + nla_put_u32(skb, TCA_FQ_PIE_FLOWS, q->flows_cnt) || + nla_put_u32(skb, TCA_FQ_PIE_TARGET, + ((u32)PSCHED_TICKS2NS(q->p_params.target)) / + NSEC_PER_USEC) || + nla_put_u32(skb, TCA_FQ_PIE_TUPDATE, + jiffies_to_usecs(q->p_params.tupdate)) || + nla_put_u32(skb, TCA_FQ_PIE_ALPHA, q->p_params.alpha) || + nla_put_u32(skb, TCA_FQ_PIE_BETA, q->p_params.beta) || + nla_put_u32(skb, TCA_FQ_PIE_QUANTUM, q->quantum) || + nla_put_u32(skb, TCA_FQ_PIE_MEMORY_LIMIT, q->memory_limit) || + nla_put_u32(skb, TCA_FQ_PIE_ECN_PROB, q->ecn_prob) || + nla_put_u32(skb, TCA_FQ_PIE_ECN, q->p_params.ecn) || + nla_put_u32(skb, TCA_FQ_PIE_BYTEMODE, q->p_params.bytemode) || + nla_put_u32(skb, TCA_FQ_PIE_DQ_RATE_ESTIMATOR, + q->p_params.dq_rate_estimator)) + goto nla_put_failure; + + return nla_nest_end(skb, opts); + +nla_put_failure: + nla_nest_cancel(skb, opts); + return -EMSGSIZE; +} + +static int fq_pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + struct tc_fq_pie_xstats st = { + .packets_in = q->stats.packets_in, + .overlimit = q->stats.overlimit, + .overmemory = q->overmemory, + .dropped = q->stats.dropped, + .ecn_mark = q->stats.ecn_mark, + .new_flow_count = q->new_flow_count, + .memory_usage = q->memory_usage, + }; + struct list_head *pos; + + sch_tree_lock(sch); + list_for_each(pos, &q->new_flows) + st.new_flows_len++; + + list_for_each(pos, &q->old_flows) + st.old_flows_len++; + sch_tree_unlock(sch); + + return gnet_stats_copy_app(d, &st, sizeof(st)); +} + +static void fq_pie_reset(struct Qdisc *sch) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + u16 idx; + + INIT_LIST_HEAD(&q->new_flows); + INIT_LIST_HEAD(&q->old_flows); + for (idx = 0; idx < q->flows_cnt; idx++) { + struct fq_pie_flow *flow = q->flows + idx; + + /* Removes all packets from flow */ + rtnl_kfree_skbs(flow->head, flow->tail); + flow->head = NULL; + + INIT_LIST_HEAD(&flow->flowchain); + pie_vars_init(&flow->vars); + } + + sch->q.qlen = 0; + sch->qstats.backlog = 0; +} + +static void fq_pie_destroy(struct Qdisc *sch) +{ + struct fq_pie_sched_data *q = qdisc_priv(sch); + + tcf_block_put(q->block); + del_timer_sync(&q->adapt_timer); + kvfree(q->flows); +} + +static struct Qdisc_ops fq_pie_qdisc_ops __read_mostly = { + .id = "fq_pie", + .priv_size = sizeof(struct fq_pie_sched_data), + .enqueue = fq_pie_qdisc_enqueue, + .dequeue = fq_pie_qdisc_dequeue, + .peek = qdisc_peek_dequeued, + .init = fq_pie_init, + .destroy = fq_pie_destroy, + .reset = fq_pie_reset, + .change = fq_pie_change, + .dump = fq_pie_dump, + .dump_stats = fq_pie_dump_stats, + .owner = THIS_MODULE, +}; + +static int __init fq_pie_module_init(void) +{ + return register_qdisc(&fq_pie_qdisc_ops); +} + +static void __exit fq_pie_module_exit(void) +{ + unregister_qdisc(&fq_pie_qdisc_ops); +} + +module_init(fq_pie_module_init); +module_exit(fq_pie_module_exit); + +MODULE_DESCRIPTION("Flow Queue Proportional Integral controller Enhanced (FQ-PIE)"); +MODULE_AUTHOR("Mohit P. Tahiliani"); +MODULE_LICENSE("GPL"); diff --git a/net/sched/sch_pie.c b/net/sched/sch_pie.c index b0b0dc46af61..915bcdb59a9f 100644 --- a/net/sched/sch_pie.c +++ b/net/sched/sch_pie.c @@ -19,159 +19,76 @@ #include <linux/skbuff.h> #include <net/pkt_sched.h> #include <net/inet_ecn.h> - -#define QUEUE_THRESHOLD 16384 -#define DQCOUNT_INVALID -1 -#define DTIME_INVALID 0xffffffffffffffff -#define MAX_PROB 0xffffffffffffffff -#define PIE_SCALE 8 - -/* parameters used */ -struct pie_params { - psched_time_t target; /* user specified target delay in pschedtime */ - u32 tupdate; /* timer frequency (in jiffies) */ - u32 limit; /* number of packets that can be enqueued */ - u32 alpha; /* alpha and beta are between 0 and 32 */ - u32 beta; /* and are used for shift relative to 1 */ - bool ecn; /* true if ecn is enabled */ - bool bytemode; /* to scale drop early prob based on pkt size */ - u8 dq_rate_estimator; /* to calculate delay using Little's law */ -}; - -/* variables used */ -struct pie_vars { - u64 prob; /* probability but scaled by u64 limit. */ - psched_time_t burst_time; - psched_time_t qdelay; - psched_time_t qdelay_old; - u64 dq_count; /* measured in bytes */ - psched_time_t dq_tstamp; /* drain rate */ - u64 accu_prob; /* accumulated drop probability */ - u32 avg_dq_rate; /* bytes per pschedtime tick,scaled */ - u32 qlen_old; /* in bytes */ - u8 accu_prob_overflows; /* overflows of accu_prob */ -}; - -/* statistics gathering */ -struct pie_stats { - u32 packets_in; /* total number of packets enqueued */ - u32 dropped; /* packets dropped due to pie_action */ - u32 overlimit; /* dropped due to lack of space in queue */ - u32 maxq; /* maximum queue size */ - u32 ecn_mark; /* packets marked with ECN */ -}; +#include <net/pie.h> /* private data for the Qdisc */ struct pie_sched_data { - struct pie_params params; struct pie_vars vars; + struct pie_params params; struct pie_stats stats; struct timer_list adapt_timer; struct Qdisc *sch; }; -static void pie_params_init(struct pie_params *params) +bool pie_drop_early(struct Qdisc *sch, struct pie_params *params, + struct pie_vars *vars, u32 qlen, u32 packet_size) { - params->alpha = 2; - params->beta = 20; - params->tupdate = usecs_to_jiffies(15 * USEC_PER_MSEC); /* 15 ms */ - params->limit = 1000; /* default of 1000 packets */ - params->target = PSCHED_NS2TICKS(15 * NSEC_PER_MSEC); /* 15 ms */ - params->ecn = false; - params->bytemode = false; - params->dq_rate_estimator = false; -} - -/* private skb vars */ -struct pie_skb_cb { - psched_time_t enqueue_time; -}; - -static struct pie_skb_cb *get_pie_cb(const struct sk_buff *skb) -{ - qdisc_cb_private_validate(skb, sizeof(struct pie_skb_cb)); - return (struct pie_skb_cb *)qdisc_skb_cb(skb)->data; -} - -static psched_time_t pie_get_enqueue_time(const struct sk_buff *skb) -{ - return get_pie_cb(skb)->enqueue_time; -} - -static void pie_set_enqueue_time(struct sk_buff *skb) -{ - get_pie_cb(skb)->enqueue_time = psched_get_time(); -} - -static void pie_vars_init(struct pie_vars *vars) -{ - vars->dq_count = DQCOUNT_INVALID; - vars->dq_tstamp = DTIME_INVALID; - vars->accu_prob = 0; - vars->avg_dq_rate = 0; - /* default of 150 ms in pschedtime */ - vars->burst_time = PSCHED_NS2TICKS(150 * NSEC_PER_MSEC); - vars->accu_prob_overflows = 0; -} - -static bool drop_early(struct Qdisc *sch, u32 packet_size) -{ - struct pie_sched_data *q = qdisc_priv(sch); u64 rnd; - u64 local_prob = q->vars.prob; + u64 local_prob = vars->prob; u32 mtu = psched_mtu(qdisc_dev(sch)); /* If there is still burst allowance left skip random early drop */ - if (q->vars.burst_time > 0) + if (vars->burst_time > 0) return false; /* If current delay is less than half of target, and * if drop prob is low already, disable early_drop */ - if ((q->vars.qdelay < q->params.target / 2) && - (q->vars.prob < MAX_PROB / 5)) + if ((vars->qdelay < params->target / 2) && + (vars->prob < MAX_PROB / 5)) return false; - /* If we have fewer than 2 mtu-sized packets, disable drop_early, + /* If we have fewer than 2 mtu-sized packets, disable pie_drop_early, * similar to min_th in RED */ - if (sch->qstats.backlog < 2 * mtu) + if (qlen < 2 * mtu) return false; /* If bytemode is turned on, use packet size to compute new * probablity. Smaller packets will have lower drop prob in this case */ - if (q->params.bytemode && packet_size <= mtu) + if (params->bytemode && packet_size <= mtu) local_prob = (u64)packet_size * div_u64(local_prob, mtu); else - local_prob = q->vars.prob; + local_prob = vars->prob; if (local_prob == 0) { - q->vars.accu_prob = 0; - q->vars.accu_prob_overflows = 0; + vars->accu_prob = 0; + vars->accu_prob_overflows = 0; } - if (local_prob > MAX_PROB - q->vars.accu_prob) - q->vars.accu_prob_overflows++; + if (local_prob > MAX_PROB - vars->accu_prob) + vars->accu_prob_overflows++; - q->vars.accu_prob += local_prob; + vars->accu_prob += local_prob; - if (q->vars.accu_prob_overflows == 0 && - q->vars.accu_prob < (MAX_PROB / 100) * 85) + if (vars->accu_prob_overflows == 0 && + vars->accu_prob < (MAX_PROB / 100) * 85) return false; - if (q->vars.accu_prob_overflows == 8 && - q->vars.accu_prob >= MAX_PROB / 2) + if (vars->accu_prob_overflows == 8 && + vars->accu_prob >= MAX_PROB / 2) return true; prandom_bytes(&rnd, 8); if (rnd < local_prob) { - q->vars.accu_prob = 0; - q->vars.accu_prob_overflows = 0; + vars->accu_prob = 0; + vars->accu_prob_overflows = 0; return true; } return false; } +EXPORT_SYMBOL_GPL(pie_drop_early); static int pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) @@ -184,7 +101,8 @@ static int pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, goto out; } - if (!drop_early(sch, skb->len)) { + if (!pie_drop_early(sch, &q->params, &q->vars, sch->qstats.backlog, + skb->len)) { enqueue = true; } else if (q->params.ecn && (q->vars.prob <= MAX_PROB / 10) && INET_ECN_set_ce(skb)) { @@ -216,14 +134,14 @@ out: } static const struct nla_policy pie_policy[TCA_PIE_MAX + 1] = { - [TCA_PIE_TARGET] = {.type = NLA_U32}, - [TCA_PIE_LIMIT] = {.type = NLA_U32}, - [TCA_PIE_TUPDATE] = {.type = NLA_U32}, - [TCA_PIE_ALPHA] = {.type = NLA_U32}, - [TCA_PIE_BETA] = {.type = NLA_U32}, - [TCA_PIE_ECN] = {.type = NLA_U32}, - [TCA_PIE_BYTEMODE] = {.type = NLA_U32}, - [TCA_PIE_DQ_RATE_ESTIMATOR] = {.type = NLA_U32}, + [TCA_PIE_TARGET] = {.type = NLA_U32}, + [TCA_PIE_LIMIT] = {.type = NLA_U32}, + [TCA_PIE_TUPDATE] = {.type = NLA_U32}, + [TCA_PIE_ALPHA] = {.type = NLA_U32}, + [TCA_PIE_BETA] = {.type = NLA_U32}, + [TCA_PIE_ECN] = {.type = NLA_U32}, + [TCA_PIE_BYTEMODE] = {.type = NLA_U32}, + [TCA_PIE_DQ_RATE_ESTIMATOR] = {.type = NLA_U32}, }; static int pie_change(struct Qdisc *sch, struct nlattr *opt, @@ -296,26 +214,25 @@ static int pie_change(struct Qdisc *sch, struct nlattr *opt, return 0; } -static void pie_process_dequeue(struct Qdisc *sch, struct sk_buff *skb) +void pie_process_dequeue(struct sk_buff *skb, struct pie_params *params, + struct pie_vars *vars, u32 qlen) { - struct pie_sched_data *q = qdisc_priv(sch); - int qlen = sch->qstats.backlog; /* current queue size in bytes */ psched_time_t now = psched_get_time(); u32 dtime = 0; /* If dq_rate_estimator is disabled, calculate qdelay using the * packet timestamp. */ - if (!q->params.dq_rate_estimator) { - q->vars.qdelay = now - pie_get_enqueue_time(skb); + if (!params->dq_rate_estimator) { + vars->qdelay = now - pie_get_enqueue_time(skb); - if (q->vars.dq_tstamp != DTIME_INVALID) - dtime = now - q->vars.dq_tstamp; + if (vars->dq_tstamp != DTIME_INVALID) + dtime = now - vars->dq_tstamp; - q->vars.dq_tstamp = now; + vars->dq_tstamp = now; if (qlen == 0) - q->vars.qdelay = 0; + vars->qdelay = 0; if (dtime == 0) return; @@ -327,39 +244,39 @@ static void pie_process_dequeue(struct Qdisc *sch, struct sk_buff *skb) * we have enough packets to calculate the drain rate. Save * current time as dq_tstamp and start measurement cycle. */ - if (qlen >= QUEUE_THRESHOLD && q->vars.dq_count == DQCOUNT_INVALID) { - q->vars.dq_tstamp = psched_get_time(); - q->vars.dq_count = 0; + if (qlen >= QUEUE_THRESHOLD && vars->dq_count == DQCOUNT_INVALID) { + vars->dq_tstamp = psched_get_time(); + vars->dq_count = 0; } - /* Calculate the average drain rate from this value. If queue length - * has receded to a small value viz., <= QUEUE_THRESHOLD bytes,reset + /* Calculate the average drain rate from this value. If queue length + * has receded to a small value viz., <= QUEUE_THRESHOLD bytes, reset * the dq_count to -1 as we don't have enough packets to calculate the - * drain rate anymore The following if block is entered only when we + * drain rate anymore. The following if block is entered only when we * have a substantial queue built up (QUEUE_THRESHOLD bytes or more) * and we calculate the drain rate for the threshold here. dq_count is * in bytes, time difference in psched_time, hence rate is in * bytes/psched_time. */ - if (q->vars.dq_count != DQCOUNT_INVALID) { - q->vars.dq_count += skb->len; + if (vars->dq_count != DQCOUNT_INVALID) { + vars->dq_count += skb->len; - if (q->vars.dq_count >= QUEUE_THRESHOLD) { - u32 count = q->vars.dq_count << PIE_SCALE; + if (vars->dq_count >= QUEUE_THRESHOLD) { + u32 count = vars->dq_count << PIE_SCALE; - dtime = now - q->vars.dq_tstamp; + dtime = now - vars->dq_tstamp; if (dtime == 0) return; count = count / dtime; - if (q->vars.avg_dq_rate == 0) - q->vars.avg_dq_rate = count; + if (vars->avg_dq_rate == 0) + vars->avg_dq_rate = count; else - q->vars.avg_dq_rate = - (q->vars.avg_dq_rate - - (q->vars.avg_dq_rate >> 3)) + (count >> 3); + vars->avg_dq_rate = + (vars->avg_dq_rate - + (vars->avg_dq_rate >> 3)) + (count >> 3); /* If the queue has receded below the threshold, we hold * on to the last drain rate calculated, else we reset @@ -367,10 +284,10 @@ static void pie_process_dequeue(struct Qdisc *sch, struct sk_buff *skb) * packet is dequeued */ if (qlen < QUEUE_THRESHOLD) { - q->vars.dq_count = DQCOUNT_INVALID; + vars->dq_count = DQCOUNT_INVALID; } else { - q->vars.dq_count = 0; - q->vars.dq_tstamp = psched_get_time(); + vars->dq_count = 0; + vars->dq_tstamp = psched_get_time(); } goto burst_allowance_reduction; @@ -380,18 +297,18 @@ static void pie_process_dequeue(struct Qdisc *sch, struct sk_buff *skb) return; burst_allowance_reduction: - if (q->vars.burst_time > 0) { - if (q->vars.burst_time > dtime) - q->vars.burst_time -= dtime; + if (vars->burst_time > 0) { + if (vars->burst_time > dtime) + vars->burst_time -= dtime; else - q->vars.burst_time = 0; + vars->burst_time = 0; } } +EXPORT_SYMBOL_GPL(pie_process_dequeue); -static void calculate_probability(struct Qdisc *sch) +void pie_calculate_probability(struct pie_params *params, struct pie_vars *vars, + u32 qlen) { - struct pie_sched_data *q = qdisc_priv(sch); - u32 qlen = sch->qstats.backlog; /* queue size in bytes */ psched_time_t qdelay = 0; /* in pschedtime */ psched_time_t qdelay_old = 0; /* in pschedtime */ s64 delta = 0; /* determines the change in probability */ @@ -400,21 +317,21 @@ static void calculate_probability(struct Qdisc *sch) u32 power; bool update_prob = true; - if (q->params.dq_rate_estimator) { - qdelay_old = q->vars.qdelay; - q->vars.qdelay_old = q->vars.qdelay; + if (params->dq_rate_estimator) { + qdelay_old = vars->qdelay; + vars->qdelay_old = vars->qdelay; - if (q->vars.avg_dq_rate > 0) - qdelay = (qlen << PIE_SCALE) / q->vars.avg_dq_rate; + if (vars->avg_dq_rate > 0) + qdelay = (qlen << PIE_SCALE) / vars->avg_dq_rate; else qdelay = 0; } else { - qdelay = q->vars.qdelay; - qdelay_old = q->vars.qdelay_old; + qdelay = vars->qdelay; + qdelay_old = vars->qdelay_old; } - /* If qdelay is zero and qlen is not, it means qlen is very small, less - * than dequeue_rate, so we do not update probabilty in this round + /* If qdelay is zero and qlen is not, it means qlen is very small, + * so we do not update probabilty in this round. */ if (qdelay == 0 && qlen != 0) update_prob = false; @@ -426,18 +343,18 @@ static void calculate_probability(struct Qdisc *sch) * probability. alpha/beta are updated locally below by scaling down * by 16 to come to 0-2 range. */ - alpha = ((u64)q->params.alpha * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4; - beta = ((u64)q->params.beta * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4; + alpha = ((u64)params->alpha * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4; + beta = ((u64)params->beta * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4; /* We scale alpha and beta differently depending on how heavy the * congestion is. Please see RFC 8033 for details. */ - if (q->vars.prob < MAX_PROB / 10) { + if (vars->prob < MAX_PROB / 10) { alpha >>= 1; beta >>= 1; power = 100; - while (q->vars.prob < div_u64(MAX_PROB, power) && + while (vars->prob < div_u64(MAX_PROB, power) && power <= 1000000) { alpha >>= 2; beta >>= 2; @@ -446,14 +363,14 @@ static void calculate_probability(struct Qdisc *sch) } /* alpha and beta should be between 0 and 32, in multiples of 1/16 */ - delta += alpha * (u64)(qdelay - q->params.target); + delta += alpha * (u64)(qdelay - params->target); delta += beta * (u64)(qdelay - qdelay_old); - oldprob = q->vars.prob; + oldprob = vars->prob; /* to ensure we increase probability in steps of no more than 2% */ if (delta > (s64)(MAX_PROB / (100 / 2)) && - q->vars.prob >= MAX_PROB / 10) + vars->prob >= MAX_PROB / 10) delta = (MAX_PROB / 100) * 2; /* Non-linear drop: @@ -464,12 +381,12 @@ static void calculate_probability(struct Qdisc *sch) if (qdelay > (PSCHED_NS2TICKS(250 * NSEC_PER_MSEC))) delta += MAX_PROB / (100 / 2); - q->vars.prob += delta; + vars->prob += delta; if (delta > 0) { /* prevent overflow */ - if (q->vars.prob < oldprob) { - q->vars.prob = MAX_PROB; + if (vars->prob < oldprob) { + vars->prob = MAX_PROB; /* Prevent normalization error. If probability is at * maximum value already, we normalize it here, and * skip the check to do a non-linear drop in the next @@ -479,8 +396,8 @@ static void calculate_probability(struct Qdisc *sch) } } else { /* prevent underflow */ - if (q->vars.prob > oldprob) - q->vars.prob = 0; + if (vars->prob > oldprob) + vars->prob = 0; } /* Non-linear drop in probability: Reduce drop probability quickly if @@ -489,10 +406,10 @@ static void calculate_probability(struct Qdisc *sch) if (qdelay == 0 && qdelay_old == 0 && update_prob) /* Reduce drop probability to 98.4% */ - q->vars.prob -= q->vars.prob / 64u; + vars->prob -= vars->prob / 64; - q->vars.qdelay = qdelay; - q->vars.qlen_old = qlen; + vars->qdelay = qdelay; + vars->qlen_old = qlen; /* We restart the measurement cycle if the following conditions are met * 1. If the delay has been low for 2 consecutive Tupdate periods @@ -500,16 +417,17 @@ static void calculate_probability(struct Qdisc *sch) * 3. If average dq_rate_estimator is enabled, we have atleast one * estimate for the avg_dq_rate ie., is a non-zero value */ - if ((q->vars.qdelay < q->params.target / 2) && - (q->vars.qdelay_old < q->params.target / 2) && - q->vars.prob == 0 && - (!q->params.dq_rate_estimator || q->vars.avg_dq_rate > 0)) { - pie_vars_init(&q->vars); + if ((vars->qdelay < params->target / 2) && + (vars->qdelay_old < params->target / 2) && + vars->prob == 0 && + (!params->dq_rate_estimator || vars->avg_dq_rate > 0)) { + pie_vars_init(vars); } - if (!q->params.dq_rate_estimator) - q->vars.qdelay_old = qdelay; + if (!params->dq_rate_estimator) + vars->qdelay_old = qdelay; } +EXPORT_SYMBOL_GPL(pie_calculate_probability); static void pie_timer(struct timer_list *t) { @@ -518,7 +436,7 @@ static void pie_timer(struct timer_list *t) spinlock_t *root_lock = qdisc_lock(qdisc_root_sleeping(sch)); spin_lock(root_lock); - calculate_probability(sch); + pie_calculate_probability(&q->params, &q->vars, sch->qstats.backlog); /* reset the timer to fire after 'tupdate'. tupdate is in jiffies. */ if (q->params.tupdate) @@ -607,12 +525,13 @@ static int pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d) static struct sk_buff *pie_qdisc_dequeue(struct Qdisc *sch) { + struct pie_sched_data *q = qdisc_priv(sch); struct sk_buff *skb = qdisc_dequeue_head(sch); if (!skb) return NULL; - pie_process_dequeue(sch, skb); + pie_process_dequeue(skb, &q->params, &q->vars, sch->qstats.backlog); return skb; } @@ -633,7 +552,7 @@ static void pie_destroy(struct Qdisc *sch) } static struct Qdisc_ops pie_qdisc_ops __read_mostly = { - .id = "pie", + .id = "pie", .priv_size = sizeof(struct pie_sched_data), .enqueue = pie_qdisc_enqueue, .dequeue = pie_qdisc_dequeue, diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index 2cd94973795c..78e79029dc63 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -15,6 +15,7 @@ #include <linux/skbuff.h> #include <net/netlink.h> #include <net/sch_generic.h> +#include <net/pkt_cls.h> #include <net/pkt_sched.h> @@ -137,6 +138,52 @@ static u64 psched_ns_t2l(const struct psched_ratecfg *r, return len; } +static void tbf_offload_change(struct Qdisc *sch) +{ + struct tbf_sched_data *q = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); + struct tc_tbf_qopt_offload qopt; + + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) + return; + + qopt.command = TC_TBF_REPLACE; + qopt.handle = sch->handle; + qopt.parent = sch->parent; + qopt.replace_params.rate = q->rate; + qopt.replace_params.max_size = q->max_size; + qopt.replace_params.qstats = &sch->qstats; + + dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TBF, &qopt); +} + +static void tbf_offload_destroy(struct Qdisc *sch) +{ + struct net_device *dev = qdisc_dev(sch); + struct tc_tbf_qopt_offload qopt; + + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) + return; + + qopt.command = TC_TBF_DESTROY; + qopt.handle = sch->handle; + qopt.parent = sch->parent; + dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TBF, &qopt); +} + +static int tbf_offload_dump(struct Qdisc *sch) +{ + struct tc_tbf_qopt_offload qopt; + + qopt.command = TC_TBF_STATS; + qopt.handle = sch->handle; + qopt.parent = sch->parent; + qopt.stats.bstats = &sch->bstats; + qopt.stats.qstats = &sch->qstats; + + return qdisc_offload_dump_helper(sch, TC_SETUP_QDISC_TBF, &qopt); +} + /* GSO packet is too big, segment it so that tbf can transmit * each segment in time */ @@ -407,6 +454,8 @@ static int tbf_change(struct Qdisc *sch, struct nlattr *opt, sch_tree_unlock(sch); err = 0; + + tbf_offload_change(sch); done: return err; } @@ -432,6 +481,7 @@ static void tbf_destroy(struct Qdisc *sch) struct tbf_sched_data *q = qdisc_priv(sch); qdisc_watchdog_cancel(&q->watchdog); + tbf_offload_destroy(sch); qdisc_put(q->qdisc); } @@ -440,8 +490,12 @@ static int tbf_dump(struct Qdisc *sch, struct sk_buff *skb) struct tbf_sched_data *q = qdisc_priv(sch); struct nlattr *nest; struct tc_tbf_qopt opt; + int err; + + err = tbf_offload_dump(sch); + if (err) + return err; - sch->qstats.backlog = q->qdisc->qstats.backlog; nest = nla_nest_start_noflag(skb, TCA_OPTIONS); if (nest == NULL) goto nla_put_failure; diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c index 7ac1542feaf8..dc651a628dcf 100644 --- a/net/xfrm/xfrm_interface.c +++ b/net/xfrm/xfrm_interface.c @@ -268,9 +268,6 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) int err = -1; int mtu; - if (!dst) - goto tx_err_link_failure; - dst_hold(dst); dst = xfrm_lookup_with_ifid(xi->net, dst, fl, NULL, 0, xi->p.if_id); if (IS_ERR(dst)) { @@ -297,7 +294,7 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) mtu = dst_mtu(dst); if (!skb->ignore_df && skb->len > mtu) { - skb_dst_update_pmtu(skb, mtu); + skb_dst_update_pmtu_no_confirm(skb, mtu); if (skb->protocol == htons(ETH_P_IPV6)) { if (mtu < IPV6_MIN_MTU) @@ -343,6 +340,7 @@ static netdev_tx_t xfrmi_xmit(struct sk_buff *skb, struct net_device *dev) { struct xfrm_if *xi = netdev_priv(dev); struct net_device_stats *stats = &xi->dev->stats; + struct dst_entry *dst = skb_dst(skb); struct flowi fl; int ret; @@ -352,10 +350,33 @@ static netdev_tx_t xfrmi_xmit(struct sk_buff *skb, struct net_device *dev) case htons(ETH_P_IPV6): xfrm_decode_session(skb, &fl, AF_INET6); memset(IP6CB(skb), 0, sizeof(*IP6CB(skb))); + if (!dst) { + fl.u.ip6.flowi6_oif = dev->ifindex; + fl.u.ip6.flowi6_flags |= FLOWI_FLAG_ANYSRC; + dst = ip6_route_output(dev_net(dev), NULL, &fl.u.ip6); + if (dst->error) { + dst_release(dst); + stats->tx_carrier_errors++; + goto tx_err; + } + skb_dst_set(skb, dst); + } break; case htons(ETH_P_IP): xfrm_decode_session(skb, &fl, AF_INET); memset(IPCB(skb), 0, sizeof(*IPCB(skb))); + if (!dst) { + struct rtable *rt; + + fl.u.ip4.flowi4_oif = dev->ifindex; + fl.u.ip4.flowi4_flags |= FLOWI_FLAG_ANYSRC; + rt = __ip_route_output_key(dev_net(dev), &fl.u.ip4); + if (IS_ERR(rt)) { + stats->tx_carrier_errors++; + goto tx_err; + } + skb_dst_set(skb, &rt->dst); + } break; default: goto tx_err; @@ -563,12 +584,9 @@ static void xfrmi_dev_setup(struct net_device *dev) { dev->netdev_ops = &xfrmi_netdev_ops; dev->type = ARPHRD_NONE; - dev->hard_header_len = ETH_HLEN; - dev->min_header_len = ETH_HLEN; dev->mtu = ETH_DATA_LEN; dev->min_mtu = ETH_MIN_MTU; - dev->max_mtu = ETH_DATA_LEN; - dev->addr_len = ETH_ALEN; + dev->max_mtu = IP_MAX_MTU; dev->flags = IFF_NOARP; dev->needs_free_netdev = true; dev->priv_destructor = xfrmi_dev_free; diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c index 612268eabef4..7225107a9aaf 100644 --- a/scripts/recordmcount.c +++ b/scripts/recordmcount.c @@ -38,6 +38,10 @@ #define R_AARCH64_ABS64 257 #endif +#define R_ARM_PC24 1 +#define R_ARM_THM_CALL 10 +#define R_ARM_CALL 28 + static int fd_map; /* File descriptor for file being modified. */ static int mmap_failed; /* Boolean flag. */ static char gpfx; /* prefix for global symbol name (sometimes '_') */ @@ -418,6 +422,18 @@ static char const *already_has_rel_mcount = "success"; /* our work here is done! #define RECORD_MCOUNT_64 #include "recordmcount.h" +static int arm_is_fake_mcount(Elf32_Rel const *rp) +{ + switch (ELF32_R_TYPE(w(rp->r_info))) { + case R_ARM_THM_CALL: + case R_ARM_CALL: + case R_ARM_PC24: + return 0; + } + + return 1; +} + /* 64-bit EM_MIPS has weird ELF64_Rela.r_info. * http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf * We interpret Table 29 Relocation Operation (Elf64_Rel, Elf64_Rela) [p.40] @@ -523,6 +539,7 @@ static int do_file(char const *const fname) altmcount = "__gnu_mcount_nc"; make_nop = make_nop_arm; rel_type_nop = R_ARM_NONE; + is_fake_mcount32 = arm_is_fake_mcount; gpfx = 0; break; case EM_AARCH64: diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index b001c602414b..b45a8f77ee24 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -32,6 +32,7 @@ TARGETS += memory-hotplug TARGETS += mount TARGETS += mqueue TARGETS += net +TARGETS += net/mptcp TARGETS += netfilter TARGETS += networking/timestamping TARGETS += nsfs diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_lib.sh b/tools/testing/selftests/drivers/net/mlxsw/qos_lib.sh index a5937069ac16..faa51012cdac 100644 --- a/tools/testing/selftests/drivers/net/mlxsw/qos_lib.sh +++ b/tools/testing/selftests/drivers/net/mlxsw/qos_lib.sh @@ -1,29 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -humanize() -{ - local speed=$1; shift - - for unit in bps Kbps Mbps Gbps; do - if (($(echo "$speed < 1024" | bc))); then - break - fi - - speed=$(echo "scale=1; $speed / 1024" | bc) - done - - echo "$speed${unit}" -} - -rate() -{ - local t0=$1; shift - local t1=$1; shift - local interval=$1; shift - - echo $((8 * (t1 - t0) / interval)) -} - check_rate() { local rate=$1; shift diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_ets.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_ets.sh new file mode 100755 index 000000000000..c6ce0b448bf3 --- /dev/null +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_ets.sh @@ -0,0 +1,9 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source qos_lib.sh +bail_on_lldpad + +lib_dir=$(dirname $0)/../../../net/forwarding +TCFLAGS=skip_sw +source $lib_dir/sch_tbf_ets.sh diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_prio.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_prio.sh new file mode 100755 index 000000000000..8d245f331619 --- /dev/null +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_prio.sh @@ -0,0 +1,9 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source qos_lib.sh +bail_on_lldpad + +lib_dir=$(dirname $0)/../../../net/forwarding +TCFLAGS=skip_sw +source $lib_dir/sch_tbf_prio.sh diff --git a/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_root.sh b/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_root.sh new file mode 100755 index 000000000000..013886061f15 --- /dev/null +++ b/tools/testing/selftests/drivers/net/mlxsw/sch_tbf_root.sh @@ -0,0 +1,9 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source qos_lib.sh +bail_on_lldpad + +lib_dir=$(dirname $0)/../../../net/forwarding +TCFLAGS=skip_sw +source $lib_dir/sch_tbf_root.sh diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh index 8dc5fac98cbc..2f5da414aaa7 100644 --- a/tools/testing/selftests/net/forwarding/lib.sh +++ b/tools/testing/selftests/net/forwarding/lib.sh @@ -248,6 +248,24 @@ busywait() done } +until_counter_is() +{ + local value=$1; shift + local current=$("$@") + + echo $((current)) + ((current >= value)) +} + +busywait_for_counter() +{ + local timeout=$1; shift + local delta=$1; shift + + local base=$("$@") + busywait "$timeout" until_counter_is $((base + delta)) "$@" +} + setup_wait_dev() { local dev=$1; shift @@ -575,9 +593,10 @@ tc_rule_stats_get() local dev=$1; shift local pref=$1; shift local dir=$1; shift + local selector=${1:-.packets}; shift tc -j -s filter show dev $dev ${dir:-ingress} pref $pref \ - | jq '.[1].options.actions[].stats.packets' + | jq ".[1].options.actions[].stats$selector" } ethtool_stats_get() @@ -588,6 +607,30 @@ ethtool_stats_get() ethtool -S $dev | grep "^ *$stat:" | head -n 1 | cut -d: -f2 } +humanize() +{ + local speed=$1; shift + + for unit in bps Kbps Mbps Gbps; do + if (($(echo "$speed < 1024" | bc))); then + break + fi + + speed=$(echo "scale=1; $speed / 1024" | bc) + done + + echo "$speed${unit}" +} + +rate() +{ + local t0=$1; shift + local t1=$1; shift + local interval=$1; shift + + echo $((8 * (t1 - t0) / interval)) +} + mac_get() { local if_name=$1 diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_core.sh b/tools/testing/selftests/net/forwarding/sch_tbf_core.sh new file mode 100644 index 000000000000..d1f26cb7cd73 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/sch_tbf_core.sh @@ -0,0 +1,233 @@ +# SPDX-License-Identifier: GPL-2.0 + +# This test sends a stream of traffic from H1 through a switch, to H2. On the +# egress port from the switch ($swp2), a shaper is installed. The test verifies +# that the rates on the port match the configured shaper. +# +# In order to test per-class shaping, $swp2 actually contains TBF under PRIO or +# ETS, with two different configurations. Traffic is prioritized using 802.1p. +# +# +-------------------------------------------+ +# | H1 | +# | + $h1.10 $h1.11 + | +# | | 192.0.2.1/28 192.0.2.17/28 | | +# | | | | +# | \______________ _____________/ | +# | \ / | +# | + $h1 | +# +---------------------|---------------------+ +# | +# +---------------------|---------------------+ +# | SW + $swp1 | +# | _______________/ \_______________ | +# | / \ | +# | +-|--------------+ +--------------|-+ | +# | | + $swp1.10 | | $swp1.11 + | | +# | | | | | | +# | | BR10 | | BR11 | | +# | | | | | | +# | | + $swp2.10 | | $swp2.11 + | | +# | +-|--------------+ +--------------|-+ | +# | \_______________ ______________/ | +# | \ / | +# | + $swp2 | +# +---------------------|---------------------+ +# | +# +---------------------|---------------------+ +# | H2 + $h2 | +# | ______________/ \______________ | +# | / \ | +# | | | | +# | + $h2.10 $h2.11 + | +# | 192.0.2.2/28 192.0.2.18/28 | +# +-------------------------------------------+ + +NUM_NETIFS=4 +CHECK_TC="yes" +source $lib_dir/lib.sh + +ipaddr() +{ + local host=$1; shift + local vlan=$1; shift + + echo 192.0.2.$((16 * (vlan - 10) + host)) +} + +host_create() +{ + local dev=$1; shift + local host=$1; shift + + simple_if_init $dev + mtu_set $dev 10000 + + vlan_create $dev 10 v$dev $(ipaddr $host 10)/28 + ip link set dev $dev.10 type vlan egress 0:0 + + vlan_create $dev 11 v$dev $(ipaddr $host 11)/28 + ip link set dev $dev.11 type vlan egress 0:1 +} + +host_destroy() +{ + local dev=$1; shift + + vlan_destroy $dev 11 + vlan_destroy $dev 10 + mtu_restore $dev + simple_if_fini $dev +} + +h1_create() +{ + host_create $h1 1 +} + +h1_destroy() +{ + host_destroy $h1 +} + +h2_create() +{ + host_create $h2 2 + + tc qdisc add dev $h2 clsact + tc filter add dev $h2 ingress pref 1010 prot 802.1q \ + flower $TCFLAGS vlan_id 10 action pass + tc filter add dev $h2 ingress pref 1011 prot 802.1q \ + flower $TCFLAGS vlan_id 11 action pass +} + +h2_destroy() +{ + tc qdisc del dev $h2 clsact + host_destroy $h2 +} + +switch_create() +{ + local intf + local vlan + + ip link add dev br10 type bridge + ip link add dev br11 type bridge + + for intf in $swp1 $swp2; do + ip link set dev $intf up + mtu_set $intf 10000 + + for vlan in 10 11; do + vlan_create $intf $vlan + ip link set dev $intf.$vlan master br$vlan + ip link set dev $intf.$vlan up + done + done + + for vlan in 10 11; do + ip link set dev $swp1.$vlan type vlan ingress 0:0 1:1 + done + + ip link set dev br10 up + ip link set dev br11 up +} + +switch_destroy() +{ + local intf + local vlan + + # A test may have been interrupted mid-run, with Qdisc installed. Delete + # it here. + tc qdisc del dev $swp2 root 2>/dev/null + + ip link set dev br11 down + ip link set dev br10 down + + for intf in $swp2 $swp1; do + for vlan in 11 10; do + ip link set dev $intf.$vlan down + ip link set dev $intf.$vlan nomaster + vlan_destroy $intf $vlan + done + + mtu_restore $intf + ip link set dev $intf down + done + + ip link del dev br11 + ip link del dev br10 +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + swp1=${NETIFS[p2]} + + swp2=${NETIFS[p3]} + h2=${NETIFS[p4]} + + swp3=${NETIFS[p5]} + h3=${NETIFS[p6]} + + swp4=${NETIFS[p7]} + swp5=${NETIFS[p8]} + + h2_mac=$(mac_get $h2) + + vrf_prepare + + h1_create + h2_create + switch_create +} + +cleanup() +{ + pre_cleanup + + switch_destroy + h2_destroy + h1_destroy + + vrf_cleanup +} + +ping_ipv4() +{ + ping_test $h1.10 $(ipaddr 2 10) " vlan 10" + ping_test $h1.11 $(ipaddr 2 11) " vlan 11" +} + +tbf_get_counter() +{ + local vlan=$1; shift + + tc_rule_stats_get $h2 10$vlan ingress .bytes +} + +do_tbf_test() +{ + local vlan=$1; shift + local mbit=$1; shift + + start_traffic $h1.$vlan $(ipaddr 1 $vlan) $(ipaddr 2 $vlan) $h2_mac + sleep 5 # Wait for the burst to dwindle + + local t2=$(busywait_for_counter 1000 +1 tbf_get_counter $vlan) + sleep 10 + local t3=$(tbf_get_counter $vlan) + stop_traffic + + RET=0 + + # Note: TBF uses 10^6 Mbits, not 2^20 ones. + local er=$((mbit * 1000 * 1000)) + local nr=$(rate $t2 $t3 10) + local nr_pct=$((100 * (nr - er) / er)) + ((-5 <= nr_pct && nr_pct <= 5)) + check_err $? "Expected rate $(humanize $er), got $(humanize $nr), which is $nr_pct% off. Required accuracy is +-5%." + + log_test "TC $((vlan - 10)): TBF rate ${mbit}Mbit" +} diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_ets.sh b/tools/testing/selftests/net/forwarding/sch_tbf_ets.sh new file mode 100755 index 000000000000..84fb6cab88e4 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/sch_tbf_ets.sh @@ -0,0 +1,6 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +QDISC="ets strict" +: ${lib_dir:=.} +source $lib_dir/sch_tbf_etsprio.sh diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh b/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh new file mode 100644 index 000000000000..8bd85da1905a --- /dev/null +++ b/tools/testing/selftests/net/forwarding/sch_tbf_etsprio.sh @@ -0,0 +1,39 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +ALL_TESTS=" + ping_ipv4 + tbf_test +" +source $lib_dir/sch_tbf_core.sh + +tbf_test_one() +{ + local bs=$1; shift + + tc qdisc replace dev $swp2 parent 10:3 handle 103: tbf \ + rate 400Mbit burst $bs limit 1M + tc qdisc replace dev $swp2 parent 10:2 handle 102: tbf \ + rate 800Mbit burst $bs limit 1M + + do_tbf_test 10 400 $bs + do_tbf_test 11 800 $bs +} + +tbf_test() +{ + # This test is used for both ETS and PRIO. Even though we only need two + # bands, PRIO demands a minimum of three. + tc qdisc add dev $swp2 root handle 10: $QDISC 3 priomap 2 1 0 + tbf_test_one 128K + tc qdisc del dev $swp2 root +} + +trap cleanup EXIT + +setup_prepare +setup_wait + +tests_run + +exit $EXIT_STATUS diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_prio.sh b/tools/testing/selftests/net/forwarding/sch_tbf_prio.sh new file mode 100755 index 000000000000..9c8cb1cb9ba4 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/sch_tbf_prio.sh @@ -0,0 +1,6 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +QDISC="prio bands" +: ${lib_dir:=.} +source $lib_dir/sch_tbf_etsprio.sh diff --git a/tools/testing/selftests/net/forwarding/sch_tbf_root.sh b/tools/testing/selftests/net/forwarding/sch_tbf_root.sh new file mode 100755 index 000000000000..72aa21ba88c7 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/sch_tbf_root.sh @@ -0,0 +1,33 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +ALL_TESTS=" + ping_ipv4 + tbf_test +" +: ${lib_dir:=.} +source $lib_dir/sch_tbf_core.sh + +tbf_test_one() +{ + local bs=$1; shift + + tc qdisc replace dev $swp2 root handle 108: tbf \ + rate 400Mbit burst $bs limit 1M + do_tbf_test 10 400 $bs +} + +tbf_test() +{ + tbf_test_one 128K + tc qdisc del dev $swp2 root +} + +trap cleanup EXIT + +setup_prepare +setup_wait + +tests_run + +exit $EXIT_STATUS diff --git a/tools/testing/selftests/net/mptcp/.gitignore b/tools/testing/selftests/net/mptcp/.gitignore new file mode 100644 index 000000000000..d72f07642738 --- /dev/null +++ b/tools/testing/selftests/net/mptcp/.gitignore @@ -0,0 +1,2 @@ +mptcp_connect +*.pcap diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/selftests/net/mptcp/Makefile new file mode 100644 index 000000000000..93de52016dde --- /dev/null +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: GPL-2.0 + +top_srcdir = ../../../../.. + +CFLAGS = -Wall -Wl,--no-as-needed -O2 -g + +TEST_PROGS := mptcp_connect.sh + +TEST_GEN_FILES = mptcp_connect + +EXTRA_CLEAN := *.pcap + +include ../../lib.mk diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config new file mode 100644 index 000000000000..2499824d9e1c --- /dev/null +++ b/tools/testing/selftests/net/mptcp/config @@ -0,0 +1,4 @@ +CONFIG_MPTCP=y +CONFIG_MPTCP_IPV6=y +CONFIG_VETH=y +CONFIG_NET_SCH_NETEM=m diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c new file mode 100644 index 000000000000..a3dccd816ae4 --- /dev/null +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -0,0 +1,832 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE + +#include <errno.h> +#include <limits.h> +#include <fcntl.h> +#include <string.h> +#include <stdbool.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <strings.h> +#include <unistd.h> + +#include <sys/poll.h> +#include <sys/sendfile.h> +#include <sys/stat.h> +#include <sys/socket.h> +#include <sys/types.h> +#include <sys/mman.h> + +#include <netdb.h> +#include <netinet/in.h> + +#include <linux/tcp.h> + +extern int optind; + +#ifndef IPPROTO_MPTCP +#define IPPROTO_MPTCP 262 +#endif +#ifndef TCP_ULP +#define TCP_ULP 31 +#endif + +static bool listen_mode; +static int poll_timeout; + +enum cfg_mode { + CFG_MODE_POLL, + CFG_MODE_MMAP, + CFG_MODE_SENDFILE, +}; + +static enum cfg_mode cfg_mode = CFG_MODE_POLL; +static const char *cfg_host; +static const char *cfg_port = "12000"; +static int cfg_sock_proto = IPPROTO_MPTCP; +static bool tcpulp_audit; +static int pf = AF_INET; +static int cfg_sndbuf; + +static void die_usage(void) +{ + fprintf(stderr, "Usage: mptcp_connect [-6] [-u] [-s MPTCP|TCP] [-p port] -m mode]" + "[ -l ] [ -t timeout ] connect_address\n"); + exit(1); +} + +static const char *getxinfo_strerr(int err) +{ + if (err == EAI_SYSTEM) + return strerror(errno); + + return gai_strerror(err); +} + +static void xgetnameinfo(const struct sockaddr *addr, socklen_t addrlen, + char *host, socklen_t hostlen, + char *serv, socklen_t servlen) +{ + int flags = NI_NUMERICHOST | NI_NUMERICSERV; + int err = getnameinfo(addr, addrlen, host, hostlen, serv, servlen, + flags); + + if (err) { + const char *errstr = getxinfo_strerr(err); + + fprintf(stderr, "Fatal: getnameinfo: %s\n", errstr); + exit(1); + } +} + +static void xgetaddrinfo(const char *node, const char *service, + const struct addrinfo *hints, + struct addrinfo **res) +{ + int err = getaddrinfo(node, service, hints, res); + + if (err) { + const char *errstr = getxinfo_strerr(err); + + fprintf(stderr, "Fatal: getaddrinfo(%s:%s): %s\n", + node ? node : "", service ? service : "", errstr); + exit(1); + } +} + +static void set_sndbuf(int fd, unsigned int size) +{ + int err; + + err = setsockopt(fd, SOL_SOCKET, SO_SNDBUF, &size, sizeof(size)); + if (err) { + perror("set SO_SNDBUF"); + exit(1); + } +} + +static int sock_listen_mptcp(const char * const listenaddr, + const char * const port) +{ + int sock; + struct addrinfo hints = { + .ai_protocol = IPPROTO_TCP, + .ai_socktype = SOCK_STREAM, + .ai_flags = AI_PASSIVE | AI_NUMERICHOST + }; + + hints.ai_family = pf; + + struct addrinfo *a, *addr; + int one = 1; + + xgetaddrinfo(listenaddr, port, &hints, &addr); + hints.ai_family = pf; + + for (a = addr; a; a = a->ai_next) { + sock = socket(a->ai_family, a->ai_socktype, cfg_sock_proto); + if (sock < 0) + continue; + + if (-1 == setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &one, + sizeof(one))) + perror("setsockopt"); + + if (bind(sock, a->ai_addr, a->ai_addrlen) == 0) + break; /* success */ + + perror("bind"); + close(sock); + sock = -1; + } + + freeaddrinfo(addr); + + if (sock < 0) { + fprintf(stderr, "Could not create listen socket\n"); + return sock; + } + + if (listen(sock, 20)) { + perror("listen"); + close(sock); + return -1; + } + + return sock; +} + +static bool sock_test_tcpulp(const char * const remoteaddr, + const char * const port) +{ + struct addrinfo hints = { + .ai_protocol = IPPROTO_TCP, + .ai_socktype = SOCK_STREAM, + }; + struct addrinfo *a, *addr; + int sock = -1, ret = 0; + bool test_pass = false; + + hints.ai_family = AF_INET; + + xgetaddrinfo(remoteaddr, port, &hints, &addr); + for (a = addr; a; a = a->ai_next) { + sock = socket(a->ai_family, a->ai_socktype, IPPROTO_TCP); + if (sock < 0) { + perror("socket"); + continue; + } + ret = setsockopt(sock, IPPROTO_TCP, TCP_ULP, "mptcp", + sizeof("mptcp")); + if (ret == -1 && errno == EOPNOTSUPP) + test_pass = true; + close(sock); + + if (test_pass) + break; + if (!ret) + fprintf(stderr, + "setsockopt(TCP_ULP) returned 0\n"); + else + perror("setsockopt(TCP_ULP)"); + } + return test_pass; +} + +static int sock_connect_mptcp(const char * const remoteaddr, + const char * const port, int proto) +{ + struct addrinfo hints = { + .ai_protocol = IPPROTO_TCP, + .ai_socktype = SOCK_STREAM, + }; + struct addrinfo *a, *addr; + int sock = -1; + + hints.ai_family = pf; + + xgetaddrinfo(remoteaddr, port, &hints, &addr); + for (a = addr; a; a = a->ai_next) { + sock = socket(a->ai_family, a->ai_socktype, proto); + if (sock < 0) { + perror("socket"); + continue; + } + + if (connect(sock, a->ai_addr, a->ai_addrlen) == 0) + break; /* success */ + + perror("connect()"); + close(sock); + sock = -1; + } + + freeaddrinfo(addr); + return sock; +} + +static size_t do_rnd_write(const int fd, char *buf, const size_t len) +{ + unsigned int do_w; + ssize_t bw; + + do_w = rand() & 0xffff; + if (do_w == 0 || do_w > len) + do_w = len; + + bw = write(fd, buf, do_w); + if (bw < 0) + perror("write"); + + return bw; +} + +static size_t do_write(const int fd, char *buf, const size_t len) +{ + size_t offset = 0; + + while (offset < len) { + size_t written; + ssize_t bw; + + bw = write(fd, buf + offset, len - offset); + if (bw < 0) { + perror("write"); + return 0; + } + + written = (size_t)bw; + offset += written; + } + + return offset; +} + +static ssize_t do_rnd_read(const int fd, char *buf, const size_t len) +{ + size_t cap = rand(); + + cap &= 0xffff; + + if (cap == 0) + cap = 1; + else if (cap > len) + cap = len; + + return read(fd, buf, cap); +} + +static void set_nonblock(int fd) +{ + int flags = fcntl(fd, F_GETFL); + + if (flags == -1) + return; + + fcntl(fd, F_SETFL, flags | O_NONBLOCK); +} + +static int copyfd_io_poll(int infd, int peerfd, int outfd) +{ + struct pollfd fds = { + .fd = peerfd, + .events = POLLIN | POLLOUT, + }; + unsigned int woff = 0, wlen = 0; + char wbuf[8192]; + + set_nonblock(peerfd); + + for (;;) { + char rbuf[8192]; + ssize_t len; + + if (fds.events == 0) + break; + + switch (poll(&fds, 1, poll_timeout)) { + case -1: + if (errno == EINTR) + continue; + perror("poll"); + return 1; + case 0: + fprintf(stderr, "%s: poll timed out (events: " + "POLLIN %u, POLLOUT %u)\n", __func__, + fds.events & POLLIN, fds.events & POLLOUT); + return 2; + } + + if (fds.revents & POLLIN) { + len = do_rnd_read(peerfd, rbuf, sizeof(rbuf)); + if (len == 0) { + /* no more data to receive: + * peer has closed its write side + */ + fds.events &= ~POLLIN; + + if ((fds.events & POLLOUT) == 0) + /* and nothing more to send */ + break; + + /* Else, still have data to transmit */ + } else if (len < 0) { + perror("read"); + return 3; + } + + do_write(outfd, rbuf, len); + } + + if (fds.revents & POLLOUT) { + if (wlen == 0) { + woff = 0; + wlen = read(infd, wbuf, sizeof(wbuf)); + } + + if (wlen > 0) { + ssize_t bw; + + bw = do_rnd_write(peerfd, wbuf + woff, wlen); + if (bw < 0) + return 111; + + woff += bw; + wlen -= bw; + } else if (wlen == 0) { + /* We have no more data to send. */ + fds.events &= ~POLLOUT; + + if ((fds.events & POLLIN) == 0) + /* ... and peer also closed already */ + break; + + /* ... but we still receive. + * Close our write side. + */ + shutdown(peerfd, SHUT_WR); + } else { + if (errno == EINTR) + continue; + perror("read"); + return 4; + } + } + + if (fds.revents & (POLLERR | POLLNVAL)) { + fprintf(stderr, "Unexpected revents: " + "POLLERR/POLLNVAL(%x)\n", fds.revents); + return 5; + } + } + + close(peerfd); + return 0; +} + +static int do_recvfile(int infd, int outfd) +{ + ssize_t r; + + do { + char buf[16384]; + + r = do_rnd_read(infd, buf, sizeof(buf)); + if (r > 0) { + if (write(outfd, buf, r) != r) + break; + } else if (r < 0) { + perror("read"); + } + } while (r > 0); + + return (int)r; +} + +static int do_mmap(int infd, int outfd, unsigned int size) +{ + char *inbuf = mmap(NULL, size, PROT_READ, MAP_SHARED, infd, 0); + ssize_t ret = 0, off = 0; + size_t rem; + + if (inbuf == MAP_FAILED) { + perror("mmap"); + return 1; + } + + rem = size; + + while (rem > 0) { + ret = write(outfd, inbuf + off, rem); + + if (ret < 0) { + perror("write"); + break; + } + + off += ret; + rem -= ret; + } + + munmap(inbuf, size); + return rem; +} + +static int get_infd_size(int fd) +{ + struct stat sb; + ssize_t count; + int err; + + err = fstat(fd, &sb); + if (err < 0) { + perror("fstat"); + return -1; + } + + if ((sb.st_mode & S_IFMT) != S_IFREG) { + fprintf(stderr, "%s: stdin is not a regular file\n", __func__); + return -2; + } + + count = sb.st_size; + if (count > INT_MAX) { + fprintf(stderr, "File too large: %zu\n", count); + return -3; + } + + return (int)count; +} + +static int do_sendfile(int infd, int outfd, unsigned int count) +{ + while (count > 0) { + ssize_t r; + + r = sendfile(outfd, infd, NULL, count); + if (r < 0) { + perror("sendfile"); + return 3; + } + + count -= r; + } + + return 0; +} + +static int copyfd_io_mmap(int infd, int peerfd, int outfd, + unsigned int size) +{ + int err; + + if (listen_mode) { + err = do_recvfile(peerfd, outfd); + if (err) + return err; + + err = do_mmap(infd, peerfd, size); + } else { + err = do_mmap(infd, peerfd, size); + if (err) + return err; + + shutdown(peerfd, SHUT_WR); + + err = do_recvfile(peerfd, outfd); + } + + return err; +} + +static int copyfd_io_sendfile(int infd, int peerfd, int outfd, + unsigned int size) +{ + int err; + + if (listen_mode) { + err = do_recvfile(peerfd, outfd); + if (err) + return err; + + err = do_sendfile(infd, peerfd, size); + } else { + err = do_sendfile(infd, peerfd, size); + if (err) + return err; + err = do_recvfile(peerfd, outfd); + } + + return err; +} + +static int copyfd_io(int infd, int peerfd, int outfd) +{ + int file_size; + + switch (cfg_mode) { + case CFG_MODE_POLL: + return copyfd_io_poll(infd, peerfd, outfd); + case CFG_MODE_MMAP: + file_size = get_infd_size(infd); + if (file_size < 0) + return file_size; + return copyfd_io_mmap(infd, peerfd, outfd, file_size); + case CFG_MODE_SENDFILE: + file_size = get_infd_size(infd); + if (file_size < 0) + return file_size; + return copyfd_io_sendfile(infd, peerfd, outfd, file_size); + } + + fprintf(stderr, "Invalid mode %d\n", cfg_mode); + + die_usage(); + return 1; +} + +static void check_sockaddr(int pf, struct sockaddr_storage *ss, + socklen_t salen) +{ + struct sockaddr_in6 *sin6; + struct sockaddr_in *sin; + socklen_t wanted_size = 0; + + switch (pf) { + case AF_INET: + wanted_size = sizeof(*sin); + sin = (void *)ss; + if (!sin->sin_port) + fprintf(stderr, "accept: something wrong: ip connection from port 0"); + break; + case AF_INET6: + wanted_size = sizeof(*sin6); + sin6 = (void *)ss; + if (!sin6->sin6_port) + fprintf(stderr, "accept: something wrong: ipv6 connection from port 0"); + break; + default: + fprintf(stderr, "accept: Unknown pf %d, salen %u\n", pf, salen); + return; + } + + if (salen != wanted_size) + fprintf(stderr, "accept: size mismatch, got %d expected %d\n", + (int)salen, wanted_size); + + if (ss->ss_family != pf) + fprintf(stderr, "accept: pf mismatch, expect %d, ss_family is %d\n", + (int)ss->ss_family, pf); +} + +static void check_getpeername(int fd, struct sockaddr_storage *ss, socklen_t salen) +{ + struct sockaddr_storage peerss; + socklen_t peersalen = sizeof(peerss); + + if (getpeername(fd, (struct sockaddr *)&peerss, &peersalen) < 0) { + perror("getpeername"); + return; + } + + if (peersalen != salen) { + fprintf(stderr, "%s: %d vs %d\n", __func__, peersalen, salen); + return; + } + + if (memcmp(ss, &peerss, peersalen)) { + char a[INET6_ADDRSTRLEN]; + char b[INET6_ADDRSTRLEN]; + char c[INET6_ADDRSTRLEN]; + char d[INET6_ADDRSTRLEN]; + + xgetnameinfo((struct sockaddr *)ss, salen, + a, sizeof(a), b, sizeof(b)); + + xgetnameinfo((struct sockaddr *)&peerss, peersalen, + c, sizeof(c), d, sizeof(d)); + + fprintf(stderr, "%s: memcmp failure: accept %s vs peername %s, %s vs %s salen %d vs %d\n", + __func__, a, c, b, d, peersalen, salen); + } +} + +static void check_getpeername_connect(int fd) +{ + struct sockaddr_storage ss; + socklen_t salen = sizeof(ss); + char a[INET6_ADDRSTRLEN]; + char b[INET6_ADDRSTRLEN]; + + if (getpeername(fd, (struct sockaddr *)&ss, &salen) < 0) { + perror("getpeername"); + return; + } + + xgetnameinfo((struct sockaddr *)&ss, salen, + a, sizeof(a), b, sizeof(b)); + + if (strcmp(cfg_host, a) || strcmp(cfg_port, b)) + fprintf(stderr, "%s: %s vs %s, %s vs %s\n", __func__, + cfg_host, a, cfg_port, b); +} + +int main_loop_s(int listensock) +{ + struct sockaddr_storage ss; + struct pollfd polls; + socklen_t salen; + int remotesock; + + polls.fd = listensock; + polls.events = POLLIN; + + switch (poll(&polls, 1, poll_timeout)) { + case -1: + perror("poll"); + return 1; + case 0: + fprintf(stderr, "%s: timed out\n", __func__); + close(listensock); + return 2; + } + + salen = sizeof(ss); + remotesock = accept(listensock, (struct sockaddr *)&ss, &salen); + if (remotesock >= 0) { + check_sockaddr(pf, &ss, salen); + check_getpeername(remotesock, &ss, salen); + + return copyfd_io(0, remotesock, 1); + } + + perror("accept"); + + return 1; +} + +static void init_rng(void) +{ + int fd = open("/dev/urandom", O_RDONLY); + unsigned int foo; + + if (fd > 0) { + int ret = read(fd, &foo, sizeof(foo)); + + if (ret < 0) + srand(fd + foo); + close(fd); + } + + srand(foo); +} + +int main_loop(void) +{ + int fd; + + /* listener is ready. */ + fd = sock_connect_mptcp(cfg_host, cfg_port, cfg_sock_proto); + if (fd < 0) + return 2; + + check_getpeername_connect(fd); + + if (cfg_sndbuf) + set_sndbuf(fd, cfg_sndbuf); + + return copyfd_io(0, fd, 1); +} + +int parse_proto(const char *proto) +{ + if (!strcasecmp(proto, "MPTCP")) + return IPPROTO_MPTCP; + if (!strcasecmp(proto, "TCP")) + return IPPROTO_TCP; + + fprintf(stderr, "Unknown protocol: %s\n.", proto); + die_usage(); + + /* silence compiler warning */ + return 0; +} + +int parse_mode(const char *mode) +{ + if (!strcasecmp(mode, "poll")) + return CFG_MODE_POLL; + if (!strcasecmp(mode, "mmap")) + return CFG_MODE_MMAP; + if (!strcasecmp(mode, "sendfile")) + return CFG_MODE_SENDFILE; + + fprintf(stderr, "Unknown test mode: %s\n", mode); + fprintf(stderr, "Supported modes are:\n"); + fprintf(stderr, "\t\t\"poll\" - interleaved read/write using poll()\n"); + fprintf(stderr, "\t\t\"mmap\" - send entire input file (mmap+write), then read response (-l will read input first)\n"); + fprintf(stderr, "\t\t\"sendfile\" - send entire input file (sendfile), then read response (-l will read input first)\n"); + + die_usage(); + + /* silence compiler warning */ + return 0; +} + +int parse_sndbuf(const char *size) +{ + unsigned long s; + + errno = 0; + + s = strtoul(size, NULL, 0); + + if (errno) { + fprintf(stderr, "Invalid sndbuf size %s (%s)\n", + size, strerror(errno)); + die_usage(); + } + + if (s > INT_MAX) { + fprintf(stderr, "Invalid sndbuf size %s (%s)\n", + size, strerror(ERANGE)); + die_usage(); + } + + cfg_sndbuf = s; + + return 0; +} + +static void parse_opts(int argc, char **argv) +{ + int c; + + while ((c = getopt(argc, argv, "6lp:s:hut:m:b:")) != -1) { + switch (c) { + case 'l': + listen_mode = true; + break; + case 'p': + cfg_port = optarg; + break; + case 's': + cfg_sock_proto = parse_proto(optarg); + break; + case 'h': + die_usage(); + break; + case 'u': + tcpulp_audit = true; + break; + case '6': + pf = AF_INET6; + break; + case 't': + poll_timeout = atoi(optarg) * 1000; + if (poll_timeout <= 0) + poll_timeout = -1; + break; + case 'm': + cfg_mode = parse_mode(optarg); + break; + case 'b': + cfg_sndbuf = parse_sndbuf(optarg); + break; + } + } + + if (optind + 1 != argc) + die_usage(); + cfg_host = argv[optind]; + + if (strchr(cfg_host, ':')) + pf = AF_INET6; +} + +int main(int argc, char *argv[]) +{ + init_rng(); + + parse_opts(argc, argv); + + if (tcpulp_audit) + return sock_test_tcpulp(cfg_host, cfg_port) ? 0 : 1; + + if (listen_mode) { + int fd = sock_listen_mptcp(cfg_host, cfg_port); + + if (fd < 0) + return 1; + + if (cfg_sndbuf) + set_sndbuf(fd, cfg_sndbuf); + + return main_loop_s(fd); + } + + return main_loop(); +} diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh new file mode 100755 index 000000000000..d573a0feb98d --- /dev/null +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh @@ -0,0 +1,595 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +time_start=$(date +%s) + +optstring="b:d:e:l:r:h4cm:" +ret=0 +sin="" +sout="" +cin="" +cout="" +ksft_skip=4 +capture=false +timeout=30 +ipv6=true +ethtool_random_on=true +tc_delay="$((RANDOM%400))" +tc_loss=$((RANDOM%101)) +tc_reorder="" +testmode="" +sndbuf=0 +options_log=true + +if [ $tc_loss -eq 100 ];then + tc_loss=1% +elif [ $tc_loss -ge 10 ]; then + tc_loss=0.$tc_loss% +elif [ $tc_loss -ge 1 ]; then + tc_loss=0.0$tc_loss% +else + tc_loss="" +fi + +usage() { + echo "Usage: $0 [ -a ]" + echo -e "\t-d: tc/netem delay in milliseconds, e.g. \"-d 10\" (default random)" + echo -e "\t-l: tc/netem loss percentage, e.g. \"-l 0.02\" (default random)" + echo -e "\t-r: tc/netem reorder mode, e.g. \"-r 25% 50% gap 5\", use "-r 0" to disable reordering (default random)" + echo -e "\t-e: ethtool features to disable, e.g.: \"-e tso -e gso\" (default: randomly disable any of tso/gso/gro)" + echo -e "\t-4: IPv4 only: disable IPv6 tests (default: test both IPv4 and IPv6)" + echo -e "\t-c: capture packets for each test using tcpdump (default: no capture)" + echo -e "\t-b: set sndbuf value (default: use kernel default)" + echo -e "\t-m: test mode (poll, sendfile; default: poll)" +} + +while getopts "$optstring" option;do + case "$option" in + "h") + usage $0 + exit 0 + ;; + "d") + if [ $OPTARG -ge 0 ];then + tc_delay="$OPTARG" + else + echo "-d requires numeric argument, got \"$OPTARG\"" 1>&2 + exit 1 + fi + ;; + "e") + ethtool_args="$ethtool_args $OPTARG off" + ethtool_random_on=false + ;; + "l") + tc_loss="$OPTARG" + ;; + "r") + tc_reorder="$OPTARG" + ;; + "4") + ipv6=false + ;; + "c") + capture=true + ;; + "b") + if [ $OPTARG -ge 0 ];then + sndbuf="$OPTARG" + else + echo "-s requires numeric argument, got \"$OPTARG\"" 1>&2 + exit 1 + fi + ;; + "m") + testmode="$OPTARG" + ;; + "?") + usage $0 + exit 1 + ;; + esac +done + +sec=$(date +%s) +rndh=$(printf %x $sec)-$(mktemp -u XXXXXX) +ns1="ns1-$rndh" +ns2="ns2-$rndh" +ns3="ns3-$rndh" +ns4="ns4-$rndh" + +TEST_COUNT=0 + +cleanup() +{ + rm -f "$cin" "$cout" + rm -f "$sin" "$sout" + rm -f "$capout" + + local netns + for netns in "$ns1" "$ns2" "$ns3" "$ns4";do + ip netns del $netns + done +} + +ip -Version > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run test without ip tool" + exit $ksft_skip +fi + +sin=$(mktemp) +sout=$(mktemp) +cin=$(mktemp) +cout=$(mktemp) +capout=$(mktemp) +trap cleanup EXIT + +for i in "$ns1" "$ns2" "$ns3" "$ns4";do + ip netns add $i || exit $ksft_skip + ip -net $i link set lo up +done + +# "$ns1" ns2 ns3 ns4 +# ns1eth2 ns2eth1 ns2eth3 ns3eth2 ns3eth4 ns4eth3 +# - drop 1% -> reorder 25% +# <- TSO off - + +ip link add ns1eth2 netns "$ns1" type veth peer name ns2eth1 netns "$ns2" +ip link add ns2eth3 netns "$ns2" type veth peer name ns3eth2 netns "$ns3" +ip link add ns3eth4 netns "$ns3" type veth peer name ns4eth3 netns "$ns4" + +ip -net "$ns1" addr add 10.0.1.1/24 dev ns1eth2 +ip -net "$ns1" addr add dead:beef:1::1/64 dev ns1eth2 nodad + +ip -net "$ns1" link set ns1eth2 up +ip -net "$ns1" route add default via 10.0.1.2 +ip -net "$ns1" route add default via dead:beef:1::2 + +ip -net "$ns2" addr add 10.0.1.2/24 dev ns2eth1 +ip -net "$ns2" addr add dead:beef:1::2/64 dev ns2eth1 nodad +ip -net "$ns2" link set ns2eth1 up + +ip -net "$ns2" addr add 10.0.2.1/24 dev ns2eth3 +ip -net "$ns2" addr add dead:beef:2::1/64 dev ns2eth3 nodad +ip -net "$ns2" link set ns2eth3 up +ip -net "$ns2" route add default via 10.0.2.2 +ip -net "$ns2" route add default via dead:beef:2::2 +ip netns exec "$ns2" sysctl -q net.ipv4.ip_forward=1 +ip netns exec "$ns2" sysctl -q net.ipv6.conf.all.forwarding=1 + +ip -net "$ns3" addr add 10.0.2.2/24 dev ns3eth2 +ip -net "$ns3" addr add dead:beef:2::2/64 dev ns3eth2 nodad +ip -net "$ns3" link set ns3eth2 up + +ip -net "$ns3" addr add 10.0.3.2/24 dev ns3eth4 +ip -net "$ns3" addr add dead:beef:3::2/64 dev ns3eth4 nodad +ip -net "$ns3" link set ns3eth4 up +ip -net "$ns3" route add default via 10.0.2.1 +ip -net "$ns3" route add default via dead:beef:2::1 +ip netns exec "$ns3" sysctl -q net.ipv4.ip_forward=1 +ip netns exec "$ns3" sysctl -q net.ipv6.conf.all.forwarding=1 + +ip -net "$ns4" addr add 10.0.3.1/24 dev ns4eth3 +ip -net "$ns4" addr add dead:beef:3::1/64 dev ns4eth3 nodad +ip -net "$ns4" link set ns4eth3 up +ip -net "$ns4" route add default via 10.0.3.2 +ip -net "$ns4" route add default via dead:beef:3::2 + +set_ethtool_flags() { + local ns="$1" + local dev="$2" + local flags="$3" + + ip netns exec $ns ethtool -K $dev $flags 2>/dev/null + [ $? -eq 0 ] && echo "INFO: set $ns dev $dev: ethtool -K $flags" +} + +set_random_ethtool_flags() { + local flags="" + local r=$RANDOM + + local pick1=$((r & 1)) + local pick2=$((r & 2)) + local pick3=$((r & 4)) + + [ $pick1 -ne 0 ] && flags="tso off" + [ $pick2 -ne 0 ] && flags="$flags gso off" + [ $pick3 -ne 0 ] && flags="$flags gro off" + + [ -z "$flags" ] && return + + set_ethtool_flags "$1" "$2" "$flags" +} + +if $ethtool_random_on;then + set_random_ethtool_flags "$ns3" ns3eth2 + set_random_ethtool_flags "$ns4" ns4eth3 +else + set_ethtool_flags "$ns3" ns3eth2 "$ethtool_args" + set_ethtool_flags "$ns4" ns4eth3 "$ethtool_args" +fi + +print_file_err() +{ + ls -l "$1" 1>&2 + echo "Trailing bytes are: " + tail -c 27 "$1" +} + +check_transfer() +{ + local in=$1 + local out=$2 + local what=$3 + + cmp "$in" "$out" > /dev/null 2>&1 + if [ $? -ne 0 ] ;then + echo "[ FAIL ] $what does not match (in, out):" + print_file_err "$in" + print_file_err "$out" + + return 1 + fi + + return 0 +} + +check_mptcp_disabled() +{ + local disabled_ns + disabled_ns="ns_disabled-$sech-$(mktemp -u XXXXXX)" + ip netns add ${disabled_ns} || exit $ksft_skip + + # net.mptcp.enabled should be enabled by default + if [ "$(ip netns exec ${disabled_ns} sysctl net.mptcp.enabled | awk '{ print $3 }')" -ne 1 ]; then + echo -e "net.mptcp.enabled sysctl is not 1 by default\t\t[ FAIL ]" + ret=1 + return 1 + fi + ip netns exec ${disabled_ns} sysctl -q net.mptcp.enabled=0 + + local err=0 + LANG=C ip netns exec ${disabled_ns} ./mptcp_connect -t $timeout -p 10000 -s MPTCP 127.0.0.1 < "$cin" 2>&1 | \ + grep -q "^socket: Protocol not available$" && err=1 + ip netns delete ${disabled_ns} + + if [ ${err} -eq 0 ]; then + echo -e "New MPTCP socket cannot be blocked via sysctl\t\t[ FAIL ]" + ret=1 + return 1 + fi + + echo -e "New MPTCP socket can be blocked via sysctl\t\t[ OK ]" + return 0 +} + +check_mptcp_ulp_setsockopt() +{ + local t retval + t="ns_ulp-$sech-$(mktemp -u XXXXXX)" + + ip netns add ${t} || exit $ksft_skip + if ! ip netns exec ${t} ./mptcp_connect -u -p 10000 -s TCP 127.0.0.1 2>&1; then + printf "setsockopt(..., TCP_ULP, \"mptcp\", ...) allowed\t[ FAIL ]\n" + retval=1 + ret=$retval + else + printf "setsockopt(..., TCP_ULP, \"mptcp\", ...) blocked\t[ OK ]\n" + retval=0 + fi + ip netns del ${t} + return $retval +} + +# $1: IP address +is_v6() +{ + [ -z "${1##*:*}" ] +} + +do_ping() +{ + local listener_ns="$1" + local connector_ns="$2" + local connect_addr="$3" + local ping_args="-q -c 1" + + if is_v6 "${connect_addr}"; then + $ipv6 || return 0 + ping_args="${ping_args} -6" + fi + + ip netns exec ${connector_ns} ping ${ping_args} $connect_addr >/dev/null + if [ $? -ne 0 ] ; then + echo "$listener_ns -> $connect_addr connectivity [ FAIL ]" 1>&2 + ret=1 + + return 1 + fi + + return 0 +} + +# $1: ns, $2: port +wait_local_port_listen() +{ + local listener_ns="${1}" + local port="${2}" + + local port_hex i + + port_hex="$(printf "%04X" "${port}")" + for i in $(seq 10); do + ip netns exec "${listener_ns}" cat /proc/net/tcp* | \ + awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/ && \$4 ~ /0A/) {rc=0; exit}} END {exit rc}" && + break + sleep 0.1 + done +} + +do_transfer() +{ + local listener_ns="$1" + local connector_ns="$2" + local cl_proto="$3" + local srv_proto="$4" + local connect_addr="$5" + local local_addr="$6" + local extra_args="" + + local port + port=$((10000+$TEST_COUNT)) + TEST_COUNT=$((TEST_COUNT+1)) + + if [ "$sndbuf" -gt 0 ]; then + extra_args="$extra_args -b $sndbuf" + fi + + if [ -n "$testmode" ]; then + extra_args="$extra_args -m $testmode" + fi + + if [ -n "$extra_args" ] && $options_log; then + options_log=false + echo "INFO: extra options: $extra_args" + fi + + :> "$cout" + :> "$sout" + :> "$capout" + + local addr_port + addr_port=$(printf "%s:%d" ${connect_addr} ${port}) + printf "%.3s %-5s -> %.3s (%-20s) %-5s\t" ${connector_ns} ${cl_proto} ${listener_ns} ${addr_port} ${srv_proto} + + if $capture; then + local capuser + if [ -z $SUDO_USER ] ; then + capuser="" + else + capuser="-Z $SUDO_USER" + fi + + local capfile="${listener_ns}-${connector_ns}-${cl_proto}-${srv_proto}-${connect_addr}.pcap" + + ip netns exec ${listener_ns} tcpdump -i any -s 65535 -B 32768 $capuser -w $capfile > "$capout" 2>&1 & + local cappid=$! + + sleep 1 + fi + + ip netns exec ${listener_ns} ./mptcp_connect -t $timeout -l -p $port -s ${srv_proto} $extra_args $local_addr < "$sin" > "$sout" & + local spid=$! + + wait_local_port_listen "${listener_ns}" "${port}" + + local start + start=$(date +%s%3N) + ip netns exec ${connector_ns} ./mptcp_connect -t $timeout -p $port -s ${cl_proto} $extra_args $connect_addr < "$cin" > "$cout" & + local cpid=$! + + wait $cpid + local retc=$? + wait $spid + local rets=$? + + local stop + stop=$(date +%s%3N) + + if $capture; then + sleep 1 + kill $cappid + fi + + local duration + duration=$((stop-start)) + duration=$(printf "(duration %05sms)" $duration) + if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ]; then + echo "$duration [ FAIL ] client exit code $retc, server $rets" 1>&2 + echo "\nnetns ${listener_ns} socket stat for $port:" 1>&2 + ip netns exec ${listener_ns} ss -nita 1>&2 -o "sport = :$port" + echo "\nnetns ${connector_ns} socket stat for $port:" 1>&2 + ip netns exec ${connector_ns} ss -nita 1>&2 -o "dport = :$port" + + cat "$capout" + return 1 + fi + + check_transfer $sin $cout "file received by client" + retc=$? + check_transfer $cin $sout "file received by server" + rets=$? + + if [ $retc -eq 0 ] && [ $rets -eq 0 ];then + echo "$duration [ OK ]" + cat "$capout" + return 0 + fi + + cat "$capout" + return 1 +} + +make_file() +{ + local name=$1 + local who=$2 + + local SIZE TSIZE + SIZE=$((RANDOM % (1024 * 8))) + TSIZE=$((SIZE * 1024)) + + dd if=/dev/urandom of="$name" bs=1024 count=$SIZE 2> /dev/null + + SIZE=$((RANDOM % 1024)) + SIZE=$((SIZE + 128)) + TSIZE=$((TSIZE + SIZE)) + dd if=/dev/urandom conv=notrunc of="$name" bs=1 count=$SIZE 2> /dev/null + echo -e "\nMPTCP_TEST_FILE_END_MARKER" >> "$name" + + echo "Created $name (size $TSIZE) containing data sent by $who" +} + +run_tests_lo() +{ + local listener_ns="$1" + local connector_ns="$2" + local connect_addr="$3" + local loopback="$4" + local lret=0 + + # skip if test programs are running inside same netns for subsequent runs. + if [ $loopback -eq 0 ] && [ ${listener_ns} = ${connector_ns} ]; then + return 0 + fi + + # skip if we don't want v6 + if ! $ipv6 && is_v6 "${connect_addr}"; then + return 0 + fi + + local local_addr + if is_v6 "${connect_addr}"; then + local_addr="::" + else + local_addr="0.0.0.0" + fi + + do_transfer ${listener_ns} ${connector_ns} MPTCP MPTCP ${connect_addr} ${local_addr} + lret=$? + if [ $lret -ne 0 ]; then + ret=$lret + return 1 + fi + + # don't bother testing fallback tcp except for loopback case. + if [ ${listener_ns} != ${connector_ns} ]; then + return 0 + fi + + do_transfer ${listener_ns} ${connector_ns} MPTCP TCP ${connect_addr} ${local_addr} + lret=$? + if [ $lret -ne 0 ]; then + ret=$lret + return 1 + fi + + do_transfer ${listener_ns} ${connector_ns} TCP MPTCP ${connect_addr} ${local_addr} + lret=$? + if [ $lret -ne 0 ]; then + ret=$lret + return 1 + fi + + return 0 +} + +run_tests() +{ + run_tests_lo $1 $2 $3 0 +} + +make_file "$cin" "client" +make_file "$sin" "server" + +check_mptcp_disabled + +check_mptcp_ulp_setsockopt + +echo "INFO: validating network environment with pings" +for sender in "$ns1" "$ns2" "$ns3" "$ns4";do + do_ping "$ns1" $sender 10.0.1.1 + do_ping "$ns1" $sender dead:beef:1::1 + + do_ping "$ns2" $sender 10.0.1.2 + do_ping "$ns2" $sender dead:beef:1::2 + do_ping "$ns2" $sender 10.0.2.1 + do_ping "$ns2" $sender dead:beef:2::1 + + do_ping "$ns3" $sender 10.0.2.2 + do_ping "$ns3" $sender dead:beef:2::2 + do_ping "$ns3" $sender 10.0.3.2 + do_ping "$ns3" $sender dead:beef:3::2 + + do_ping "$ns4" $sender 10.0.3.1 + do_ping "$ns4" $sender dead:beef:3::1 +done + +[ -n "$tc_loss" ] && tc -net "$ns2" qdisc add dev ns2eth3 root netem loss random $tc_loss +echo -n "INFO: Using loss of $tc_loss " +test "$tc_delay" -gt 0 && echo -n "delay $tc_delay ms " + +if [ -z "${tc_reorder}" ]; then + reorder1=$((RANDOM%10)) + reorder1=$((100 - reorder1)) + reorder2=$((RANDOM%100)) + + if [ $tc_delay -gt 0 ] && [ $reorder1 -lt 100 ] && [ $reorder2 -gt 0 ]; then + tc_reorder="reorder ${reorder1}% ${reorder2}%" + echo -n "$tc_reorder " + fi +elif [ "$tc_reorder" = "0" ];then + tc_reorder="" +elif [ "$tc_delay" -gt 0 ];then + # reordering requires some delay + tc_reorder="reorder $tc_reorder" + echo -n "$tc_reorder " +fi + +echo "on ns3eth4" + +tc -net "$ns3" qdisc add dev ns3eth4 root netem delay ${tc_delay}ms $tc_reorder + +for sender in $ns1 $ns2 $ns3 $ns4;do + run_tests_lo "$ns1" "$sender" 10.0.1.1 1 + if [ $ret -ne 0 ] ;then + echo "FAIL: Could not even run loopback test" 1>&2 + exit $ret + fi + run_tests_lo "$ns1" $sender dead:beef:1::1 1 + if [ $ret -ne 0 ] ;then + echo "FAIL: Could not even run loopback v6 test" 2>&1 + exit $ret + fi + + run_tests "$ns2" $sender 10.0.1.2 + run_tests "$ns2" $sender dead:beef:1::2 + run_tests "$ns2" $sender 10.0.2.1 + run_tests "$ns2" $sender dead:beef:2::1 + + run_tests "$ns3" $sender 10.0.2.2 + run_tests "$ns3" $sender dead:beef:2::2 + run_tests "$ns3" $sender 10.0.3.2 + run_tests "$ns3" $sender dead:beef:3::2 + + run_tests "$ns4" $sender 10.0.3.1 + run_tests "$ns4" $sender dead:beef:3::1 +done + +time_end=$(date +%s) +time_run=$((time_end-time_start)) + +echo "Time: ${time_run} seconds" + +exit $ret diff --git a/tools/testing/selftests/net/mptcp/settings b/tools/testing/selftests/net/mptcp/settings new file mode 100644 index 000000000000..026384c189c9 --- /dev/null +++ b/tools/testing/selftests/net/mptcp/settings @@ -0,0 +1 @@ +timeout=450 diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile index de1032b5ddea..08194aa44006 100644 --- a/tools/testing/selftests/netfilter/Makefile +++ b/tools/testing/selftests/netfilter/Makefile @@ -2,6 +2,7 @@ # Makefile for netfilter selftests TEST_PROGS := nft_trans_stress.sh nft_nat.sh bridge_brouter.sh \ - conntrack_icmp_related.sh nft_flowtable.sh ipvs.sh + conntrack_icmp_related.sh nft_flowtable.sh ipvs.sh \ + nft_concat_range.sh include ../lib.mk diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh new file mode 100755 index 000000000000..aca21dde102a --- /dev/null +++ b/tools/testing/selftests/netfilter/nft_concat_range.sh @@ -0,0 +1,1481 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# nft_concat_range.sh - Tests for sets with concatenation of ranged fields +# +# Copyright (c) 2019 Red Hat GmbH +# +# Author: Stefano Brivio <sbrivio@redhat.com> +# +# shellcheck disable=SC2154,SC2034,SC2016,SC2030,SC2031 +# ^ Configuration and templates sourced with eval, counters reused in subshells + +KSELFTEST_SKIP=4 + +# Available test groups: +# - correctness: check that packets match given entries, and only those +# - concurrency: attempt races between insertion, deletion and lookup +# - timeout: check that packets match entries until they expire +# - performance: estimate matching rate, compare with rbtree and hash baselines +TESTS="correctness concurrency timeout" +[ "${quicktest}" != "1" ] && TESTS="${TESTS} performance" + +# Set types, defined by TYPE_ variables below +TYPES="net_port port_net net6_port port_proto net6_port_mac net6_port_mac_proto + net_port_net net_mac net_mac_icmp net6_mac_icmp net6_port_net6_port + net_port_mac_proto_net" + +# List of possible paths to pktgen script from kernel tree for performance tests +PKTGEN_SCRIPT_PATHS=" + ../../../samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh + pktgen/pktgen_bench_xmit_mode_netif_receive.sh" + +# Definition of set types: +# display display text for test report +# type_spec nftables set type specifier +# chain_spec nftables type specifier for rules mapping to set +# dst call sequence of format_*() functions for destination fields +# src call sequence of format_*() functions for source fields +# start initial integer used to generate addresses and ports +# count count of entries to generate and match +# src_delta number summed to destination generator for source fields +# tools list of tools for correctness and timeout tests, any can be used +# proto L4 protocol of test packets +# +# race_repeat race attempts per thread, 0 disables concurrency test for type +# flood_tools list of tools for concurrency tests, any can be used +# flood_proto L4 protocol of test packets for concurrency tests +# flood_spec nftables type specifier for concurrency tests +# +# perf_duration duration of single pktgen injection test +# perf_spec nftables type specifier for performance tests +# perf_dst format_*() functions for destination fields in performance test +# perf_src format_*() functions for source fields in performance test +# perf_entries number of set entries for performance test +# perf_proto L3 protocol of test packets +TYPE_net_port=" +display net,port +type_spec ipv4_addr . inet_service +chain_spec ip daddr . udp dport +dst addr4 port +src +start 1 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp + +race_repeat 3 +flood_tools iperf3 iperf netperf +flood_proto udp +flood_spec ip daddr . udp dport + +perf_duration 5 +perf_spec ip daddr . udp dport +perf_dst addr4 port +perf_src +perf_entries 1000 +perf_proto ipv4 +" + +TYPE_port_net=" +display port,net +type_spec inet_service . ipv4_addr +chain_spec udp dport . ip daddr +dst port addr4 +src +start 1 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp + +race_repeat 3 +flood_tools iperf3 iperf netperf +flood_proto udp +flood_spec udp dport . ip daddr + +perf_duration 5 +perf_spec udp dport . ip daddr +perf_dst port addr4 +perf_src +perf_entries 100 +perf_proto ipv4 +" + +TYPE_net6_port=" +display net6,port +type_spec ipv6_addr . inet_service +chain_spec ip6 daddr . udp dport +dst addr6 port +src +start 10 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp6 + +race_repeat 3 +flood_tools iperf3 iperf netperf +flood_proto tcp6 +flood_spec ip6 daddr . udp dport + +perf_duration 5 +perf_spec ip6 daddr . udp dport +perf_dst addr6 port +perf_src +perf_entries 1000 +perf_proto ipv6 +" + +TYPE_port_proto=" +display port,proto +type_spec inet_service . inet_proto +chain_spec udp dport . meta l4proto +dst port proto +src +start 1 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp + +race_repeat 0 + +perf_duration 5 +perf_spec udp dport . meta l4proto +perf_dst port proto +perf_src +perf_entries 30000 +perf_proto ipv4 +" + +TYPE_net6_port_mac=" +display net6,port,mac +type_spec ipv6_addr . inet_service . ether_addr +chain_spec ip6 daddr . udp dport . ether saddr +dst addr6 port +src mac +start 10 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp6 + +race_repeat 0 + +perf_duration 5 +perf_spec ip6 daddr . udp dport . ether daddr +perf_dst addr6 port mac +perf_src +perf_entries 10 +perf_proto ipv6 +" + +TYPE_net6_port_mac_proto=" +display net6,port,mac,proto +type_spec ipv6_addr . inet_service . ether_addr . inet_proto +chain_spec ip6 daddr . udp dport . ether saddr . meta l4proto +dst addr6 port +src mac proto +start 10 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp6 + +race_repeat 0 + +perf_duration 5 +perf_spec ip6 daddr . udp dport . ether daddr . meta l4proto +perf_dst addr6 port mac proto +perf_src +perf_entries 1000 +perf_proto ipv6 +" + +TYPE_net_port_net=" +display net,port,net +type_spec ipv4_addr . inet_service . ipv4_addr +chain_spec ip daddr . udp dport . ip saddr +dst addr4 port +src addr4 +start 1 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp + +race_repeat 3 +flood_tools iperf3 iperf netperf +flood_proto tcp +flood_spec ip daddr . udp dport . ip saddr + +perf_duration 0 +" + +TYPE_net6_port_net6_port=" +display net6,port,net6,port +type_spec ipv6_addr . inet_service . ipv6_addr . inet_service +chain_spec ip6 daddr . udp dport . ip6 saddr . udp sport +dst addr6 port +src addr6 port +start 10 +count 5 +src_delta 2000 +tools sendip nc +proto udp6 + +race_repeat 3 +flood_tools iperf3 iperf netperf +flood_proto tcp6 +flood_spec ip6 daddr . tcp dport . ip6 saddr . tcp sport + +perf_duration 0 +" + +TYPE_net_port_mac_proto_net=" +display net,port,mac,proto,net +type_spec ipv4_addr . inet_service . ether_addr . inet_proto . ipv4_addr +chain_spec ip daddr . udp dport . ether saddr . meta l4proto . ip saddr +dst addr4 port +src mac proto addr4 +start 1 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp + +race_repeat 0 + +perf_duration 0 +" + +TYPE_net_mac=" +display net,mac +type_spec ipv4_addr . ether_addr +chain_spec ip daddr . ether saddr +dst addr4 +src mac +start 1 +count 5 +src_delta 2000 +tools sendip nc bash +proto udp + +race_repeat 0 + +perf_duration 5 +perf_spec ip daddr . ether daddr +perf_dst addr4 mac +perf_src +perf_entries 1000 +perf_proto ipv4 +" + +TYPE_net_mac_icmp=" +display net,mac - ICMP +type_spec ipv4_addr . ether_addr +chain_spec ip daddr . ether saddr +dst addr4 +src mac +start 1 +count 5 +src_delta 2000 +tools ping +proto icmp + +race_repeat 0 + +perf_duration 0 +" + +TYPE_net6_mac_icmp=" +display net6,mac - ICMPv6 +type_spec ipv6_addr . ether_addr +chain_spec ip6 daddr . ether saddr +dst addr6 +src mac +start 10 +count 50 +src_delta 2000 +tools ping +proto icmp6 + +race_repeat 0 + +perf_duration 0 +" + +TYPE_net_port_proto_net=" +display net,port,proto,net +type_spec ipv4_addr . inet_service . inet_proto . ipv4_addr +chain_spec ip daddr . udp dport . meta l4proto . ip saddr +dst addr4 port proto +src addr4 +start 1 +count 5 +src_delta 2000 +tools sendip nc +proto udp + +race_repeat 3 +flood_tools iperf3 iperf netperf +flood_proto tcp +flood_spec ip daddr . tcp dport . meta l4proto . ip saddr + +perf_duration 0 +" + +# Set template for all tests, types and rules are filled in depending on test +set_template=' +flush ruleset + +table inet filter { + counter test { + packets 0 bytes 0 + } + + set test { + type ${type_spec} + flags interval,timeout + } + + chain input { + type filter hook prerouting priority 0; policy accept; + ${chain_spec} @test counter name \"test\" + } +} + +table netdev perf { + counter test { + packets 0 bytes 0 + } + + counter match { + packets 0 bytes 0 + } + + set test { + type ${type_spec} + flags interval + } + + set norange { + type ${type_spec} + } + + set noconcat { + type ${type_spec%% *} + flags interval + } + + chain test { + type filter hook ingress device veth_a priority 0; + } +} +' + +err_buf= +info_buf= + +# Append string to error buffer +err() { + err_buf="${err_buf}${1} +" +} + +# Append string to information buffer +info() { + info_buf="${info_buf}${1} +" +} + +# Flush error buffer to stdout +err_flush() { + printf "%s" "${err_buf}" + err_buf= +} + +# Flush information buffer to stdout +info_flush() { + printf "%s" "${info_buf}" + info_buf= +} + +# Setup veth pair: this namespace receives traffic, B generates it +setup_veth() { + ip netns add B + ip link add veth_a type veth peer name veth_b || return 1 + + ip link set veth_a up + ip link set veth_b netns B + + ip -n B link set veth_b up + + ip addr add dev veth_a 10.0.0.1 + ip route add default dev veth_a + + ip -6 addr add fe80::1/64 dev veth_a nodad + ip -6 addr add 2001:db8::1/64 dev veth_a nodad + ip -6 route add default dev veth_a + + ip -n B route add default dev veth_b + + ip -6 -n B addr add fe80::2/64 dev veth_b nodad + ip -6 -n B addr add 2001:db8::2/64 dev veth_b nodad + ip -6 -n B route add default dev veth_b + + B() { + ip netns exec B "$@" >/dev/null 2>&1 + } + + sleep 2 +} + +# Fill in set template and initialise set +setup_set() { + eval "echo \"${set_template}\"" | nft -f - +} + +# Check that at least one of the needed tools is available +check_tools() { + __tools= + for tool in ${tools}; do + if [ "${tool}" = "nc" ] && [ "${proto}" = "udp6" ] && \ + ! nc -u -w0 1.1.1.1 1 2>/dev/null; then + # Some GNU netcat builds might not support IPv6 + __tools="${__tools} netcat-openbsd" + continue + fi + __tools="${__tools} ${tool}" + + command -v "${tool}" >/dev/null && return 0 + done + err "need one of:${__tools}, skipping" && return 1 +} + +# Set up function to send ICMP packets +setup_send_icmp() { + send_icmp() { + B ping -c1 -W1 "${dst_addr4}" >/dev/null 2>&1 + } +} + +# Set up function to send ICMPv6 packets +setup_send_icmp6() { + if command -v ping6 >/dev/null; then + send_icmp6() { + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + B ping6 -q -c1 -W1 "${dst_addr6}" + } + else + send_icmp6() { + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + B ping -q -6 -c1 -W1 "${dst_addr6}" + } + fi +} + +# Set up function to send single UDP packets on IPv4 +setup_send_udp() { + if command -v sendip >/dev/null; then + send_udp() { + [ -n "${src_port}" ] && src_port="-us ${src_port}" + [ -n "${dst_port}" ] && dst_port="-ud ${dst_port}" + [ -n "${src_addr4}" ] && src_addr4="-is ${src_addr4}" + + # shellcheck disable=SC2086 # sendip needs split options + B sendip -p ipv4 -p udp ${src_addr4} ${src_port} \ + ${dst_port} "${dst_addr4}" + + src_port= + dst_port= + src_addr4= + } + elif command -v nc >/dev/null; then + if nc -u -w0 1.1.1.1 1 2>/dev/null; then + # OpenBSD netcat + nc_opt="-w0" + else + # GNU netcat + nc_opt="-q0" + fi + + send_udp() { + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}" dev veth_b + __src_addr4="-s ${src_addr4}" + fi + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + [ -n "${src_port}" ] && src_port="-p ${src_port}" + + echo "" | B nc -u "${nc_opt}" "${__src_addr4}" \ + "${src_port}" "${dst_addr4}" "${dst_port}" + + src_addr4= + src_port= + } + elif [ -z "$(bash -c 'type -p')" ]; then + send_udp() { + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + B ip route add default dev veth_b + fi + + B bash -c "echo > /dev/udp/${dst_addr4}/${dst_port}" + + if [ -n "${src_addr4}" ]; then + B ip addr del "${src_addr4}/16" dev veth_b + fi + src_addr4= + } + else + return 1 + fi +} + +# Set up function to send single UDP packets on IPv6 +setup_send_udp6() { + if command -v sendip >/dev/null; then + send_udp6() { + [ -n "${src_port}" ] && src_port="-us ${src_port}" + [ -n "${dst_port}" ] && dst_port="-ud ${dst_port}" + if [ -n "${src_addr6}" ]; then + src_addr6="-6s ${src_addr6}" + else + src_addr6="-6s 2001:db8::2" + fi + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + B sendip -p ipv6 -p udp ${src_addr6} ${src_port} \ + ${dst_port} "${dst_addr6}" + + src_port= + dst_port= + src_addr6= + } + elif command -v nc >/dev/null && nc -u -w0 1.1.1.1 1 2>/dev/null; then + # GNU netcat might not work with IPv6, try next tool + send_udp6() { + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + if [ -n "${src_addr6}" ]; then + B ip addr add "${src_addr6}" dev veth_b nodad + else + src_addr6="2001:db8::2" + fi + [ -n "${src_port}" ] && src_port="-p ${src_port}" + + # shellcheck disable=SC2086 # this needs split options + echo "" | B nc -u w0 "-s${src_addr6}" ${src_port} \ + ${dst_addr6} ${dst_port} + + src_addr6= + src_port= + } + elif [ -z "$(bash -c 'type -p')" ]; then + send_udp6() { + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + B ip addr add "${src_addr6}" dev veth_b nodad + B bash -c "echo > /dev/udp/${dst_addr6}/${dst_port}" + ip -6 addr del "${dst_addr6}" dev veth_a 2>/dev/null + } + else + return 1 + fi +} + +# Set up function to send TCP traffic on IPv4 +setup_flood_tcp() { + if command -v iperf3 >/dev/null; then + flood_tcp() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + src_addr4="-B ${src_addr4}" + else + B ip addr add dev veth_b 10.0.0.2 + src_addr4="-B 10.0.0.2" + fi + if [ -n "${src_port}" ]; then + src_port="--cport ${src_port}" + fi + B ip route add default dev veth_b 2>/dev/null + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + iperf3 -s -DB "${dst_addr4}" ${dst_port} >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B iperf3 -c "${dst_addr4}" ${dst_port} ${src_port} \ + ${src_addr4} -l16 -t 1000 + + src_addr4= + src_port= + dst_port= + } + elif command -v iperf >/dev/null; then + flood_tcp() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + src_addr4="-B ${src_addr4}" + else + B ip addr add dev veth_b 10.0.0.2 2>/dev/null + src_addr4="-B 10.0.0.2" + fi + if [ -n "${src_port}" ]; then + src_addr4="${src_addr4}:${src_port}" + fi + B ip route add default dev veth_b + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + iperf -s -DB "${dst_addr4}" ${dst_port} >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B iperf -c "${dst_addr4}" ${dst_port} ${src_addr4} \ + -l20 -t 1000 + + src_addr4= + src_port= + dst_port= + } + elif command -v netperf >/dev/null; then + flood_tcp() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + else + B ip addr add dev veth_b 10.0.0.2 + src_addr4="10.0.0.2" + fi + if [ -n "${src_port}" ]; then + dst_port="${dst_port},${src_port}" + fi + B ip route add default dev veth_b + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + netserver -4 ${dst_port} -L "${dst_addr4}" \ + >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B netperf -4 -H "${dst_addr4}" ${dst_port} \ + -L "${src_addr4}" -l 1000 -t TCP_STREAM + + src_addr4= + src_port= + dst_port= + } + else + return 1 + fi +} + +# Set up function to send TCP traffic on IPv6 +setup_flood_tcp6() { + if command -v iperf3 >/dev/null; then + flood_tcp6() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr6}" ]; then + B ip addr add "${src_addr6}" dev veth_b nodad + src_addr6="-B ${src_addr6}" + else + src_addr6="-B 2001:db8::2" + fi + if [ -n "${src_port}" ]; then + src_port="--cport ${src_port}" + fi + B ip route add default dev veth_b + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + iperf3 -s -DB "${dst_addr6}" ${dst_port} >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B iperf3 -c "${dst_addr6}" ${dst_port} \ + ${src_port} ${src_addr6} -l16 -t 1000 + + src_addr6= + src_port= + dst_port= + } + elif command -v iperf >/dev/null; then + flood_tcp6() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr6}" ]; then + B ip addr add "${src_addr6}" dev veth_b nodad + src_addr6="-B ${src_addr6}" + else + src_addr6="-B 2001:db8::2" + fi + if [ -n "${src_port}" ]; then + src_addr6="${src_addr6}:${src_port}" + fi + B ip route add default dev veth_b + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + iperf -s -VDB "${dst_addr6}" ${dst_port} >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B iperf -c "${dst_addr6}" -V ${dst_port} \ + ${src_addr6} -l1 -t 1000 + + src_addr6= + src_port= + dst_port= + } + elif command -v netperf >/dev/null; then + flood_tcp6() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr6}" ]; then + B ip addr add "${src_addr6}" dev veth_b nodad + else + src_addr6="2001:db8::2" + fi + if [ -n "${src_port}" ]; then + dst_port="${dst_port},${src_port}" + fi + B ip route add default dev veth_b + ip -6 addr add "${dst_addr6}" dev veth_a nodad \ + 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + netserver -6 ${dst_port} -L "${dst_addr6}" \ + >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B netperf -6 -H "${dst_addr6}" ${dst_port} \ + -L "${src_addr6}" -l 1000 -t TCP_STREAM + + src_addr6= + src_port= + dst_port= + } + else + return 1 + fi +} + +# Set up function to send UDP traffic on IPv4 +setup_flood_udp() { + if command -v iperf3 >/dev/null; then + flood_udp() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + src_addr4="-B ${src_addr4}" + else + B ip addr add dev veth_b 10.0.0.2 2>/dev/null + src_addr4="-B 10.0.0.2" + fi + if [ -n "${src_port}" ]; then + src_port="--cport ${src_port}" + fi + B ip route add default dev veth_b + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + iperf3 -s -DB "${dst_addr4}" ${dst_port} + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B iperf3 -u -c "${dst_addr4}" -Z -b 100M -l16 -t1000 \ + ${dst_port} ${src_port} ${src_addr4} + + src_addr4= + src_port= + dst_port= + } + elif command -v iperf >/dev/null; then + flood_udp() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + src_addr4="-B ${src_addr4}" + else + B ip addr add dev veth_b 10.0.0.2 + src_addr4="-B 10.0.0.2" + fi + if [ -n "${src_port}" ]; then + src_addr4="${src_addr4}:${src_port}" + fi + B ip route add default dev veth_b + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + iperf -u -sDB "${dst_addr4}" ${dst_port} >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B iperf -u -c "${dst_addr4}" -b 100M -l1 -t1000 \ + ${dst_port} ${src_addr4} + + src_addr4= + src_port= + dst_port= + } + elif command -v netperf >/dev/null; then + flood_udp() { + [ -n "${dst_port}" ] && dst_port="-p ${dst_port}" + if [ -n "${src_addr4}" ]; then + B ip addr add "${src_addr4}/16" dev veth_b + else + B ip addr add dev veth_b 10.0.0.2 + src_addr4="10.0.0.2" + fi + if [ -n "${src_port}" ]; then + dst_port="${dst_port},${src_port}" + fi + B ip route add default dev veth_b + ip addr add "${dst_addr4}" dev veth_a 2>/dev/null + + # shellcheck disable=SC2086 # this needs split options + netserver -4 ${dst_port} -L "${dst_addr4}" \ + >/dev/null 2>&1 + sleep 2 + + # shellcheck disable=SC2086 # this needs split options + B netperf -4 -H "${dst_addr4}" ${dst_port} \ + -L "${src_addr4}" -l 1000 -t UDP_STREAM + + src_addr4= + src_port= + dst_port= + } + else + return 1 + fi +} + +# Find pktgen script and set up function to start pktgen injection +setup_perf() { + for pktgen_script_path in ${PKTGEN_SCRIPT_PATHS} __notfound; do + command -v "${pktgen_script_path}" >/dev/null && break + done + [ "${pktgen_script_path}" = "__notfound" ] && return 1 + + perf_ipv4() { + ${pktgen_script_path} -s80 \ + -i veth_a -d "${dst_addr4}" -p "${dst_port}" \ + -m "${dst_mac}" \ + -t $(($(nproc) / 5 + 1)) -b10000 -n0 2>/dev/null & + perf_pid=$! + } + perf_ipv6() { + IP6=6 ${pktgen_script_path} -s100 \ + -i veth_a -d "${dst_addr6}" -p "${dst_port}" \ + -m "${dst_mac}" \ + -t $(($(nproc) / 5 + 1)) -b10000 -n0 2>/dev/null & + perf_pid=$! + } +} + +# Clean up before each test +cleanup() { + nft reset counter inet filter test >/dev/null 2>&1 + nft flush ruleset >/dev/null 2>&1 + ip link del dummy0 2>/dev/null + ip route del default 2>/dev/null + ip -6 route del default 2>/dev/null + ip netns del B 2>/dev/null + ip link del veth_a 2>/dev/null + timeout= + killall iperf3 2>/dev/null + killall iperf 2>/dev/null + killall netperf 2>/dev/null + killall netserver 2>/dev/null + rm -f ${tmp} + sleep 2 +} + +# Entry point for setup functions +setup() { + if [ "$(id -u)" -ne 0 ]; then + echo " need to run as root" + exit ${KSELFTEST_SKIP} + fi + + cleanup + check_tools || return 1 + for arg do + if ! eval setup_"${arg}"; then + err " ${arg} not supported" + return 1 + fi + done +} + +# Format integer into IPv4 address, summing 10.0.0.5 (arbitrary) to it +format_addr4() { + a=$((${1} + 16777216 * 10 + 5)) + printf "%i.%i.%i.%i" \ + "$((a / 16777216))" "$((a % 16777216 / 65536))" \ + "$((a % 65536 / 256))" "$((a % 256))" +} + +# Format integer into IPv6 address, summing 2001:db8:: to it +format_addr6() { + printf "2001:db8::%04x:%04x" "$((${1} / 65536))" "$((${1} % 65536))" +} + +# Format integer into EUI-48 address, summing 00:01:00:00:00:00 to it +format_mac() { + printf "00:01:%02x:%02x:%02x:%02x" \ + "$((${1} / 16777216))" "$((${1} % 16777216 / 65536))" \ + "$((${1} % 65536 / 256))" "$((${1} % 256))" +} + +# Format integer into port, avoid 0 port +format_port() { + printf "%i" "$((${1} % 65534 + 1))" +} + +# Drop suffixed '6' from L4 protocol, if any +format_proto() { + printf "%s" "${proto}" | tr -d 6 +} + +# Format destination and source fields into nft concatenated type +format() { + __start= + __end= + __expr="{ " + + for f in ${dst}; do + [ "${__expr}" != "{ " ] && __expr="${__expr} . " + + __start="$(eval format_"${f}" "${start}")" + __end="$(eval format_"${f}" "${end}")" + + if [ "${f}" = "proto" ]; then + __expr="${__expr}${__start}" + else + __expr="${__expr}${__start}-${__end}" + fi + done + for f in ${src}; do + __expr="${__expr} . " + __start="$(eval format_"${f}" "${srcstart}")" + __end="$(eval format_"${f}" "${srcend}")" + + if [ "${f}" = "proto" ]; then + __expr="${__expr}${__start}" + else + __expr="${__expr}${__start}-${__end}" + fi + done + + if [ -n "${timeout}" ]; then + echo "${__expr} timeout ${timeout}s }" + else + echo "${__expr} }" + fi +} + +# Format destination and source fields into nft type, start element only +format_norange() { + __expr="{ " + + for f in ${dst}; do + [ "${__expr}" != "{ " ] && __expr="${__expr} . " + + __expr="${__expr}$(eval format_"${f}" "${start}")" + done + for f in ${src}; do + __expr="${__expr} . $(eval format_"${f}" "${start}")" + done + + echo "${__expr} }" +} + +# Format first destination field into nft type +format_noconcat() { + for f in ${dst}; do + __start="$(eval format_"${f}" "${start}")" + __end="$(eval format_"${f}" "${end}")" + + if [ "${f}" = "proto" ]; then + echo "{ ${__start} }" + else + echo "{ ${__start}-${__end} }" + fi + return + done +} + +# Add single entry to 'test' set in 'inet filter' table +add() { + if ! nft add element inet filter test "${1}"; then + err "Failed to add ${1} given ruleset:" + err "$(nft list ruleset -a)" + return 1 + fi +} + +# Format and output entries for sets in 'netdev perf' table +add_perf() { + if [ "${1}" = "test" ]; then + echo "add element netdev perf test $(format)" + elif [ "${1}" = "norange" ]; then + echo "add element netdev perf norange $(format_norange)" + elif [ "${1}" = "noconcat" ]; then + echo "add element netdev perf noconcat $(format_noconcat)" + fi +} + +# Add single entry to 'norange' set in 'netdev perf' table +add_perf_norange() { + if ! nft add element netdev perf norange "${1}"; then + err "Failed to add ${1} given ruleset:" + err "$(nft list ruleset -a)" + return 1 + fi +} + +# Add single entry to 'noconcat' set in 'netdev perf' table +add_perf_noconcat() { + if ! nft add element netdev perf noconcat "${1}"; then + err "Failed to add ${1} given ruleset:" + err "$(nft list ruleset -a)" + return 1 + fi +} + +# Delete single entry from set +del() { + if ! nft delete element inet filter test "${1}"; then + err "Failed to delete ${1} given ruleset:" + err "$(nft list ruleset -a)" + return 1 + fi +} + +# Return packet count from 'test' counter in 'inet filter' table +count_packets() { + found=0 + for token in $(nft list counter inet filter test); do + [ ${found} -eq 1 ] && echo "${token}" && return + [ "${token}" = "packets" ] && found=1 + done +} + +# Return packet count from 'test' counter in 'netdev perf' table +count_perf_packets() { + found=0 + for token in $(nft list counter netdev perf test); do + [ ${found} -eq 1 ] && echo "${token}" && return + [ "${token}" = "packets" ] && found=1 + done +} + +# Set MAC addresses, send traffic according to specifier +flood() { + ip link set veth_a address "$(format_mac "${1}")" + ip -n B link set veth_b address "$(format_mac "${2}")" + + for f in ${dst}; do + eval dst_"$f"=\$\(format_\$f "${1}"\) + done + for f in ${src}; do + eval src_"$f"=\$\(format_\$f "${2}"\) + done + eval flood_\$proto +} + +# Set MAC addresses, start pktgen injection +perf() { + dst_mac="$(format_mac "${1}")" + ip link set veth_a address "${dst_mac}" + + for f in ${dst}; do + eval dst_"$f"=\$\(format_\$f "${1}"\) + done + for f in ${src}; do + eval src_"$f"=\$\(format_\$f "${2}"\) + done + eval perf_\$perf_proto +} + +# Set MAC addresses, send single packet, check that it matches, reset counter +send_match() { + ip link set veth_a address "$(format_mac "${1}")" + ip -n B link set veth_b address "$(format_mac "${2}")" + + for f in ${dst}; do + eval dst_"$f"=\$\(format_\$f "${1}"\) + done + for f in ${src}; do + eval src_"$f"=\$\(format_\$f "${2}"\) + done + eval send_\$proto + if [ "$(count_packets)" != "1" ]; then + err "${proto} packet to:" + err " $(for f in ${dst}; do + eval format_\$f "${1}"; printf ' '; done)" + err "from:" + err " $(for f in ${src}; do + eval format_\$f "${2}"; printf ' '; done)" + err "should have matched ruleset:" + err "$(nft list ruleset -a)" + return 1 + fi + nft reset counter inet filter test >/dev/null +} + +# Set MAC addresses, send single packet, check that it doesn't match +send_nomatch() { + ip link set veth_a address "$(format_mac "${1}")" + ip -n B link set veth_b address "$(format_mac "${2}")" + + for f in ${dst}; do + eval dst_"$f"=\$\(format_\$f "${1}"\) + done + for f in ${src}; do + eval src_"$f"=\$\(format_\$f "${2}"\) + done + eval send_\$proto + if [ "$(count_packets)" != "0" ]; then + err "${proto} packet to:" + err " $(for f in ${dst}; do + eval format_\$f "${1}"; printf ' '; done)" + err "from:" + err " $(for f in ${src}; do + eval format_\$f "${2}"; printf ' '; done)" + err "should not have matched ruleset:" + err "$(nft list ruleset -a)" + return 1 + fi +} + +# Correctness test template: +# - add ranged element, check that packets match it +# - check that packets outside range don't match it +# - remove some elements, check that packets don't match anymore +test_correctness() { + setup veth send_"${proto}" set || return ${KSELFTEST_SKIP} + + range_size=1 + for i in $(seq "${start}" $((start + count))); do + end=$((start + range_size)) + + # Avoid negative or zero-sized port ranges + if [ $((end / 65534)) -gt $((start / 65534)) ]; then + start=${end} + end=$((end + 1)) + fi + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + add "$(format)" || return 1 + for j in $(seq ${start} $((range_size / 2 + 1)) ${end}); do + send_match "${j}" $((j + src_delta)) || return 1 + done + send_nomatch $((end + 1)) $((end + 1 + src_delta)) || return 1 + + # Delete elements now and then + if [ $((i % 3)) -eq 0 ]; then + del "$(format)" || return 1 + for j in $(seq ${start} \ + $((range_size / 2 + 1)) ${end}); do + send_nomatch "${j}" $((j + src_delta)) \ + || return 1 + done + fi + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done +} + +# Concurrency test template: +# - add all the elements +# - start a thread for each physical thread that: +# - adds all the elements +# - flushes the set +# - adds all the elements +# - flushes the entire ruleset +# - adds the set back +# - adds all the elements +# - delete all the elements +test_concurrency() { + proto=${flood_proto} + tools=${flood_tools} + chain_spec=${flood_spec} + setup veth flood_"${proto}" set || return ${KSELFTEST_SKIP} + + range_size=1 + cstart=${start} + flood_pids= + for i in $(seq ${start} $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + add "$(format)" || return 1 + + flood "${i}" $((i + src_delta)) & flood_pids="${flood_pids} $!" + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done + + sleep 10 + + pids= + for c in $(seq 1 "$(nproc)"); do ( + for r in $(seq 1 "${race_repeat}"); do + range_size=1 + + # $start needs to be local to this subshell + # shellcheck disable=SC2030 + start=${cstart} + for i in $(seq ${start} $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + add "$(format)" 2>/dev/null + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done + + nft flush inet filter test 2>/dev/null + + range_size=1 + start=${cstart} + for i in $(seq ${start} $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + add "$(format)" 2>/dev/null + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done + + nft flush ruleset + setup set 2>/dev/null + + range_size=1 + start=${cstart} + for i in $(seq ${start} $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + add "$(format)" 2>/dev/null + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done + + range_size=1 + start=${cstart} + for i in $(seq ${start} $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + del "$(format)" 2>/dev/null + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done + done + ) & pids="${pids} $!" + done + + # shellcheck disable=SC2046,SC2086 # word splitting wanted here + wait $(for pid in ${pids}; do echo ${pid}; done) + # shellcheck disable=SC2046,SC2086 + kill $(for pid in ${flood_pids}; do echo ${pid}; done) 2>/dev/null + # shellcheck disable=SC2046,SC2086 + wait $(for pid in ${flood_pids}; do echo ${pid}; done) 2>/dev/null + + return 0 +} + +# Timeout test template: +# - add all the elements with 3s timeout while checking that packets match +# - wait 3s after the last insertion, check that packets don't match any entry +test_timeout() { + setup veth send_"${proto}" set || return ${KSELFTEST_SKIP} + + timeout=3 + range_size=1 + for i in $(seq "${start}" $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + add "$(format)" || return 1 + + for j in $(seq ${start} $((range_size / 2 + 1)) ${end}); do + send_match "${j}" $((j + src_delta)) || return 1 + done + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done + sleep 3 + for i in $(seq ${start} $((start + count))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + for j in $(seq ${start} $((range_size / 2 + 1)) ${end}); do + send_nomatch "${j}" $((j + src_delta)) || return 1 + done + + range_size=$((range_size + 1)) + start=$((end + range_size)) + done +} + +# Performance test template: +# - add concatenated ranged entries +# - add non-ranged concatenated entries (for hash set matching rate baseline) +# - add ranged entries with first field only (for rbhash baseline) +# - start pktgen injection directly on device rx path of this namespace +# - measure drop only rate, hash and rbtree baselines, then matching rate +test_performance() { + chain_spec=${perf_spec} + dst="${perf_dst}" + src="${perf_src}" + setup veth perf set || return ${KSELFTEST_SKIP} + + first=${start} + range_size=1 + for set in test norange noconcat; do + start=${first} + for i in $(seq ${start} $((start + perf_entries))); do + end=$((start + range_size)) + srcstart=$((start + src_delta)) + srcend=$((end + src_delta)) + + if [ $((end / 65534)) -gt $((start / 65534)) ]; then + start=${end} + end=$((end + 1)) + elif [ ${start} -eq ${end} ]; then + end=$((start + 1)) + fi + + add_perf ${set} + + start=$((end + range_size)) + done > "${tmp}" + nft -f "${tmp}" + done + + perf $((end - 1)) ${srcstart} + + sleep 2 + + nft add rule netdev perf test counter name \"test\" drop + nft reset counter netdev perf test >/dev/null 2>&1 + sleep "${perf_duration}" + pps="$(printf %10s $(($(count_perf_packets) / perf_duration)))" + info " baseline (drop from netdev hook): ${pps}pps" + handle="$(nft -a list chain netdev perf test | grep counter)" + handle="${handle##* }" + nft delete rule netdev perf test handle "${handle}" + + nft add rule "netdev perf test ${chain_spec} @norange \ + counter name \"test\" drop" + nft reset counter netdev perf test >/dev/null 2>&1 + sleep "${perf_duration}" + pps="$(printf %10s $(($(count_perf_packets) / perf_duration)))" + info " baseline hash (non-ranged entries): ${pps}pps" + handle="$(nft -a list chain netdev perf test | grep counter)" + handle="${handle##* }" + nft delete rule netdev perf test handle "${handle}" + + nft add rule "netdev perf test ${chain_spec%%. *} @noconcat \ + counter name \"test\" drop" + nft reset counter netdev perf test >/dev/null 2>&1 + sleep "${perf_duration}" + pps="$(printf %10s $(($(count_perf_packets) / perf_duration)))" + info " baseline rbtree (match on first field only): ${pps}pps" + handle="$(nft -a list chain netdev perf test | grep counter)" + handle="${handle##* }" + nft delete rule netdev perf test handle "${handle}" + + nft add rule "netdev perf test ${chain_spec} @test \ + counter name \"test\" drop" + nft reset counter netdev perf test >/dev/null 2>&1 + sleep "${perf_duration}" + pps="$(printf %10s $(($(count_perf_packets) / perf_duration)))" + p5="$(printf %5s "${perf_entries}")" + info " set with ${p5} full, ranged entries: ${pps}pps" + kill "${perf_pid}" +} + +# Run everything in a separate network namespace +[ "${1}" != "run" ] && { unshare -n "${0}" run; exit $?; } +tmp="$(mktemp)" +trap cleanup EXIT + +# Entry point for test runs +passed=0 +for name in ${TESTS}; do + printf "TEST: %s\n" "${name}" + for type in ${TYPES}; do + eval desc=\$TYPE_"${type}" + IFS=' +' + for __line in ${desc}; do + # shellcheck disable=SC2086 + eval ${__line%% *}=\"${__line##* }\"; + done + IFS=' +' + + if [ "${name}" = "concurrency" ] && \ + [ "${race_repeat}" = "0" ]; then + continue + fi + if [ "${name}" = "performance" ] && \ + [ "${perf_duration}" = "0" ]; then + continue + fi + + printf " %-60s " "${display}" + eval test_"${name}" + ret=$? + + if [ $ret -eq 0 ]; then + printf "[ OK ]\n" + info_flush + passed=$((passed + 1)) + elif [ $ret -eq 1 ]; then + printf "[FAIL]\n" + err_flush + exit 1 + elif [ $ret -eq ${KSELFTEST_SKIP} ]; then + printf "[SKIP]\n" + err_flush + fi + done +done + +[ ${passed} -eq 0 ] && exit ${KSELFTEST_SKIP} |