summaryrefslogtreecommitdiff
path: root/include/linux/mlx5
AgeCommit message (Collapse)Author
2022-03-24Merge tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Introduce new device migration uAPI and implement device specific mlx5 vfio-pci variant driver supporting new protocol (Jason Gunthorpe, Yishai Hadas, Leon Romanovsky) - New HiSilicon acc vfio-pci variant driver, also supporting migration interface (Shameer Kolothum, Longfang Liu) - D3hot fixes for vfio-pci-core (Abhishek Sahu) - Document new vfio-pci variant driver acceptance criteria (Alex Williamson) - Fix UML build unresolved ioport_{un}map() functions (Alex Williamson) - Fix MAINTAINERS due to header movement (Lukas Bulwahn) * tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio: (31 commits) vfio-pci: Provide reviewers and acceptance criteria for variant drivers MAINTAINERS: adjust entry for header movement in hisilicon qm driver hisi_acc_vfio_pci: Use its own PCI reset_done error handler hisi_acc_vfio_pci: Add support for VFIO live migration crypto: hisilicon/qm: Set the VF QM state register hisi_acc_vfio_pci: Add helper to retrieve the struct pci_driver hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices hisi_acc_qm: Move VF PCI device IDs to common header crypto: hisilicon/qm: Move few definitions to common header crypto: hisilicon/qm: Move the QM header to include/linux vfio/mlx5: Fix to not use 0 as NULL pointer PCI/IOV: Fix wrong kernel-doc identifier vfio/mlx5: Use its own PCI reset_done error handler vfio/pci: Expose vfio_pci_core_aer_err_detected() vfio/mlx5: Implement vfio_pci driver for mlx5 devices vfio/mlx5: Expose migration commands over mlx5 device vfio: Remove migration protocol v1 documentation vfio: Extend the device migration protocol with RUNNING_P2P vfio: Define device migration protocol v2 ...
2022-03-22net/mlx5e: Fix build warning, detected write beyond size of fieldSaeed Mahameed
When merged with Linus tree, the cited patch below will cause the following build warning: In function 'fortify_memset_chk', inlined from 'mlx5e_xmit_xdp_frame' at drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c:438:3: include/linux/fortify-string.h:242:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix that by grouping the fields to memeset in struct_group() to avoid the false alarm. Fixes: 9ded70fa1d81 ("net/mlx5e: Don't prefill WQEs in XDP SQ in the multi buffer mode") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20220322172224.31849-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17net/mlx5: Remove unused fill page array API functionTariq Toukan
mlx5_fill_page_array API function is not used. Remove it, reduce the number of exported functions. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-17net/mlx5: Remove unused exported contiguous coherent buffer allocation APITariq Toukan
All WQ types moved to using the fragmented allocation API for coherent memory. Contiguous API is not used anymore. Remove it, reduce the number of exported functions. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-10net/mlx5: Parse module mapping using mlx5_ifcGal Pressman
The assumption that the first byte in the module mapping dword is the module number shouldn't be hard-coded in the driver, but come from mlx5_ifc structs. While at it, fix the incorrect width for the 'rx_lane' and 'tx_lane' fields. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-10net/mlx5: Query the maximum MCIA register read size from firmwareGal Pressman
The MCIA register supports either 12 or 32 dwords, use the correct value by querying the capability from the MCAM register. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/dsa/dsa2.c commit afb3cc1a397d ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails") commit e83d56537859 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master") https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/ drivers/net/ethernet/intel/ice/ice.h commit 97b0129146b1 ("ice: Fix error with handling of bonding MTU") commit 43113ff73453 ("ice: add TTY for GNSS module for E810T device") https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/ drivers/staging/gdm724x/gdm_lte.c commit fc7f750dc9d1 ("staging: gdm724x: fix use after free in gdm_lte_rx()") commit 4bcc4249b4cf ("staging: Use netif_rx().") https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09net/mlx5: DR, Add support for ConnectX-7 steeringYevgeny Kliteynik
Add support for a new SW format version that is implemented by ConnectX-7. Except for several differences, the STEv2 is identical to STEv1, so for most callbacks the STEv2 context struct will call STEv1 functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: DR, Add support for matching on Internet Header Length (IHL)Yevgeny Kliteynik
Add support for matching on new field - Internet Header Length (IHL). Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add debugfs counters for page commands failuresMoshe Shemesh
Add the following new debugfs counters for debug and verbosity: fw_pages_alloc_failed - number of pages FW requested but driver failed to allocate. give_pages_dropped - number of pages given to FW, but command give pages failed by FW. reclaim_pages_discard - number of pages which were about to reclaim back and FW failed the command. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add pages debugfsMoshe Shemesh
Add pages debugfs to expose the following counters for debuggability: fw_pages_total - How many pages were given to FW and not returned yet. vfs_pages - For SRIOV, how many pages were given to FW for virtual functions usage. host_pf_pages - For ECPF, how many pages were given to FW for external hosts physical functions usage. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Move debugfs entries to separate structMoshe Shemesh
Move the debugfs entry pointers under priv to their own struct. Add get function for device debugfs root. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Change release_all_pages cap bit locationMoshe Shemesh
mlx5 FW has changed release_all_pages cap bit by one bit offset to reflect a fix in the FW flow for release_all_pages. The driver should use the new bit to ensure it calls release_all_pages only if the FW fix is there. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add command failures data to debugfsMoshe Shemesh
Add new counters to command interface debugfs to count command failures. The following counters added: total_failed - number of times command failed (any kind of failure). failed_mbox_status - number of times command failed on bad status returned by FW. In addition, add data about last command failure to command interface debugfs: last_failed_errno - last command failed returned errno. last_failed_mbox_status - last bad status returned by FW. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5e: SHAMPO, reduce TIR indicationBen Ben-Ishay
SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the first place to enforce suitability between connected TIR and RQ, this enforcement does not exist in current the Firmware implementation and was redundant in the first place. Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload") Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Fix size field in bufferx_reg structMohammad Kabat
According to HW spec the field "size" should be 16 bits in bufferx register. Fixes: e281682bf294 ("net/mlx5_core: HW data structs/types definitions cleanup") Signed-off-by: Mohammad Kabat <mohammadkab@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-28Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next 2022-22-02 The following PR includes updates to mlx5-next branch: Headlines: ========== 1) Jakub cleans up unused static inline functions 2) I did some low level firmware command interface return status changes to provide the caller with full visibility on the error/status returned by the Firmware. 3) Use the new command interface in RDMA DEVX usecases to avoid flooding dmesg with some "expected" user error prone use cases. 4) Moshe also uses the new command interface to grab the specific error code from MFRL register command to provide the exact error reason for why SW reset couldn't perform internally in FW. 5) From Mark Bloch: Lag, drop packets in hardware when possible In active-backup mode the inactive interface's packets are dropped by the bond device. In switchdev where TC rules are offloaded to the FDB this can lead to packets being hit in the FDB where without offload they would have been dropped before reaching TC rules in the kernel. Create a drop rule to make sure packets on inactive ports are dropped before reaching the FDB. Listen on NETDEV_CHANGEUPPER / NETDEV_CHANGEINFODATA events and record the inactive state and offload accordingly. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add clarification on sync reset failure net/mlx5: Add reset_state field to MFRL register RDMA/mlx5: Use new command interface API net/mlx5: cmdif, Refactor error handling and reporting of async commands net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct} net/mlx5: cmdif, Add new api for command execution net/mlx5: cmdif, cmd_check refactoring net/mlx5: cmdif, Return value improvements net/mlx5: Lag, offload active-backup drops to hardware net/mlx5: Lag, record inactive state of bond device net/mlx5: Lag, don't use magic numbers for ports net/mlx5: Lag, use local variable already defined to access E-Switch net/mlx5: E-switch, add drop rule support to ingress ACL net/mlx5: E-switch, remove special uplink ingress ACL handling net/mlx5: E-Switch, reserve and use same uplink metadata across ports net/mlx5: Add ability to insert to specific flow group mlx5: remove unused static inlines ==================== Link: https://lore.kernel.org/r/20220223233930.319301-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-27net/mlx5: Introduce migration bits and structuresYishai Hadas
Introduce migration IFC related stuff to enable migration commands. Link: https://lore.kernel.org/all/20220224142024.147653-7-yishaih@nvidia.com Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-02-27net/mlx5: Expose APIs to get/put the mlx5 core deviceYishai Hadas
Expose an API to get the mlx5 core device from a given VF PCI device if mlx5_core is its driver. Upon the get API we stay with the intf_state_mutex locked to make sure that the device can't be gone/unloaded till the caller will complete its job over the device, this expects to be for a short period of time for any flow that the lock is taken. Upon the put API we unlock the intf_state_mutex. The use case for those APIs is the migration flow of a VF over VFIO PCI. In that case the VF doesn't ride on mlx5_core, because the device is driving *two* different PCI devices, the PF owned by mlx5_core and the VF owned by the vfio driver. The mlx5_core of the PF is accessed only during the narrow window of the VF's ioctl that requires its services. This allows the PF driver to be more independent of the VF driver, so long as it doesn't reset the FW. Link: https://lore.kernel.org/all/20220224142024.147653-6-yishaih@nvidia.com Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-02-23net/mlx5: Add clarification on sync reset failureMoshe Shemesh
In case devlink reload action fw_activate failed in sync reset stage, use the new MFRL field reset_state to find why it failed and share this clarification with the user. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: Add reset_state field to MFRL registerMoshe Shemesh
Add new field reset_state to MFRL register. This field expose current state of sync reset for fw update. This field enables sharing with the user more details on why fw activate failed in case it failed the sync reset stage. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: cmdif, Refactor error handling and reporting of async commandsSaeed Mahameed
Same as the new mlx5_cmd_do API, report all information to callers and let them handle the error values and outbox parsing. The user callback status "work->user_callback(status)" is now similar to the error rc code returned from the blocking mlx5_cmd_do() version, and now is defined as follows: -EREMOTEIO : Command executed by FW, outbox.status != MLX5_CMD_STAT_OK. Caller must check FW outbox status. 0 : Command execution successful, outbox.status == MLX5_CMD_STAT_OK. < 0 : Command couldn't execute, FW or driver induced error. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct}Saeed Mahameed
mlx5_core_create_{cq/dct} functions are non-trivial mlx5 commands functions. They check command execution status themselves and hide valuable FW failure information. For mlx5_core/eth kernel user this is what we actually want, but for a devx/rdma user the hidden information is essential and should be propagated up to the caller, thus we convert these commands to use mlx5_cmd_do to return the FW/driver and command outbox status as is, and let the caller decide what to do with it. For kernel callers of mlx5_core_create_{cq/dct} or those who only care about the binary status (FAIL/SUCCESS) they must check status themselves via mlx5_cmd_check() to restore the current behavior. err = mlx5_create_cq(in, out) err = mlx5_cmd_check(err, in, out) if (err) // handle err For DEVX users and those who care about full visibility, They will just propagate the error to user space, and app can check if err == -EREMOTEIO, then outbox.{status,syndrome} are valid. API Note: mlx5_cmd_check() must be used by kernel users since it allows the driver to intercept the command execution status and return a driver simulated status in case of driver induced error handling or reset/recovery flows. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: cmdif, Add new api for command executionSaeed Mahameed
Add mlx5_cmd_do. Unlike mlx5_cmd_exec, this function will not modify or translate outbox.status. The function will return: return = 0: Command was executed, outbox.status == MLX5_CMD_STAT_OK. return = -EREMOTEIO: Executed, outbox.status != MLX5_CMD_STAT_OK. return < 0: Command execution couldn't be performed by FW or driver. And document other mlx5_cmd_exec functions. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: cmdif, cmd_check refactoringSaeed Mahameed
Do not mangle the command outbox in the internal low level cmd_exec and cmd_invoke functions. Instead return a proper unique error code and move the driver error checking to be at a higher level in mlx5_cmd_exec(). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23net/mlx5: Add ability to insert to specific flow groupMark Bloch
If the flow table isn't an autogroup the upper driver has to create the flow groups explicitly. This information can't later be used when creating rules to insert into a specific flow group. Allow such use case. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23mlx5: remove unused static inlinesJakub Kicinski
mlx5 has some unused static inline helpers in include/ while at it also clean static inlines in the driver itself. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-01-27net/mlx5: Remove unused TIR modify bitmask enumsTariq Toukan
struct mlx5_ifc_modify_tir_bitmask_bits is used for the bitmask of MODIFY_TIR operations. Remove the unused bitmask enums. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-01-06net/mlx5: Introduce API for bulk request and release of IRQsShay Drory
Currently IRQs are requested one by one. To balance spreading IRQs among cpus using such scheme requires remembering cpu mask for the cpus used for a given device. This complicates the IRQ allocation scheme in subsequent patch. Hence, prepare the code for bulk IRQs allocation. This enables spreading IRQs among cpus in subsequent patch. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-31net/mlx5: DR, Ignore modify TTL if device doesn't support itYevgeny Kliteynik
When modifying TTL, packet's csum has to be recalculated. Due to HW issue in ConnectX-5, csum recalculation for modify TTL is supported through a work-around that is specifically enabled by configuration. If the work-around isn't enabled, ignore the modify TTL action rather than adding an unsupported action. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
2021-12-31net/mlx5: DR, Add support for matching on geneve_tlv_option_0_exist fieldYevgeny Kliteynik
Match on geneve_tlv_option_0_exist field on devices that support STEv1. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
2021-12-31net/mlx5: Add misc5 flow table match parametersMuhammad Sammar
Add support for misc5 match parameter as per HW spec, this will allow matching on tunnel_header fields. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
2021-12-31net/mlx5: DR, Fix lower case macro prefix "mlx5_" to "MLX5_"Yevgeny Kliteynik
Macros prefix should be capital letters - fix the prefix in mlx5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
2021-12-16Merge branch 'mlx5-next' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next branch 2021-12-15 Hi Dave, Jakub, Jason This pulls mlx5-next branch into net-next and rdma branches. All patches already reviewed on both rdma and netdev mailing lists. Please pull and let me know if there's any problem. 1) Add multiple FDB steering priorities [1] 2) Introduce HW bits needed to configure MAC list size of VF/SF. Required for ("net/mlx5: Memory optimizations") upcoming series [2]. [1] https://lore.kernel.org/netdev/20211201193621.9129-1-saeed@kernel.org/ [2] https://lore.kernel.org/lkml/20211208141722.13646-1-shayd@nvidia.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15net/mlx5: Introduce log_max_current_uc_list_wr_supported bitShay Drory
Downstream patch will use this bit in order to know whether the device supports changing of max_uc_list. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-13net/mlx5: Separate FDB namespaceMaor Gottlieb
This patch doesn't add an additional namespaces, but just separates the naming to be used by each FDB user, bypass and kernel. Downstream patches will actually split this up and allow to have more than single priority for the bypass users. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02net/mlx5: Dynamically resize flow counters query bufferAvihai Horon
The flow counters bulk query buffer is allocated once during mlx5_fc_init_stats(). For PFs and VFs this buffer usually takes a little more than 512KB of memory, which is aligned to the next power of 2, to 1MB. For SFs, this buffer is reduced and takes around 128 Bytes. The buffer size determines the maximum number of flow counters that can be queried at a time. Thus, having a bigger buffer can improve performance for users that need to query many flow counters. There are cases that don't use many flow counters and don't need a big buffer (e.g. SFs, VFs). Since this size is critical with large scale, in these cases the buffer size should be reduced. In order to reduce memory consumption while maintaining query performance, change the query buffer's allocation scheme to the following: - First allocate the buffer with small initial size. - If the number of counters surpasses the initial size, resize the buffer to the maximum size. The buffer only grows and isn't shrank, because users with many flow counters don't care about the buffer size and we don't want to add resize overhead if the current number of counters drops. This solution is preferable to the current one, which is less accurate and only addresses SFs. Signed-off-by: Avihai Horon <avihaih@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-30net/mlx5: Fix access to a non-supported registerAya Levin
Validate MRTC register is supported before triggering a delayed work which accesses it. Fixes: 5a1023deeed0 ("net/mlx5: Add periodic update of host time to firmware") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdevPaul Blakey
E-Switch encap mode is relevant only when in switchdev mode. The RDMA driver can query the encap configuration via mlx5_eswitch_get_encap_mode(). Make sure it returns the currently used mode and not the set one. This reverts the cited commit which reset the encap mode on entering switchdev and fixes the original issue properly. Fixes: 9a64144d683a ("net/mlx5: E-Switch, Fix default encap mode") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-29net/mlx5: Allow skipping counter refresh on creationPaul Blakey
CT creates a counter for each CT rule, and for each such counter, fs_counters tries to queue mlx5_fc_stats_work() work again via mod_delayed_work(0) call to refresh all counters. This call has a large performance impact when reaching high insertion rate and accounts for ~8% of the insertion time when using software steering. Allow skipping the refresh of all counters during counter creation. Change CT to use this refresh skipping for it's counters. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-27Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into net-next Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Add handle SHAMPO cqe supportKhalid Manaa
This patch adds the new CQE SHAMPO fields: - flush: indicates that we must close the current session and pass the SKB to the network stack. - match: indicates that the current packet matches the oppened session, the packet will be merge into the current SKB. - header_size: the size of the packet headers that written into the headers buffer. - header_entry_index: the entry index in the headers buffer. - data_offset: packets data offset in the WQE. Also new cqe handler is added to handle SHAMPO packets: - The new handler uses CQE SHAMPO fields to build the SKB. CQE's Flush and match fields are not used in this patch, packets are not merged in this patch. Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Add support to klm_umr_wqeBen Ben-Ishay
This commit adds the needed definitions for using the klm_umr_wqe. UMR stands for user-mode memory registration, is a mechanism to alter address translation properties of MKEY by posting WorkQueueElement aka WQE on send queue. MKEY stands for memory key, MKEY are used to describe a region in memory that can be later used by HW. KLM stands for {Key, Length, MemVa}, KLM_MKEY is indirect MKEY that enables to map multiple memory spaces with different sizes in unified MKEY. klm_umr_wqe is a UMR that use to update a KLM_MKEY. SHAMPO feature uses KLM_MKEY for memory registration of his header buffer. Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Rename TIR lro functions to TIR packet merge functionsKhalid Manaa
This series introduces new packet merge type, therefore rename lro functions to packet merge to support the new merge type: - Generalize + rename mlx5e_build_tir_ctx_lro to mlx5e_build_tir_ctx_packet_merge. - Rename mlx5e_modify_tirs_lro to mlx5e_modify_tirs_packet_merge. - Rename lro bit in mlx5_ifc_modify_tir_bitmask_bits to packet_merge. - Rename lro_en in mlx5e_params to packet_merge_type type and combine packet_merge params into one struct mlx5e_packet_merge_param. Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5: Add SHAMPO caps, HW bits and enumerationsBen Ben-Ishay
This commit adds SHAMPO bit to hca_cap and SHAMPO capabilities structure, SHAMPO related HW spec hardware fields and enumerations. SHAMPO stands for: split headers and merge payload offload. SHAMPO new fields: WQ: - headers_mkey: mkey that represents the headers buffer, where the packets headers will be written by the HW. - shampo_enable: flag to verify if the WQ supports SHAMPO feature. - log_reservation_size: the log of the reservation size where the data of the packet will be written by the HW. - log_max_num_of_packets_per_reservation: log of the maximum number of packets that can be written to the same reservation. - log_headers_entry_size: log of the header entry size of the headers buffer. - log_headers_buffer_entry_num: log of the entries number of the headers buffer. RQ: - shampo_no_match_alignment_granularity: the HW alignment granularity in case the received packet doesn't match the current session. - shampo_match_criteria_type: the type of match criteria. - reservation_timeout: the maximum time that the HW will hold the reservation. mlx5_ifc_shampo_cap_bits, the capabilities of the SHAMPO feature: - shampo_log_max_reservation_size: the maximum allowed value of the field WQ.log_reservation_size. - log_reservation_size: the minimum allowed value of the field WQ.log_reservation_size. - shampo_min_mss_size: the minimum payload size of packet that can open a new session or be merged to a session. - shampo_max_log_headers_entry_size: the maximum allowed value of the field WQ.log_headers_entry_size Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Rename lro_timeout to packet_merge_timeoutBen Ben-Ishay
TIR stands for transport interface receive, the TIR object is responsible for performing all transport related operations on the receive side like packet processing, demultiplexing the packets to different RQ's, etc. lro_timeout is a field in the TIR that is used to set the timeout for lro session, this series introduces new packet merge type, therefore rename lro_timeout to packet_merge_timeout for all packet merge types. Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5: remove the recent devlink paramsJakub Kicinski
revert commit 46ae40b94d88 ("net/mlx5: Let user configure io_eq_size param") revert commit a6cb08daa3b4 ("net/mlx5: Let user configure event_eq_size param") revert commit 554604061979 ("net/mlx5: Let user configure max_macs param") The EQE parameters are applicable to more drivers, they should be configured via standard API, probably ethtool. Example of another driver needing something similar: https://lore.kernel.org/all/1633454136-14679-3-git-send-email-sbhatta@marvell.com/ The last param for "max_macs" is probably fine but the documentation is severely lacking. The meaning and implications for changing the param need to be stated. Link: https://lore.kernel.org/r/20211026152939.3125950-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net/mlx5: Let user configure max_macs paramShay Drory
Currently, max_macs is taking 70Kbytes of memory per function. This size is not needed in all use cases, and is critical with large scale. Hence, allow user to configure the number of max_macs. For example, to reduce the number of max_macs to 1, execute:: $ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \ cmode driverinit $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure event_eq_size paramShay Drory
Event EQ is an EQ which received the notification of almost all the events generated by the NIC. Currently, each event EQ is taking 512KB of memory. This size is not needed in most use cases, and is critical with large scale. Hence, allow user to configure the size of the event EQ. For example to reduce event EQ size to 64, execute:: $ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64 $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25net/mlx5: Let user configure io_eq_size paramShay Drory
Currently, each I/O EQ is taking 128KB of memory. This size is not needed in all use cases, and is critical with large scale. Hence, allow user to configure the size of I/O EQs. For example, to reduce I/O EQ size to 64, execute: $ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64 $ devlink dev reload pci/0000:00:0b.0 Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>