summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)Author
2013-11-08IB/mlx5: Simplify mlx5_ib_destroy_srqEli Cohen
Make use of destroy_srq_kernel() to clear SRQ resouces. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/mlx5: Fix overflow check in IB_WR_FAST_REG_MREli Cohen
Make sure not to overflow when reading the page list from struct ib_fast_reg_page_list. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/mlx5: Multithreaded create MREli Cohen
Use asynchronous commands to execute up to eight concurrent create MR commands. This is to fill memory caches faster so we keep consuming from there. Also, increase timeout for shrinking caches to five minutes. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/mlx5: Fix check of number of entries in create CQEli Cohen
Verify that the value is non negative before rounding up to power of 2. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/cxgb4: Fix formatting of physical addressBen Hutchings
Physical addresses may be wider than virtual addresses (e.g. on i386 with PAE) and must not be formatted with %p. Compile-tested only. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-07net/mlx4_core: Initialize all mailbox buffers to zero before useJack Morgenstein
To guarantee that all unused fields in all FW commands for both inboxes and outboxes are zeroed out, initialize the mailbox buffer to all zeroes. This is especially important for SRIOV comm-channel virtual commands (such as QUERY_FUNC_CAP), where if new fields are added to support new features, the driver can depend on older kernels passing zeroes in these fields. In addition to zeroing out the mailbox buffer at allocation time, all (now unnecessary) calls to memset by the callers of mlx4_alloc_cmd_mailbox() are removed. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04mlx4: Structures and init/teardown for VF resource quotasJack Morgenstein
This is step #1 for implementing SRIOV resource quotas for VFs. Quotas are implemented per resource type for VFs and the PF, to prevent any entity from simply grabbing all the resources for itself and leaving the other entities unable to obtain such resources. Resources which are allocated using quotas: QPs, CQs, SRQs, MPTs, MTTs, MAC, VLAN, and Counters. The quota system works as follows: Each entity (VF or PF) is given a max number of a given resource (its quota), and a guaranteed minimum number for each resource (starvation prevention). For QPs, CQs, SRQs, MPTs and MTTs: 50% of the available quantity for the resource is divided equally among the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool. The other 50% is the "free pool", allocated on a first-come-first-serve basis. For each VF/PF, resources are first allocated from its "guaranteed-minimum" pool. When that pool is exhausted, the driver attempts to allocate from the resource "free-pool". The quota (i.e., max) for the VFs and the PF is: The free-pool amount (50% of the real max) + the guaranteed minimum For MACs: Guarantee 2 MACs per VF/PF per port. As a result, since we have only 128 MACs per port, reduce the allowable number of VFs from 64 to 63. Any remaining MACs are put into a free pool. For VLANs: For the PF, the per-port quota is 128 and guarantee is 64 (to allow the PF to register at least a VLAN per VF in VST mode). For the VFs, the per-port quota is 64 and the guarantee is 0. We assume that VGT VFs are trusted not to abuse the VLAN resource. For Counters: For all functions (PF and VFs), the quota is 128 and the guarantee is 0. In this patch, we define the needed structures, which are added to the resource-tracker struct. In addition, we do initialization for the resource quota, and adjust the query_device response to use quotas rather than resource maxima. As part of the implementation, we introduce a new field in mlx4_dev: quotas. This field holds the resource quotas used to report maxima to the upper layers (ib_core, via query_device). The HCA maxima of these values are passed to the VFs (via QUERY_HCA) so that they may continue to use these in handling QPs, CQs, SRQs and MPTs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21IB/core: Temporarily disable create_flow/destroy_flow uverbsYann Droneaud
The create_flow/destroy_flow uverbs and the associated extensions to the user-kernel verbs ABI are under review and are too experimental to freeze at this point. So userspace is not exposed to experimental features and an uinstable ABI, temporarily disable this for v3.12 (with a Kconfig option behind staging to reenable it if desired). The feature will be enabled after proper cleanup for v3.13. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Link: http://marc.info/?i=cover.1381351016.git.ydroneaud@opteya.com Link: http://marc.info/?i=cover.1381177342.git.ydroneaud@opteya.com [ Add a Kconfig option to reenable these verbs. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-14Merge branch 'misc' into for-nextRoland Dreier
2013-10-14IB: Remove unnecessary semicolonsJoe Perches
These aren't necessary after switch blocks. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Ensure proper synchronization accessing memoryEli Cohen
Call mlx5_ib_populate_pas() before mapping the DMA buffer to ensure the hardware reads the values written by the CPU. Found by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Fix alignment of reg umr gather buffersEli Cohen
The hardware requires that gather buffers for UMR work requests be aligned to 2K. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Fix eq names to display nicely in /proc/interruptsSagi Grimberg
It's helpful for a driver to put the pci slot name in its interrupt names, so /proc/interrupts will show the pci slot of the device. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Fix opt param mask according to firmware specEli Cohen
Failed to configure opt mask to configure rre from init to rtr. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10mlx5: Fix opt param mask for sq err to rts transitionEli Cohen
Add missing entry in the table for UC transport. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Disable atomic operationsEli Cohen
Currently Atomic operations don't work properly. Disable them for the time being. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Avoid async events on invalid port numberEli Cohen
On a single ported Connect-IB, its possible for the firmware to issue events on the non-existing 2nd port. Make sure to ignore events generated for such ports. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Decrease memory consumption of mr cachesEli Cohen
Change the logic so we do not allocate memory nor map the device before actually posting to the REG_UMR QP. In addition, unmap and free the memory after we get completion. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Fix memory leak in mlx5_ib_create_srqMoshe Lazer
The patch fixes the rollback in case of failure in creating SRQ. Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Flush cache workqueue before destroying itMoshe Lazer
Destroying the workqueue without flushing it first can lead to a case in which the kernel tries to push a delayed work to the workqueue which does not exist anymore. Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-10IB/mlx5: Fix send work queue size calculationEli Cohen
1. Make sure wqe_cnt does not exceed the limit published by firmware. 2. There is no requirement that the number of outstanding work requests will be a power of two. Remove the ilog2 in the calculation of sq.max_post to fix that. 3. Add case for IB_QPT_XRC_TGT in sq_overhead and return 0 as XRC target QPs do not have a send queue. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-10-04IB/qib: Drop qib_tune_pcie_caps() and qib_tune_pcie_coalesce() return valuesBjorn Helgaas
The callers of qib_tune_pcie_caps() and qib_tune_pcie_coalesce() don't check the return values, so this patch drops the return values altogether. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
2013-09-24IB/qib: Use pcie_set_mps() and pcie_get_mps() to simplify codeYijing Wang
Refactor qib_tune_pcie_caps(). Use pcie_get_mps(), pcie_set_mps(), pcie_get_readrq(), and pcie_set_readrq() to simplify the code. The PCI core caches the "PCIe Max Payload Size Supported" in pci_dev->pcie_mpss, so use that instead of pcie_capability_read_word(). Remove the unused val2fld() and fld2val(). Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
2013-09-24IB/qib: Use pci_is_root_bus() to check whether it is a root busYijing Wang
Use pci_is_root_bus() instead of "if (bus->parent)" statement for better readability. Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
2013-09-13Remove GENERIC_HARDIRQ config optionMartin Schwidefsky
After the last architecture switched to generic hard irqs the config options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code for !CONFIG_GENERIC_HARDIRQS can be removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-09-06Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "The usual trivial updates all over the tree -- mostly typo fixes and documentation updates" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits) doc: Documentation/cputopology.txt fix typo treewide: Convert retrun typos to return Fix comment typo for init_cma_reserved_pageblock Documentation/trace: Correcting and extending tracepoint documentation mm/hotplug: fix a typo in Documentation/memory-hotplug.txt power: Documentation: Update s2ram link doc: fix a typo in Documentation/00-INDEX Documentation/printk-formats.txt: No casts needed for u64/s64 doc: Fix typo "is is" in Documentations treewide: Fix printks with 0x%# zram: doc fixes Documentation/kmemcheck: update kmemcheck documentation doc: documentation/hwspinlock.txt fix typo PM / Hibernate: add section for resume options doc: filesystems : Fix typo in Documentations/filesystems scsi/megaraid fixed several typos in comments ppc: init_32: Fix error typo "CONFIG_START_KERNEL" treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks page_isolation: Fix a comment typo in test_pages_isolated() doc: fix a typo about irq affinity ...
2013-09-03Merge branches 'cxgb4', 'flowsteer', 'ipoib', 'iser', 'mlx4', 'ocrdma' and ↵Roland Dreier
'qib' into for-next
2013-09-03RDMA/ocrdma: Fix compiler warning about int/pointer size mismatchRoland Dreier
Fix: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function 'ocrdma_build_fr': >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:1832:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] mr = (struct ocrdma_mr *)qp->dev->stag_arr[(hdr->lkey >> 8) & ^ drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function 'ocrdma_alloc_frmr': >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:2661:64: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] dev->stag_arr[(mr->hwmr.lkey >> 8) & (OCRDMA_MAX_STAG - 1)] = (u64) mr; Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02IB/qib: Move COUNTER_MASK definition within qib_mad.h header guardsIra Weiny
Commit 36a8f01cd24b ("IB/qib: Add congestion control agent implementation") caused statements to leak pass the header guard. Fix this. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Fix passing wrong opcode to modify_srqNaresh Gottumukkala
Fix passing wrong opcode to ocrdma_modify_srq and query SRQ. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Fill PVID in UMC caseNaresh Gottumukkala
In UMC case, driver needs to fill PVID in the address vector template for UD traffic. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Add ABI versioning supportNaresh Gottumukkala
Add ABI versioning support between driver and userspace library. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Consider multiple SGES in case of DPPNaresh Gottumukkala
While posting inline DPP data, we are not considering multiple sges. Fix this. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Fix for displaying proper link speedNaresh Gottumukkala
Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Increase STAG array sizeNaresh Gottumukkala
1) Increase STAG Array size. 2) Max inline data size should be set to the same value used during QP creation 3) Set max_sge_rd to zero since we dont support RD transport in our adapters. 4) Max cqes reported in ibv_devinfo should be from QUERY_CONFIG. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Dont use PD 0 for userpace CQ DBNaresh Gottumukkala
Create_CQ verb doesn't provide a PD pointer. So, until now we are creating all (both userspace and kernel) CQ DB regions from PD0. This will result in mmapping PD0 to applications. A rogue userspace application can mess things up. Also more serious issues is even the be2net NIC uses PD0. This patch addresses this problem by: 1) Create a PD page for every userspace application when the alloc_ucontext is called. This will be destroyed in dealloc_ucontext. 2) All CQs for that context will use the PD allocated in ucontext. 3) The first create_PD call from application will result in returning the PD address from its ucontext (no new PD will be created). 4) For subsecquent create_pd calls from application, we create new PDs for the application. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: FRMA code cleanupNaresh Gottumukkala
1) Fixed setting FR_MR bit for FRWR stag allocation 2) Access rights are passsed during FRWR stage and not during STAT allocation stage 3) FRWR WQE structure cleanup 4) Add QP level signaled bit. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: For ERX2 irrespective of Qid, num_posted offset is 24Naresh Gottumukkala
1) All RQ doorbells are handled by ERX2 and doorbell->num_posted offset is constant to bit offset 24 for ERX2 irrspective of Q id. 2) Fixed RESET to INIT state change (from ERR->RST->INIT->RTR case). Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Fix to work with even a single MSI-X vectorNaresh Gottumukkala
There are cases like SRIOV where can get only one MSI-X vector allocated for RoCE. In that case we need to use the vector for both data plane and control plane. We need to use EQ create version V2. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Remove the MTU check based on Ethernet MTUNaresh Gottumukkala
Also increase MAX AH to 512. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Add support for fast register work requests (FRWR)Naresh Gottumukkala
Also get the max_srq value from query_config mailbox response. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02RDMA/ocrdma: Create IRD queue fixNaresh Gottumukkala
1) Fix ocrdma_get_num_posted_shift for upto 128 QPs. 2) Create for min of dev->max_wqe and requested wqe in create_qp. 3) As part of creating ird queue, populate with basic header templates. 4) Make sure all the DB memory allocated to userspace are page aligned. 5) Fix issue in checking the mmap local cache. 6) Some code cleanup. Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-28IB/mlx4: Add receive flow steering supportHadar Hen Zion
Implement ib_create_flow() and ib_destroy_flow(). Translate the verbs structures provided by the user to HW structures and call the MLX4_QP_FLOW_STEERING_ATTACH/DETACH firmware commands. On the ATTACH command completion, the firmware provides a 64-bit registration ID, which is placed into struct mlx4_ib_flow that wraps the instance of struct ib_flow which is retuned to caller. Later, this reg ID is used for detaching that flow from the firmware. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-20treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacksJoe Perches
Don't emit OOM warnings when k.alloc calls fail when there there is a v.alloc immediately afterwards. Converted a kmalloc/vmalloc with memset to kzalloc/vzalloc. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-08-13RDMA/cxgb4: Issue RI.FINI before closing when entering TERMSteve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Advertise ~0ULL as max MR sizeSteve Wise
Lustre uses a advertised max MR size of ~0ULL to indicate it should use a dma_mr. Hence advertise max MR size as ~0ULL. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Always do GTS write if cidx_inc == CIDXINC_MASKSteve Wise
When polling, we do a GTS update if the accumulated cidx_inc == the CQ depth / 16. However, if the CQ is large enough, Cq depth / 16 exceeds the size of the field in the GTS word. So we also need to update if cidx_inc hits CIDXINC_MASK to avoid overflowing the field. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Set arp error handler for PASS_ACCEPT_RPL messagesSteve Wise
accept_cr() failed to set the arp error handler on a reused skb. This results in a kernel crash if the arp does indeed time out. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Fix accounting for unsignaled SQ WRs to deal with wrapSteve Wise
When determining how many WRs are completed with a signaled CQE, correctly deal with queue wraps. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-13RDMA/cxgb4: Fix QP flush logicSteve Wise
This patch makes following fixes in QP flush logic: - correctly flushes unsignaled WRs followed by a signaled WR - supports for flushing a CQ bound to multiple QPs - resets cidx_flush if a active queue starts getting HW CQEs again - marks WQ in error when we leave RTS. This was only being done for user queues, but we need it for kernel queues too so that post_send/post_recv will start returning the appropriate error synchronously - eats unsignaled read resp CQEs. HW always inserts CQEs so we must silently discard them if the read work request was unsignaled. - handles QP flushes with pending SW CQEs. The flush and out of order completion logic has a bug where if out of order completions are flushed but not yet polled by the consumer and the qp is then flushed then we end up inserting duplicate completions. - c4iw_flush_sq() should only flush wrs that have not already been flushed. Since we already track where in the SQ we've flushed via sq.cidx_flush, just start at that point and flush any remaining. This bug only caused a problem in the presence of unsignaled work requests. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Fixed sparse warning due to htonl/ntohl confusion. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>