summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)Author
2012-09-30IB/mlx4: SR-IOV multiplex and demultiplex MADsJack Morgenstein
Special QPs are paravirtualized. vHCAs are not given direct access to QP0/1. Rather, these QPs are operated by a special context hosted by the PF, which mediates access to/from vHCAs. This is done by opening a "tunnel" per vHCA port per QP0/1. A tunnel comprises a pair of UD QPs: a "Tunnel QP" in the PF-context and a "Proxy QP" in the vHCA. All vHCA MAD traffic must pass through the corresponding tunnel. vHCA QPs cannot be assigned to VL15 and are denied of the well-known QKey. Outgoing messages are "de-multiplexed" (i.e., directed to the wire via the real special QP). Incoming messages are "multiplexed" (i.e. steered by the PPF to the correct VF or to the PF) QP0 access is restricted to the PF vHCA. VF vHCAs also have (virtual) QP0s, but they never receive any SMPs and all SMPs sent are discarded. QP1 traffic is allowed for all vHCAs, but special care is required to bridge the gap between the host and network views. Specifically: - Transaction IDs are mapped to guarantee uniqueness among vHCAs - CM para-virtualization o Incoming requests are steered to the correct vHCA according to the embedded GID o Local communication IDs are mapped to ensure uniqueness among vHCAs (see the patch that adds CM paravirtualization.) - Multicast para-virtualization o The PF context aggregates membership state from all vHCAs o The SA is contacted only when the aggregate membership changes o If the aggregate does not change, the PF context will provide the requesting vHCA with the proper response. (see the patch that adds multicast group paravirtualization) Incoming MADs are steered according to: - the DGID If a GRH is present - the mapped transaction ID for response MADs - the embedded GID in CM requests - the remote communication ID in other CM messages Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoopJack Morgenstein
This requires: 1. Replacing the paravirtualized P_Key index (inserted by the guest) with the real P_Key index. 2. For UD QPs, placing the guest's true source GID index in the address path structure mgid field, and setting the ud_force_mgid bit so that the mgid is taken from the QP context and not from the WQE when posting sends. 3. For UC and RC QPs, placing the guest's true source GID index in the address path structure mgid field. 4. For tunnel and proxy QPs, setting the Q_Key value reserved for that proxy/tunnel pair. Since not all the above adjustments occur in all the QP transitions, the QP transitions require separate wrapper functions. Secondly, initialize the P_Key virtualization table to its default values: Master virtualized table is 1-1 with the real P_Key table, guest virtualized table has P_Key index 0 mapped to the real P_Key index 0, and all the other P_Key indices mapped to the reserved (invalid) P_Key at index 127. Finally, add logic in smp_snoop for maintaining the phys_P_Key_cache. and generating events on the master only if a P_Key actually changed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30IB/mlx4: Initialize SR-IOV IB support for slaves in master contextJack Morgenstein
Allocate SR-IOV paravirtualization resources and MAD demuxing contexts on the master. This has two parts. The first part is to initialize the structures to contain the contexts. This is done at master startup time in mlx4_ib_init_sriov(). The second part is to actually create the tunneling resources required on the master to support a slave. This is performed the master detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event generated when a slave initializes its comm channel). For the master, there is no such startup event, so it creates its own tunneling resources when it starts up. In addition, the master also creates the real special QPs. The ib_core layer on the master causes creation of proxy special QPs, since the master is also paravirtualized at the ib_core layer. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP supportJack Morgenstein
1. Introduce the basic SR-IOV parvirtualization context objects for multiplexing and demultiplexing MADs. 2. Introduce support for the new proxy and tunnel QP types. This patch introduces the objects required by the master for managing QP paravirtualization for guests. struct mlx4_ib_sriov is created by the master only. It is a container for the following: 1. All the info required by the PPF to multiplex and de-multiplex MADs (including those from the PF). (struct mlx4_ib_demux_ctx demux) 2. All the info required to manage alias GUIDs (i.e., the GUID at index 0 that each guest perceives. In fact, this is not the GUID which is actually at index 0, but is, in fact, the GUID which is at index[<VF number>] in the physical table. 3. structures which are used to manage CM paravirtualization 4. structures for managing the real special QPs when running in SR-IOV mode. The real SQPs are controlled by the PPF in this case. All SQPs created and controlled by the ib core layer are proxy SQP. struct mlx4_ib_demux_ctx contains the information per port needed to manage paravirtualization: 1. All multicast paravirt info 2. All tunnel-qp paravirt info for the port. 3. GUID-table and GUID-prefix for the port 4. work queues. struct mlx4_ib_demux_pv_ctx contains all the info for managing the paravirtualized QPs for one slave/port. struct mlx4_ib_demux_pv_qp contains the info need to run an individual QP (either tunnel qp or real SQP). Note: We made use of the 2 most significant bits in enum mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h). We need these bits in the low-level driver for internal purposes. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30IB/mlx4: Fill in sq_sig_type in query QPDotan Barak
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30RDMA/cxgb4: Fix error handling in create_qp()Emil Goode
The variable ret is assigned return values in a couple of places, but its value is never returned. This patch makes use of the ret variable so that the caller get correct error codes returned. The following changes are also introduced: - The alloc_oc_sq function can return -ENOSYS or -ENOMEM so we want to get the return value from it. - Change the label names to improve readability. Signed-off-by: Emil Goode <emilgoode@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/team/team.c drivers/net/usb/qmi_wwan.c net/batman-adv/bat_iv_ogm.c net/ipv4/fib_frontend.c net/ipv4/route.c net/l2tp/l2tp_netlink.c The team, fib_frontend, route, and l2tp_netlink conflicts were simply overlapping changes. qmi_wwan and bat_iv_ogm were of the "use HEAD" variety. With help from Antonio Quartulli. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-25Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25bPaul E. McKenney
Resolved conflict in kernel/sched/core.c using Peter Zijlstra's approach from https://lkml.org/lkml/2012/9/5/585.
2012-09-25Merge remote-tracking branch 'tip/smp/hotplug' into next.2012.09.25bPaul E. McKenney
The conflicts between kernel/rcutree.h and kernel/rcutree_plugin.h were due to adjacent insertions and deletions, which were resolved by simply accepting the changes on both branches.
2012-09-21userns: Convert ipathfs to use GLOBAL_ROOT_UID and GLOBAL_ROOT_GIDEric W. Biederman
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-17Merge remote-tracking branch 'pci/pci/gavin-window-alignment' into nextBenjamin Herrenschmidt
Merge Gavin patches from the PCI tree as subsequent powerpc patches are going to depend on them Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-14Merge branches 'cxgb4', 'ipoib', 'mlx4', 'ocrdma' and 'qib' into for-nextRoland Dreier
2012-09-14IB/qib: Fix failure of compliance test C14-024#06_LocalPortNumMike Marciniszyn
Commit 3236b2d469db ("IB/qib: MADs with misset M_Keys should return failure") introduced a return code assignment that unfortunately introduced an unconditional exit for the routine due to missing braces. This patch adds the braces to correct the original patch. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-14RDMA/ocrdma: Fix CQE expansion of unsignaled WQEParav Pandit
Fix CQE expansion of unsignaled WQE -- don't expand the CQE when the WQE index of the completed CQE matches with last pending WQE (tail) in the queue. Signed-off-by: Parav Pandit <parav.pandit@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-13Merge commit 'v3.6-rc5' into pci/gavin-window-alignmentBjorn Helgaas
* commit 'v3.6-rc5': (1098 commits) Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync powerpc: Update DSCR on all CPUs when writing sysfs dscr_default powerpc/powernv: Always go into nap mode when CPU is offline powerpc: Give hypervisor decrementer interrupts their own handler powerpc/vphn: Fix arch_update_cpu_topology() return value ARM: gemini: fix the gemini build ... Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/rapidio/devices/tsi721.c
2012-09-13Merge commit 'v3.6-rc5' into nextBjorn Helgaas
* commit 'v3.6-rc5': (1098 commits) Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync powerpc: Update DSCR on all CPUs when writing sysfs dscr_default powerpc/powernv: Always go into nap mode when CPU is offline powerpc: Give hypervisor decrementer interrupts their own handler powerpc/vphn: Fix arch_update_cpu_topology() return value ARM: gemini: fix the gemini build ... Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/rapidio/devices/tsi721.c
2012-09-12Merge branch 'pci/stephen-const' into nextBjorn Helgaas
* pci/stephen-const: make drivers with pci error handlers const scsi: make pci error handlers const netdev: make pci_error_handlers const PCI: Make pci_error_handlers const
2012-09-07RDMA/cxgb4: Move dereference below NULL testWei Yongjun
spatch with a semantic match is used to found this. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-07make drivers with pci error handlers constStephen Hemminger
Covers the rest of the uses of pci error handler. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-09-07Merge branch 'merge' into nextBenjamin Herrenschmidt
Brings in various bug fixes from 3.6-rcX
2012-09-05RDMA/cxgb4: Update RDMA/cxgb4 due to macro definition removal in cxgb4 driverVipul Pandya
cxgb4 driver has duplicate definitions of registers which will be removed. This patch updates the RDMA/cxgb4 driver accordingly. Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Reviewed-by: Sivakumar Subramani <sivasu@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-05IB/ehca: Remove uses of virt_to_abs() and abs_to_virt()Michael Ellerman
abs_to_virt() simply calls __va() and we'd like to get rid of it, so replace all abs_to_virt() uses with __va(). __va() returns void *, so when assigning to a pointer there's no need to cast. Similarly virt_to_abs() just calls __pa(). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-05IB/ehca: Don't use phys_to_abs(), it's a nopMichael Ellerman
Since commit f5339277 phys_to_abs() is 100% a nop, so just remove it. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-08-23IB/qib: Use PCI Express Capability accessorsJiang Liu
Use PCI Express Capability access functions to simplify qib driver. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
2012-08-23IB/mthca: Use PCI Express Capability accessorsJiang Liu
Use PCI Express Capability access functions to simplify mthca driver. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Roland Dreier <roland@purestorage.com>
2012-08-20infiniband: ehca: Fix compiler warningsPaul E. McKenney
Fix comp_task() to return void to match smp_hotplug_thread's thread_fn member, and adjust indirection on the ->cpu_comp_threads per-CPU member of the ehca_comp_pool structure. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
2012-08-20infiniband: ehca: Fix while->do-while conversion typoPaul E. McKenney
This commit just adds a needed semicolon. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
2012-08-16Merge branches 'cma', 'ipoib', 'misc', 'mlx4', 'ocrdma', 'qib' and 'srp' ↵Roland Dreier
into for-next
2012-08-16IB/mlx4: Check iboe netdev pointer before dereferencing itKleber Sacilotto de Souza
Unlike other parts of the mlx4_ib code, the function build_mlx_header() doesn't check if the iboe netdev of the given port is valid before dereferencing it, which can cause a crash if the ethernet interface has already been taken down. Fix this by checking for a valid netdev pointer before using it to get the port MAC address. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-08-15IB/qib: Fix error return code in qib_init_7322_variables()Julia Lawall
Convert a 0 error return code to a negative one, as returned elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier ret; expression e,e1,e2,e3,e4,x; @@ ( if (\(ret != 0\|ret < 0\) || ...) { ... return ...; } | ret = 0 ) ... when != ret = e1 *x = \(kmalloc\|kzalloc\|kcalloc\|devm_kzalloc\|ioremap\|ioremap_nocache\|devm_ioremap\|devm_ioremap_nocache\)(...); ... when != x = e2 when != ret = e3 *if (x == NULL || ...) { ... when != ret = e4 * return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-08-15IB: Fix typos in infiniband driversMasanari Iida
Correct spelling typos in comments in drivers/infiniband. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-08-13workqueue: use mod_delayed_work() instead of cancel + queueTejun Heo
Convert delayed_work users doing cancel_delayed_work() followed by queue_delayed_work() to mod_delayed_work(). Most conversions are straight-forward. Ones worth mentioning are, * drivers/edac/edac_mc.c: edac_mc_workq_setup() converted to always use mod_delayed_work() and cancel loop in edac_mc_reset_delay_period() is dropped. * drivers/platform/x86/thinkpad_acpi.c: No need to remember whether watchdog is active or not. @fan_watchdog_active and related code dropped. * drivers/power/charger-manager.c: Seemingly a lot of delayed_work_pending() abuse going on here. [delayed_]work_pending() are unsynchronized and racy when used like this. I converted one instance in fullbatt_handler(). Please conver the rest so that it invokes workqueue APIs for the intended target state rather than trying to game work item pending state transitions. e.g. if timer should be modified - call mod_delayed_work(), canceled - call cancel_delayed_work[_sync](). * drivers/thermal/thermal_sys.c: thermal_zone_device_set_polling() simplified. Note that round_jiffies() calls in this function are meaningless. round_jiffies() work on absolute jiffies not delta delay used by delayed_work. v2: Tomi pointed out that __cancel_delayed_work() users can't be safely converted to mod_delayed_work(). They could be calling it from irq context and if that happens while delayed_work_timer_fn() is running, it could deadlock. __cancel_delayed_work() users are dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Doug Thompson <dougthompson@xmission.com> Cc: David Airlie <airlied@linux.ie> Cc: Roland Dreier <roland@kernel.org> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Johannes Berg <johannes@sipsolutions.net>
2012-08-13infiniband: Ehca: Use hotplug thread infrastructureThomas Gleixner
Get rid of the hotplug notifiers and use the generic hotplug thread infrastructure. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20120716103948.775527032@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-08-10RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevsRoland Dreier
If CONFIG_VLAN_8021Q is not set, then vlan_dev_real_dev() just goes BUG(), so we shouldn't call it unless we're actually dealing with a VLAN netdev. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-08-10IB/mlx4: Fix possible deadlock on sm_lock spinlockJack Morgenstein
The sm_lock spinlock is taken in the process context by mlx4_ib_modify_device, and in the interrupt context by update_sm_ah, so we need to take that spinlock with irqsave, and release it with irqrestore. Lockdeps reports this as follows: [ INFO: inconsistent lock state ] 3.5.0+ #20 Not tainted inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (&(&ibdev->sm_lock)->rlock){?.+...}, at: [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib] {HARDIRQ-ON-W} state was registered at: [<ffffffff810b84a0>] mark_irqflags+0x120/0x190 [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0 [<ffffffff810b9f51>] lock_acquire+0xb1/0x150 [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50 [<ffffffffa028d563>] mlx4_ib_modify_device+0x63/0x240 [mlx4_ib] [<ffffffffa026d1fc>] ib_modify_device+0x1c/0x20 [ib_core] [<ffffffffa026c353>] set_node_desc+0x83/0xc0 [ib_core] [<ffffffff8136a150>] dev_attr_store+0x20/0x30 [<ffffffff81201fd6>] sysfs_write_file+0xe6/0x170 [<ffffffff8118da38>] vfs_write+0xc8/0x190 [<ffffffff8118dc01>] sys_write+0x51/0x90 [<ffffffff8155b869>] system_call_fastpath+0x16/0x1b ... *** DEADLOCK *** 1 lock held by swapper/0/0: stack backtrace: Pid: 0, comm: swapper/0 Not tainted 3.5.0+ #20 Call Trace: <IRQ> [<ffffffff810b7bea>] print_usage_bug+0x18a/0x190 [<ffffffff810b7370>] ? print_irq_inversion_bug+0x210/0x210 [<ffffffff810b7fb2>] mark_lock_irq+0xf2/0x280 [<ffffffff810b8290>] mark_lock+0x150/0x240 [<ffffffff810b84ef>] mark_irqflags+0x16f/0x190 [<ffffffff810b9ce7>] __lock_acquire+0x307/0x4c0 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff810b9f51>] lock_acquire+0xb1/0x150 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff815523b1>] _raw_spin_lock+0x41/0x50 [<ffffffffa028af1d>] ? update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffffa026b2fa>] ? ib_create_ah+0x1a/0x40 [ib_core] [<ffffffffa028af1d>] update_sm_ah+0xad/0x100 [mlx4_ib] [<ffffffff810c27c3>] ? is_module_address+0x23/0x30 [<ffffffffa028b05b>] handle_port_mgmt_change_event+0xeb/0x150 [mlx4_ib] [<ffffffffa028c177>] mlx4_ib_event+0x117/0x160 [mlx4_ib] [<ffffffff81552501>] ? _raw_spin_lock_irqsave+0x61/0x70 [<ffffffffa022718c>] mlx4_dispatch_event+0x6c/0x90 [mlx4_core] [<ffffffffa0221b40>] mlx4_eq_int+0x500/0x950 [mlx4_core] Reported by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-30Merge branches 'cma', 'ipoib', 'ocrdma' and 'qib' into for-nextRoland Dreier
2012-07-29IB/qib: Fix size of cc_supported_table_entriesMike Marciniszyn
Commit 36a8f01cd24b ("IB/qib: Add congestion control agent implementation") tries to store the value 1984 in a u8, which leads to truncation. Fix this by making the member big enough. This bug was detected by a smatch warning. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ramkrishna Vepa <ramkrishna.vepa@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-27RDMA/ocrdma: Fix check of GSI CQsRoland Dreier
It looks like one check was accidentally duplicated, and the other 3 checks were left out. This was detected by scripts/coccinelle/tests/doubletest.cocci: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:895:6-54: duplicated argument to && or || Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-24Merge tag 'rdma-for-3.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes from Roland Dreier: - Updates to the qib low-level driver - First chunk of changes for SR-IOV support for mlx4 IB - RDMA CM support for IPv6-only binding - Other misc cleanups and fixes Fix up some add-add conflicts in include/linux/mlx4/device.h and drivers/net/ethernet/mellanox/mlx4/main.c * tag 'rdma-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits) IB/qib: checkpatch fixes IB/qib: Add congestion control agent implementation IB/qib: Reduce sdma_lock contention IB/qib: Fix an incorrect log message IB/qib: Fix QP RCU sparse warnings mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them mlx4_core: Allow guests to have IB ports mlx4_core: Implement mechanism for reserved Q_Keys net/mlx4_core: Free ICM table in case of error IB/cm: Destroy idr as part of the module init error flow mlx4_core: Remove double function declarations IB/mlx4: Fill the masked_atomic_cap attribute in query device IB/mthca: Fill in sq_sig_type in query QP IB/mthca: Warning about event for non-existent QPs should show event type IB/qib: Fix sparse RCU warnings in qib_keys.c net/mlx4_core: Initialize IB port capabilities for all slaves mlx4: Use port management change event instead of smp_snoop IB/qib: RCU locking for MR validation IB/qib: Avoid returning EBUSY from MR deregister IB/qib: Fix UC MR refs for immediate operations ...
2012-07-22Merge branches 'cma', 'cxgb4', 'misc', 'mlx4-sriov', 'mlx-cleanups', ↵Roland Dreier
'ocrdma' and 'qib' into for-linus
2012-07-19IB/qib: checkpatch fixesMike Marciniszyn
Elminate some simple_strto* usage. checkpatch also noted pr_ conversations, which have been done as recommended. The pr_fmt() define is used to shorten line length. Other multi-line string warnings are also elmininated. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-19IB/qib: Add congestion control agent implementationMike Marciniszyn
Add a congestion control agent in the driver that handles gets and sets from the congestion control manager in the fabric for the Performance Scale Messaging (PSM) library. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-19IB/qib: Reduce sdma_lock contentionMike Marciniszyn
Profiling has shown that sdma_lock is proving a bottleneck for performance. The situations include: - RDMA reads when krcvqs > 1 - post sends from multiple threads For RDMA read the current global qib_wq mechanism runs on all CPUs and contends for the sdma_lock when multiple RMDA read requests are fielded on differenct CPUs. For post sends, the direct call to qib_do_send() from multiple threads causes the contention. Since the sdma mechanism is per port, this fix converts the existing workqueue to a per port single thread workqueue to reduce the lock contention in the RDMA read case, and for any other case where the QP is scheduled via the workqueue mechanism from more than 1 CPU. For the post send case, This patch modifies the post send code to test for a non empty sdma engine. If the sdma is not idle the (now single thread) workqueue will be used to trigger the send engine instead of the direct call to qib_do_send(). Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-19IB/qib: Fix an incorrect log messageBetty Dall
There is a cut-and-paste typo in the function qib_pci_slot_reset() where it prints that the "link_reset" function is called rather than the "slot_reset" function. This makes the message misleading. Signed-off-by: Betty Dall <betty.dall@hp.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-19{NET,IB}/mlx4: Add rmap support to mlx4_assign_eqAmir Vadai
Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap. If supplied, the assigned IRQ is tracked using rmap infrastructure. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17IB/qib: Fix QP RCU sparse warningsMike Marciniszyn
Commit af061a644a0e ("IB/qib: Use RCU for qpn lookup") introduced sparse warnings. This patch corrects those issues. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-11mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and ↵Jack Morgenstein
paravirtualize them To allow easy paravirtualization of P_Key and GID table sizes, keep paravirtualized sizes in mlx4_dev->caps, but save the actual physical sizes from FW in struct: mlx4_dev->phys_cap. In addition, in SR-IOV mode, do the following: 1. Reduce reported P_Key table size by 1. This is done to reserve the highest P_Key index for internal use, for declaring an invalid P_Key in P_Key paravirtualization. We require a P_Key index which always contain an invalid P_Key value for this purpose (i.e., one which cannot be modified by the subnet manager). The way to do this is to reduce the P_Key table size reported to the subnet manager by 1, so that it will not attempt to access the P_Key at index #127. 2. Paravirtualize the GID table size to 1. Thus, each guest sees only a single GID (at its paravirtualized index 0). In addition, since we are paravirtualizing the GID table size to 1, we add paravirtualization of the master GID event here (i.e., we do not do ib_dispatch_event() for the GUID change event on the master, since its (only) GUID never changes). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-11IB/mlx4: Fill the masked_atomic_cap attribute in query deviceDotan Barak
When the user queries for device capabilities, fill in the masked_atomic_cap attribute with the real support level of atomic capabilities instead of using a hard coded value. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-11IB/mthca: Fill in sq_sig_type in query QPDotan Barak
The query QP code was didn't fill that attribute, do that. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-11IB/mthca: Warning about event for non-existent QPs should show event typeDotan Barak
Events received for non-existent QPs should generate a warning that includes the event type that was received. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>