diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2016-03-22 15:48:44 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2016-03-22 15:48:44 -0700 |
commit | b8ba4526832fcccba7f46e55ce9a8b79902bdcec (patch) | |
tree | 5f2fc306e9909c9936efc017bf2c8fde49d8c9bb /drivers/infiniband/sw/rdmavt/mcast.c | |
parent | 01cde1538e1dff4254e340f606177a870131a01f (diff) | |
parent | 520a07bff6fbb23cac905007d74c67058b189acb (diff) |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma updates from Doug Ledford:
"Round two of 4.6 merge window patches.
This is a monster pull request. I held off on the hfi1 driver updates
(the hfi1 driver is intimately tied to the qib driver and the new
rdmavt software library that was created to help both of them) in my
first pull request. The hfi1/qib/rdmavt update is probably 90% of
this pull request. The hfi1 driver is being left in staging so that
it can be fixed up in regards to the API that Al and yourself didn't
like. Intel has agreed to do the work, but in the meantime, this
clears out 300+ patches in the backlog queue and brings my tree and
their tree closer to sync.
This also includes about 10 patches to the core and a few to mlx5 to
create an infrastructure for configuring SRIOV ports on IB devices.
That series includes one patch to the net core that we sent to netdev@
and Dave Miller with each of the three revisions to the series. We
didn't get any response to the patch, so we took that as implicit
approval.
Finally, this series includes Intel's new iWARP driver for their x722
cards. It's not nearly the beast as the hfi1 driver. It also has a
linux-next merge issue, but that has been resolved and it now passes
just fine.
Summary:
- A few minor core fixups needed for the next patch series
- The IB SRIOV series. This has bounced around for several versions.
Of note is the fact that the first patch in this series effects the
net core. It was directed to netdev and DaveM for each iteration
of the series (three versions total). Dave did not object, but did
not respond either. I've taken this as permission to move forward
with the series.
- The new Intel X722 iWARP driver
- A huge set of updates to the Intel hfi1 driver. Of particular
interest here is that we have left the driver in staging since it
still has an API that people object to. Intel is working on a fix,
but getting these patches in now helps keep me sane as the upstream
and Intel's trees were over 300 patches apart"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (362 commits)
IB/ipoib: Allow mcast packets from other VFs
IB/mlx5: Implement callbacks for manipulating VFs
net/mlx5_core: Implement modify HCA vport command
net/mlx5_core: Add VF param when querying vport counter
IB/ipoib: Add ndo operations for configuring VFs
IB/core: Add interfaces to control VF attributes
IB/core: Support accessing SA in virtualized environment
IB/core: Add subnet prefix to port info
IB/mlx5: Fix decision on using MAD_IFC
net/core: Add support for configuring VF GUIDs
IB/{core, ulp} Support above 32 possible device capability flags
IB/core: Replace setting the zero values in ib_uverbs_ex_query_device
net/mlx5_core: Introduce offload arithmetic hardware capabilities
net/mlx5_core: Refactor device capability function
net/mlx5_core: Fix caching ATOMIC endian mode capability
ib_srpt: fix a WARN_ON() message
i40iw: Replace the obsolete crypto hash interface with shash
IB/hfi1: Add SDMA cache eviction algorithm
IB/hfi1: Switch to using the pin query function
IB/hfi1: Specify mm when releasing pages
...
Diffstat (limited to 'drivers/infiniband/sw/rdmavt/mcast.c')
-rw-r--r-- | drivers/infiniband/sw/rdmavt/mcast.c | 417 |
1 files changed, 417 insertions, 0 deletions
diff --git a/drivers/infiniband/sw/rdmavt/mcast.c b/drivers/infiniband/sw/rdmavt/mcast.c new file mode 100644 index 000000000000..983d319ac976 --- /dev/null +++ b/drivers/infiniband/sw/rdmavt/mcast.c @@ -0,0 +1,417 @@ +/* + * Copyright(c) 2016 Intel Corporation. + * + * This file is provided under a dual BSD/GPLv2 license. When using or + * redistributing this file, you may do so under either license. + * + * GPL LICENSE SUMMARY + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * BSD LICENSE + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * - Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * - Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + */ + +#include <linux/slab.h> +#include <linux/sched.h> +#include <linux/rculist.h> +#include <rdma/rdma_vt.h> +#include <rdma/rdmavt_qp.h> + +#include "mcast.h" + +/** + * rvt_driver_mcast - init resources for multicast + * @rdi: rvt dev struct + * + * This is per device that registers with rdmavt + */ +void rvt_driver_mcast_init(struct rvt_dev_info *rdi) +{ + /* + * Anything that needs setup for multicast on a per driver or per rdi + * basis should be done in here. + */ + spin_lock_init(&rdi->n_mcast_grps_lock); +} + +/** + * mcast_qp_alloc - alloc a struct to link a QP to mcast GID struct + * @qp: the QP to link + */ +static struct rvt_mcast_qp *rvt_mcast_qp_alloc(struct rvt_qp *qp) +{ + struct rvt_mcast_qp *mqp; + + mqp = kmalloc(sizeof(*mqp), GFP_KERNEL); + if (!mqp) + goto bail; + + mqp->qp = qp; + atomic_inc(&qp->refcount); + +bail: + return mqp; +} + +static void rvt_mcast_qp_free(struct rvt_mcast_qp *mqp) +{ + struct rvt_qp *qp = mqp->qp; + + /* Notify hfi1_destroy_qp() if it is waiting. */ + if (atomic_dec_and_test(&qp->refcount)) + wake_up(&qp->wait); + + kfree(mqp); +} + +/** + * mcast_alloc - allocate the multicast GID structure + * @mgid: the multicast GID + * + * A list of QPs will be attached to this structure. + */ +static struct rvt_mcast *rvt_mcast_alloc(union ib_gid *mgid) +{ + struct rvt_mcast *mcast; + + mcast = kzalloc(sizeof(*mcast), GFP_KERNEL); + if (!mcast) + goto bail; + + mcast->mgid = *mgid; + INIT_LIST_HEAD(&mcast->qp_list); + init_waitqueue_head(&mcast->wait); + atomic_set(&mcast->refcount, 0); + +bail: + return mcast; +} + +static void rvt_mcast_free(struct rvt_mcast *mcast) +{ + struct rvt_mcast_qp *p, *tmp; + + list_for_each_entry_safe(p, tmp, &mcast->qp_list, list) + rvt_mcast_qp_free(p); + + kfree(mcast); +} + +/** + * rvt_mcast_find - search the global table for the given multicast GID + * @ibp: the IB port structure + * @mgid: the multicast GID to search for + * + * The caller is responsible for decrementing the reference count if found. + * + * Return: NULL if not found. + */ +struct rvt_mcast *rvt_mcast_find(struct rvt_ibport *ibp, union ib_gid *mgid) +{ + struct rb_node *n; + unsigned long flags; + struct rvt_mcast *found = NULL; + + spin_lock_irqsave(&ibp->lock, flags); + n = ibp->mcast_tree.rb_node; + while (n) { + int ret; + struct rvt_mcast *mcast; + + mcast = rb_entry(n, struct rvt_mcast, rb_node); + + ret = memcmp(mgid->raw, mcast->mgid.raw, + sizeof(union ib_gid)); + if (ret < 0) { + n = n->rb_left; + } else if (ret > 0) { + n = n->rb_right; + } else { + atomic_inc(&mcast->refcount); + found = mcast; + break; + } + } + spin_unlock_irqrestore(&ibp->lock, flags); + return found; +} +EXPORT_SYMBOL(rvt_mcast_find); + +/** + * mcast_add - insert mcast GID into table and attach QP struct + * @mcast: the mcast GID table + * @mqp: the QP to attach + * + * Return: zero if both were added. Return EEXIST if the GID was already in + * the table but the QP was added. Return ESRCH if the QP was already + * attached and neither structure was added. + */ +static int rvt_mcast_add(struct rvt_dev_info *rdi, struct rvt_ibport *ibp, + struct rvt_mcast *mcast, struct rvt_mcast_qp *mqp) +{ + struct rb_node **n = &ibp->mcast_tree.rb_node; + struct rb_node *pn = NULL; + int ret; + + spin_lock_irq(&ibp->lock); + + while (*n) { + struct rvt_mcast *tmcast; + struct rvt_mcast_qp *p; + + pn = *n; + tmcast = rb_entry(pn, struct rvt_mcast, rb_node); + + ret = memcmp(mcast->mgid.raw, tmcast->mgid.raw, + sizeof(union ib_gid)); + if (ret < 0) { + n = &pn->rb_left; + continue; + } + if (ret > 0) { + n = &pn->rb_right; + continue; + } + + /* Search the QP list to see if this is already there. */ + list_for_each_entry_rcu(p, &tmcast->qp_list, list) { + if (p->qp == mqp->qp) { + ret = ESRCH; + goto bail; + } + } + if (tmcast->n_attached == + rdi->dparms.props.max_mcast_qp_attach) { + ret = ENOMEM; + goto bail; + } + + tmcast->n_attached++; + + list_add_tail_rcu(&mqp->list, &tmcast->qp_list); + ret = EEXIST; + goto bail; + } + + spin_lock(&rdi->n_mcast_grps_lock); + if (rdi->n_mcast_grps_allocated == rdi->dparms.props.max_mcast_grp) { + spin_unlock(&rdi->n_mcast_grps_lock); + ret = ENOMEM; + goto bail; + } + + rdi->n_mcast_grps_allocated++; + spin_unlock(&rdi->n_mcast_grps_lock); + + mcast->n_attached++; + + list_add_tail_rcu(&mqp->list, &mcast->qp_list); + + atomic_inc(&mcast->refcount); + rb_link_node(&mcast->rb_node, pn, n); + rb_insert_color(&mcast->rb_node, &ibp->mcast_tree); + + ret = 0; + +bail: + spin_unlock_irq(&ibp->lock); + + return ret; +} + +/** + * rvt_attach_mcast - attach a qp to a multicast group + * @ibqp: Infiniband qp + * @igd: multicast guid + * @lid: multicast lid + * + * Return: 0 on success + */ +int rvt_attach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + struct rvt_qp *qp = ibqp_to_rvtqp(ibqp); + struct rvt_dev_info *rdi = ib_to_rvt(ibqp->device); + struct rvt_ibport *ibp = rdi->ports[qp->port_num - 1]; + struct rvt_mcast *mcast; + struct rvt_mcast_qp *mqp; + int ret = -ENOMEM; + + if (ibqp->qp_num <= 1 || qp->state == IB_QPS_RESET) + return -EINVAL; + + /* + * Allocate data structures since its better to do this outside of + * spin locks and it will most likely be needed. + */ + mcast = rvt_mcast_alloc(gid); + if (!mcast) + return -ENOMEM; + + mqp = rvt_mcast_qp_alloc(qp); + if (!mqp) + goto bail_mcast; + + switch (rvt_mcast_add(rdi, ibp, mcast, mqp)) { + case ESRCH: + /* Neither was used: OK to attach the same QP twice. */ + ret = 0; + goto bail_mqp; + case EEXIST: /* The mcast wasn't used */ + ret = 0; + goto bail_mcast; + case ENOMEM: + /* Exceeded the maximum number of mcast groups. */ + ret = -ENOMEM; + goto bail_mqp; + default: + break; + } + + return 0; + +bail_mqp: + rvt_mcast_qp_free(mqp); + +bail_mcast: + rvt_mcast_free(mcast); + + return ret; +} + +/** + * rvt_detach_mcast - remove a qp from a multicast group + * @ibqp: Infiniband qp + * @igd: multicast guid + * @lid: multicast lid + * + * Return: 0 on success + */ +int rvt_detach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + struct rvt_qp *qp = ibqp_to_rvtqp(ibqp); + struct rvt_dev_info *rdi = ib_to_rvt(ibqp->device); + struct rvt_ibport *ibp = rdi->ports[qp->port_num - 1]; + struct rvt_mcast *mcast = NULL; + struct rvt_mcast_qp *p, *tmp, *delp = NULL; + struct rb_node *n; + int last = 0; + int ret = 0; + + if (ibqp->qp_num <= 1 || qp->state == IB_QPS_RESET) + return -EINVAL; + + spin_lock_irq(&ibp->lock); + + /* Find the GID in the mcast table. */ + n = ibp->mcast_tree.rb_node; + while (1) { + if (!n) { + spin_unlock_irq(&ibp->lock); + return -EINVAL; + } + + mcast = rb_entry(n, struct rvt_mcast, rb_node); + ret = memcmp(gid->raw, mcast->mgid.raw, + sizeof(union ib_gid)); + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + break; + } + + /* Search the QP list. */ + list_for_each_entry_safe(p, tmp, &mcast->qp_list, list) { + if (p->qp != qp) + continue; + /* + * We found it, so remove it, but don't poison the forward + * link until we are sure there are no list walkers. + */ + list_del_rcu(&p->list); + mcast->n_attached--; + delp = p; + + /* If this was the last attached QP, remove the GID too. */ + if (list_empty(&mcast->qp_list)) { + rb_erase(&mcast->rb_node, &ibp->mcast_tree); + last = 1; + } + break; + } + + spin_unlock_irq(&ibp->lock); + /* QP not attached */ + if (!delp) + return -EINVAL; + + /* + * Wait for any list walkers to finish before freeing the + * list element. + */ + wait_event(mcast->wait, atomic_read(&mcast->refcount) <= 1); + rvt_mcast_qp_free(delp); + + if (last) { + atomic_dec(&mcast->refcount); + wait_event(mcast->wait, !atomic_read(&mcast->refcount)); + rvt_mcast_free(mcast); + spin_lock_irq(&rdi->n_mcast_grps_lock); + rdi->n_mcast_grps_allocated--; + spin_unlock_irq(&rdi->n_mcast_grps_lock); + } + + return 0; +} + +/** + *rvt_mast_tree_empty - determine if any qps are attached to any mcast group + *@rdi: rvt dev struct + * + * Return: in use count + */ +int rvt_mcast_tree_empty(struct rvt_dev_info *rdi) +{ + int i; + int in_use = 0; + + for (i = 0; i < rdi->dparms.nports; i++) + if (rdi->ports[i]->mcast_tree.rb_node) + in_use++; + return in_use; +} |