summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/cxgb4/provider.c
diff options
context:
space:
mode:
authorSteve Wise <swise@opengridcomputing.com>2011-05-09 22:06:23 -0700
committerRoland Dreier <roland@purestorage.com>2011-05-09 22:06:23 -0700
commit2f25e9a540951ebd533b9b98d2259deb44b0b476 (patch)
treea2548d5cbcb9e26b679e9f783c78cf088f7b2123 /drivers/infiniband/hw/cxgb4/provider.c
parentd9594d990a528d4c444777d0f360bb50c6114825 (diff)
RDMA/cxgb4: EEH errors can hang the driver
A few more EEH fixes: c4iw_wait_for_reply(): detect fatal EEH condition on timeout and return an error. The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH event followed by a ib_register_device() when the device was reinitialized. However, the RDMA core doesn't allow multiple iterations of register/deregister by the provider. See drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where the kobject ref is held until the device is deallocated in ib_deallocate_device(). Calling deregister adds this kobj reference, and then a subsequent register call will generate a WARN_ON() from the kobject subsystem because the kobject is being initialized but is already initialized with the ref held. So the provider must deregister and dealloc when resetting for an EEH event, then alloc/register to re-initialize. To do this, we cannot use the device ptr as our ULD handle since it will change with each reallocation. This commit adds a ULD context struct which is used as the ULD handle, and then contains the device pointer and other state needed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Diffstat (limited to 'drivers/infiniband/hw/cxgb4/provider.c')
-rw-r--r--drivers/infiniband/hw/cxgb4/provider.c2
1 files changed, 0 insertions, 2 deletions
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index f66dd8bf512..5b9e4220ca0 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -516,7 +516,6 @@ int c4iw_register_device(struct c4iw_dev *dev)
if (ret)
goto bail2;
}
- dev->registered = 1;
return 0;
bail2:
ib_unregister_device(&dev->ibdev);
@@ -535,6 +534,5 @@ void c4iw_unregister_device(struct c4iw_dev *dev)
c4iw_class_attributes[i]);
ib_unregister_device(&dev->ibdev);
kfree(dev->ibdev.iwcm);
- dev->registered = 0;
return;
}