summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mthca/mthca_dev.h
AgeCommit message (Collapse)Author
2009-09-05IB/mthca: Don't allow userspace open while recovering from catastrophic errorJack Morgenstein
Userspace apps are supposed to release all ib device resources if they receive a fatal async event (IBV_EVENT_DEVICE_FATAL). However, the app has no way of knowing when the device has come back up, except to repeatedly attempt ibv_open_device() until it succeeds. However, currently there is no protection against the open succeeding while the device is in being removed following the fatal event. In this case, the open will succeed, but as a result the device waits in the middle of its removal until the new app releases its resources -- and the new app will not do so, since the open succeeded at a point following the fatal event generation. This patch adds an "active" flag to the device. The active flag is set to false (in the fatal event flow) before the "fatal" event is generated, so any subsequent ibv_dev_open() call to the device will fail until the device comes back up, thus preventing the above deadlock. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27IB/mthca: Add module parameter for number of MTTs per segmentEli Cohen
The current MTT allocator uses kmalloc() to allocate a buffer for its buddy allocator, and thus is limited in the amount of MTT segments that it can control. As a result, the size of memory that can be registered is limited too. This patch uses a module parameter to control the number of MTT entries that each segment represents, allowing more memory to be registered with the same number of segments. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22IB/mthca: Keep free count for MTT buddy allocatorRoland Dreier
MTT entries are allocated with a buddy allocator, which just keeps bitmaps for each level of the buddy table. However, all free space starts out at the highest order, and small allocations start scanning from the lowest order. When the lowest order tables have no free space, this can lead to scanning potentially millions of bits before finding a free entry at a higher order. We can avoid this by just keeping a count of how many free entries each order has, and skipping the bitmap scan when an order is completely empty. This provides a nice performance boost for a negligible increase in memory usage. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mthca: Remove "stop" flag for catastrophic error polling timerRoland Dreier
Since we use del_timer_sync() anyway, there's no need for an additional flag to tell the timer not to rearm. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA: Remove subversion $Id tagsRoland Dreier
They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-18Convert asm/semaphore.h users to linux/semaphore.hMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-16IB/mthca: Update module version and release dateJack Morgenstein
The ib_mthca driver has been stable for a while, so bump the version number to 1.0 to indicate this. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mthca: Formatting cleanupsRoland Dreier
Fix a few whitespace and other coding style problems. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IB/mthca: Remove MSI support as scheduledAdrian Bunk
Remove MSI support from the mthca driver, as scheduled. There is no reason to use MSI instead of MSI-X, since MSI-X performs better. No one has spoken up since MSI support was deprecated in commit f6be6fbe ("IB/mthca: Schedule MSI support for removal"), so apparently the MSI support is unused. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mthca: Increase max number of QPs per multicast group to 56Roland Dreier
Increase the number of QPs allowed per multicast group from 8 to 56. This allows for one QP per core on 16-core systems, which are now quite common, and allows some space for future growth. This is basically the same patch that Jack Morgenstein <jackm@dev.mellanox.co.il> just supplied for mlx4. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-06IB: Return "maybe missed event" hint from ib_req_notify_cq()Roland Dreier
The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-12IB/mthca: Always fill MTTs from CPUMichael S. Tsirkin
Speed up memory registration by filling in MTTs directly when the CPU can write directly to the whole table (all mem-free cards, and to Tavor mode on 64-bit systems with the patch I posted earlier). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/mthca: Recover from catastrophic errorsJack Morgenstein
Trigger device remove and then add when a catastrophic error is detected in hardware. This, in turn, will cause a device reset, which we hope will recover from the catastrophic condition. Since this might interefere with debugging the root cause, add a module option to suppress this behaviour. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Pass userspace data to modify_srq and modify_qp methodsRalph Campbell
Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-09IB/mthca: Fix race in reference countingRoland Dreier
Fix races in in destroying various objects. If a destroy routine waits for an object to become free by doing wait_event(&obj->wait, !atomic_read(&obj->refcount)); /* now clean up and destroy the object */ and another place drops a reference to the object by doing if (atomic_dec_and_test(&obj->refcount)) wake_up(&obj->wait); then this is susceptible to a race where the wait_event() and final freeing of the object occur between the atomic_dec_and_test() and the wake_up(). And this is a use-after-free, since wake_up() will be called on part of the already-freed object. Fix this in mthca by replacing the atomic_t refcounts with plain old integers protected by a spinlock. This makes it possible to do the decrement of the reference count and the wake_up() so that it appears as a single atomic operation to the code waiting on the wait queue. While touching this code, also simplify mthca_cq_clean(): the CQ being cleaned cannot go away, because it still has a QP attached to it. So there's no reason to be paranoid and look up the CQ by number; it's perfectly safe to use the pointer that the callers already have. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-12IB/mthca: Fix max_srq_sge returned by ib_query_device for Tavor devicesJack Morgenstein
The driver allocates SRQ WQEs size with a power of 2 size both for Tavor and for memfree. For Tavor, however, the hardware only requires the WQE size to be a multiple of 16, not a power of 2, and the max number of scatter-gather allowed is reported accordingly by the firmware (and this is the value currently returned by ib_query_device() and ibv_query_device()). If the max number of scatter/gather entries reported by the FW is used when creating an SRQ, the creation will fail for Tavor, since the required WQE size will be increased to the next power of 2, which turns out to be larger than the device permitted max WQE size (which is not a power of 2). This patch reduces the reported SRQ max wqe size so that it can be used successfully in creating an SRQ on Tavor HCAs. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10IB: simplify static rate encodingJack Morgenstein
Push translation of static rate to HCA format into low-level drivers, where it belongs. For static rate encoding, use encoding of rate field from IB standard PathRecord, with addition of value 0, for backwards compatibility with current usage. The changes are: - Add enum ib_rate to midlayer includes. - Get rid of static rate translation in IPoIB; just use static rate directly from Path and MulticastGroup records. - Update mthca driver to translate absolute static rate into the format used by hardware. This also fixes mthca's static rate handling for HCAs that are capable of 4X DDR. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-02IB/mthca: Always build debugging code unless CONFIG_EMBEDDED=yRoland Dreier
Change the mthca debugging trace output code so that it can enabled and disabled at runtime with the debug_level module parameter in sysfs. Also, don't allow CONFIG_INFINIBAND_MTHCA_DEBUG to be disabled unless CONFIG_EMBEDDED is selected. We want users (and especially distros) to have this turned on unless they really need to save space, because by the time we want debugging output, it's usually too late to rebuild a kernel. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Optimize large messages on Sinai HCAsEli Cohen
Sinai (one-port PCI Express) HCAs get improved throughput for messages bigger than 80 KB in DDR mode if memory keys are formatted in a specific way. The enhancement only works if the memory key table is smaller than 2^24 entries. For larger tables, the enhancement is off and a warning is printed (to avoid silent performance loss). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Implement query_ah methodJack Morgenstein
Implement query_ah (except for AVs which are in HCA memory). This is needed to implement RMPP duplicate session detection on sending side (extraction of DGID/DLID and GRH flag from address handle). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Write FW commands through doorbell pageEli Cohen
This patch is checks whether the HCA supports posting FW commands through a doorbell page (user access region 0, or "UAR0"). If this is supported, the driver maps UAR0 and uses it for FW commands. This can be controlled by the value of a writable module parameter fw_cmd_doorbell. When the parameter is 0, the commands are posted through HCR using the old method; otherwise if HCA is capable commands go through UAR0. This use of UAR0 to post commands eliminates the need for polling the "go" bit prior to posting a new command. Since reading from a PCI device is much more expensive then issuing a posted write, it is expected that issuing FW commands this way will provide better CPU utilization. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Bump driver version and release dateRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Support for query QP and SRQEli Cohen
Implement the query_qp and query_srq methods in mthca. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Add device-specific support for resizing CQsRoland Dreier
Add low-level driver support for resizing CQs (both kernel and userspace) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Make functions that never fail return voidRoland Dreier
The function mthca_free_err_wqe() can never fail, so get rid of its return value. That means handle_error_cqe() doesn't have to check what mthca_free_err_wqe() returns, which means it can't fail either and doesn't have to return anything either. All this results in simpler source code and a slight object code improvement: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10) function old new delta mthca_free_err_wqe 83 81 -2 mthca_poll_cq 1758 1750 -8 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-02-13IB/mthca: bump driver version and release dateRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-30IB/mthca: Semaphore to mutex conversionsRoland Dreier
Convert semaphores to mutexes in mthca. Leave firmware command interface poll_sem and event_sem as semaphores. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-12IB/mthca: Initialize grh_present before using itMichael S. Tsirkin
build_mlx_header() was using sqp->ud_header.grh_present before it was initialized by mthca_read_ah(). Furthermore, header->grh_present is set by ib_ud_header_init, so there's no need to set it again in mthca_read_ah(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-08[PATCH] fix more missing includesTim Schmielau
Include fixes for 2.6.14-git11. Should allow to remove sched.h from module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more to come since I haven't yet checked the other archs. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-10[IB] uverbs: have kernel return QP capabilitiesJack Morgenstein
Move the computation of QP capabilities (max scatter/gather entries, max inline data, etc) into the kernel, and have the uverbs module return the values as part of the create QP response. This keeps precise knowledge of device limits in the low-level kernel driver. This requires an ABI bump, so while we're making changes, get rid of the max_sge parameter for the modify SRQ command -- it's not used and shouldn't be there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-04[IB] mthca: report page size capabilityJack Morgenstein
Report the device's real page size capability in mthca_query_device(). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-29[IB] mthca: report asynchronous CQ eventsMichael S. Tsirkin
Implement reporting asynchronous CQ events in Mellanox HCA driver. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-27[IB] mthca: first pass at catastrophic error reportingRoland Dreier
Add some initial support for detecting and reporting catastrophic errors reported by Mellanox HCAs. We start a periodic timer which polls the catastrophic error reporting buffer in device memory. If an error is detected, we dump the contents of the buffer for port-mortem debugging, and report a fatal asynchronous error to higher levels. In the future we can try to recover from these errors by resetting the device, but this will require some work in higher-level code as well. Let's get this in now, so that we at least get catastrophic errors reported in logs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] mthca: Better limit checking and reportingJack Morgenstein
Check the sizes of CQs, QPs and SRQs when creating objects, and fail instead of creating too-big queues. Also return real limits instead of just plausible-sounding values from mthca_query_device(). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] mthca: SRQ limit reached eventsRoland Dreier
Our hardware supports generating an event when the number of receives posted to a shared receive queue (SRQ) falls below a user-specified limit. Implement mthca_modify_srq() to arm the limit, and add code to handle dispatching SRQ events when they occur. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17[IB] mthca: Report correct atomic capabilityJack Morgenstein
Return correct atomic capability flag from mthca query function. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB/mthca: Add SRQ implementationRoland Dreier
Add mthca support for shared receive queues (SRQs), including userspace SRQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB/mthca: Factor out common queue alloc codeRoland Dreier
Clean up the allocation of memory for queues by factoring out the common code into mthca_buf_alloc() and mthca_buf_free(). Now CQs and QPs share the same queue allocation code, which we'll also use for SRQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB/mthca: Use correct port width capability valueRoland Dreier
When we call the INIT_IB firmware command to bring up a port, use the actual port width capability returned by the QUERY_DEV_LIM command instead of always trying to enable both 1X and 4X. This fixes breakage seen when the firmware is build to allow 4X only. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB/mthca: add HCA board ID to sysfs infoMichael S. Tsirkin
Add support for reporting HCA board ID returned from QUERY_ADAPTER firmware command through sysfs. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB: sparse endianness cleanupSean Hefty
Fix sparse warnings. Use __be* where appropriate. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-08-26[PATCH] IB: Add copyright noticesRoland Dreier
Make some lawyers happy and add copyright notices for people who forgot to include them when they actually touched the code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-07-07[PATCH] IB uverbs: add mthca user QP supportRoland Dreier
Add support for userspace queue pairs (QPs) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07[PATCH] IB uverbs: add mthca user CQ supportRoland Dreier
Add support for userspace completion queues (CQs) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07[PATCH] IB uverbs: add mthca user PD supportRoland Dreier
Add support for userspace protection domains (PDs) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] IB/mthca: Bump versionRoland Dreier
It's about time for a version bump. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] IB/mthca: Align FW command mailboxes to 4KRoland Dreier
Future versions of Mellanox HCA firmware will require command mailboxes to be aligned to 4K. Support this by using a pci_pool to allocate all mailboxes. This has the added benefit of shrinking the source and text of mthca. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] IB/mthca: Split off MTT allocationRoland Dreier
Split allocation of MTT range from creation of MR. This will be useful for implementing shared memory regions and userspace verbs. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27[PATCH] IB/mthca: Add Sun copyright noticeTom Duffy
Add Sun copyright to files modified by Tom Duffy. Signed-off-by: Tom Duffy <tduffy@sun.com> Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16[PATCH] IB/mthca: add support for new MT25204 HCARoland Dreier
Decouple table of HCA features from exact HCA device type. Add a current FW version field so we can warn when someone is using old FW. Add support for new MT25204 HCA. Remove the warning about mem-free support, since it should be pretty solid at this point. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>