summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)Author
2011-05-24drbd: fix disconnect/reconnect loop, if ping-timeout == ping-intLars Ellenberg
If there is no replication traffic within the idle timeout (ping-int seconds), DRBD will send a P_PING, and adjust the timeout to ping-timeout. If there is no P_PING_ACK received within this ping-timeout, DRBD finally drops the connection, and tries to re-establish it. To decide which timeout was active, we compared the current timeout with the ping-timeout, and dropped the connection, if that was the case. By default, ping-int is 10 seconds, ping-timeout is 500 ms. Unfortunately, if you configure ping-timeout to be the same as ping-int, expiry of the idle-timeout had been mistaken for a missing ping ack, and caused an immediate reconnection attempt. Fix: Allow both timeouts to be equal, use a local variable to store which timeout is active. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix potential distributed deadlockLars Ellenberg
We limit ourselves to a configurable maximum number of pages used as temporary bio pages. If the configured "max_buffers" is not big enough to match the bandwidth of the respective deployment, a distributed deadlock could be triggered by e.g. fast online verify and heavy application IO. TCP connections would block on congestion, because both receivers would wait on pages to become available. Fortunately the respective senders in this case would be able to give back some pages already. So do that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fix for application IO with the on-io-error=pass-on policyPhilipp Reisner
In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-19Merge branches 'for-jens/xen-backend-fixes' and 'for-jens/xen-blkback-v3.3' ↵Jens Axboe
of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-2.6.40/drivers
2011-05-18xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override.Konrad Rzeszutek Wilk
We only supported the M2P (and P2M) override only for the GNTMAP_contains_pte type mappings. Meaning that we grants operations would "contain the machine address of the PTE to update" If the flag is unset, then the grant operation is "contains a host virtual address". The latter case means that the Hypervisor takes care of updating our page table (specifically the PTE entry) with the guest's MFN. As such we should not try to do anything with the PTE. Previous to this patch we would try to clear the PTE which resulted in Xen hypervisor being upset with us: (XEN) mm.c:1066:d0 Attempt to implicitly unmap a granted PTE c0100000ccc59067 (XEN) domain_crash called from mm.c:1067 (XEN) Domain 0 (vcpu#0) crashed on cpu#3: (XEN) ----[ Xen-4.0-110228 x86_64 debug=y Not tainted ]---- and crashing us. This patch allows us to inhibit the PTE clearing in the PV guest if the GNTMAP_contains_pte is not set. On the m2p_remove_override path we provide the same parameter. Sadly in the grant-table driver we do not have a mechanism to tell m2p_remove_override whether to clear the PTE or not. Since the grant-table driver is used by user-space, we can safely assume that it operates only on PTE's. Hence the implementation for it to work on !GNTMAP_contains_pte returns -EOPNOTSUPP. In the future we can implement the support for this. It will require some extra accounting structure to keep track of the page[i], and the flag. [v1: Added documentation details, made it return -EOPNOTSUPP instead of trying to do a half-way implementation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-18xen/blkback: don't fail empty barrier requestsJan Beulich
The sector number on empty barrier requests may (will?) be -1, which, given that it's being treated as unsigned 64-bit quantity, will almost always exceed the actual (virtual) disk's size. Inspired by Konrad's "When writting barriers set the sector number to zero...". While at it also add overflow checking to the math in vbd_translate(). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-13xen/blkback: fix xenbus_transaction_start() hang caused by double ↵Laszlo Ersek
xenbus_transaction_end() vbd_resize() up_read()'s xs_state.suspend_mutex twice in a row via double xenbus_transaction_end() calls. The next down_read() in xenbus_transaction_start() (at eg. the next resize attempt) hangs. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=618317 Acked-by: Jan Beulich <jbeulich@novell.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Align the tabs on the structure.Konrad Rzeszutek Wilk
The recent changes caused this field of the structure to be offset a bit. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: if log_stats is enabled print out the data.Konrad Rzeszutek Wilk
And not depend on the driver being built with -DDEBUG flag. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Add the prefix XEN in the common.h.Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Prefix 'vbd' with 'xen' in structs and functions.Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Change structure name blkif_st to xen_blkif.Konrad Rzeszutek Wilk
No need for that '_st' and xen_blkif is more apt. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Remove the unused typedefs.Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Move include/xen/blkif.h into drivers/block/xen-blkback/common.hKonrad Rzeszutek Wilk
Not point of the blkif.h file. It is not used by the frontend. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Fixing some more of the cleanpatch.pl warnings.Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Checkpatch.pl recommend against multiple assigments.Konrad Rzeszutek Wilk
CHECK: multiple assignments should be avoided Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Flesh out the description in the Kconfig.Konrad Rzeszutek Wilk
with more details. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Fix spelling mistakes.Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Move blkif_get_x86_[32|64]_req to common.h in block/xen-blkback ↵Konrad Rzeszutek Wilk
dir. From the blkif.h header, which was exposed to the frontend. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Removing the debug_lvl option.Konrad Rzeszutek Wilk
It is not really used for anything. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Use the DRV_PFX in the pr_.. macros.Konrad Rzeszutek Wilk
To make it easier to read. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Make the DPRINTK uniform.Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen/blkback: Change printk/DPRINTK to pr_.. type variant.Konrad Rzeszutek Wilk
And also make them uniform and prefix the message with 'xen-blkback'. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen-blkfront: Introduce BLKIF_OP_FLUSH_DISKCACHE support.Konrad Rzeszutek Wilk
If the backend supports the 'feature-flush-cache' mode, use that instead of the 'feature-barrier' support. Currently there are three backends that support the 'feature-flush-cache' mode: NetBSD, Solaris and Linux kernel. The 'flush' option is much light-weight version than the 'barrier' support so lets try to use as there are no filesystems in the kernel that use full barriers anymore. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12xen-blkfront: fix data size for xenbus_gather in blkfront_connectMarek Marczykowski
barrier variable is int, not long. This overflow caused another variable override: "err" (in PV code) and "binfo" (in xenlinux code - drivers/xen/blkfront/blkfront.c). The later caused incorrect device flags (RO/removable etc). Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> [v1: Changed title] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-11xen/blkback: Fixed up comments and converted spaces to tabs.Konrad Rzeszutek Wilk
Suggested-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-06cciss: fix compile issueJens Axboe
drivers/block/cciss.c: In function ‘cciss_send_reset’: drivers/block/cciss.c:2515:2: error: implicit declaration of function ‘fill_cmd’ drivers/block/cciss.c: At top level: drivers/block/cciss.c:2531:12: error: conflicting types for ‘fill_cmd’ drivers/block/cciss.c:2534:1: note: an argument type that has a default promotion can’t match an empty parameter name list declaration drivers/block/cciss.c:2515:18: note: previous implicit declaration of ‘fill_cmd’ was here make[1]: *** [drivers/block/cciss.o] Error 1 make: *** [drivers/block/cciss.o] Error 2 Move fill_cmd() to above where it is first used. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: add cciss_tape_cmds module paramterStephen M. Cameron
This is to allow number of commands reserved for use by SCSI tape drives and medium changers to be adjusted at driver load time via the kernel parameter cciss_tape_cmds, with a default value of 6, and a range of 2 - 16 inclusive. Previously, the driver limited the number of commands which could be queued to the SCSI half of the the driver to only 2. This is to fix the problem that if you had more than two tape drives, you couldn't, for example, erase or rewind them all at the same time. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: do not use bit 2 doorbell resetStephen M. Cameron
It causes NMIs which are undesirable at best, unsurvivable at worst. Prefer the soft reset instead. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: do not attempt PCI power management reset method if we know it won't ↵Stephen M. Cameron
work. Just go straight to the soft-reset method instead. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: remove superfluous sleeps around reset codeStephen M. Cameron
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: do soft reset if hard reset is brokenStephen M. Cameron
on driver load, if reset_devices is set, and the hard reset attempts fail, try to bring up the controller to the point that a command can be sent, and send it a soft reset command, then after the reset undo whatever driver initialization was done to get it to the point to take a command, and re-do it after the reset. This is to get kdump to work on all the "non-resettable" controllers (except 64xx controllers which can't be reset due to the potentially shared cache module.) Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: use new doorbell-bit-5 reset methodStephen M. Cameron
The bit-2-doorbell reset method seemed to cause (survivable) NMIs on some systems and (unsurvivable) IOCK NMIs on some G7 servers. Firmware guys implemented a new doorbell method to alleviate these problems triggered by bit 5 of the doorbell register. We want to use it if it's available. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: increase timeouts for post-reset no-opsStephen M. Cameron
Just to reduce the messages about timeouts that appear. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: clarify messages around reset behaviorStephen M. Cameron
When waiting for the board to become "not ready" don't print a message saying "waiting for board to become ready" (possibly followed by a message saying "failed waiting for board to become not ready". Instead, it should be "waiting for board to reset" and "failed waiting for board to reset." Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> " Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: increase time to wait for board reset to startStephen M. Cameron
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: get rid of message related magic numbersStephen M. Cameron
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: fix reply pool and block fetch table memory leaksStephen M. Cameron
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: factor out irq request codeStephen M. Cameron
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: factor out scatterlist allocation functionsStephen M. Cameron
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: factor out command pool allocation functionsStephen M. Cameron
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: do a better job of detecting controller reset failureStephen M. Cameron
Detect failure of controller reset by noticing if the 32 bytes of "driver version" we store on the hardware in the config table fail to get zeroed out. Previously we noticed if the controller did not transition to "simple mode", but this did not detect reset failure if the controller was already in simple mode prior to the reset attempt (e.g. due to module parameter hpsa_simple_mode=1). Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-06cciss: add readl after writel in interrupt mask setting codeStephen M. Cameron
This is to ensure the board interrupts are really off when these functions return. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-05xen/blkback: Fix up some of the comments.Konrad Rzeszutek Wilk
They had the wrong data or were in the wrong spot. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-05xen/blkback: Squash the checking for operation into dispatch_rw_block_ioKonrad Rzeszutek Wilk
We do a check for the operations right before calling dispatch_rw_block_io. And then we do the same check in dispatch_rw_block_io. This patch squashes those checks into the 'dispatch_rw_block_io' function. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-05xen/blkback: Add support for BLKIF_OP_FLUSH_DISKCACHE and drop ↵Konrad Rzeszutek Wilk
BLKIF_OP_WRITE_BARRIER. We drop the support for 'feature-barrier' and add in the support for the 'feature-flush-cache' if the real backend storage supports flushing. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-27Revert "xen/blkback: Move the plugging/unplugging to a higher level."Konrad Rzeszutek Wilk
This reverts commit 97961ef46b9b5a6a7c918a38b898a7b3e49869f4 b/c we lose about 15% performance if we do the unplugging and the end of the reading the ring buffer.
2011-04-26xen/blkback: Stick REQ_SYNC on WRITEs to deal with CFQ I/O scheduler.Konrad Rzeszutek Wilk
If one runs a simple fio request with random read/write with a 20%/80% ratio, the numbers are incredibly bad when using the CFQ scheduler. IOmeter | | | | 64K, randrw | NOOP | CFQ | deadline | randrwmix=80 | | | | --------------+-------+------+----------+ blkback |103/27 |32/10 | 102/27 | --------------+-------+------+----------+ QEMU qdisk |103/27 |102/27| 102/27 | The problem as explained by Vivek Goyal was: ".. that difference is that sync vs async requests. In the case of a kernel thread submitting IO, [..] all the WRITES might be being considered as async and will go in a different queue. If you mix those with some READS, they are always sync and will go in differnet queue. In presence of sync queue, CFQ will idle and choke up WRITES in an attempt to improve latencies of READs. In case of AIO [note: this is what QEMU qdisk is doing] , [..] it is direct IO and both READS and WRITES will be considered SYNC and will go in a single queue and no choking of WRITES will take place." The solution is quite simple, tack on REQ_SYNC (which is what the WRITE_ODIRECT macro points to) and the numbers go back up. Suggested-by: Vivek Goyal <vgoyal@redhat.com Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-26xen/blkback: Move the plugging/unplugging to a higher level.Konrad Rzeszutek Wilk
We used to the plug/unplug on the submit_bio. But that means if within a stream of WRITE, WRITE, WRITE,...,WRITE we have one READ, it could stall the pipeline (as the 'submio_bio' could trigger the unplug_fnc to be called and stall/sync when doing the READ). Instead we want to move the unplugging when the whole (or as a much as possible) ring buffer has been processed. This also eliminates us doing plug/unplug for each request. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-21block: don't block events on excl write for non-optical devicesTejun Heo
Disk event code automatically blocks events on excl write. This is primarily to avoid issuing polling commands while burning is in progress. This behavior doesn't fit other types of devices with removeable media where polling commands don't have adverse side effects and door locking usually doesn't exist. This patch introduces new genhd flag which controls the auto-blocking behavior and uses it to enable auto-blocking only on optical devices. Note for stable: 2.6.38 and later only Cc: stable@kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>