summaryrefslogtreecommitdiffstats
path: root/drivers/block/drbd/drbd_main.c
AgeCommit message (Collapse)Author
2011-03-10drbd: Implemented real timeout checking for request processing timePhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: fix potential imbalance of ap_in_flightLars Ellenberg
When we receive a barrier ack, we walk the ring list of drbd requests in the transfer log of the respective epoch, do some housekeeping, and free those objects. We tried to keep epochs of mirrored and unmirrored drbd requests separate, and assert that no local-only requests are present in a barrier_acked epoch. It turns out that this has quite a number of corner cases and would add bloated code without functional benefit. We now revert the (insufficient) commits drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions drbd: Ensure that an epoch contains only requests of one kind and instead fix the processing of barrier acks to cope with a mix of local-only and mirrored requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: silence some noisy log messages during disconnectLars Ellenberg
If we fail to send the information that we lost our disk, we have no connection, and no disk: no access to data anymore. That is either expected (deconfiguration), or there will be so much noise in the logs that "Sending state failed" is not useful at all. Drop it. If the reason for a shorter than expected receive was a signal, which we sent because we already decided to disconnect, these additional log messages are confusing and useless. This patch follows this pattern: - dev_warn(DEV, "short read expecting header on sock: r=%d\n", r); + if (!signal_pending(current)) + dev_warn(DEV, "short read expecting header on sock: r=%d\n", r); Also make them all dev_warn for consistency. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: describe bitmap locking for bulk operation in finer detailLars Ellenberg
Now that we do no longer in-place endian-swap the bitmap, we allow selected bitmap operations (testing bits, sometimes even settting bits) during some bulk operations. This caused us to hit a lot of FIXME asserts similar to FIXME asender in drbd_bm_count_bits, bitmap locked for 'write from resync_finished' by worker Which now is nonsense: looking at the bitmap is perfectly legal as long as it is not being resized. This cosmetic patch defines some flags to describe expectations in finer detail, so the asserts in e.g. bm_change_bits_to() can be skipped if appropriate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: log UUIDs whenever they changeLars Ellenberg
All decisions about sync, sync direction, and wether or not to allow a connect or attach are based on our set of UUIDs to tag a data generation. Log changes to the UUIDs whenever they occur, logging "new current UUID P:Q:R:S" is more useful than "Creating new current UUID". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: queue bitmap writeout more intelligentlyLars Ellenberg
The "lazy writeout" of cleared bitmap pages happens during resync, and should happen again once the resync finishes cleanly, or is aborted. If resync finished cleanly, or was aborted because of peer disk failure, we trigger the writeout from worker context in the after state change work. If resync was aborted because of connection failure, we should not immediately trigger bitmap writeout, but rather postpone the writeout to after the connection cleanup happened. We now do it in the receiver context from drbd_disconnect(). If resync was aborted because of local disk failure, well, there is nothing to write to anymore. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: don't pointlessly queue bitmap send, if we lost connectionLars Ellenberg
This is a minor optimization and cleanup, and also considerably reduces some harmless (but noisy) race with the connection cleanup code. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Ensure that an epoch contains only requests of one kindPhilipp Reisner
The assert in drbd_req.c:755 forces us to have only requests of one kind in an epoch. The two kinds we distinguish here are: local-only or mirrored. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Do not drop net config if sending in drbd_send_protocol() failsPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Work on the Ahead -> SyncSource transitionPhilipp Reisner
The test if rs_pending_cnt == 0 was too weak. Using Test for unacked_cnt == 0 instead. Moved that into the worker. Since unacked_cnt gets already increased when an P_RS_DATA_REQ comes in. Also using a timer to make Ahead -> SyncSource -> Ahead cycles slower... Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Do not full sync if a P_SYNC_UUID packet gets lostPhilipp Reisner
See also commit from 2009-08-15 "drbd_uuid_compare(): Do not full sync in case a P_SYNC_UUID packet gets lost." We saw cases where the History UUIDs where not as expected. So the detection of the special case did not trigger. With the sync UUID no longer being a random number, but deducible from the previous bitmap UUID, the detection of this special case becomes more reliable. The SyncUUID now is the previous bitmap UUID + 0x1000000000000. Rule 5a: Cs = H1p & H1p + Offset = Bp Connection was lost before SyncUUID Packet came through. Corrent (peer) UUIDs: Bp = H1p H1p = H2p H2p = 0 Become Sync target. Rule 7a: Cp = H1s & H1s + Offset = Bs Connection was lost before SyncUUID Packet came through. Correct (own) UUIDs: Bs = H1s H1s = H2s H2s = 0 Become Sync source. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Corrected off-by-one error in DRBD_MINOR_COUNT_MAXPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Cleaned up the resync timer logicPhilipp Reisner
Besides removed a few lines of code, this moves the inspection of the state from before the queuing process to after the queuing. I.e. more closely to the actual invocation of the work. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitionsPhilipp Reisner
Create a new barrier when leaving the AHEAD mode. Otherwise we trigger the assertion in req_mod(, barrier_acked) D_ASSERT(req->rq_state & RQ_NET_SENT); The new barrier is created by recycling the newest existing one. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: There might be a resync after unfreezing IO due to no disk [Bugz 332]Philipp Reisner
When on-no-data-accessible is set to suspend-io, also consider that a Primary, SyncTarget node losses its connection. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: improve on bitmap write out timingLars Ellenberg
Even though we now track the need for bitmap writeout per bitmap page, there is no need to trigger the writeout while a resync is going on. Once the resync is finished (or aborted), we trigger bitmap writeout anyways. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: spelling fix in log messageLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: serialize sending of resync uuid with pending w_send_oosLars Ellenberg
To improve the latency of IO requests during bitmap exchange, we recently allowed writes while waiting for the bitmap, sending "set out-of-sync" information packets for any newly dirtied bits. We have to make sure that the new resync-uuid does not overtake these "set oos" packets. Once the resync-uuid is received, the sync target starts the resync process, and expects the bitmap to only be cleared, not re-set. If we use this protocol extension, we queue the generation and sending of the resync-uuid on the worker, which naturally serializes with all previously queued packets. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: fix potential dereference of NULL pointerLars Ellenberg
If drbd used to have crypto digest algorithms configured, then is being unconfigured (but not unloaded), it frees the algorithms, but does not reset the config. If it then is reconfigured to use the very same algorithm, it "forgot" to re-allocate the algorithms, thinking that the config has not changed in that aspect. It will then Oops on the first attempt to actually use those algorithms. Fix this by resetting the config to defaults after cleanup. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: move bitmap write from resync_finished to after_state_changeLars Ellenberg
We must not call it directly from resync_finished, as we may be in either receiver or worker context there. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: bitmap keep track of changes vs on-disk bitmapLars Ellenberg
When we set or clear bits in a bitmap page, also set a flag in the page->private pointer. This allows us to skip writes of unchanged pages. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Rename __inc_ap_bio_cond to may_inc_ap_bioAndreas Gruenbacher
The old name is confusing: the function does not increment anything. Also rename _inc_ap_bio_cond to inc_ap_bio_cond: there is no need for an underscore. Finally, make it clear that these functions return boolean values. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: send_bitmap_rle_or_plain: Get rid of ugly and useless enumAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Use the standard bool, true, and false keywordsAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Be more explicit about functions that return an enum drbd_state_rvAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Rename enum drbd_state_ret_codes to enum drbd_state_rvAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Rename enum drbd_ret_codes to enum drbd_ret_codeAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Get rid of unnecessary macros (1)Andreas Gruenbacher
This macro doesn't save much code, but makes things a lot harder to read. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Rename drbd_make_request_26 to drbd_make_requestAndreas Gruenbacher
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Make sure that drbd_send() has sent the right number of bytesAndreas Gruenbacher
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-03-10drbd: remove /proc/drbd before unregistering from netlinkLars Ellenberg
There still exists a (theoretical) race on module unload, where /proc/drbd may still exist, but the netlink callback has been unregistered already, allowing drbdsetup to shout without listeners, and get no reply. Reorder remove_proc_entry and unregister of netlink callback. drbdsetup first checks for existence of the proc entry, and if that is missing, won't even try to contact the module. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Becoming sync target may not happen out of < C_WF_REPORT_PARAMSPhilipp Reisner
This patch is acutally a necessary addendum to the patch "fix for spurious full sync (becoming sync target looked like invalidate)" Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmapPhilipp Reisner
* C_STARTING_SYNC_S, C_STARTING_SYNC_T In these states the bitmap gets written to disk. Locking out of app-IO is done by using the drbd_queue_bitmap_io() and drbd_bitmap_io() functions these days. It is no longer necessary to lock out app-IO based on the connection state. App-IO that may come in after the BITMAP_IO flag got cleared before the state transition to C_SYNC_(SOURCE|TARGET) does not get mirrored, sets a bit in the local bitmap, that is already set, therefore changes nothing. * C_WF_BITMAP_S In this state we send updates (P_OUT_OF_SYNC packets). With that we make sure they have the same number of bits when going into the C_SYNC_(SOURCE|TARGET) connection state. * C_UNCONNECTED: The receiver starts, no need to lock out IO. * C_DISCONNECTING: in drbd_disconnect() we had a wait_event() to wait until ap_bio_cnt reaches 0. Removed that. * C_TIMEOUT, C_BROKEN_PIPE, C_NETWORK_FAILURE C_PROTOCOL_ERROR, C_TEAR_DOWN: Same as C_DISCONNECTING * C_WF_REPORT_PARAMS: IO still possible since that is still like C_WF_CONNECTION. And we do not need to send barriers in C_WF_BITMAP_S connection state. Allow concurrent accesses to the bitmap when receiving the bitmap. Everything gets ORed anyways. A drbd_free_tl_hash() is in after_state_chg_work(). At that point all the work items of the last connections must have been processed. Introduced a call to drbd_free_tl_hash() into drbd_free_mdev() for paranoia reasons. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Improvements in sanitize_state()Philipp Reisner
The relevant change is that the state change to C_FW_BITMAP_S should implicitly change pdsk to C_CONSISTENT. (Think of it as C_OUTDATED, only without the guarantee that the peer has the outdated written to its meta data) At that opportunity I restructured the switch statement so that it gets evaluated every time. (Has declarative character) Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Fixed race condition in drbd_queue_bitmap_ioPhilipp Reisner
May only test for ap_bio_cnt == 0 under req_lock. It can increase only under req_lock. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: use test_and_set_bit() to decide if bm_io_work should be queuedPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: When proxy's buffer drained off go into regular resync modePhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNCPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Implemented two new connection states Ahead/BehindPhilipp Reisner
In this connection mode, the ahead node no longer replicates application IO. The behind's disk becomes out dated. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Track the numbers of sectors in flightPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: properly use max_hw_sectors to limit the our bio sizeLars Ellenberg
To ease tracking of bios in some hash tables, we want it to not cross certain boundaries (128k, used to be 32k). We limit the maximum bio size using queue parameters. Historically some defines and variables we use there have been named max_segment_size, which was misguided. Rename them to max_bio_size, and use [blk_]queue_max_hw_sectors where appropriate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: detect modification of in-flight buffersLars Ellenberg
With data-integrity digest enabled, double-check on the sending side for modifications by upper layers of buffers under write back, so we can tell it appart from corruption on the "wire". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: use the resync controller for online-verify requests as wellLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: improve online-verify progress trackingLars Ellenberg
For a partial (resumed) online-verify, initialize rs_total not to total bits, but to number of bits to check in this run, to match the meaning rs_total has for actual resync. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10block: kill off REQ_UNPLUGJens Axboe
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: remove per-queue pluggingJens Axboe
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-27Merge branch 'cleanup-bd_claim' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into for-2.6.38/core
2010-11-13block: make blkdev_get/put() handle exclusive accessTejun Heo
Over time, block layer has accumulated a set of APIs dealing with bdev open, close, claim and release. * blkdev_get/put() are the primary open and close functions. * bd_claim/release() deal with exclusive open. * open/close_bdev_exclusive() are combination of open and claim and the other way around, respectively. * bd_link/unlink_disk_holder() to create and remove holder/slave symlinks. * open_by_devnum() wraps bdget() + blkdev_get(). The interface is a bit confusing and the decoupling of open and claim makes it impossible to properly guarantee exclusive access as in-kernel open + claim sequence can disturb the existing exclusive open even before the block layer knows the current open if for another exclusive access. Reorganize the interface such that, * blkdev_get() is extended to include exclusive access management. @holder argument is added and, if is @FMODE_EXCL specified, it will gain exclusive access atomically w.r.t. other exclusive accesses. * blkdev_put() is similarly extended. It now takes @mode argument and if @FMODE_EXCL is set, it releases an exclusive access. Also, when the last exclusive claim is released, the holder/slave symlinks are removed automatically. * bd_claim/release() and close_bdev_exclusive() are no longer necessary and either made static or removed. * bd_link_disk_holder() remains the same but bd_unlink_disk_holder() is no longer necessary and removed. * open_bdev_exclusive() becomes a simple wrapper around lookup_bdev() and blkdev_get(). It also has an unexpected extra bdev_read_only() test which probably should be moved into blkdev_get(). * open_by_devnum() is modified to take @holder argument and pass it to blkdev_get(). Most of bdev open/close operations are unified into blkdev_get/put() and most exclusive accesses are tested atomically at the open time (as it should). This cleans up code and removes some, both valid and invalid, but unnecessary all the same, corner cases. open_bdev_exclusive() and open_by_devnum() can use further cleanup - rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop special features. Well, let's leave them for another day. Most conversions are straight-forward. drbd conversion is a bit more involved as there was some reordering, but the logic should stay the same. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Neil Brown <neilb@suse.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Philipp Reisner <philipp.reisner@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: dm-devel@redhat.com Cc: drbd-dev@lists.linbit.com Cc: Leo Chen <leochen@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Joern Engel <joern@logfs.org> Cc: reiserfs-devel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>
2010-11-10Merge branch 'for-2.6.37/drivers' into for-linusJens Axboe
Conflicts: drivers/block/cciss.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-28drivers/block/drbd/drbd_main.c: fix error pathNicolas Kaiser
Failure to create drbd_ee_mempool appears not to get checked. Looks like a copy-and-paste problem to me. Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>