asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2014-08-04	bcache: try to set b->parent properly	Slava Pestov
	bcache_flash_dev.ktest would reliably crash with 8k and 16k bucket size before; now it passes. Change-Id: Ib542232235e39298c3a7548fe52b645cabb823d1
2014-08-04	bcache: wait for buckets when allocating new btree root	Slava Pestov
	Tested: - sometimes bcache_tier test would hang on startup with a failure to allocate the btree root -- no longer seeing this Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-03-18	bcache: Kill unused freelist	Kent Overstreet
	This was originally added as at optimization that for various reasons isn't needed anymore, but it does add a lot of nasty corner cases (and it was responsible for some recently fixed bugs). Just get rid of it now. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-03-18	bcache: Rework btree cache reserve handling	Kent Overstreet
	This changes the bucket allocation reserves to use _real_ reserves - separate freelists - instead of watermarks, which if nothing else makes the current code saner to reason about and is going to be important in the future when we add support for multiple btrees. It also adds btree_check_reserve(), which checks (and locks) the reserves for both bucket allocation and memory allocation for btree nodes; the old code just kinda sorta assumed that since (e.g. for btree node splits) it had the root locked and that meant no other threads could try to make use of the same reserve; this technically should have been ok for memory allocation (we should always have a reserve for memory allocation (the btree node cache is used as a reserve and we preallocate it)), but multiple btrees will mean that locking the root won't be sufficient anymore, and for the bucket allocation reserve it was technically possible for the old code to deadlock. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-03-18	bcache: btree locking rework	Kent Overstreet
	Add a new lock, b->write_lock, which is required to actually modify - or write - a btree node; this lock is only held for short durations. This means we can write out a btree node without taking b->lock, which _is_ held for long durations - solving a deadlock when btree_flush_write() (from the journalling code) is called with a btree node locked. Right now just occurs in bch_btree_set_root(), but with an upcoming journalling rework is going to happen a lot more. This also turns b->lock is now more of a read/intent lock instead of a read/write lock - but not completely, since it still blocks readers. May turn it into a real intent lock at some point in the future. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-03-18	bcache: Fix another bug recovering from unclean shutdown	Kent Overstreet
	The on disk bucket gens are allowed to be out of date, when we reuse buckets that didn't have any live data in them. To deal with this, the initial gc has to update the bucket gen when we find a pointer gen newer than the bucket's gen. Unfortunately we weren't doing this for pointers in the journal that we're about to replay. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Convert btree_iter to struct btree_keys	Kent Overstreet
	More work to disentangle bset.c from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Add struct btree_keys	Kent Overstreet
	Soon, bset.c won't need to depend on struct btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Abstract out stuff needed for sorting	Kent Overstreet
	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Rename/shuffle various code around	Kent Overstreet
	More work to disentangle bset.c from the rest of the code: Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Split out sort_extent_cmp()	Kent Overstreet
	Only use extent comparison for comparing extents, so we're not using START_KEY() on other key types (i.e. btree pointers) Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Btree verify code improvements	Kent Overstreet
	Used this fixed code to find and fix the bug fixed by a4d885097b0ac0cd1337f171f2d4b83e946094d4. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: kill index()	Kent Overstreet
	That was a terrible name for a macro, add some better helpers to replace it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: Rework allocator reserves	Kent Overstreet
	We need a reserve for allocating buckets for new btree nodes - and now that we've got multiple btrees, it really needs to be per btree. This reworks the reserves so we've got separate freelists for each reserve instead of watermarks, which seems to make things a bit cleaner, and it adds some code so that btree_split() can make sure the reserve is available before it starts. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08	bcache: kill closure locking usage	Kent Overstreet
	Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Avoid deadlocking in garbage collection	Kent Overstreet
	Not a complete fix - we could still deadlock if btree_insert_node() has to split... Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Incremental gc	Kent Overstreet
	Big garbage collection rewrite; now, garbage collection uses the same mechanisms as used elsewhere for inserting/updating btree node pointers, instead of rewriting interior btree nodes in place. This makes the code significantly cleaner and less fragile, and means we can now make garbage collection incremental - it doesn't have to hold a write lock on the root of the btree for the entire duration of garbage collection. This means that there's less of a latency hit for doing garbage collection, which means we can gc more frequently (and do a better job of reclaiming from the cache), and we can coalesce across more btree nodes (improving our space efficiency). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: bch_(btree\|extent)_ptr_invalid()	Kent Overstreet
	Trying to treat btree pointers and leaf node pointers the same way was a mistake - going to start being more explicit about the type of key/pointer we're dealing with. This is the first part of that refactoring; this patch shouldn't change any actual behaviour. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Don't bother with bucket refcount for btree node allocations	Kent Overstreet
	The bucket refcount (dropped with bkey_put()) is only needed to prevent the newly allocated bucket from being garbage collected until we've added a pointer to it somewhere. But for btree node allocations, the fact that we have btree nodes locked is enough to guard against races with garbage collection. Eventually the per bucket refcount is going to be replaced with something specific to bch_alloc_sectors(). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Debug code improvements	Kent Overstreet
	Couple changes: * Consolidate bch_check_keys() and bch_check_key_order(), and move the checks that only check_key_order() could do to bch_btree_iter_next(). * Get rid of CONFIG_BCACHE_EDEBUG - now, all that code is compiled in when CONFIG_BCACHE_DEBUG is enabled, and there's now a sysfs file to flip on the EDEBUG checks at runtime. * Dropped an old not terribly useful check in rw_unlock(), and refactored/improved a some of the other debug code. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Convert bch_btree_insert() to bch_btree_map_leaf_nodes()	Kent Overstreet
	Last of the btree_map() conversions. Main visible effect is bch_btree_insert() is no longer taking a struct btree_op as an argument anymore - there's no fancy state machine stuff going on, it's just a normal function. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Kill op->replace	Kent Overstreet
	This is prep work for converting bch_btree_insert to bch_btree_map_leaf_nodes() - we have to convert all its arguments to actual arguments. Bunch of churn, but should be straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Kill op->cl	Kent Overstreet
	This isn't used for waiting asynchronously anymore - so this is a fairly trivial refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Prune struct btree_op	Kent Overstreet
	Eventual goal is for struct btree_op to contain only what is necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Convert bch_btree_read_async() to bch_btree_map_keys()	Kent Overstreet
	This is a fairly straightforward conversion, mostly reshuffling - op->lookup_done goes away, replaced by MAP_DONE/MAP_CONTINUE. And the code for handling cache hits and misses wasn't really btree code, so it gets moved to request.c. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Move some stuff to btree.c	Kent Overstreet
	With the new btree_map() functions, we don't need to export the stuff needed for traversing the btree anymore. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Add btree_map() functions	Kent Overstreet
	Lots of stuff has been open coding its own btree traversal - which is generally pretty simple code, but there are a few subtleties. This adds new new functions, bch_btree_map_nodes() and bch_btree_map_keys(), which do the traversal for you. Everything that's open coding btree traversal now (with the exception of garbage collection) is slowly going to be converted to these two functions; being able to write other code at a higher level of abstraction is a big improvement w.r.t. overall code quality. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Convert gc to a kthread	Kent Overstreet
	We needed a dedicated rescuer workqueue for gc anyways... and gc was conceptually a dedicated thread, just one that wasn't running all the time. Switch it to a dedicated thread to make the code a bit more straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Convert bucket_wait to wait_queue_head_t	Kent Overstreet
	At one point we did do fancy asynchronous waiting stuff with bucket_wait, but that's all gone (and bucket_wait is used a lot less than it used to be). So use the standard primitives. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Convert try_wait to wait_queue_head_t	Kent Overstreet
	We never waited on c->try_wait asynchronously, so just use the standard primitives. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Move keylist out of btree_op	Kent Overstreet
	Slowly working on pruning struct btree_op - the aim is for it to only contain things that are actually necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Refactor request_write()	Kent Overstreet
	Try to improve some of the naming a bit to be more consistent, and also improve the flow of control in request_write() a bit. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Add explicit keylist arg to btree_insert()	Kent Overstreet
	Some refactoring - better to explicitly pass stuff around instead of having it all in the "big bag of state", struct btree_op. Going to prune struct btree_op quite a bit over time. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Convert btree_insert_check_key() to btree_insert_node()	Kent Overstreet
	This was the main point of all this refactoring - now, btree_insert_check_key() won't fail just because the leaf node happened to be full. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Explicitly track btree node's parent	Kent Overstreet
	This is prep work for the reworked btree insertion code. The way we set b->parent is ugly and hacky... the problem is, when btree_split() or garbage collection splits or rewrites a btree node, the parent changes for all its (potentially already cached) children. I may change this later and add some code to look through the btree node cache and find all our cached child nodes and change the parent pointer then... Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10	bcache: Remove unnecessary check in should_split()	Kent Overstreet
	Checking i->seq was redundant, because since ages ago we always initialize the new bset when advancing b->written Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-07-01	bcache: Delete fuzz tester	Kent Overstreet
	This code has rotted and it hasn't been used in ages anyways. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-06-26	bcache: Write out full stripes	Kent Overstreet
	Now that we're tracking dirty data per stripe, we can add two optimizations for raid5/6: * If a stripe is already dirty, force writes to that stripe to writeback mode - to help build up full stripes of dirty data * When flushing dirty data, preferentially write out full stripes first if there are any. Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-06-26	bcache: Rip out pkey()/pbtree()	Kent Overstreet
	Old gcc doesnt like the struct hack, and it is kind of ugly. So finish off the work to convert pr_debug() statements to tracepoints, and delete pkey()/pbtree(). Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-06-26	bcache: Refactor btree io	Kent Overstreet
	The most significant change is that btree reads are now done synchronously, instead of asynchronously and doing the post read stuff from a workqueue. This was originally done because we can't block on IO under generic_make_request(). But - we already have a mechanism to punt cache lookups to workqueue if needed, so if we just use that we don't have to deal with the complexity of doing things asynchronously. The main benefit is this makes the locking situation saner; we can hold our write lock on the btree node until we're finished reading it, and we don't need that btree_node_read_done() flag anymore. Also, for writes, btree_write() was broken out into btree_node_write() and btree_leaf_dirty() - the old code with the boolean argument was dumb and confusing. The prio_blocked mechanism was improved a bit too, now the only counter is in struct btree_write, we don't mess with transfering a count from struct btree anymore. This required changing garbage collection to block prios at the start and unblock when it finishes, which is cleaner than what it was doing anyways (the old code had mostly the same effect, but was doing it in a convoluted way) And the btree iter btree_node_read_done() uses was converted to a real mempool. Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-03-23	bcache: A block layer cache	Kent Overstreet
	Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>