summaryrefslogtreecommitdiffstats
path: root/fs/btrfs
AgeCommit message (Collapse)Author
2011-04-07Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
2011-04-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: don't warn in btrfs_add_orphan Btrfs: fix free space cache when there are pinned extents and clusters V2 Btrfs: Fix uninitialized root flags for subvolumes btrfs: clear __GFP_FS flag in the space cache inode Btrfs: fix memory leak in start_transaction() Btrfs: fix memory leak in btrfs_ioctl_start_sync() Btrfs: fix subvol_sem leak in btrfs_rename() Btrfs: Fix oops for defrag with compression turned on Btrfs: fix /proc/mounts info. Btrfs: fix compiler warning in file.c
2011-04-05Btrfs: don't warn in btrfs_add_orphanJosef Bacik
When I moved the orphan adding to btrfs_truncate I missed the fact that during orphan cleanup we just add the orphan items to the orphan list without going through btrfs_orphan_add, which results in lots of warnings on mount if you have any orphan items that need to be truncated. Just remove this warning since it's ok, this will allow all of the normal space accounting take place. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: fix free space cache when there are pinned extents and clusters V2Josef Bacik
I noticed a huge problem with the free space cache that was presenting as an early ENOSPC. Turns out when writing the free space cache out I forgot to take into account pinned extents and more importantly clusters. This would result in us leaking free space everytime we unmounted the filesystem and remounted it. I fix this by making sure to check and see if the current block group has a cluster and writing out any entries that are in the cluster to the cache, as well as writing any pinned extents we currently have to the cache since those will be available for us to use the next time the fs mounts. This patch also adds a check to the end of load_free_space_cache to make sure we got the right amount of free space cache, and if not make sure to clear the cache and re-cache the old fashioned way. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: Fix uninitialized root flags for subvolumesLi Zefan
root_item->flags and root_item->byte_limit are not initialized when a subvolume is created. This bug is not revealed until we added readonly snapshot support - now you mount a btrfs filesystem and you may find the subvolumes in it are readonly. To work around this problem, we steal a bit from root_item->inode_item->flags, and use it to indicate if those fields have been properly initialized. When we read a tree root from disk, we check if the bit is set, and if not we'll set the flag and initialize the two fields of the root item. Reported-by: Andreas Philipp <philipp.andreas@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Andreas Philipp <philipp.andreas@gmail.com> cc: stable@kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05btrfs: clear __GFP_FS flag in the space cache inodeMiao Xie
the object id of the space cache inode's key is allocated from the relative root, just like the regular file. So we can't identify space cache inode by checking the object id of the inode's key, and we have to clear __GFP_FS flag at the time we look up the space cache inode. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: fix memory leak in start_transaction()Yoshinori Sano
Free btrfs_trans_handle when join_transaction() fails in start_transaction() Signed-off-by: Yoshinori Sano <yoshinori.sano@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: fix memory leak in btrfs_ioctl_start_sync()Tsutomu Itoh
Call btrfs_end_transaction() if btrfs_commit_transaction_async() fails. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: fix subvol_sem leak in btrfs_rename()Johann Lombardi
btrfs_rename() does not release the subvol_sem if the transaction failed to start. Signed-off-by: Johann Lombardi <johann@whamcloud.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: Fix oops for defrag with compression turned onLi Zefan
When we defrag a file, whose size can be fit into an inline extent, with compression enabled, the compress type is set to be fs_info->compress_type, which is 0 if the btrfs filesystem is mounted without compress option. This leads to oops. Reported-by: Daniel Blueman <daniel.blueman@gmail.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: fix /proc/mounts info.Tsutomu Itoh
Some mount options are not displayed by /proc/mounts. This patch displays the option such as compress_type by /proc/mounts. Ex. [before] $ mount | grep sdc2 /dev/sdc2 on /test12 type btrfs (rw,space_cache,compress=lzo) $ cat /proc/mounts | grep sdc2 /dev/sdc2 /test12 btrfs rw,relatime,compress 0 0 [after] $ mount | grep sdc2 /dev/sdc2 on /test12 type btrfs (rw,space_cache,compress=lzo) $ cat /proc/mounts | grep sdc2 /dev/sdc2 /test12 btrfs rw,relatime,compress=lzo,space_cache 0 0 Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-05Btrfs: fix compiler warning in file.cTsutomu Itoh
While compiling Btrfs, I got following messages: CC [M] fs/btrfs/file.o fs/btrfs/file.c: In function '__btrfs_buffered_write': fs/btrfs/file.c:909: warning: 'ret' may be used uninitialized in this function CC [M] fs/btrfs/tree-defrag.o This patch fixes compiler warning. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-28Merge branch 'for-linus-unmerged' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (45 commits) Btrfs: fix __btrfs_map_block on 32 bit machines btrfs: fix possible deadlock by clearing __GFP_FS flag btrfs: check link counter overflow in link(2) btrfs: don't mess with i_nlink of unlocked inode in rename() Btrfs: check return value of btrfs_alloc_path() Btrfs: fix OOPS of empty filesystem after balance Btrfs: fix memory leak of empty filesystem after balance Btrfs: fix return value of setflags ioctl Btrfs: fix uncheck memory allocations btrfs: make inode ref log recovery faster Btrfs: add btrfs_trim_fs() to handle FITRIM Btrfs: adjust btrfs_discard_extent() return errors and trimmed bytes Btrfs: make btrfs_map_block() return entire free extent for each device of RAID0/1/10/DUP Btrfs: make update_reserved_bytes() public btrfs: return EXDEV when linking from different subvolumes Btrfs: Per file/directory controls for COW and compression Btrfs: add datacow flag in inode flag btrfs: use GFP_NOFS instead of GFP_KERNEL Btrfs: check return value of read_tree_block() btrfs: properly access unaligned checksum buffer ... Fix up trivial conflicts in fs/btrfs/volumes.c due to plug removal in the block layer.
2011-03-28Btrfs: fix __btrfs_map_block on 32 bit machinesChris Mason
Recent changes for discard support didn't compile, this fixes them not to try and % 64 bit numbers. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: fix possible deadlock by clearing __GFP_FS flagMiao Xie
Using the GFP_HIGHUSER_MOVABLE flag to allocate the metadata's page may cause deadlock. Task1 open() ... btrfs_search_slot() ... btrfs_cow_block() ... alloc_page() wait for reclaiming shrink_slab() ... shrink_icache_memory() ... btrfs_evict_inode() ... btrfs_search_slot() If the path is locked by task1, the deadlock happens. So the btree's page cache is different with the file's page cache, it can not allocate pages by GFP_HIGHUSER_MOVABLE flag, we must clear __GFP_FS flag in GFP_HIGHUSER_MOVABLE flag. Reported-by: Itaru Kitayama <kitayama@cl.bb4u.ne.jp> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: check link counter overflow in link(2)Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: don't mess with i_nlink of unlocked inode in rename()Al Viro
old_inode is not locked; it's not safe to play with its link count. Instead of bumping it and calling btrfs_unlink_inode(), add a variant of the latter that does not do btrfs_drop_nlink()/ btrfs_update_inode(), call it instead of btrfs_inc_nlink()/ btrfs_unlink_inode() and do btrfs_update_inode() ourselves. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: check return value of btrfs_alloc_path()Tsutomu Itoh
Adding the check on the return value of btrfs_alloc_path() to several places. And, some of callers are modified by this change. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: fix OOPS of empty filesystem after balanceliubo
btrfs will remove unused block groups after balance. When a empty filesystem is balanced, the block group with tag "DATA" may be dropped, and after umount and mount again, it will not find "DATA" space_info and lead to OOPS. So we initial the necessary space_infos(DATA, SYSTEM, METADATA) to avoid OOPS. Reported-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: fix memory leak of empty filesystem after balanceliubo
After Josef's patch(commit 3c14874acc71180553fb5aba528e3cf57c5b958b), btrfs will exclude super bytes when reading block groups(by marking a extent state UPTODATE). However, these bytes do not get freed while balance remove unused block groups, and we won't process those removed ones any more, when we do umount and unload the btrfs module, btrfs hits a memory leak. This patch add the missing free operation. Reproduce steps: $ mkfs.btrfs disk $ mount disk /mnt/btrfs -o loop $ btrfs filesystem balance /mnt/btrfs $ umount /mnt/btrfs $ rmmod btrfs Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: fix return value of setflags ioctlliubo
setflags ioctl should return error when any checks fail. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: fix uncheck memory allocationsYoshinori Sano
To make Btrfs code more robust, several return value checks where memory allocation can fail are introduced. I use BUG_ON where I don't know how to handle the error properly, which increases the number of using the notorious BUG_ON, though. Signed-off-by: Yoshinori Sano <yoshinori.sano@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: make inode ref log recovery fasterliubo
When we recover from crash via write-ahead log tree and process the inode refs, for each btrfs_inode_ref item, we will 1) check if we already have a perfect match in fs/file tree, if we have, then we're done. 2) search the corresponding back reference in fs/file tree, and check all the names in this back reference to see if they are also in the log to avoid conflict corners. 3) recover the logged inode refs to fs/file tree. In current btrfs, however, - for 2)'s check, once is enough, since the checked back reference will remain unchanged after processing all the inode refs belonged to the key. - it has no need to do another 1) between 2) and 3). I've made a small test to show how it improves, $dd if=/dev/zero of=foobar bs=4K count=1 $sync $make 100 hard links continuously, like ln foobar link_i $fsync foobar $echo b > /proc/sysrq-trigger after reboot $time mount DEV PATH without patch: real 0m0.285s user 0m0.001s sys 0m0.009s with patch: real 0m0.123s user 0m0.000s sys 0m0.010s Changelog v1->v2: - fix double free - pointed by David Sterba Changelog v2->v3: - adjust free order Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: add btrfs_trim_fs() to handle FITRIMLi Dongyang
We take an free extent out from allocator, trim it, then put it back, but before we trim the block group, we should make sure the block group is cached, so plus a little change to make cache_block_group() run without a transaction. Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: adjust btrfs_discard_extent() return errors and trimmed bytesLi Dongyang
Callers of btrfs_discard_extent() should check if we are mounted with -o discard, as we want to make fitrim to work even the fs is not mounted with -o discard. Also we should use REQ_DISCARD to map the free extent to get a full mapping, last we only return errors if 1. the error is not a EOPNOTSUPP 2. no device supports discard Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: make btrfs_map_block() return entire free extent for each device of ↵Li Dongyang
RAID0/1/10/DUP btrfs_map_block() will only return a single stripe length, but we want the full extent be mapped to each disk when we are trimming the extent, so we add length to btrfs_bio_stripe and fill it if we are mapping for REQ_DISCARD. Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: make update_reserved_bytes() publicLi Dongyang
Make the function public as we should update the reserved extents calculations after taking out an extent for trimming. Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: return EXDEV when linking from different subvolumesMark Fasheh
btrfs_link returns EPERM if a cross-subvolume link is attempted. However, in this case I believe EXDEV to be the more appropriate value. >From the link(2) man page: EXDEV oldpath and newpath are not on the same mounted file system. (Linux permits a file system to be mounted at multiple points, but link() does not work across different mount points, even if the same file system is mounted on both.) This matters because an application may have different behaviors based on return codes. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: Per file/directory controls for COW and compressionLiu Bo
Data compression and data cow are controlled across the entire FS by mount options right now. ioctls are needed to set this on a per file or per directory basis. This has been proposed previously, but VFS developers wanted us to use generic ioctls rather than btrfs-specific ones. According to Chris's comment, there should be just one true compression method(probably LZO) stored in the super. However, before this, we would wait for that one method is stable enough to be adopted into the super. So I list it as a long term goal, and just store it in ram today. After applying this patch, we can use the generic "FS_IOC_SETFLAGS" ioctl to control file and directory's datacow and compression attribute. NOTE: - The compression type is selected by such rules: If we mount btrfs with compress options, ie, zlib/lzo, the type is it. Otherwise, we'll use the default compress type (zlib today). v1->v2: - rebase to the latest btrfs. v2->v3: - fix a problem, i.e. when a file is set NOCOW via mount option, then this NOCOW will be screwed by inheritance from parent directory. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: use GFP_NOFS instead of GFP_KERNELMiao Xie
In the filesystem context, we must allocate memory by GFP_NOFS, or we may start another filesystem operation and make kswap thread hang up. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: check return value of read_tree_block()Tsutomu Itoh
This patch is checking return value of read_tree_block(), and if it is NULL, error processing. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28btrfs: properly access unaligned checksum bufferDavid Sterba
On Fri, Mar 18, 2011 at 11:56:53AM -0400, Chris Mason wrote: > Thanks for fielding this one. Does put_unaligned_le32 optimize away on > platforms with efficient access? It would be great if we didn't need > the #ifdef. (quicktest: assembly output is same for put_unaligned_le32 and direct assignment on my x86_64) I was originally following examples in Documentation/unaligned-memory-access.txt. From other code it seems to me that the define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is intended for larger portions of code. Macros/wrappers for {put,get}_unaligned* are chosen via arch/<arch>/include/asm/unaligned.h accordingly, therefore it's safe to use put_unaligned_le32 without the ifdef. dave Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: cleanup some BUG_ON()Tsutomu Itoh
This patch changes some BUG_ON() to the error return. (but, most callers still use BUG_ON()) Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: add initial tracepoint support for btrfsliubo
Tracepoints can provide insight into why btrfs hits bugs and be greatly helpful for debugging, e.g dd-7822 [000] 2121.641088: btrfs_inode_request: root = 5(FS_TREE), gen = 4, ino = 256, blocks = 8, disk_i_size = 0, last_trans = 8, logged_trans = 0 dd-7822 [000] 2121.641100: btrfs_inode_new: root = 5(FS_TREE), gen = 8, ino = 257, blocks = 0, disk_i_size = 0, last_trans = 0, logged_trans = 0 btrfs-transacti-7804 [001] 2146.935420: btrfs_cow_block: root = 2(EXTENT_TREE), refs = 2, orig_buf = 29368320 (orig_level = 0), cow_buf = 29388800 (cow_level = 0) btrfs-transacti-7804 [001] 2146.935473: btrfs_cow_block: root = 1(ROOT_TREE), refs = 2, orig_buf = 29364224 (orig_level = 0), cow_buf = 29392896 (cow_level = 0) btrfs-transacti-7804 [001] 2146.972221: btrfs_transaction_commit: root = 1(ROOT_TREE), gen = 8 flush-btrfs-2-7821 [001] 2155.824210: btrfs_chunk_alloc: root = 3(CHUNK_TREE), offset = 1103101952, size = 1073741824, num_stripes = 1, sub_stripes = 0, type = DATA flush-btrfs-2-7821 [001] 2155.824241: btrfs_cow_block: root = 2(EXTENT_TREE), refs = 2, orig_buf = 29388800 (orig_level = 0), cow_buf = 29396992 (cow_level = 0) flush-btrfs-2-7821 [001] 2155.824255: btrfs_cow_block: root = 4(DEV_TREE), refs = 2, orig_buf = 29372416 (orig_level = 0), cow_buf = 29401088 (cow_level = 0) flush-btrfs-2-7821 [000] 2155.824329: btrfs_cow_block: root = 3(CHUNK_TREE), refs = 2, orig_buf = 20971520 (orig_level = 0), cow_buf = 20975616 (cow_level = 0) btrfs-endio-wri-7800 [001] 2155.898019: btrfs_cow_block: root = 5(FS_TREE), refs = 2, orig_buf = 29384704 (orig_level = 0), cow_buf = 29405184 (cow_level = 0) btrfs-endio-wri-7800 [001] 2155.898043: btrfs_cow_block: root = 7(CSUM_TREE), refs = 2, orig_buf = 29376512 (orig_level = 0), cow_buf = 29409280 (cow_level = 0) Here is what I have added: 1) ordere_extent: btrfs_ordered_extent_add btrfs_ordered_extent_remove btrfs_ordered_extent_start btrfs_ordered_extent_put These provide critical information to understand how ordered_extents are updated. 2) extent_map: btrfs_get_extent extent_map is used in both read and write cases, and it is useful for tracking how btrfs specific IO is running. 3) writepage: __extent_writepage btrfs_writepage_end_io_hook Pages are cirtical resourses and produce a lot of corner cases during writeback, so it is valuable to know how page is written to disk. 4) inode: btrfs_inode_new btrfs_inode_request btrfs_inode_evict These can show where and when a inode is created, when a inode is evicted. 5) sync: btrfs_sync_file btrfs_sync_fs These show sync arguments. 6) transaction: btrfs_transaction_commit In transaction based filesystem, it will be useful to know the generation and who does commit. 7) back reference and cow: btrfs_delayed_tree_ref btrfs_delayed_data_ref btrfs_delayed_ref_head btrfs_cow_block Btrfs natively supports back references, these tracepoints are helpful on understanding btrfs's COW mechanism. 8) chunk: btrfs_chunk_alloc btrfs_chunk_free Chunk is a link between physical offset and logical offset, and stands for space infomation in btrfs, and these are helpful on tracing space things. 9) reserved_extent: btrfs_reserved_extent_alloc btrfs_reserved_extent_free These can show how btrfs uses its space. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-28Btrfs: use RCU instead of a spinlock to protect the root nodeChris Mason
The pointer to the extent buffer for the root of each tree is protected by a spinlock so that we can safely read the pointer and take a reference on the extent buffer. But now that the extent buffers are freed via RCU, we can safely use rcu_read_lock instead. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-03-25Btrfs: mark the bio with an error if we have a failure in dioJosef Bacik
I noticed that dio_end_io calls the appropriate endio function with an error, but the endio functions don't actually do anything with that error, they assume that if there was an error then the bio will not be uptodate. So if we had checksum failures we would never pass back EIO. So if there is an error in our endio functions make sure to clear the uptodate flag on the bio. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-25Btrfs: don't allocate dip->csums when doing writesJosef Bacik
When doing direct writes we store the checksums in the ordered sum stuff in the ordered extent for writing them when the write completes, so we don't even use the dip->csums array. So if we're writing, don't bother allocating dip->csums since we won't use it anyway. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-25Btrfs: cleanup how we setup free space clustersJosef Bacik
This patch makes the free space cluster refilling code a little easier to understand, and fixes some things with the bitmap part of it. Currently we either want to refill a cluster with 1) All normal extent entries (those without bitmaps) 2) A bitmap entry with enough space The current code has this ugly jump around logic that will first try and fill up the cluster with extent entries and then if it can't do that it will try and find a bitmap to use. So instead split this out into two functions, one that tries to find only normal entries, and one that tries to find bitmaps. This also fixes a suboptimal thing we would do with bitmaps. If we used a bitmap we would just tell the cluster that we were pointing at a bitmap and it would do the tree search in the block group for that entry every time we tried to make an allocation. Instead of doing that now we just add it to the clusters group. I tested this with my ENOSPC tests and xfstests and it survived. Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-24Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits) Documentation/iostats.txt: bit-size reference etc. cfq-iosched: removing unnecessary think time checking cfq-iosched: Don't clear queue stats when preempt. blk-throttle: Reset group slice when limits are changed blk-cgroup: Only give unaccounted_time under debug cfq-iosched: Don't set active queue in preempt block: fix non-atomic access to genhd inflight structures block: attempt to merge with existing requests on plug flush block: NULL dereference on error path in __blkdev_get() cfq-iosched: Don't update group weights when on service tree fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away block: Require subsystems to explicitly allocate bio_set integrity mempool jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging fs: make fsync_buffers_list() plug mm: make generic_writepages() use plugging blk-cgroup: Add unaccounted time to timeslice_used. block: fixup plugging stubs for !CONFIG_BLOCK block: remove obsolete comments for blkdev_issue_zeroout. blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Fix up conflicts in fs/{aio.c,super.c}
2011-03-23userns: rename is_owner_or_cap to inode_owner_or_capableSerge E. Hallyn
And give it a kernel-doc comment. [akpm@linux-foundation.org: btrfs changed in linux-next] Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22zlib: slim down zlib_deflate() workspace when possibleJim Keniston
Instead of always creating a huge (268K) deflate_workspace with the maximum compression parameters (windowBits=15, memLevel=8), allow the caller to obtain a smaller workspace by specifying smaller parameter values. For example, when capturing oops and panic reports to a medium with limited capacity, such as NVRAM, compression may be the only way to capture the whole report. In this case, a small workspace (24K works fine) is a win, whether you allocate the workspace when you need it (i.e., during an oops or panic) or at boot time. I've verified that this patch works with all accepted values of windowBits (positive and negative), memLevel, and compression level. Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David Miller <davem@davemloft.net> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-21Btrfs: don't be as aggressive about using bitmapsJosef Bacik
We have been creating bitmaps for small extents unconditionally forever. This was great when testing to make sure the bitmap stuff was working, but is overkill normally. So instead of always adding small chunks of free space to bitmaps, only start doing it if we go past half of our extent threshold. This will keeps us from creating a bitmap for just one small free extent at the front of the block group, and will make the allocator a little faster as a result. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-21Btrfs: deal with min_bytes appropriately when looking for a clusterJosef Bacik
We do all this fun stuff with min_bytes, but either don't use it in the case of just normal extents, or use it completely wrong in the case of bitmaps. So fix this for both cases 1) In the extent case, stop looking for space with window_free >= min_bytes instead of bytes + empty_size. 2) In the bitmap case, we were looking for streches of free space that was at least min_bytes in size, which was not right at all. So instead search for stretches of free space that are at least bytes in size (this will make a difference when we have > page size blocks) and then only search for min_bytes amount of free space. Thanks, Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-21Btrfs: check free space in block group before searching for a clusterJosef Bacik
The free space cluster stuff is heavy duty, so there is no sense in going through the entire song and dance if there isn't enough space in the block group to begin with. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits) doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore Update cpuset info & webiste for cgroups dcdbas: force SMI to happen when expected arch/arm/Kconfig: remove one to many l's in the word. asm-generic/user.h: Fix spelling in comment drm: fix printk typo 'sracth' Remove one to many n's in a word Documentation/filesystems/romfs.txt: fixing link to genromfs drivers:scsi Change printk typo initate -> initiate serial, pch uart: Remove duplicate inclusion of linux/pci.h header fs/eventpoll.c: fix spelling mm: Fix out-of-date comments which refers non-existent functions drm: Fix printk typo 'failled' coh901318.c: Change initate to initiate. mbox-db5500.c Change initate to initiate. edac: correct i82975x error-info reported edac: correct i82975x mci initialisation edac: correct commented info fs: update comments to point correct document target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c ... Trivial conflict in fs/eventpoll.c (spelling vs addition)
2011-03-17Btrfs: add checks to verify dir items are correctJosef Bacik
We need to make sure the dir items we get are valid dir items. So any time we try and read one check it with verify_dir_item, which will do various sanity checks to make sure it looks sane. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-17Btrfs: check return value of btrfs_search_slot properlyJosef Bacik
Doing an audit of where we use btrfs_search_slot only showed one place where we don't check the return value of btrfs_search_slot properly. Just fix mark_extent_written to see if btrfs_search_slot failed and act accordingly. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-17Btrfs: check items for correctness as we searchJosef Bacik
Currently if we have corrupted items things will blow up in spectacular ways. So as we read in blocks and they are leaves, check the entire leaf to make sure all of the items are correct and point to valid parts in the leaf for the item data the are responsible for. If the item is corrupt we will kick back EIO and not read any of the copies since they are likely to not be correct either. This will catch generic corruptions, it will be up to the individual callers of btrfs_search_slot to make sure their items are right. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>
2011-03-17Btrfs: return error if the range we want to map is bogusJosef Bacik
Currently if we have corrupt metadata map_extent_buffer will complain about it, but not return an error so the caller has no idea a problem was hit. Fix this. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>