summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2011-02-20ocfs2: Use hrtimer to track ocfs2 fs lock statsSunil Mushran
Patch makes use of the hrtimer to track times in ocfs2 lock stats. The patch is a bit involved to ensure no additional impact on the memory footprint. The size of ocfs2_inode_cache remains 1280 bytes on 32-bit systems. A related change was to modify the unit of the max wait time from nanosec to microsec allowing us to track max time larger than 4 secs. This change necessitated the bumping of the output version in the debugfs file, locking_state, from 2 to 3. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-02-20ocfs2: Check heartbeat mode for kernel stacks onlyMark Fasheh
Commit 2c442719e90a44a6982c033d69df4aae4b167cfa added some checks for proper heartbeat mode when the o2cb stack is running. Unfortunately, it didn't take into account that a userpsace stack could be running. Fix this by only doing the check if o2cb is in use. This patch allows userspace stacks to mount the fs again. Cc: stable@kernel.org Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-02-20Ocfs2/refcounttree: Fix a bug for refcounttree to writeback clusters in a ↵Tristan Ye
right number. Current refcounttree codes actually didn't writeback the new pages out in write-back mode, due to a bug of always passing a ZERO number of clusters to 'ocfs2_cow_sync_writeback', the patch tries to pass a proper one in. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Cc: stable@kernel.org Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-02-20ocfs2: Fix estimate of necessary credits for mkdirJan Kara
In the rare case that INLINE_DATA, INDEX_DIR, QUOTA, XATTR features are disabled and both the allocation of the directory inode and the allocation of the first directory block need to relink allocation group, there need not be enough credits reserved in a transaction. Fix the estimate. CC: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <jlbec@evilplan.org>
2011-02-19ceph: keep reference to parent inode on ceph_dentryYehuda Sadeh
When creating a new dentry we now hold a reference to the parent inode in the ceph_dentry. This is required due to the new RCU changes from 949854d0, which set dentry->d_parent to NULL in d_kill before calling the ->release() callback. If/when that behavior is changed, we can revert this hack. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2011-02-18Merge branch 'fixes-2.6.38' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: make sure MAYDAY_INITIAL_TIMEOUT is at least 2 jiffies long workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable' workqueue: wake up a worker when a rescuer is leaving a gcwq
2011-02-18debugfs: Fix filesystem reference counting on debugfs_remove() failureJan Kara
When __debugfs_remove() fails (because simple_rmdir() fails e.g. when a directory is not empty), we must not decrement use count of the filesystem as nothing was in fact deleted. This fixes use after free caused by debugfs in some cases. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-17eCryptfs: Revert "dont call lookup_one_len to avoid NULL nameidata"Tyler Hicks
This reverts commit 21edad32205e97dc7ccb81a85234c77e760364c8 and commit 93c3fe40c279f002906ad14584c30671097d4394, which fixed a regression by the former. Al Viro pointed out bypassed dcache lookups in ecryptfs_new_lower_dentry(), misuse of vfs_path_lookup() in ecryptfs_lookup_one_lower() and a dislike of passing nameidata to the lower filesystem. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-02-17fs/partitions: Validate map_count in Mac partition tablesTimo Warns
Validate number of blocks in map and remove redundant variable. Signed-off-by: Timo Warns <warns@pre-sense.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-17fs/eventpoll.c: fix spellingDaniel Baluta
eventpoll.c has wonderful comments but some annoying typos sneaked in: * toepoll_ctl -> to epoll_ctl * rapresent -> represents * sructure -> structure * machanism -> mechanism * trasfering -> transferring Signed-off-by: Daniel Baluta <daniel.baluta@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-17fs: update comments to point correct documentNamhyung Kim
dcache-locking.txt is not exist any more, and the path was not correct anyway. Fix it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-16Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: nfsd: correctly handle return value from nfsd_map_name_to_*
2011-02-17cifs: fix handling of scopeid in cifs_convert_addressJeff Layton
The code finds, the '%' sign in an ipv6 address and copies that to a buffer allocated on the stack. It then ignores that buffer, and passes 'pct' to simple_strtoul(), which doesn't work right because we're comparing 'endp' against a completely different string. Fix it by passing the correct pointer. While we're at it, this is a good candidate for conversion to strict_strtoul as well. Cc: stable@kernel.org Cc: David Howells <dhowells@redhat.com> Reported-by: Björn JACKE <bj@sernet.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-02-16block: revert block_dev read-only checkChuck Ebbert
This reverts commit 75f1dc0d076d ("block: check bdev_read_only() from blkdev_get()"). That commit added stricter checking to make sure devices that were being used read-only were actually opened in that mode. It turns out that the change breaks a bunch of kernel code that opens block devices. Affected systems include dm, md, and the loop device. Because strict checking for read-only opens of block devices was not done before this, the code that opens the devices was opening them read-write even if they were being used read-only. Auditing all that code will take time, and new userspace packages for dm, mdadm, etc. will also be required. Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-16nfsd: correctly handle return value from nfsd_map_name_to_*NeilBrown
These functions return an nfs status, not a host_err. So don't try to convert before returning. This is a regression introduced by 3c726023402a2f3b28f49b9d90ebf9e71151157d; I fixed up two of the callers, but missed these two. Cc: stable@kernel.org Reported-by: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-16Btrfs: set FMODE_EXCL in btrfs_device->modeIlya Dryomov
This fixes a bug introduced in d4d77629, where the device added online (and therefore initialized via btrfs_init_new_device()) would be left with the positive bdev->bd_holders after unmount. Since d4d77629 we no longer OR FMODE_EXCL explicitly on blkdev_put(), set it in btrfs_device->mode. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-16Btrfs: make btrfs_rm_device() fail gracefullyIlya Dryomov
If shrinking done as part of the online device removal fails add that device back to the allocation list and increment the rw_devices counter. This fixes two bugs: 1) we could have a perfectly good device out of alloc list for no good reason; 2) in the btrfs consisting of two devices, failure in btrfs_rm_device() could lead to a situation where it was impossible to remove any of the devices because of the "unable to remove the only writeable device" error. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-16Btrfs: Avoid accessing unmapped kernel addressLi Zefan
When decompressing a chunk of data, we'll copy the data out to a working buffer if the data is stored in more than one page, otherwise we'll use the mapped page directly to avoid memory copy. In the latter case, we'll end up accessing the kernel address after we've unmapped the page in a corner case. Reported-by: Juan Francisco Cantero Hurtado <iam@juanfra.info> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-16Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctlLi Zefan
- Check user-specified flags correctly - Check the inode owership - Search root item in root tree but not fs tree Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-16Btrfs: allow balance to explicitly allocate chunks as it relocatesChris Mason
Btrfs device shrinking and balancing ends up reallocating all the blocks in order to allow COW to move them to new destinations. It is somewhat awkward in terms of ENOSPC because most of the enospc code is built around the idea that some operation on a reference counted tree triggers allocations in the non-reference counted trees. This commit changes the balancing code to deal with enospc by trying to allocate a new chunk. If that allocation succeeds, we go ahead and retry whatever failed due to enospc. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-16Btrfs: put ENOSPC debugging under a mount optionChris Mason
ENOSPC in btrfs is getting to the point where the extra debugging isn't required. I've put it under mount -o enospc_debug just in case someone is having difficult problems. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-16vfs: fix BUG_ON() in fs/namei.c:1461Linus Torvalds
When Al moved the nameidata_dentry_drop_rcu_maybe() call into the do_follow_link function in commit 844a391799c2 ("nothing in do_follow_link() is going to see RCU"), he mistakenly left the BUG_ON(inode != path->dentry->d_inode); behind. Which would otherwise be ok, but that BUG_ON() really needs to be _after_ dropping RCU, since the dentry isn't necessarily stable otherwise. So complete the code movement in that commit, and move the BUG_ON() into do_follow_link() too. This means that we need to pass in 'inode' as an argument (just for this one use), but that's a small thing. And eventually we may be confident enough in our path lookup that we can just remove the BUG_ON() and the unnecessary inode argument. Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-16workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable'Tejun Heo
There are two spellings in use for 'freeze' + 'able' - 'freezable' and 'freezeable'. The former is the more prominent one. The latter is mostly used by workqueue and in a few other odd places. Unify the spelling to 'freezable'. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Dubov <oakad@yahoo.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Steven Whitehouse <swhiteho@redhat.com>
2011-02-15Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: nfsd: break lease on unlink due to rename nfsd4: acquire only one lease per file nfsd4: modify fi_delegations under recall_lock nfsd4: remove unused deleg dprintk's. nfsd4: split lease setting into separate function nfsd4: fix leak on allocation error nfsd4: add helper function for lease setup nfsd4: split up nfsd_break_deleg_cb NFSD: memory corruption due to writing beyond the stat array NFSD: use nfserr for status after decode_cb_op_status nfsd: don't leak dentry count on mnt_want_write failure
2011-02-15Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu() drop out of RCU in return_reval split do_revalidate() into RCU and non-RCU cases in do_lookup() split RCU and non-RCU cases of need_revalidate nothing in do_follow_link() is going to see RCU
2011-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: check return value of alloc_extent_map() Btrfs - Fix memory leak in btrfs_init_new_device() btrfs: prevent heap corruption in btrfs_ioctl_space_info() Btrfs: Fix balance panic Btrfs: don't release pages when we can't clear the uptodate bits Btrfs: fix page->private races
2011-02-15s390: remove task_show_regsMartin Schwidefsky
task_show_regs used to be a debugging aid in the early bringup days of Linux on s390. /proc/<pid>/status is a world readable file, it is not a good idea to show the registers of a process. The only correct fix is to remove task_show_regs. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-15ext4: fix comment typo uninitizedPaul Bolle
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-15fuse/cuse: fix comment typo initilaizationPaul Bolle
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-02-15Merge branch 'master' into for-nextJiri Kosina
2011-02-15get rid of nameidata_dentry_drop_rcu() calling nameidata_drop_rcu()Al Viro
can't happen anymore and didn't work right anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-02-15drop out of RCU in return_revalAl Viro
... thus killing the need to handle drop-from-RCU in d_revalidate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-02-15split do_revalidate() into RCU and non-RCU casesAl Viro
fixing oopsen in lookup_one_len() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-02-15in do_lookup() split RCU and non-RCU cases of need_revalidateAl Viro
and use unlikely() instead of gotos, for fsck sake... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-02-15nothing in do_follow_link() is going to see RCUAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-02-14Btrfs: check return value of alloc_extent_map()Tsutomu Itoh
I add the check on the return value of alloc_extent_map() to several places. In addition, alloc_extent_map() returns only the address or NULL. Therefore, check by IS_ERR() is unnecessary. So, I remove IS_ERR() checking. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-14Btrfs - Fix memory leak in btrfs_init_new_device()Ilya Dryomov
Memory allocated by calling kstrdup() should be freed. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-14btrfs: prevent heap corruption in btrfs_ioctl_space_info()Dan Rosenberg
Commit bf5fc093c5b625e4259203f1cee7ca73488a5620 refactored btrfs_ioctl_space_info() and introduced several security issues. space_args.space_slots is an unsigned 64-bit type controlled by a possibly unprivileged caller. The comparison as a signed int type allows providing values that are treated as negative and cause the subsequent allocation size calculation to wrap, or be truncated to 0. By providing a size that's truncated to 0, kmalloc() will return ZERO_SIZE_PTR. It's also possible to provide a value smaller than the slot count. The subsequent loop ignores the allocation size when copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR. The fix changes the slot count type and comparison typecast to u64, which prevents truncation or signedness errors, and also ensures that we don't copy more data than we've allocated in the subsequent loop. Note that zero-size allocations are no longer possible since there is already an explicit check for space_args.space_slots being 0 and truncation of this value is no longer an issue. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: Josef Bacik <josef@redhat.com> Reviewed-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-14Btrfs: Fix balance panicYan, Zheng
Mark the cloned backref_node as checked in clone_backref_node() Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-14Btrfs: don't release pages when we can't clear the uptodate bitsChris Mason
Btrfs tracks uptodate state in an rbtree as well as in the page bits. This is supposed to enable us to use block sizes other than the page size, but there are a few parts still missing before that completely works. But, our readpage routine trusts this additional range based tracking of uptodateness, much in the same way the buffer head up to date bits are trusted for the other filesystems. The problem is that sometimes we need to allocate memory in order to split records in the rbtree, even when we are just clearing bits. This can be difficult when our clearing function is called GFP_ATOMIC, which can happen in the releasepage path. So, what happens today looks like this: releasepage called with GFP_ATOMIC btrfs_releasepage calls clear_extent_bit clear_extent_bit fails to allocate ram, leaving the up to date bit set btrfs_releasepage returns success The end result is the page being gone, but btrfs thinking the range is up to date. Later on if someone tries to read that same page, the btrfs readpage code will return immediately thinking the page is already up to date. This commit fixes things to fail the releasepage when we can't clear the extent state bits. It covers both data pages and metadata tree blocks. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-14Btrfs: fix page->private racesChris Mason
There is a race where btrfs_releasepage can drop the page->private contents just as alloc_extent_buffer is setting up pages for metadata. Because of how the Btrfs page flags work, this results in us skipping the crc on the page during IO. This patch sovles the race by waiting until after the extent buffer is inserted into the radix tree before it sets page private. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-02-14nfsd: break lease on unlink due to renameJ. Bruce Fields
4795bb37effb7b8fe77e2d2034545d062d3788a8 "nfsd: break lease on unlink, link, and rename", only broke the lease on the file that was being renamed, and didn't handle the case where the target path refers to an already-existing file that will be unlinked by a rename--in that case the target file should have any leases broken as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: acquire only one lease per fileJ. Bruce Fields
Instead of acquiring one lease each time another client opens a file, nfsd can acquire just one lease to represent all of them, and reference count it to determine when to release it. This fixes a regression introduced by c45821d263a8a5109d69a9e8942b8d65bcd5f31a "locks: eliminate fl_mylease callback": after that patch, only the struct file * is used to determine who owns a given lease. But since we recently converted the server to share a single struct file per open, if we acquire multiple leases on the same file from nfsd, it then becomes impossible on unlocking a lease to determine which of those leases (all of whom share the same struct file *) we meant to remove. Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous version of this patch. Tested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: modify fi_delegations under recall_lockJ. Bruce Fields
Modify fi_delegations only under the recall_lock, allowing us to use that list on lease breaks. Also some trivial cleanup to simplify later changes. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: remove unused deleg dprintk's.J. Bruce Fields
These aren't all that useful, and get in the way of the next steps. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: split lease setting into separate functionJ. Bruce Fields
Splitting some code into a separate function which we'll be adding some more to. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: fix leak on allocation errorJ. Bruce Fields
Also share some common exit code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: add helper function for lease setupJ. Bruce Fields
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14nfsd4: split up nfsd_break_deleg_cbJ. Bruce Fields
We'll be adding some more code here soon. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-02-14NFSD: memory corruption due to writing beyond the stat arrayKonstantin Khorenko
If nfsd fails to find an exported via NFS file in the readahead cache, it should increment corresponding nfsdstats counter (ra_depth[10]), but due to a bug it may instead write to ra_depth[11], corrupting the following field. In a kernel with NFSDv4 compiled in the corruption takes the form of an increment of a counter of the number of NFSv4 operation 0's received; since there is no operation 0, this is harmless. In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the memory beyond nfsdstats. Signed-off-by: Konstantin Khorenko <khorenko@openvz.org> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>