summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2012-01-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-01-02nfsd4: fix spurious 4.1 post-reboot failuresJ. Bruce Fields
In the NFSv4.1 case, this could cause a spurious "NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writable. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reported-by: Steve Dickson <SteveD@redhat.com>
2012-01-02Squashfs: fix mount time sanity check for corrupted superblockPhillip Lougher
A Squashfs filesystem containing nothing but an empty directory, although unusual and ultimately pointless, is still valid. The directory_table >= next_table sanity check rejects these filesystems as invalid because the directory_table is empty and equal to next_table. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
2011-12-31[media] fs/compat_ioctl: it needs to see the DVBv3 compat stuffMauro Carvalho Chehab
Only the ioctl core should see the DVBv3 compat stuff, as its contents are not available anymore to the drivers. As fs/compat_ioctl also handles DVBv3 ioctl's, it needs those definitions: fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: array type has incomplete element type fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: array type has incomplete element type fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: array type has incomplete element type fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1345: error: initializer element is not constant fs/compat_ioctl.c:1345: error: (near initialization for ‘ioctl_pointer[462]’) fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: array type has incomplete element type fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: array type has incomplete element type fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: array type has incomplete element type fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_parameters’ fs/compat_ioctl.c:1346: error: initializer element is not constant fs/compat_ioctl.c:1346: error: (near initialization for ‘ioctl_pointer[463]’) fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: array type has incomplete element type fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: array type has incomplete element type fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: array type has incomplete element type fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: invalid application of ‘sizeof’ to incomplete type ‘struct dvb_frontend_event’ fs/compat_ioctl.c:1347: error: initializer element is not constant fs/compat_ioctl.c:1347: error: (near initialization for ‘ioctl_pointer[464]’) Reported-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-12-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: disable use of dcache for readdir etc.
2011-12-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2011-12-30Squashfs: optimise squashfs_cache_get entry searchAjeet Yadav
squashfs_cache_get() iterates over all entries to search for block its looking for. Often get() / put() are called for same block. If we cache the current entry index, then we can optimise the subsequent *_get() calls. Signed-off-by: Ajeet Yadav <ajeet.yadav.77@gmail.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
2011-12-30Squashfs: add missing block release on error conditionPhillip Lougher
squashfs_read_metadata forgets to release the cache block if an error has occurred. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
2011-12-29Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: log all dirty inodes in xfs_fs_sync_fs xfs: log the inode in ->write_inode calls for kupdate
2011-12-29procfs: do not confuse jiffies with cputime64_tAndreas Schwab
Commit 2a95ea6c0d129b4 ("procfs: do not overflow get_{idle,iowait}_time for nohz") did not take into account that one some architectures jiffies and cputime use different units. This causes get_idle_time() to return numbers in the wrong units, making the idle time fields in /proc/stat wrong. Instead of converting the usec value returned by get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function usecs_to_cputime64 to convert it to the correct unit of cputime64_t. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Artem S. Tashkinov" <t.artem@mailcity.com> Cc: Dave Jones <davej@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-29ceph: disable use of dcache for readdir etc.Sage Weil
Ceph attempts to use the dcache to satisfy negative lookups and readdir when the entire directory contents are in cache. Disable this behavior until lingering bugs in this code are shaken out; we'll re-enable these hooks once things are fully stable. Signed-off-by: Sage Weil <sage@newdream.net>
2011-12-28ext4: use proper little-endian bitopsAkinobu Mita
ext4_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le() for ext4. Only two ext4_{set,clear}_bit() calls check the return value. The rest of calls ignore the return value and they can be replaced with __{set,clear}_bit_le(). This changes ext4_{set,clear}_bit() from __test_and_{set,clear}_bit_le() to __{set,clear}_bit_le() and introduces ext4_test_and_{set,clear}_bit() for the two places where old bit needs to be returned. This ext4_{set,clear}_bit() change is considered safe, because if someone uses these macros without noticing the change, new ext4_{set,clear}_bit don't have return value and causes compiler errors where the return value is used. This also removes unused ext4_find_first_zero_bit(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-28ext4: remove no longer used functions in inode.cZheng Liu
The functions ext4_block_truncate_page() and ext4_block_zero_page_range() are no longer used, so remove them. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-28ext4: avoid counting the number of free inodes twice in find_group_orlov()Theodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-28ext4: add missing spaces to debugging printk'sZheng Liu
Fix ext4_debug format in ext4_ext_handle_uninitialized_extents() and ext4_end_io_dio(). Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-28jbd2: clear revoked flag on buffers before a new transaction startedYongqiang Yang
Currently, we clear revoked flag only when a block is reused. However, this can tigger a false journal error. Consider a situation when a block is used as a meta block and is deleted(revoked) in ordered mode, then the block is allocated as a data block to a file. At this moment, user changes the file's journal mode from ordered to journaled and truncates the file. The block will be considered re-revoked by journal because it has revoked flag still pending from the last transaction and an assertion triggers. We fix the problem by keeping the revoked status more uptodate - we clear revoked flag when switching revoke tables to reflect there is no revoked buffers in current transaction any more. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-28ext4: flush journal when switching from data=journal modeYongqiang Yang
It's necessary to flush the journal when switching away from data=journal mode. This is because there are no revoke records when data blocks are journalled, but revoke records are required in the other journal modes. However, it is not necessary to flush the journal when switching into data=journal mode, and flushing the journal is expensive. So let's avoid it in that case. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-28ext4: allocate delalloc blocks before changing journal modeYongqiang Yang
delalloc blocks should be allocated before changing journal mode, otherwise they can not be allocated and even more truncate on delalloc blocks could triggre BUG by flushing delalloc buffers. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-26vfs: fix handling of lock allocation failure in lease-break caseLinus Torvalds
Bruce Fields notes that commit 778fc546f749 ("locks: fix tracking of inprogress lease breaks") introduced a possible error pointer dereference on failure to allocate memory. locks_conflict() will dereference the passed-in new lease lock structure that may be an error pointer. This means an open (without O_NONBLOCK set) on a file with a lease applied (generally only done when Samba or nfsd (with v4) is running) could crash if a kmalloc() fails. So instead of playing games with IS_ERROR() all over the place, just check the allocation failure early. That makes the code more straightforward, and avoids this possible bad pointer dereference. Based-on-patch-by: J. Bruce Fields <bfields@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-25Merge branch 'pm-sleep' into pm-for-linusRafael J. Wysocki
* pm-sleep: (51 commits) PM: Drop generic_subsys_pm_ops PM / Sleep: Remove forward-only callbacks from AMBA bus type PM / Sleep: Remove forward-only callbacks from platform bus type PM: Run the driver callback directly if the subsystem one is not there PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers PM / Sleep: Merge internal functions in generic_ops.c PM / Sleep: Simplify generic system suspend callbacks PM / Hibernate: Remove deprecated hibernation snapshot ioctls PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled() PM / Sleep: Recommend [un]lock_system_sleep() over using pm_mutex directly PM / Sleep: Replace mutex_[un]lock(&pm_mutex) with [un]lock_system_sleep() PM / Sleep: Make [un]lock_system_sleep() generic PM / Sleep: Use the freezer_count() functions in [un]lock_system_sleep() APIs PM / Freezer: Remove the "userspace only" constraint from freezer[_do_not]_count() PM / Hibernate: Replace unintuitive 'if' condition in kernel/power/user.c with 'else' Freezer / sunrpc / NFS: don't allow TASK_KILLABLE sleeps to block the freezer PM / Sleep: Unify diagnostic messages from device suspend/resume ACPI / PM: Do not save/restore NVS on Asus K54C/K54HR PM / Hibernate: Remove deprecated hibernation test modes PM / Hibernate: Thaw processes in SNAPSHOT_CREATE_IMAGE ioctl test path ... Conflicts: kernel/kmod.c
2011-12-23Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds
for linus: writeback reason binary tracing format fix * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: show writeback reason with __print_symbolic
2011-12-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: call d_instantiate after all ops are setup Btrfs: fix worker lock misuse in find_worker
2011-12-23xfs: log all dirty inodes in xfs_fs_sync_fsChristoph Hellwig
Since Linux 2.6.36 the writeback code has introduces various measures for live lock prevention during sync(). Unfortunately some of these are actively harmful for the XFS model, where the inode gets marked dirty for metadata from the data I/O handler. The older_than_this checks that are now more strictly enforced since writeback: avoid livelocking WB_SYNC_ALL writeback by only calling into __writeback_inodes_sb and thus only sampling the current cut off time once. But on a slow enough devices the previous asynchronous sync pass might not have fully completed yet, and thus XFS might mark metadata dirty only after that sampling of the cut off time for the blocking pass already happened. I have not myself reproduced this myself on a real system, but by introducing artificial delay into the XFS I/O completion workqueues it can be reproduced easily. Fix this by iterating over all XFS inodes in ->sync_fs and log all that are dirty. This might log inode that only got redirtied after the previous pass, but given how cheap delayed logging of inodes is it isn't a major concern for performance. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Tested-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2011-12-23xfs: log the inode in ->write_inode calls for kupdateChristoph Hellwig
If the writeback code writes back an inode because it has expired we currently use the non-blockin ->write_inode path. This means any inode that is pinned is skipped. With delayed logging and a workload that has very little log traffic otherwise it is very likely that an inode that gets constantly written to is always pinned, and thus we keep refusing to write it. The VM writeback code at that point redirties it and doesn't try to write it again for another 30 seconds. This means under certain scenarious time based metadata writeback never happens. Fix this by calling into xfs_log_inode for kupdate in addition to data integrity syncs, and thus transfer the inode to the log ASAP. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Tested-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2011-12-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-23Merge branch 'integration' into for-linusChris Mason
2011-12-23Btrfs: call d_instantiate after all ops are setupAl Viro
This closes races where btrfs is calling d_instantiate too soon during inode creation. All of the callers of btrfs_add_nondir are updated to instantiate after the inode is fully setup in memory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-12-23Btrfs: fix worker lock misuse in find_workerChris Mason
Dan Carpenter noticed that we were doing a double unlock on the worker lock, and sometimes picking a worker thread without the lock held. This fixes both errors. Signed-off-by: Chris Mason <chris.mason@oracle.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
2011-12-22Btrfs: always save ref_root in delayed refsArne Jansen
For consistent backref walking and (later) qgroup calculation the information to which root a delayed ref belongs is useful even for shared refs. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-12-22Btrfs: mark delayed refs as for cowArne Jansen
Add a for_cow parameter to add_delayed_*_ref and pass the appropriate value from every call site. The for_cow parameter will later on be used to determine if a ref will change anything with respect to qgroups. Delayed refs coming from relocation are always counted as for_cow, as they don't change subvol quota. Also pass in the fs_info for later use. btrfs_find_all_roots() will use this as an optimization, as changes that are for_cow will not change anything with respect to which root points to a certain leaf. Thus, we don't need to add the current sequence number to those delayed refs. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-12-22Btrfs: added helper btrfs_next_item()Jan Schmidt
btrfs_next_item() makes the btrfs path point to the next item, crossing leaf boundaries if needed. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-12-22Btrfs: generic data structure to build unique listsArne Jansen
ulist is a generic data structures to hold a collection of unique u64 values. The only operations it supports is adding to the list and enumerating it. It is possible to store an auxiliary value along with the key. The implementation is preliminary and can probably be sped up significantly. It is used by btrfs_find_all_roots() quota to translate recursions into iterative loops. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-12-21Merge branch 'master' into pm-sleepRafael J. Wysocki
* master: (848 commits) SELinux: Fix RCU deref check warning in sel_netport_insert() binary_sysctl(): fix memory leak mm/vmalloc.c: remove static declaration of va from __get_vm_area_node ipmi_watchdog: restore settings when BMC reset oom: fix integer overflow of points in oom_badness memcg: keep root group unchanged if creation fails nilfs2: potential integer overflow in nilfs_ioctl_clean_segments() nilfs2: unbreak compat ioctl cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask evm: prevent racing during tfm allocation evm: key must be set once during initialization mmc: vub300: fix type of firmware_rom_wait_states module parameter Revert "mmc: enable runtime PM by default" mmc: sdhci: remove "state" argument from sdhci_suspend_host x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT IB/qib: Correct sense on freectxts increment and decrement RDMA/cma: Verify private data length cgroups: fix a css_set not found bug in cgroup_attach_proc oprofile: Fix uninitialized memory access when writing to writing to oprofilefs Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel" ... Conflicts: kernel/cgroup_freezer.c
2011-12-21ext4: remove unneeded file_remove_suid() from ext4_ioctl()Theodore Ts'o
In the code to support EXT4_IOC_MOVE_EXT, ext4_ioctl calls file_remove_suid() after the call to ext4_move_extents() if any extents has been moved. There are at least three things wrong with this. First, file_remove_suid() should be called with i_mutex down, which is not here. Second, it should be called before the donor file has been modified, to avoid a potential race condition. Third, and most importantly, it's pointless, because ext4_file_extents() already checks if the donor file has the setuid or setgid bit set, and will return an error in that case. So the first two objections don't really matter, since file_remove_suid() will never need to modify the inode in any case. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-21Btrfs: integrate integrity check module into btrfsStefan Behrens
This is the last part of the patch series. It modifies the btrfs code to use the integrity check module if configured to do so with the define BTRFS_FS_CHECK_INTEGRITY. If this define is not set, the only effective change is that code is added that handles the mount option to activate the integrity check. If the mount option is set and the define BTRFS_FS_CHECK_INTEGRITY is not set, that code complains in the log and the mount fails with EINVAL. Add the mount option to activate the usage of the integrity check code. Add invocation of btrfs integrity check code init and cleanup function on mount and umount, respectively. Add hook to call btrfs integrity check code version of submit_bh/submit_bio. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2011-12-21Btrfs: Makefile changes to optionally include btrfs integrity checkStefan Behrens
If the btrfs integrity check is enabled, the files required to implement the checks are included in the build. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2011-12-21Btrfs: add config option to enable btrfs integrity checkStefan Behrens
Added the BTRFS_FS_CHECK_INTEGRITY option to Kconfig. It depends on BTRFS_FS. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2011-12-21Btrfs: add optional integrity check codeStefan Behrens
The two files added in this patch contain all the code that is required to implement the integrity checks. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
2011-12-20Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix a regression in nfs_file_llseek() NFSv4: Do not accept delegated opens when a delegation recall is in effect NFSv4: Ensure correct locking when accessing the 'lock_states' list NFSv4.1: Ensure that we handle _all_ SEQUENCE status bits. NFSv4: Don't error if we handled it in nfs4_recovery_handle_error SUNRPC: Ensure we always bump the backlog queue in xprt_free_slot SUNRPC: Fix the execution time statistics in the face of RPC restarts
2011-12-20nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()Haogang Chen
There is a potential integer overflow in nilfs_ioctl_clean_segments(). When a large argv[n].v_nmembs is passed from the userspace, the subsequent call to vmalloc() will allocate a buffer smaller than expected, which leads to out-of-bound access in nilfs_ioctl_move_blocks() and lfs_clean_segments(). The following check does not prevent the overflow because nsegs is also controlled by the userspace and could be very large. if (argv[n].v_nmembs > nsegs * nilfs->ns_blocks_per_segment) goto out_free; This patch clamps argv[n].v_nmembs to UINT_MAX / argv[n].v_size, and returns -EINVAL when overflow. Signed-off-by: Haogang Chen <haogangchen@gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-20nilfs2: unbreak compat ioctlThomas Meyer
commit 828b1c50ae ("nilfs2: add compat ioctl") incidentally broke all other NILFS compat ioctls. Make them work again. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> [3.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-19xfs: mark the xfssyncd workqueue as non-reentrantChristoph Hellwig
On a system with lots of memory pressure that is stuck on synchronous inode reclaim the workqueue code will run one instance of the inode reclaim work item on every CPU. which is not what we want. Make sure to mark the xfssyncd workqueue as non-reentrant to make sure there only is one instace of each running globally. Also stop using special paramater for the workqueue; now that we guarantee each fs has only running one of each works at a time there is no need to artificially lower max_active and compensate for that by setting the WQ_CPU_INTENSIVE flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2011-12-19Merge branch 'sched/core' of ↵Martin Schwidefsky
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip Conflicts: drivers/cpufreq/cpufreq_conservative.c drivers/cpufreq/cpufreq_ondemand.c drivers/macintosh/rack-meter.c fs/proc/stat.c fs/proc/uptime.c kernel/sched/core.c
2011-12-18ext4: optimize ext4_find_delalloc_range() in nodelalloc modeRobin Dong
We found performance regression when using bigalloc with "nodelalloc" (1MB cluster size): 1. mke2fs -C 1048576 -O ^has_journal,bigalloc /dev/sda 2. mount -o nodelalloc /dev/sda /test/ 3. time dd if=/dev/zero of=/test/io bs=1048576 count=1024 The "dd" will cost about 2 seconds to finish, but if we mke2fs without "bigalloc", "dd" will only cost less than 1 second. The reason is: when using ext4 with "nodelalloc", it will call ext4_find_delalloc_cluster() nearly everytime it call ext4_ext_map_blocks(), and ext4_find_delalloc_range() will also scan all pages in cluster because no buffer is "delayed". A cluster has 256 pages (1MB cluster), so it will scan 256 * 256k pags when creating a 1G file. That severely hurts the performance. Therefore, we return immediately from ext4_find_delalloc_range() in nodelalloc mode, since by definition there can't be any delalloc pages. Signed-off-by: Robin Dong <sanbai@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-18ext4: remove unused local variableCurt Wohlgemuth
In get_implied_cluster_alloc(), rr_cluster_end was being defined and set, but was never used. Removed this. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-18ext4: fix error handling on inode bitmap corruptionJan Kara
When insert_inode_locked() fails in ext4_new_inode() it most likely means inode bitmap got corrupted and we allocated again inode which is already in use. Also doing unlock_new_inode() during error recovery is wrong since the inode does not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:) which declares filesystem error and does not call unlock_new_inode(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-18ext4: add missing space to ext4_msg output in ext4_fill_super()Zheng Liu
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-18ext4: do not reference pa_inode from group_paYongqiang Yang
pa_inode in group_pa is set NULL in ext4_mb_new_group_pa, so pa_inode should be not referenced. Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-12-18btrfs: fix dirtied pages accounting on sub-page writesWu Fengguang
When doing 1KB sequential writes to the same page, balance_dirty_pages_ratelimited_nr() should be called once instead of 4 times, the latter makes the dirtier tasks be throttled much too heavy. Fix it with proper de-accounting on clear_page_dirty_for_io(). CC: Chris Mason <chris.mason@oracle.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-12-18writeback: Include all dirty inodes in background writebackJan Kara
Current livelock avoidance code makes background work to include only inodes that were dirtied before background writeback has started. However background writeback can be running for a long time and thus excluding newly dirtied inodes can eventually exclude significant portion of dirty inodes making background writeback inefficient. Since background writeback avoids livelocking the flusher thread by yielding to any other work, there is no real reason why background work should not include all dirty inodes so change the logic in wb_writeback(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>