summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2012-08-19CIFS: Protect i_nlink from being negativeSteve French
that can cause warning messages. Pavel had initially suggested a smaller patch around drop_nlink, after a similar problem was discovered NFS. Protecting additional places where nlink is touched was suggested by Jeff Layton and is included in this. Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-08-19ext4: remove duplicated declarations in inode.cZheng Liu
In patch cb20d5188366f04d96d2e07b1240cc92170ade40, ext4_set_bh_endio and ext4_end_io_buffer_write are declared at the beginning of inode.c, and again later on in the middle of the file. Remove the second set of duplicated function declarations. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-18ext4: fix trivial typo in commentWang Sheng-Hui
Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-18ext4: no need to add inode to orphan list during hole punchAshish Sangwan
While performing punch hole for an inode, i_disksize is not changed. So, there is no need to add the inode to orphan list. Signed-off-by: Ashish Sangwan <ashish.sangwan2@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Acked-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-18jbd2: don't write superblock when if its emptyEric Sandeen
This sequence: # truncate --size=1g fsfile # mkfs.ext4 -F fsfile # mount -o loop,ro fsfile /mnt # umount /mnt # dmesg | tail results in an IO error when unmounting the RO filesystem: [ 318.020828] Buffer I/O error on device loop1, logical block 196608 [ 318.027024] lost page write due to I/O error on loop1 [ 318.032088] JBD2: Error -5 detected when updating journal superblock for loop1-8. This was a regression introduced by commit 24bcc89c7e7c: "jbd2: split updating of journal superblock and marking journal empty". Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-08-18ext4: replace plain integer with NULL in super.cSachin Kamat
Fixes the following sparse warning: fs/ext4/super.c:1672:45: warning: Using plain integer as NULL pointer Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-18Merge branch 'vfs-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull vfs fixes from Miklos Szeredi. This mainly fixes some confusion about whether the open 'mode' variable passed around should contain the full file type (S_IFREG etc) information or just the permission mode. In particular, the lack of proper file type information had confused fuse. * 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: vfs: fix propagation of atomic_open create error on negative dentry fuse: check create mode in atomic open vfs: pass right create mode to may_o_create() vfs: atomic_open(): fix create mode usage vfs: canonicalize create mode in build_open_flags()
2012-08-17ext4: drop lock_super()/unlock_super()Theodore Ts'o
We don't need lock_super()/unlock_super() any more, since the places where it is used, we are protected by the s_umount r/w semaphore. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Marco Stornelli <marco.stornelli@gmail.com>
2012-08-17Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: "The following are all bug fixes and regressions. The most notable are the ones which cause problems for ext4 on RAID --- a performance problem when mounting very large filesystems, and a kernel OOPS when doing an rm -rf on large directory hierarchies on fast devices." * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix kernel BUG on large-scale rm -rf commands ext4: fix long mount times on very big file systems ext4: don't call ext4_error while block group is locked ext4: avoid kmemcheck complaint from reading uninitialized memory ext4: make sure the journal sb is written in ext4_clear_journal_err()
2012-08-17ext4: return an error if kset_create_and_add fails in ext4_init_fs()Theodore Ts'o
In the very unlikely case that kset_create_and_add() fails when the ext4.ko module is being loaded (or during kernel startup) set err so that it's clear that the module load failed. https://bugzilla.kernel.org/show_bug.cgi?id=27912 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: remove unused function argument 'order' in mb_find_extent()Robin Dong
All the routines call mb_find_extent are setting argument 'order' to 0 just like: mb_find_extent(e4b, 0, ex.fe_start, ex.fe_len, &ex); therefore the useless argument should be removed. Signed-off-by: Robin Dong <sanbai@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: remove unused macro MB_DEFAULT_MAX_GROUPS_TO_SCANRobin Dong
Signed-off-by: Robin Dong <sanbai@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: check return value of blkdev_issue_flush()Theodore Ts'o
blkdev_issue_flush() can fail; make sure the error gets properly propagated. This is a port of the equivalent ext3 patch from commit 44f4f729e7a1. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17autofs4 - fix expire checkIan Kent
In some cases when an autofs indirect mount is contained in a file system that is marked as shared (such as when systemd does the equivalent of "mount --make-rshared /" early in the boot), mounts stop expiring. When this happens the first expiry check on a mountpoint dentry in autofs_expire_indirect() sees a mountpoint dentry with a higher than minimal reference count. Consequently the dentry is condidered busy and the actual expiry check is never done. This particular check was originally meant as an optimisation to detect a path walk in progress but with the addition of rcu-walk it can be ineffective anyway. Removing the test allows automounts to expire again since the actual expire check doesn't rely on the dentry reference count. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-17jbd2: check return value of blkdev_issue_flush()Theodore Ts'o
blkdev_issue_flush() can fail; make sure the error gets properly propagated. This is a port of the equivalent jbd patch from commit 349ecd6a3c0e. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: make the zero-out chunk size tunableZheng Liu
Currently in ext4 the length of zero-out chunk is set to 7 file system blocks. But if an inode has uninitailized extents from using fallocate to preallocate space, and the workload issues many random writes, this can cause a fragmented extent tree that will unnecessarily grow the extent tree. So create a new sysfs tunable, extent_max_zeroout_kb, which controls the maximum size where blocks will be zeroed out instead of creating a new uninitialized extent. The default of this has been sent to 32kb. CC: Zach Brown <zab@zabbo.net> CC: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: add max_dir_size_kb mount optionTheodore Ts'o
Very large directories can cause significant performance problems, or perhaps even invoke the OOM killer, if the process is running in a highly constrained memory environment (whether it is VM's with a small amount of memory or in a small memory cgroup). So it is useful, in cloud server/data center environments, to be able to set a filesystem-wide cap on the maximum size of a directory, to ensure that directories never get larger than a sane size. We do this via a new mount option, max_dir_size_kb. If there is an attempt to grow the directory larger than max_dir_size_kb, the system call will return ENOSPC instead. Google-Bug-Id: 6863013 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: don't load the block bitmap for block groups which have no spaceTheodore Ts'o
Add a short circuit check to ext4_mb_group_group() so that we don't bother to load the block bitmap for a block group which does not have any space available. (Or which does not have enough space until we are in desperation mode, i.e., when cr == 3.) Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=45741 Reported-by: mirek@me.com Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: collapse a single extent tree block into the inode if possibleTheodore Ts'o
If an inode has more than 4 extents, but then later some of the extents are merged together, we can optimize the file system by moving the extents up into the inode, and discarding the extent tree block. This is important, because if there are a large number of inodes with an external extent tree blocks where the contents could fit in the inode, this can significantly increase the fsck time of the file system. Google-Bug-Id: 6801242 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-08-17ext4: fix kernel BUG on large-scale rm -rf commandsTheodore Ts'o
Commit 968dee7722: "ext4: fix hole punch failure when depth is greater than 0" introduced a regression in v3.5.1/v3.6-rc1 which caused kernel crashes when users ran run "rm -rf" on large directory hierarchy on ext4 filesystems on RAID devices: BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 Process rm (pid: 18229, threadinfo ffff8801276bc000, task ffff880123631710) Call Trace: [<ffffffff81236483>] ? __ext4_handle_dirty_metadata+0x83/0x110 [<ffffffff812353d3>] ext4_ext_truncate+0x193/0x1d0 [<ffffffff8120a8cf>] ? ext4_mark_inode_dirty+0x7f/0x1f0 [<ffffffff81207e05>] ext4_truncate+0xf5/0x100 [<ffffffff8120cd51>] ext4_evict_inode+0x461/0x490 [<ffffffff811a1312>] evict+0xa2/0x1a0 [<ffffffff811a1513>] iput+0x103/0x1f0 [<ffffffff81196d84>] do_unlinkat+0x154/0x1c0 [<ffffffff8118cc3a>] ? sys_newfstatat+0x2a/0x40 [<ffffffff81197b0b>] sys_unlinkat+0x1b/0x50 [<ffffffff816135e9>] system_call_fastpath+0x16/0x1b Code: 8b 4d 20 0f b7 41 02 48 8d 04 40 48 8d 04 81 49 89 45 18 0f b7 49 02 48 83 c1 01 49 89 4d 00 e9 ae f8 ff ff 0f 1f 00 49 8b 45 28 <48> 8b 40 28 49 89 45 20 e9 85 f8 ff ff 0f 1f 80 00 00 00 RIP [<ffffffff81233164>] ext4_ext_remove_space+0xa34/0xdf0 This could be reproduced as follows: The problem in commit 968dee7722 was that caused the variable 'i' to be left uninitialized if the truncate required more space than was available in the journal. This resulted in the function ext4_ext_truncate_extend_restart() returning -EAGAIN, which caused ext4_ext_remove_space() to restart the truncate operation after starting a new jbd2 handle. Reported-by: Maciej Żenczykowski <maze@google.com> Reported-by: Marti Raudsepp <marti@juffo.org> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-08-17ext4: fix long mount times on very big file systemsTheodore Ts'o
Commit 8aeb00ff85a: "ext4: fix overhead calculation used by ext4_statfs()" introduced a O(n**2) calculation which makes very large file systems take forever to mount. Fix this with an optimization for non-bigalloc file systems. (For bigalloc file systems the overhead needs to be set in the the superblock.) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-08-17ext4: don't call ext4_error while block group is lockedTheodore Ts'o
While in ext4_validate_block_bitmap(), if an block allocation bitmap is found to be invalid, we call ext4_error() while the block group is still locked. This causes ext4_commit_super() to call a function which might sleep while in an atomic context. There's no need to keep the block group locked at this point, so hoist the ext4_error() call up to ext4_validate_block_bitmap() and release the block group spinlock before calling ext4_error(). The reported stack trace can be found at: http://article.gmane.org/gmane.comp.file-systems.ext4/33731 Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-08-16xfs: check for possible overflow in xfs_ioc_trimTomas Racek
If range.start or range.minlen is bigger than filesystem size, return invalid value error. This fixes possible overflow in BTOBB macro when passed value was nearly ULLONG_MAX. Signed-off-by: Tomas Racek <tracek@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-16xfs: unlock the AGI buffer when looping in xfs_diallocChristoph Hellwig
Also update some commens in the area to make the code easier to read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-16NFS: return -ENOKEY when the upcall fails to map the nameBryan Schumaker
This allows the normal error-paths to handle the error, rather than making a special call to complete_request_key() just for this instance. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Tested-by: William Dauchy <wdauchy@gmail.com> Cc: stable@vger.kernel.org [>= 3.4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16NFS: Clear key construction data if the idmap upcall failsBryan Schumaker
idmap_pipe_downcall already clears this field if the upcall succeeds, but if it fails (rpc.idmapd isn't running) the field will still be set on the next call triggering a BUG_ON(). This patch tries to handle all possible ways that the upcall could fail and clear the idmap key data for each one. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Tested-by: William Dauchy <wdauchy@gmail.com> Cc: stable@vger.kernel.org [>= 3.4] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16NFSv4: Don't use private xdr_stream fields in decode_getaclTrond Myklebust
Instead of using the private field xdr->p from struct xdr_stream, use the public xdr_stream_pos(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16NFSv4: Fix the acl cache size calculationTrond Myklebust
Currently, we do not take into account the size of the 16 byte struct nfs4_cached_acl header, when deciding whether or not we should cache the acl data. Consequently, we will end up allocating an 8k buffer in order to fit a maximum size 4k acl. This patch adjusts the calculation so that we limit the cache size to 4k for the acl header+data. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16NFSv4: Fix pointer arithmetic in decode_getaclTrond Myklebust
Resetting the cursor xdr->p to a previous value is not a safe practice: if the xdr_stream has crossed out of the initial iovec, then a bunch of other fields would need to be reset too. Fix this issue by using xdr_enter_page() so that the buffer gets page aligned at the bitmap _before_ we decode it. Also fix the confusion of the ACL length with the page buffer length by not adding the base offset to the ACL length... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-08-16NFS: Alias the nfs module to nfs4bjschuma@gmail.com
This allows distros to remove the line from their modprobe configuration. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16NFS: Fix a regression when loading the NFS v4 modulebjschuma@gmail.com
Some systems have a modprobe.d/nfs.conf file that sets an nfs4 alias pointing to nfs.ko, rather than nfs4.ko. This can prevent the v4 module from loading on mount, since the kernel sees that something named "nfs4" has already been loaded. To work around this, I've renamed the modules to "nfsv2.ko" "nfsv3.ko" and "nfsv4.ko". I also had to move the nfs4_fs_type back to nfs.ko to ensure that `mount -t nfs4` still works. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16autofs4 - fix get_next_positive_subdir()Ian Kent
Following a report of a crash during an automount expire I found that the locking in fs/autofs4/expire.c:get_next_positive_subdir() was wrong. Not only is the locking wrong but the function is more complex than it needs to be. The function is meant to calculate (and dget) the next entry in the list of directories contained in the root of an autofs mount point (an autofs indirect mount to be precise). The main problem was that the d_lock of the owner of the list was not being taken when walking the list, which lead to list corruption under load. The only other lock that needs to be taken is against the next dentry candidate so it can be checked for usability. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: verify all ioctl retry iov elements fuse: add missing INIT flag descriptions fuse: add missing INIT flags fuse: update attributes on aio_read fuse: invalidate inode mapping if mtime changes fuse: add FUSE_AUTO_INVAL_DATA init flag
2012-08-16debugfs: make __create_file staticChris Wright
It's only used locally, no need to pollute global namespace. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-16xfs: kill struct declarations in xfs_mount.hAlex Elder
I noticed that "struct xfs_mount_args" was still declared in "fs/xfs/xfs_mount.h". That struct doesn't even exist any more (and is obviously not referenced elsewhere in that header file). While in there, delete four other unneeded struct declarations in that file. Doing so highlights that "fs/xfs/xfs_trace.h" was relying indirectly on "xfs_mount.h" to be #included in order to declare "struct xfs_bmbt_irec", so add that declaration to resolve that issue. Signed-off-by: Alex Elder <elder@inktank.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-16xfs: fix uninitialised variable in xfs_rtbuf_get()Dave Chinner
Results in this assert failure in generic/090: XFS: Assertion failed: *nmap >= 1, file: fs/xfs/xfs_bmap.c, line: 4363 ..... Call Trace: [<ffffffff814680db>] xfs_bmapi_read+0x6b/0x370 [<ffffffff814b64b2>] xfs_rtbuf_get+0x42/0x130 [<ffffffff814b6f09>] xfs_rtget_summary+0x89/0x120 [<ffffffff814b7bfe>] xfs_rtallocate_extent_size+0xce/0x340 [<ffffffff814b89f0>] xfs_rtallocate_extent+0x240/0x290 [<ffffffff81462c1a>] xfs_bmap_rtalloc+0x1ba/0x340 [<ffffffff81463a65>] xfs_bmap_alloc+0x35/0x40 [<ffffffff8146f111>] xfs_bmapi_allocate+0xf1/0x350 [<ffffffff8146f9de>] xfs_bmapi_write+0x66e/0xa60 [<ffffffff8144538a>] xfs_iomap_write_direct+0x22a/0x3f0 [<ffffffff8143707b>] __xfs_get_blocks+0x38b/0x5d0 [<ffffffff814372d4>] xfs_get_blocks_direct+0x14/0x20 [<ffffffff811b0081>] do_blockdev_direct_IO+0xf71/0x1eb0 [<ffffffff811b1015>] __blockdev_direct_IO+0x55/0x60 [<ffffffff814355ca>] xfs_vm_direct_IO+0x11a/0x1e0 [<ffffffff8112d617>] generic_file_direct_write+0xd7/0x1b0 [<ffffffff8143e16c>] xfs_file_dio_aio_write+0x13c/0x320 [<ffffffff8143e6f2>] xfs_file_aio_write+0x1c2/0x1d0 [<ffffffff81174a07>] do_sync_write+0xa7/0xe0 [<ffffffff81175288>] vfs_write+0xa8/0x160 [<ffffffff81175702>] sys_pwrite64+0x92/0xb0 [<ffffffff81b68f69>] system_call_fastpath+0x16/0x1b Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-08-16vfs: fix propagation of atomic_open create error on negative dentrySage Weil
If ->atomic_open() returns -ENOENT, we take care to return the create error (e.g., EACCES), if any. Do the same when ->atomic_open() returns 1 and provides a negative dentry. This fixes a regression where an unprivileged open O_CREAT fails with ENOENT instead of EACCES, introduced with the new atomic_open code. It is tested by the open/08.t test in the pjd posix test suite, and was observed on top of fuse (backed by ceph-fuse). Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2012-08-15udf: fix retun value on error path in udf_load_logicalvolNikola Pajkovsky
In case we detect a problem and bail out, we fail to set "ret" to a nonzero value, and udf_load_logicalvol will mistakenly report success. Signed-off-by: Nikola Pajkovsky <npajkovs@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15jbd: don't write superblock when unmounting an ro filesystemJan Kara
This sequence: results in an IO error when unmounting the RO filesystem. The bug was introduced by: commit 9754e39c7bc51328f145e933bfb0df47cd67b6e9 Author: Jan Kara <jack@suse.cz> Date: Sat Apr 7 12:33:03 2012 +0200 jbd: Split updating of journal superblock and marking journal empty which lost some of the magic in journal_update_superblock() which used to test for a journal with no outstanding transactions. This is a port of a jbd2 fix by Eric Sandeen. CC: <stable@vger.kernel.org> # 3.4.x Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15fuse: check create mode in atomic openMiklos Szeredi
Verify that the VFS is passing us a complete create mode with the S_IFREG to atomic open. Reported-by: Steve <steveamigauk@yahoo.co.uk> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15vfs: pass right create mode to may_o_create()Miklos Szeredi
Pass the umask-ed create mode to may_o_create() instead of the original one. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15vfs: atomic_open(): fix create mode usageMiklos Szeredi
Don't mask S_ISREG off the create mode before passing to ->atomic_open(). Other methods (->create, ->mknod) also get the complete file mode and filesystems expect it. Reported-by: Steve <steveamigauk@yahoo.co.uk> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Richard W.M. Jones <rjones@redhat.com>
2012-08-15vfs: canonicalize create mode in build_open_flags()Miklos Szeredi
Userspace can pass weird create mode in open(2) that we canonicalize to "(mode & S_IALLUGO) | S_IFREG" in vfs_create(). The problem is that we use the uncanonicalized mode before calling vfs_create() with unforseen consequences. So do the canonicalization early in build_open_flags(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Richard W.M. Jones <rjones@redhat.com> CC: stable@vger.kernel.org
2012-08-14userns: Make seq_file's user namespace accessibleEric W. Biederman
struct file already has a user namespace associated with it in file->f_cred->user_ns, unfortunately because struct seq_file has no struct file backpointer associated with it, it is difficult to get at the user namespace in seq_file context. Therefore add a helper function seq_user_ns to return the associated user namespace and a user_ns field to struct seq_file to be used in implementing seq_user_ns. Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-08-15reiserfs: fix deadlocks with quotasJeff Mahoney
The BKL push-down for reiserfs made lock recursion a special case that needs to be handled explicitly. One of the cases that was unhandled is dropping the quota during inode eviction. Both reiserfs_evict_inode and reiserfs_write_dquot take the write lock, but when the journal lock is taken it only drops one the references. The locking rules are that the journal lock be acquired before the write lock so leaving the reference open leads to a ABBA deadlock. This patch pushes the unlock up before clear_inode and avoids the recursive locking. Another ABBA situation can occur when the write lock is dropped while reading the bitmap buffer while in the quota code. When the lock is reacquired, it will deadlock against dquot->dq_lock and dqopt->dqio_mutex in the dquot_acquire path. It's safe to retain the lock across the read and should be cached under write load. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15quota: Move down dqptr_sem read after initializing default warn[] type at ↵Jeff Liu
__dquot_alloc_space(). sb->s_dqopt->dqptr_sem is used to serialize ops using pointers from inode to dquots. But for __dquot_alloc_space(), it could be safely moved down after the default warn[] array got initialized. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15UDF: During mount free lvid_bh before rescanning with different blocksizeAshish Sangwan
If s_lvid_bh is not freed and set to NULL before re-scanning partition with default block size, we might end up using wrong lvid in case s_lvid_bh is not updated in udf_load_logicalvolint during rescan. Signed-off-by: Ashish Sangwan <ashish.sangwan2@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-15udf: fix udf_setsize() for file data in ICBIan Abbott
If the new size is larger than the old size and the old file data was stored in the ICB (iinfo->i_alloc_type == ICBTAG_FLAG_AD_IN_ICB) and the new size still fits in the ICB, skip the call to udf_extend_file() as it does not handle this i_alloc_type value (it calls BUG()). Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Jan Kara <jack@suse.cz>
2012-08-13workqueue: use mod_delayed_work() instead of cancel + queueTejun Heo
Convert delayed_work users doing cancel_delayed_work() followed by queue_delayed_work() to mod_delayed_work(). Most conversions are straight-forward. Ones worth mentioning are, * drivers/edac/edac_mc.c: edac_mc_workq_setup() converted to always use mod_delayed_work() and cancel loop in edac_mc_reset_delay_period() is dropped. * drivers/platform/x86/thinkpad_acpi.c: No need to remember whether watchdog is active or not. @fan_watchdog_active and related code dropped. * drivers/power/charger-manager.c: Seemingly a lot of delayed_work_pending() abuse going on here. [delayed_]work_pending() are unsynchronized and racy when used like this. I converted one instance in fullbatt_handler(). Please conver the rest so that it invokes workqueue APIs for the intended target state rather than trying to game work item pending state transitions. e.g. if timer should be modified - call mod_delayed_work(), canceled - call cancel_delayed_work[_sync](). * drivers/thermal/thermal_sys.c: thermal_zone_device_set_polling() simplified. Note that round_jiffies() calls in this function are meaningless. round_jiffies() work on absolute jiffies not delta delay used by delayed_work. v2: Tomi pointed out that __cancel_delayed_work() users can't be safely converted to mod_delayed_work(). They could be calling it from irq context and if that happens while delayed_work_timer_fn() is running, it could deadlock. __cancel_delayed_work() users are dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Doug Thompson <dougthompson@xmission.com> Cc: David Airlie <airlied@linux.ie> Cc: Roland Dreier <roland@kernel.org> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Johannes Berg <johannes@sipsolutions.net>
2012-08-13dlm: cleanup send_to_sock routineYing Xue
Remove unnecessary code form send_to_sock routine. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David Teigland <teigland@redhat.com>