summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2014-10-09fs: make cont_expand_zero interruptibleMikulas Patocka
This patch makes it possible to kill a process looping in cont_expand_zero. A process may spend a lot of time in this function, so it is desirable to be able to kill it. It happened to me that I wanted to copy a piece data from the disk to a file. By mistake, I used the "seek" parameter to dd instead of "skip". Due to the "seek" parameter, dd attempted to extend the file and became stuck doing so - the only possibility was to reset the machine or wait many hours until the filesystem runs out of space and cont_expand_zero fails. We need this patch to be able to terminate the process. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09fs: Fix theoretical division by 0 in super_cache_scan().Tetsuo Handa
total_objects could be 0 and is used as a denom. While total_objects is a "long", total_objects == 0 unlikely happens for 3.12 and later kernels because 32-bit architectures would not be able to hold (1 << 32) objects. However, total_objects == 0 may happen for kernels between 3.1 and 3.11 because total_objects in prune_super() was an "int" and (e.g.) x86_64 architecture might be able to hold (1 << 32) objects. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable <stable@kernel.org> # 3.1+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09dcache: Fix no spaces at the start of a line in dcache.cDaeseok Youn
Fixed coding style in dcache.c Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09[jffs2] kill wbuf_queued/wbuf_dwork_lockAl Viro
schedule_delayed_work() happening when the work is already pending is a cheap no-op. Don't bother with ->wbuf_queued logics - it's both broken (cancelling ->wbuf_dwork leaves it set, as spotted by Jeff Harris) and pointless. It's cheaper to let schedule_delayed_work() handle that case. Reported-by: Jeff Harris <jefftharris@gmail.com> Tested-by: Jeff Harris <jefftharris@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09handle suicide on late failure exits in execve() in search_binary_handler()Al Viro
... rather than doing that in the guts of ->load_binary(). [updated to fix the bug spotted by Shentino - for SIGSEGV we really need something stronger than send_sig_info(); again, better do that in one place] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09dcache.c: call ->d_prune() regardless of d_unhashed()Al Viro
the only in-tree instance checks d_unhashed() anyway, out-of-tree code can preserve the current behaviour by adding such check if they want it and we get an ability to use it in cases where we *want* to be notified of killing being inevitable before ->d_lock is dropped, whether it's unhashed or not. In particular, autofs would benefit from that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09d_prune_alias(): just lock the parent and call __dentry_kill()Al Viro
The only reason for games with ->d_prune() was __d_drop(), which was needed only to force dput() into killing the sucker off. Note that lock_parent() can be called under ->i_lock and won't drop it, so dentry is safe from somebody managing to kill it under us - it won't happen while we are holding ->i_lock. __dentry_kill() is called only with ->d_lockref.count being 0 (here and when picked from shrink list) or 1 (dput() and dropping the ancestors in shrink_dentry_list()), so it will never be called twice - the first thing it's doing is making ->d_lockref.count negative and once that happens, nothing will increment it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09proc: Update proc_flush_task_mnt to use d_invalidateEric W. Biederman
Now that d_invalidate always succeeds and flushes mount points use it in stead of a combination of shrink_dcache_parent and d_drop in proc_flush_task_mnt. This removes the danger of a mount point under /proc/<pid>/... becoming unreachable after the d_drop. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Remove d_drop calls from d_revalidate implementationsEric W. Biederman
Now that d_invalidate always succeeds it is not longer necessary or desirable to hard code d_drop calls into filesystem specific d_revalidate implementations. Remove the unnecessary d_drop calls and rely on d_invalidate to drop the dentries. Using d_invalidate ensures that paths to mount points will not be dropped. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Make d_invalidate return voidEric W. Biederman
Now that d_invalidate can no longer fail, stop returning a useless return code. For the few callers that checked the return code update remove the handling of d_invalidate failure. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Merge check_submounts_and_drop and d_invalidateEric W. Biederman
Now that d_invalidate is the only caller of check_submounts_and_drop, expand check_submounts_and_drop inline in d_invalidate. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Remove unnecessary calls of check_submounts_and_dropEric W. Biederman
Now that check_submounts_and_drop can not fail and is called from d_invalidate there is no longer a need to call check_submounts_and_drom from filesystem d_revalidate methods so remove it. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Lazily remove mounts on unlinked files and directories.Eric W. Biederman
With the introduction of mount namespaces and bind mounts it became possible to access files and directories that on some paths are mount points but are not mount points on other paths. It is very confusing when rm -rf somedir returns -EBUSY simply because somedir is mounted somewhere else. With the addition of user namespaces allowing unprivileged mounts this condition has gone from annoying to allowing a DOS attack on other users in the system. The possibility for mischief is removed by updating the vfs to support rename, unlink and rmdir on a dentry that is a mountpoint and by lazily unmounting mountpoints on deleted dentries. In particular this change allows rename, unlink and rmdir system calls on a dentry without a mountpoint in the current mount namespace to succeed, and it allows rename, unlink, and rmdir performed on a distributed filesystem to update the vfs cache even if when there is a mount in some namespace on the original dentry. There are two common patterns of maintaining mounts: Mounts on trusted paths with the parent directory of the mount point and all ancestory directories up to / owned by root and modifiable only by root (i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu, cpuacct, ...}, /usr, /usr/local). Mounts on unprivileged directories maintained by fusermount. In the case of mounts in trusted directories owned by root and modifiable only by root the current parent directory permissions are sufficient to ensure a mount point on a trusted path is not removed or renamed by anyone other than root, even if there is a context where the there are no mount points to prevent this. In the case of mounts in directories owned by less privileged users races with users modifying the path of a mount point are already a danger. fusermount already uses a combination of chdir, /proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races. The removable of global rename, unlink, and rmdir protection really adds nothing new to consider only a widening of the attack window, and fusermount is already safe against unprivileged users modifying the directory simultaneously. In principle for perfect userspace programs returning -EBUSY for unlink, rmdir, and rename of dentires that have mounts in the local namespace is actually unnecessary. Unfortunately not all userspace programs are perfect so retaining -EBUSY for unlink, rmdir and rename of dentries that have mounts in the current mount namespace plays an important role of maintaining consistency with historical behavior and making imperfect userspace applications hard to exploit. v2: Remove spurious old_dentry. v3: Optimized shrink_submounts_and_drop Removed unsued afs label v4: Simplified the changes to check_submounts_and_drop Do not rename check_submounts_and_drop shrink_submounts_and_drop Document what why we need atomicity in check_submounts_and_drop Rely on the parent inode mutex to make d_revalidate and d_invalidate an atomic unit. v5: Refcount the mountpoint to detach in case of simultaneous renames. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Add a function to lazily unmount all mounts from any dentry.Eric W. Biederman
The new function detach_mounts comes in two pieces. The first piece is a static inline test of d_mounpoint that returns immediately without taking any locks if d_mounpoint is not set. In the common case when mountpoints are absent this allows the vfs to continue running with it's same cacheline foot print. The second piece of detach_mounts __detach_mounts actually does the work and it assumes that a mountpoint is present so it is slow and takes namespace_sem for write, and then locks the mount hash (aka mount_lock) after a struct mountpoint has been found. With those two locks held each entry on the list of mounts on a mountpoint is selected and lazily unmounted until all of the mount have been lazily unmounted. v7: Wrote a proper change description and removed the changelog documenting deleted wrong turns. Signed-off-by: Eric W. Biederman <ebiederman@twitter.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: factor out lookup_mountpoint from new_mountpointEric W. Biederman
I am shortly going to add a new user of struct mountpoint that needs to look up existing entries but does not want to create a struct mountpoint if one does not exist. Therefore to keep the code simple and easy to read split out lookup_mountpoint from new_mountpoint. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Keep a list of mounts on a mount pointEric W. Biederman
To spot any possible problems call BUG if a mountpoint is put when it's list of mounts is not empty. AV: use hlist instead of list_head Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Eric W. Biederman <ebiederman@twitter.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Don't allow overwriting mounts in the current mount namespaceEric W. Biederman
In preparation for allowing mountpoints to be renamed and unlinked in remote filesystems and in other mount namespaces test if on a dentry there is a mount in the local mount namespace before allowing it to be renamed or unlinked. The primary motivation here are old versions of fusermount unmount which is not safe if the a path can be renamed or unlinked while it is verifying the mount is safe to unmount. More recent versions are simpler and safer by simply using UMOUNT_NOFOLLOW when unmounting a mount in a directory owned by an arbitrary user. Miklos Szeredi <miklos@szeredi.hu> reports this is approach is good enough to remove concerns about new kernels mixed with old versions of fusermount. A secondary motivation for restrictions here is that it removing empty directories that have non-empty mount points on them appears to violate the rule that rmdir can not remove empty directories. As Linus Torvalds pointed out this is useful for programs (like git) that test if a directory is empty with rmdir. Therefore this patch arranges to enforce the existing mount point semantics for local mount namespace. v2: Rewrote the test to be a drop in replacement for d_mountpoint v3: Use bool instead of int as the return type of is_local_mountpoint Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: More precise tests in d_invalidateEric W. Biederman
The current comments in d_invalidate about what and why it is doing what it is doing are wildly off-base. Which is not surprising as the comments date back to last minute bug fix of the 2.2 kernel. The big fat lie of a comment said: If it's a directory, we can't drop it for fear of somebody re-populating it with children (even though dropping it would make it unreachable from that root, we still might repopulate it if it was a working directory or similar). [AV] What we really need to avoid is multiple dentry aliases of the same directory inode; on all filesystems that have ->d_revalidate() we either declare all positive dentries always valid (and thus never fed to d_invalidate()) or use d_materialise_unique() and/or d_splice_alias(), which take care of alias prevention. The current rules are: - To prevent mount point leaks dentries that are mount points or that have childrent that are mount points may not be be unhashed. - All dentries may be unhashed. - Directories may be rehashed with d_materialise_unique check_submounts_and_drop implements this already for well maintained remote filesystems so implement the current rules in d_invalidate by just calling check_submounts_and_drop. The one difference between d_invalidate and check_submounts_and_drop is that d_invalidate must respect it when a d_revalidate method has earlier called d_drop so preserve the d_unhashed check in d_invalidate. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09vfs: Document the effect of d_revalidate on d_find_aliasEric W. Biederman
d_drop or check_submounts_and_drop called from d_revalidate can result in renamed directories with child dentries being unhashed. These renamed and drop directory dentries can be rehashed after d_materialise_unique uses d_find_alias to find them. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09delayed mntputAl Viro
On final mntput() we want fs shutdown to happen before return to userland; however, the only case where we want it happen right there (i.e. where task_work_add won't do) is MNT_INTERNAL victim. Those have to be fully synchronous - failure halfway through module init might count on having vfsmount killed right there. Fortunately, final mntput on MNT_INTERNAL vfsmounts happens on shallow stack. So we handle those synchronously and do an analog of delayed fput logics for everything else. As the result, we are guaranteed that fs shutdown will always happen on shallow stack. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09autofs - remove obsolete d_invalidate() from expireIan Kent
Biederman's umount-on-rmdir series changes d_invalidate() to sumarily remove mounts under the passed in dentry regardless of whether they are busy or not. So calling this in fs/autofs4/expire.c:autofs4_tree_busy() is definitely the wrong thing to do becuase it will silently umount entries instead of just cleaning stale dentrys. But this call shouldn't be needed and testing shows that automounting continues to function without it. As Al Viro correctly surmises the original intent of the call was to perform what shrink_dcache_parent() does. If at some time in the future I see stale dentries accumulating following failed mounts I'll revisit the issue and possibly add a shrink_dcache_parent() call if needed. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09Allow sharing external names after __d_move()Al Viro
* external dentry names get a small structure prepended to them (struct external_name). * it contains an atomic refcount, matching the number of struct dentry instances that have ->d_name.name pointing to that external name. The first thing free_dentry() does is decrementing refcount of external name, so the instances that are between the call of free_dentry() and RCU-delayed actual freeing do not contribute. * __d_move(x, y, false) makes the name of x equal to the name of y, external or not. If y has an external name, extra reference is grabbed and put into x->d_name.name. If x used to have an external name, the reference to the old name is dropped and, should it reach zero, freeing is scheduled via kfree_rcu(). * free_dentry() in dentry with external name decrements the refcount of that name and, should it reach zero, does RCU-delayed call that will free both the dentry and external name. Otherwise it does what it used to do, except that __d_free() doesn't even look at ->d_name.name; it simply frees the dentry. All non-RCU accesses to dentry external name are safe wrt freeing since they all should happen before free_dentry() is called. RCU accesses might run into a dentry seen by free_dentry() or into an old name that got already dropped by __d_move(); however, in both cases dentry must have been alive and refer to that name at some point after we'd done rcu_read_lock(), which means that any freeing must be still pending. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-08NFSv4.1/pnfs: replace broken pnfs_put_lseg_asyncTrond Myklebust
You cannot call pnfs_put_lseg_async() more than once per lseg, so it is really an inappropriate way to deal with a refcount issue. Instead, replace it with a function that decrements the refcount, and puts the final 'free' operation (which is incompatible with locks) on the workqueue. Cc: Weston Andros Adamson <dros@primarydata.com> Fixes: e6cf82d1830f: pnfs: add pnfs_put_lseg_async Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-10-08fs: Add a missing permission check to do_umountAndy Lutomirski
Accessing do_remount_sb should require global CAP_SYS_ADMIN, but only one of the two call sites was appropriately protected. Fixes CVE-2014-7975. Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2014-10-08NFSv4: Remove dead prototype for nfs4_insert_deviceid_node()Tom Haynes
nfs4_insert_deviceid_node() was removed in 661373b13d0490ff410a2133d4a7a117f2dd037e Signed-off-by: Tom Haynes <loghyr@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-10-08Merge tag 'f2fs-for-3.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set introduces a couple of new features such as large sector size, FITRIM, and atomic/volatile writes. Several patches enhance power-off recovery and checkpoint routines. The fsck.f2fs starts to support fixing corrupted partitions with recovery hints provided by this patch-set. Summary: - retain some recovery information for fsck.f2fs - enhance checkpoint speed - enhance flush command management - bug fix for lseek - tune in-place-update policies - enhance roll-forward speed - revisit all the roll-forward and fsync rules - support larget sector size - support FITRIM - support atomic and volatile writes And several clean-ups and bug fixes are included" * tag 'f2fs-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (42 commits) f2fs: support volatile operations for transient data f2fs: support atomic writes f2fs: remove unused return value f2fs: clean up f2fs_ioctl functions f2fs: potential shift wrapping buf in f2fs_trim_fs() f2fs: call f2fs_unlock_op after error was handled f2fs: check the use of macros on block counts and addresses f2fs: refactor flush_nat_entries to remove costly reorganizing ops f2fs: introduce FITRIM in f2fs_ioctl f2fs: introduce cp_control structure f2fs: use more free segments until SSR is activated f2fs: change the ipu_policy option to enable combinations f2fs: fix to search whole dirty segmap when get_victim f2fs: fix to clean previous mount option when remount_fs f2fs: skip punching hole in special condition f2fs: support large sector size f2fs: fix to truncate blocks past EOF in ->setattr f2fs: update i_size when __allocate_data_block f2fs: use MAX_BIO_BLOCKS(sbi) f2fs: remove redundant operation during roll-forward recovery ...
2014-10-08Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Highlights: - support the NFSv4.2 SEEK operation (allowing clients to support SEEK_HOLE/SEEK_DATA), thanks to Anna. - end the grace period early in a number of cases, mitigating a long-standing annoyance, thanks to Jeff - improve SMP scalability, thanks to Trond" * 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: eliminate "to_delegation" define NFSD: Implement SEEK NFSD: Add generic v4.2 infrastructure svcrdma: advertise the correct max payload nfsd: introduce nfsd4_callback_ops nfsd: split nfsd4_callback initialization and use nfsd: introduce a generic nfsd4_cb nfsd: remove nfsd4_callback.cb_op nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence nfsd: fix nfsd4_cb_recall_done error handling nfsd4: clarify how grace period ends nfsd4: stop grace_time update at end of grace period nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls nfsd: serialize nfsdcltrack upcalls for a particular client nfsd: pass extra info in env vars to upcalls to allow for early grace period end nfsd: add a v4_end_grace file to /proc/fs/nfsd lockd: add a /proc/fs/lockd/nlm_end_grace file nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE nfsd: remove redundant boot_time parm from grace_done client tracking op ...
2014-10-08Merge tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - fix an NFSv4.1 state renewal regression - fix open/lock state recovery error handling - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails - fix statd when reconnection fails - don't wake tasks during connection abort - don't start reboot recovery if lease check fails - fix duplicate proc entries Features: - pNFS block driver fixes and clean ups from Christoph - More code cleanups from Anna - Improve mmap() writeback performance - Replace use of PF_TRANS with a more generic mechanism for avoiding deadlocks in nfs_release_page" * tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (66 commits) NFSv4.1: Fix an NFSv4.1 state renewal regression NFSv4: fix open/lock state recovery error handling NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails NFS: Fabricate fscache server index key correctly SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT NFSv3: Fix missing includes of nfs3_fs.h NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page() NFS: avoid waiting at all in nfs_release_page when congested. NFS: avoid deadlocks with loop-back mounted NFS filesystems. MM: export page_wakeup functions SCHED: add some "wait..on_bit...timeout()" interfaces. NFS: don't use STABLE writes during writeback. NFSv4: use exponential retry on NFS4ERR_DELAY for async requests. rpc: Add -EPERM processing for xs_udp_send_request() rpc: return sent and err from xs_sendpages() lockd: Try to reconnect if statd has moved SUNRPC: Don't wake tasks during connection abort Fixing lease renewal nfs: fix duplicate proc entries pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe ...
2014-10-08btrfs: Fix compile error when CONFIG_SECURITY is not set.Qu Wenruo
Fix the following compile error when CONFIG_SECURITY is not set: error: 'struct security_mnt_opts' has no member named 'num_mnt_opts' Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-08GFS2: use _RET_IP_ instead of (unsigned long)__builtin_return_address(0)Fabian Frederick
use macro definition Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-10-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull "trivial tree" updates from Jiri Kosina: "Usual pile from trivial tree everyone is so eagerly waiting for" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Remove MN10300_PROC_MN2WS0038 mei: fix comments treewide: Fix typos in Kconfig kprobes: update jprobe_example.c for do_fork() change Documentation: change "&" to "and" in Documentation/applying-patches.txt Documentation: remove obsolete pcmcia-cs from Changes Documentation: update links in Changes Documentation: Docbook: Fix generated DocBook/kernel-api.xml score: Remove GENERIC_HAS_IOMAP gpio: fix 'CONFIG_GPIO_IRQCHIP' comments tty: doc: Fix grammar in serial/tty dma-debug: modify check_for_stack output treewide: fix errors in printk genirq: fix reference in devm_request_threaded_irq comment treewide: fix synchronize_rcu() in comments checkstack.pl: port to AArch64 doc: queue-sysfs: minor fixes init/do_mounts: better syntax description MIPS: fix comment spelling powerpc/simpleboot: fix comment ...
2014-10-07Btrfs: fix compiles when CONFIG_BTRFS_FS_RUN_SANITY_TESTS is offChris Mason
Commit fccb84c94 moved added some helpers to cleanup our sanity tests, but it looks like both Dave and I always compile with the tests enabled. This fixes things to work when they are turned off too. Signed-off-by: Chris Mason <clm@fb.com>
2014-10-07f2fs: support volatile operations for transient dataJaegeuk Kim
This patch adds support for volatile writes which keep data pages in memory until f2fs_evict_inode is called by iput. For instance, we can use this feature for the sqlite database as follows. While supporting atomic writes for main database file, we can keep its journal data temporarily in the page cache by the following sequence. 1. open -> ioctl(F2FS_IOC_START_VOLATILE_WRITE); 2. writes : keep all the data in the page cache. 3. flush to the database file with atomic writes a. ioctl(F2FS_IOC_START_ATOMIC_WRITE); b. writes c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); 4. close -> drop the cached data Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-07locks: flock_make_lock should return a struct file_lock (or PTR_ERR)Jeff Layton
Eliminate the need for a return pointer. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: set fl_owner for leases to filp instead of current->filesJeff Layton
Like flock locks, leases are owned by the file description. Now that the i_have_this_lease check in __break_lease is gone, we don't actually use the fl_owner for leases for anything. So, it's now safe to set this more appropriately to the same value as the fl_file. While we're at it, fix up the comments over the fl_owner_t definition since they're rather out of date. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2014-10-07locks: give lm_break a return valueJeff Layton
Christoph suggests: "Add a return value to lm_break so that the lock manager can tell the core code "you can delete this lease right now". That gets rid of the games with the timeout which require all kinds of race avoidance code in the users." Do that here and have the nfsd lease break routine use it when it detects that there was a race between setting up the lease and it being broken. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: __break_lease cleanup in preparation of allowing direct removal of leasesJeff Layton
Eliminate an unneeded "flock" variable. We can use "fl" as a loop cursor everywhere. Add a any_leases_conflict helper function as well to consolidate a bit of code. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: remove i_have_this_lease check from __break_leaseJeff Layton
I think that the intent of this code was to ensure that a process won't deadlock if it has one fd open with a lease on it and then breaks that lease by opening another fd. In that case it'll treat the __break_lease call as if it were non-blocking. This seems wrong -- the process could (for instance) be multithreaded and managing different fds via different threads. I also don't see any mention of this limitation in the (somewhat sketchy) documentation. Remove the check and the non-blocking behavior when i_have_this_lease is true. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2014-10-07locks: move freeing of leases outside of i_lockJeff Layton
There was only one place where we still could free a file_lock while holding the i_lock -- lease_modify. Add a new list_head argument to the lm_change operation, pass in a private list when calling it, and fix those callers to dispose of the list once the lock has been dropped. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: move i_lock acquisition into generic_*_lease handlersJeff Layton
Now that we have a saner internal API for managing leases, we no longer need to mandate that the inode->i_lock be held over most of the lease code. Push it down into generic_add_lease and generic_delete_lease. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: define a lm_setup handler for leasesJeff Layton
...and move the fasync setup into it for fcntl lease calls. At the same time, change the semantics of how the file_lock double-pointer is handled. Up until now, on a successful lease return you got a pointer to the lock on the list. This is bad, since that pointer can no longer be relied on as valid once the inode->i_lock has been released. Change the code to instead just zero out the pointer if the lease we passed in ended up being used. Then the callers can just check to see if it's NULL after the call and free it if it isn't. The priv argument has the same semantics. The lm_setup function can zero the pointer out to signal to the caller that it should not be freed after the function returns. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: plumb a "priv" pointer into the setlease routinesJeff Layton
In later patches, we're going to add a new lock_manager_operation to finish setting up the lease while still holding the i_lock. To do this, we'll need to pass a little bit of info in the fcntl setlease case (primarily an fasync structure). Plumb the extra pointer into there in advance of that. We declare this pointer as a void ** to make it clear that this is private info, and that the caller isn't required to set this unless the lm_setup specifically requires it. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07nfsd: don't keep a pointer to the lease in nfs4_fileJeff Layton
Now that we don't need to pass in an actual lease pointer to vfs_setlease on unlock, we can stop tracking a pointer to the lease in the nfs4_file. Switch all of the places that check the fi_lease to check fi_deleg_file instead. We always set that at the same time so it will have the same semantics. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: clean up vfs_setlease kerneldoc commentsJeff Layton
Some of the latter paragraphs seem ambiguous and just plain wrong. In particular the break_lease comment makes no sense. We call break_lease (and break_deleg) from all sorts of vfs-layer functions, so there is clearly such a method. Also get rid of some of the other comments about what's needed for a full implementation. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: generic_delete_lease doesn't need a file_lock at allJeff Layton
Ensure that it's OK to pass in a NULL file_lock double pointer on a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to do just that. Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE with an error return. That's a problem we can handle without crashing the box if it occurs. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07nfsd: fix potential lease memory leak in nfs4_setleaseJeff Layton
It's unlikely to ever occur, but if there were already a lease set on the file then we could end up getting back a different pointer on a successful setlease attempt than the one we allocated. If that happens, the one we allocated could leak. In practice, I don't think this will happen due to the fact that we only try to set up the lease once per nfs4_file, but this error handling is a bit more correct given the current lease API. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-07locks: close potential race in lease_get_mtimeJeff Layton
lease_get_mtime is called without the i_lock held, so there's no guarantee about the stability of the list. Between the time when we assign "flock" and then dereference it to check whether it's a lease and for write, the lease could be freed. Ensure that that doesn't occur by taking the i_lock before trying to check the lease. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-10-06f2fs: support atomic writesJaegeuk Kim
This patch introduces a very limited functionality for atomic write support. In order to support atomic write, this patch adds two ioctls: o F2FS_IOC_START_ATOMIC_WRITE o F2FS_IOC_COMMIT_ATOMIC_WRITE The database engine should be aware of the following sequence. 1. open -> ioctl(F2FS_IOC_START_ATOMIC_WRITE); 2. writes : all the written data will be treated as atomic pages. 3. commit -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); : this flushes all the data blocks to the disk, which will be shown all or nothing by f2fs recovery procedure. 4. repeat to #2. The IO pattens should be: ,- START_ATOMIC_WRITE ,- COMMIT_ATOMIC_WRITE CP | D D D D D D | FSYNC | D D D D | FSYNC ... `- COMMIT_ATOMIC_WRITE Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-06ecryptfs: remove unneeded buggy code in ecryptfs_do_create()Alexey Khoroshilov
There is a bug in error handling of lock_parent() in ecryptfs_do_create(): lock_parent() acquries mutex even if dget_parent() fails, so mutex should be unlocked anyway. But dget_parent() does not fail, so the patch just removes unneeded buggy code. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2014-10-06btrfs: Make btrfs handle security mount options internally to avoid losing ↵Qu Wenruo
security label. [BUG] Originally when mount btrfs with "-o subvol=" mount option, btrfs will lose all security lable. And if the btrfs fs is mounted somewhere else, due to the lost of security lable, SELinux will refuse to mount since the same super block is being mounted using different security lable. [REPRODUCER] With SELinux enabled: #mkfs -t btrfs /dev/sda5 #mount -o context=system_u:object_r:nfs_t:s0 /dev/sda5 /mnt/btrfs #btrfs subvolume create /mnt/btrfs/subvol #mount -o subvol=subvol,context=system_u:object_r:nfs_t:s0 /dev/sda5 /mnt/test kernel message: SELinux: mount invalid. Same superblock, different security settings for (dev sda5, type btrfs) [REASON] This happens because btrfs will call vfs_kern_mount() and then mount_subtree() to handle subvolume name lookup. First mount will cut off all the security lables and when it comes to the second vfs_kern_mount(), it has no security label now. [FIX] This patch will makes btrfs behavior much more like nfs, which has the type flag FS_BINARY_MOUNTDATA, making btrfs handles the security label internally. So security label will be set in the real mount time and won't lose label when use with "subvol=" mount option. Reported-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>