summaryrefslogtreecommitdiffstats
path: root/fs/locks.c
AgeCommit message (Collapse)Author
2007-10-30locks: fix possible infinite loop in posix deadlock detectionJ. Bruce Fields
It's currently possible to send posix_locks_deadlock() into an infinite loop (under the BKL). For now, fix this just by bailing out after a few iterations. We may want to fix this in a way that better clarifies the semantics of deadlock detection. But that will take more time, and this minimal fix is probably adequate for any realistic scenario, and is simple enough to be appropriate for applying to stable kernels now. Thanks to George Davis for reporting the problem. Cc: "George G. Davis" <gdavis@mvista.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter
Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-09Rework /proc/locks via seq_files and seq_list helpersPavel Emelyanov
Currently /proc/locks is shown with a proc_read function, but its behavior is rather complex as it has to manually handle current offset and buffer length. On the other hand, files that show objects from lists can be easily reimplemented using the sequential files and the seq_list_XXX() helpers. This saves (as usually) 16 lines of code and more than 200 from the .text section. [akpm@linux-foundation.org: no externs in C] [akpm@linux-foundation.org: warning fixes] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-09fs/locks.c: use list_for_each_entry() instead of list_for_each()Matthias Kaehlcke
fs/locks.c: use list_for_each_entry() instead of list_for_each() in posix_locks_deadlock() and get_locks_status() Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-09Cleanup macros for distinguishing mandatory locksPavel Emelyanov
The combination of S_ISGID bit set and S_IXGRP bit unset is used to mark the inode as "mandatory lockable" and there's a macro for this check called MANDATORY_LOCK(inode). However, fs/locks.c and some filesystems still perform the explicit i_mode checking. Besides, Andrew pointed out, that this macro is buggy itself, as it dereferences the inode arg twice. Convert this macro into static inline function and switch its users to it, making the code shorter and more readable. The __mandatory_lock() helper is to be used in places where the IS_MANDLOCK() for superblock is already known to be true. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: David Howells <dhowells@redhat.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-09locks: Fix potential OOPS in generic_setlease()Pavel Emelyanov
This code is run under lock_kernel(), which is dropped during sleeping operations, so the following race is possible: CPU1: CPU2: vfs_setlease(); vfs_setlease(); lock_kernel(); lock_kernel(); /* spin */ generic_setlease(): ... for (before = ...) /* here we found some lease after * which we will insert the new one */ fl = locks_alloc_lock(); /* go to sleep in this allocation and * drop the BKL */ generic_setlease(): ... for (before = ...) /* here we find the "before" pointing * at the one we found on CPU1 */ ->fl_change(my_before, arg); lease_modify(); locks_free_lock(); /* and we freed it */ ... unlock_kernel(); locks_insert_lock(before, fl); /* OOPS! We have just tried to add the lease * at the tail of already removed one */ The similar races are already handled in other code - all the allocations are performed before any checks/updates. Thanks to Kamalesh Babulal for testing and for a bug report on an earlier version. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
2007-10-09Use list_first_entry in locks_wake_up_blocksPavel Emelyanov
This routine deletes all the elements from the list with the "while (!list_empty())" loop, and we already have a list_first_entry() macro to help it look nicer :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
2007-10-09locks: fix flock_lock_file() commentJ. Bruce Fields
This comment wasn't updated when lease support was added, and it makes essentially the same mistake that the code made before a recent bugfix. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-10-09Memory shortage can result in inconsistent flocks statePavel Emelyanov
When the flock_lock_file() is called to change the flock from F_RDLCK to F_WRLCK or vice versa the existing flock can be removed without appropriate warning. Look: for_each_lock(inode, before) { struct file_lock *fl = *before; if (IS_POSIX(fl)) break; if (IS_LEASE(fl)) continue; if (filp != fl->fl_file) continue; if (request->fl_type == fl->fl_type) goto out; found = 1; locks_delete_lock(before); <<<<<< ! break; } if after this point the subsequent locks_alloc_lock() will fail the return code will be -ENOMEM, but the existing lock is already removed. This is a known feature that such "re-locking" is not atomic, but in the racy case the file should stay locked (although by some other process), but in this case the file will be unlocked. The proposal is to prepare the lock in advance keeping no chance to fail in the future code. Found during making the flocks pid-namespaces aware. (Note: Thanks to Reuben Farrelly for finding a bug in an earlier version of this patch.) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Reuben Farrelly <reuben-linuxkernel@reub.net>
2007-10-09locks: kill redundant local variableJ. Bruce Fields
There's no need for another variable local to this loop; we can use the variable (of the same name!) already declared at the top of the function, and not used till later (at which point it's initialized, so this is safe). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-10-09locks: reverse order of posix_locks_conflict() argumentsJ. Bruce Fields
The first argument to posix_locks_conflict() is meant to be a lock request, and the second a lock from an inode's lock request. It doesn't really make a difference which order you call them in, since the only asymmetric test in posix_lock_conflict() is the check whether the second argument is a posix lock--and every caller already does that check for some reason. But may as well fix posix_test_lock() to call posix_locks_conflict() with the arguments in the same order as everywhere else. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-09-11Leases can be hidden by flocksPavel Emelyanov
The inode->i_flock list contains the leases, flocks and posix locks in the specified order. However, the flocks are added in the head of this list thus hiding the leases from F_GETLEASE command, from time_out_leases() and other code that expects the leases to come first. The following example will demonstrate this: #define _GNU_SOURCE #include <unistd.h> #include <fcntl.h> #include <stdio.h> #include <sys/file.h> static void show_lease(int fd) { int res; res = fcntl(fd, F_GETLEASE); switch (res) { case F_RDLCK: printf("Read lease\n"); break; case F_WRLCK: printf("Write lease\n"); break; case F_UNLCK: printf("No leases\n"); break; default: printf("Some shit\n"); break; } } int main(int argc, char **argv) { int fd, res; fd = open(argv[1], O_RDONLY); if (fd == -1) { perror("Can't open file"); return 1; } res = fcntl(fd, F_SETLEASE, F_WRLCK); if (res == -1) { perror("Can't set lease"); return 1; } show_lease(fd); if (flock(fd, LOCK_SH) == -1) { perror("Can't flock shared"); return 1; } show_lease(fd); return 0; } The first call to show_lease() will show the write lease set, but the second will show no leases. Fix the flock adding so that the leases always stay in the head of this list. Found during making the flocks pid-namespaces aware. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31rename setlease to generic_setleaseChristoph Hellwig
Make it a little more clear that this is the default implementation for the setleast operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20mm: Remove slab destructors from kmem_cache_create().Paul Mundt
Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-18locks: fix vfs_test_lock() commentJ. Bruce Fields
Thanks to Doug Chapman for pointing out that the comment here is inconsistent with the function prototype. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18locks: make posix_test_lock() interface more consistentJ. Bruce Fields
Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or presence of a conflict lock by setting fl_type to, respectively, F_UNLCK or something other than F_UNLCK, the return value is no longer needed. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18locks: export setlease to filesystemsJ. Bruce Fields
Export setlease so it can used by filesystems to implement their lease methods. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18locks: provide a file lease method enabling cluster-coherent leasesJ. Bruce Fields
Currently leases are only kept locally, so there's no way for a distributed filesystem to enforce them against multiple clients. We're particularly interested in the case of nfsd exporting a cluster filesystem, in which case nfsd needs cluster-coherent leases in order to implement delegations correctly. Also add some documentation. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-07-18locks: rename lease functions to reflect locks.c conventionsJ. Bruce Fields
We've been using the convention that vfs_foo is the function that calls a filesystem-specific foo method if it exists, or falls back on a generic method if it doesn't; thus vfs_foo is what is called when some other part of the kernel (normally lockd or nfsd) wants to get a lock, whereas foo is what filesystems call to use the underlying local functionality as part of their lock implementation. So rename setlease to vfs_setlease (which will call a filesystem-specific setlease after a later patch) and __setlease to setlease. Also, vfs_setlease need only be GPL-exported as long as it's only needed by lockd and nfsd. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-07-18locks: share more common lease codeJ. Bruce Fields
Share more code between setlease (used by nfsd) and fcntl. Also some minor cleanup. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: Christoph Hellwig <hch@infradead.org>
2007-07-18locks: clean up lease_alloc()J. Bruce Fields
Return the newly allocated structure as the return value instead of using a struct ** parameter. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-07-18locks: convert an -EINVAL return to a BUGJ. Bruce Fields
There's no point trying to return an error in these cases, which all represent bugs in the callers. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-07-18leases: minor break_lease() comment clarificationdavid m. richter
clarify that break_lease() checks for presence of any lock, not just leases. Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-17Remove SLAB_CTOR_CONSTRUCTORChristoph Lameter
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-10locks: fix F_GETLK regression (failure to find conflicts)J. Bruce Fields
In 9d6a8c5c213e34c475e72b245a8eb709258e968c we changed posix_test_lock to modify its single file_lock argument instead of taking separate input and output arguments. This makes it no longer safe to set the output lock's fl_type to F_UNLCK before looking for a conflict, since that means searching for a conflict against a lock with type F_UNLCK. This fixes a regression which causes F_GETLK to incorrectly report no conflict on most filesystems (including any filesystem that doesn't do its own locking). Also fix posix_lock_to_flock() to copy the lock type. This isn't strictly necessary, since the caller already does this; but it seems less likely to cause confusion in the future. Thanks to Doug Chapman for the bug report. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Acked-by: Doug Chapman <doug.chapman@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-07Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses
2007-05-07slab allocators: Remove SLAB_DEBUG_INITIAL flagChristoph Lameter
I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-06locks: add fl_grant callback for asynchronous lock returnMarc Eshel
Acquiring a lock on a cluster filesystem may require communication with remote hosts, and to avoid blocking lockd or nfsd threads during such communication, we allow the results to be returned asynchronously. When a ->lock() call needs to block, the file system will return -EINPROGRESS, and then later return the results with a call to the routine in the fl_grant field of the lock_manager_operations struct. This differs from the case when ->lock returns -EAGAIN to a blocking lock request; in that case, the filesystem calls fl_notify when the lock is granted, and the caller retries the original lock. So while fl_notify is merely a hint to the caller that it should retry, fl_grant actually communicates the final result of the lock operation (with the lock already acquired in the succesful case). Therefore fl_grant takes a lock, a status and, for the test lock case, a conflicting lock. We also allow fl_grant to return an error to the filesystem, to handle the case where the fl_grant requests arrives after the lock manager has already given up waiting for it. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2007-05-06locks: add lock cancel commandMarc Eshel
Lock managers need to be able to cancel pending lock requests. In the case where the exported filesystem manages its own locks, it's not sufficient just to call posix_unblock_lock(); we need to let the filesystem know what's happening too. We do this by adding a new fcntl lock command: FL_CANCELLK. Some day this might also be made available to userspace applications that could benefit from an asynchronous locking api. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-06locks: allow {vfs,posix}_lock_file to return conflicting lockMarc Eshel
The nfsv4 protocol's lock operation, in the case of a conflict, returns information about the conflicting lock. It's unclear how clients can use this, so for now we're not going so far as to add a filesystem method that can return a conflicting lock, but we may as well return something in the local case when it's easy to. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-06locks: factor out generic/filesystem switch from setlock codeMarc Eshel
Factor out the code that switches between generic and filesystem-specific lock methods; eventually we want to call this from lock managers (lockd and nfsd) too; currently they only call the generic methods. This patch does that for all the setlk code. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-06locks: factor out generic/filesystem switch from test_lockJ. Bruce Fields
Factor out the code that switches between generic and filesystem-specific lock methods; eventually we want to call this from lock managers (lockd and nfsd) too; currently they only call the generic methods. This patch does that for test_lock. Note that this hasn't been necessary until recently, because the few filesystems that define ->lock() (nfs, cifs...) aren't exportable via NFS. However GFS (and, in the future, other cluster filesystems) need to implement their own locking to get cluster-coherent locking, and also want to be able to export locking to NFS (lockd and NFSv4). So we accomplish this by factoring out code such as this and exporting it for the use of lockd and nfsd. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-06locks: give posix_test_lock same interface as ->lockMarc Eshel
posix_test_lock() and ->lock() do the same job but have gratuitously different interfaces. Modify posix_test_lock() so the two agree, simplifying some code in the process. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-05-06locks: make ->lock release private data before returning in GETLK caseJ. Bruce Fields
The file_lock argument to ->lock is used to return the conflicting lock when found. There's no reason for the filesystem to return any private information with this conflicting lock, but nfsv4 is. Fix nfsv4 client, and modify locks.c to stop calling fl_release_private for it in this case. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Cc: "Trond Myklebust" <Trond.Myklebust@netapp.com>"
2007-04-16locks: create posix-to-flock helper functionsJ. Bruce Fields
Factor out a bit of messy code by creating posix-to-flock counterparts to the existing flock-to-posix helper functions. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2007-04-16locks: trivial removal of unnecessary parenthesesJ. Bruce Fields
Remove some unnecessary parentheses. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
2006-12-08[PATCH] VFS: change struct file to use struct pathJosef "Jeff" Sipek
This patch changes struct file to use struct path instead of having independent pointers to struct dentry and struct vfsmount, and converts all users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}. Additionally, it adds two #define's to make the transition easier for users of the f_dentry and f_vfsmnt. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] slab: remove kmem_cache_tChristoph Lameter
Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07[PATCH] slab: remove SLAB_KERNELChristoph Lameter
SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02[PATCH] file: modify struct fown_struct to use a struct pidEric W. Biederman
File handles can be requested to send sigio and sigurg to processes. By tracking the destination processes using struct pid instead of pid_t we make the interface safe from all potential pid wrap around problems. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01[PATCH] VFS: Use SEEK_{SET, CUR, END} instead of hardcoded valuesJosef 'Jeff' Sipek
VFS: Use SEEK_{SET,CUR,END} instead of hardcoded values Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-14[PATCH] fcntl(F_SETSIG) fixTrond Myklebust
fcntl(F_SETSIG) no longer works on leases because lease_release_private_callback() gets called as the lease is copied in order to initialise it. The problem is that lease_alloc() performs an unnecessary initialisation, which sets the lease_manager_ops. Avoid the problem by allocating the target lease structure using locks_alloc_lock(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-07-05VFS: Add support for the FL_ACCESS flag to flock_lock_file()Trond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05VFS: Allow caller to determine if BSD or posix locks were actually freedTrond Myklebust
Change posix_lock_file_conf(), and flock_lock_file() so that if called with an F_UNLCK argument, and the FL_EXISTS flag they will indicate whether or not any locks were actually freed by returning 0 or -ENOENT. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-23[PATCH] fs/locks.c: make posix_locks_deadlock() staticAdrian Bunk
We can now make posix_locks_deadlock() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] vfs: add lock owner argument to flush operationMiklos Szeredi
Pass the POSIX lock owner ID to the flush operation. This is useful for filesystems which don't want to store any locking state in inode->i_flock but want to handle locking/unlocking POSIX locks internally. FUSE is one such filesystem but I think it possible that some network filesystems would need this also. Also add a flag to indicate that a POSIX locking request was generated by close(), so filesystems using the above feature won't send an extra locking request in this case. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] locks: clean up locks_remove_posix()Miklos Szeredi
locks_remove_posix() can use posix_lock_file() instead of doing the lock removal by hand. posix_lock_file() now does exacly the same. The comment about pids no longer applies, posix_lock_file() takes only the owner into account. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] locks: don't do unnecessary allocationsMiklos Szeredi
posix_lock_file() always allocates new locks in advance, even if it's easy to determine that no allocations will be needed. Optimize these cases: - FL_ACCESS flag is set - Unlocking the whole range Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23[PATCH] locks: don't unnecessarily fail posix lock operationsMiklos Szeredi
posix_lock_file() was too cautious, failing operations on OOM, even if they didn't actually require an allocation. This has the disadvantage, that a failing unlock on process exit could lead to a memory leak. There are two possibilites for this: - filesystem implements .lock() and calls back to posix_lock_file(). On cleanup of files_struct locks_remove_posix() is called which should remove all locks belonging to files_struct. However if filesystem calls posix_lock_file() which fails, then those locks will never be freed. - if a file is closed while a lock is blocked, then after acquiring fcntl_setlk() will undo the lock. But this unlock itself might fail on OOM, again possibly leaking the lock. The solution is to move the checking of the allocations until after it is sure that they will be needed. This will solve the above problem since unlock will always succeed unless it splits an existing region. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22[PATCH] remove steal_locks()Miklos Szeredi
This patch removes the steal_locks() function. steal_locks() doesn't work correctly with any filesystem that does it's own lock management, including NFS, CIFS, etc. In addition it has weird semantics on local filesystems in case tasks sharing file-descriptor tables are doing POSIX locking operations in parallel to execve(). The steal_locks() function has an effect on applications doing: clone(CLONE_FILES) /* in child */ lock execve lock POSIX locks acquired before execve (by "child", "parent" or any further task sharing files_struct) will after the execve be owned exclusively by "child". According to Chris Wright some LSB/LTP kind of suite triggers without the stealing behavior, but there's no known real-world application that would also fail. Apps using NPTL are not affected, since all other threads are killed before execve. Apps using LinuxThreads are only affected if they - have multiple threads during exec (LinuxThreads doesn't kill other threads, the app may do it with pthread_kill_other_threads_np()) - rely on POSIX locks being inherited across exec Both conditions are documented, but not their interaction. Apps using clone() natively are affected if they - use clone(CLONE_FILES) - rely on POSIX locks being inherited across exec The above scenarios are unlikely, but possible. If the patch is vetoed, there's a plan B, that involves mostly keeping the weird stealing semantics, but changing the way lock ownership is handled so that network and local filesystems work consistently. That would add more complexity though, so this solution seems to be preferred by most people. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <willy@debian.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>