asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2010-10-28	fsnotify: rename FS_IN_ISDIR to FS_ISDIR	Eric Paris
	The _IN_ in the naming is reserved for flags only used by inotify. Since I am about to use this flag for fanotify rename it to be generic like the rest. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-10-15	llseek: automatically add .llseek fop	Arnd Bergmann
	All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-08-12	Revert "fsnotify: store struct file not struct path"	Linus Torvalds
	This reverts commit 3bcf3860a4ff9bbc522820b4b765e65e4deceb3e (and the accompanying commit c1e5c954020e "vfs/fsnotify: fsnotify_close can delay the final work in fput" that was a horribly ugly hack to make it work at all). The 'struct file' approach not only causes that disgusting hack, it somehow breaks pulseaudio, probably due to some other subtlety with f_count handling. Fix up various conflicts due to later fsnotify work. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-28	fanotify: use both marks when possible	Eric Paris
	fanotify currently, when given a vfsmount_mark will look up (if it exists) the corresponding inode mark. This patch drops that lookup and uses the mark provided. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: pass both the vfsmount mark and inode mark	Eric Paris
	should_send_event() and handle_event() will both need to look up the inode event if they get a vfsmount event. Lets just pass both at the same time since we have them both after walking the lists in lockstep. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: remove group->mask	Eric Paris
	group->mask is now useless. It was originally a shortcut for fsnotify to save on performance. These checks are now redundant, so we remove them. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: cleanup should_send_event	Eric Paris
	The change to use srcu and walk the object list rather than the global fsnotify_group list means that should_send_event is no longer needed for a number of groups and can be simplified for others. Do that. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: use the mark in handler functions	Eric Paris
	inotify now gets a mark in the should_send_event and handle_event functions. Rather than look up the mark themselves inotify should just use the mark it was handed. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: send fsnotify_mark to groups in event handling functions	Eric Paris
	With the change of fsnotify to use srcu walking the marks list instead of walking the global groups list we now know the mark in question. The code can send the mark to the group's handling functions and the groups won't have to find those marks themselves. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: store struct file not struct path	Eric Paris
	Al explains that calling dentry_open() with a mnt/dentry pair is only garunteed to be safe if they are already used in an open struct file. To make sure this is the case don't store and use a struct path in fsnotify, always use a struct file. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: fsnotify_add_notify_event should return an event	Eric Paris
	Rather than the horrific void ** argument and such just to pass the fanotify_merge event back to the caller of fsnotify_add_notify_event() have those things return an event if it was different than the event suggusted to be added. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: add pr_debug throughout	Eric Paris
	It can be hard to debug fsnotify since there are so few printks. Use pr_debug to allow for dynamic debugging. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: Fix mask checks	Jerome Marchand
	The mask checks in inotify_update_existing_watch() and inotify_new_watch() are useless because inotify_arg_to_mask() sets FS_IN_IGNORED and FS_EVENT_ON_CHILD bits anyway. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: force inotify and fsnotify use same bits	Eric Paris
	inotify uses bits called IN_* and fsnotify uses bits called FS_*. These need to line up. This patch adds build time checks to make sure noone can change these bits so they are not the same. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: allow users to request not to recieve events on unlinked children	Eric Paris
	An inotify watch on a directory will send events for children even if those children have been unlinked. This patch add a new inotify flag IN_EXCL_UNLINK which allows a watch to specificy they don't care about unlinked children. This should fix performance problems seen by tasks which add a watch to /tmp and then are overrun with events when other processes are reading and writing to unlinked files they created in /tmp. https://bugzilla.kernel.org/show_bug.cgi?id=16296 Requested-by: Matthias Clasen <mclasen@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: send IN_UNMOUNT events	Eric Paris
	Since the .31 or so notify rewrite inotify has not sent events about inodes which are unmounted. This patch restores those events. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: fix inotify oneshot support	Eric Paris
	During the large inotify rewrite to fsnotify I completely dropped support for IN_ONESHOT. Reimplement that support. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify_user.c: make local symbol static	H Hartley Sweeten
	The symbol inotify_max_user_watches is not used outside this file and should be static. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: intoduce a notification merge argument	Eric Paris
	Each group can define their own notification (and secondary_q) merge function. Inotify does tail drop, fanotify does matching and drop which can actually allocate a completely new event. But for fanotify to properly deal with permissions events it needs to know the new event which was ultimately added to the notification queue. This patch just implements a void ** argument which is passed to the merge function. fanotify can use this field to pass the new event back to higher layers. Signed-off-by: Eric Paris <eparis@redhat.com> for fanotify to properly deal with permissions events
2010-07-28	fsnotify: allow marks to not pin inodes in core	Eric Paris
	inotify marks must pin inodes in core. dnotify doesn't technically need to since they are closed when the directory is closed. fanotify also need to pin inodes in core as it works today. But the next step is to introduce the concept of 'ignored masks' which is actually a mask of events for an inode of no interest. I claim that these should be liberally sent to the kernel and should not pin the inode in core. If the inode is brought back in the listener will get an event it may have thought excluded, but this is not a serious situation and one any listener should deal with. This patch lays the ground work for non-pinning inode marks by using lazy inode pinning. We do not pin a mark until it has a non-zero mask entry. If a listener new sets a mask we never pin the inode. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: split generic and inode specific mark code	Eric Paris
	currently all marking is done by functions in inode-mark.c. Some of this is pretty generic and should be instead done in a generic function and we should only put the inode specific code in inode-mark.c Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: take inode->i_lock inside fsnotify_find_mark_entry()	Andreas Gruenbacher
	All callers to fsnotify_find_mark_entry() except one take and release inode->i_lock around the call. Take the lock inside fsnotify_find_mark_entry() instead. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: rename mark_entry to just mark	Eric Paris
	rename anything in inotify that deals with mark_entry to just be mark. It makes a lot more sense. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: rename fsnotify_find_mark_entry to fsnotify_find_mark	Eric Paris
	the _entry portion of fsnotify functions is useless. Drop it. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: rename fsnotify_mark_entry to just fsnotify_mark	Eric Paris
	The name is long and it serves no real purpose. So rename fsnotify_mark_entry to just fsnotify_mark. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: put inode specific fields in an fsnotify_mark in a union	Eric Paris
	The addition of marks on vfs mounts will be simplified if the inode specific parts of a mark and the vfsmnt specific parts of a mark are actually in a union so naming can be easy. This patch just implements the inode struct and the union. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: include vfsmount in should_send_event when appropriate	Eric Paris
	To ensure that a group will not duplicate events when it receives it based on the vfsmount and the inode should_send_event test we should distinguish those two cases. We pass a vfsmount to this function so groups can make their own determinations. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: drop mask argument from fsnotify_alloc_group	Eric Paris
	Nothing uses the mask argument to fsnotify_alloc_group. This patch drops that argument. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: fsnotify_obtain_group should be fsnotify_alloc_group	Eric Paris
	fsnotify_obtain_group was intended to be able to find an already existing group. Nothing uses that functionality. This just renames it to fsnotify_alloc_group so it is clear what it is doing. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: remove group_num altogether	Eric Paris
	The original fsnotify interface has a group-num which was intended to be able to find a group after it was added. I no longer think this is a necessary thing to do and so we remove the group_num. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: per group notification queue merge types	Eric Paris
	inotify only wishes to merge a new event with the last event on the notification fifo. fanotify is willing to merge any events including by means of bitwise OR masks of multiple events together. This patch moves the inotify event merging logic out of the generic fsnotify notification.c and into the inotify code. This allows each use of fsnotify to provide their own merge functionality. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: include data in should_send calls	Eric Paris
	fanotify is going to need to look at file->private_data to know if an event should be sent or not. This passes the data (which might be a file, dentry, inode, or none) to the should_send function calls so fanotify can get that information when available Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: provide the data type to should_send_event	Eric Paris
	fanotify is only interested in event types which contain enough information to open the original file in the context of the fanotify listener. Since fanotify may not want to send events if that data isn't present we pass the data type to the should_send_event function call so fanotify can express its lack of interest. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: do not spam console without limit	Eric Paris
	inotify was supposed to have a dmesg printk ratelimitor which would cause inotify to only emit one message per boot. The static bool was never set so it kept firing messages. This patch correctly limits warnings in multiple places. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: remove inotify in kernel interface	Eric Paris
	nothing uses inotify in the kernel, drop it! Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: do not reuse watch descriptors	Eric Paris
	Prior to 2.6.31 inotify would not reuse watch descriptors until all of them had been used at least once. After the rewrite inotify would reuse watch descriptors. The selinux utility 'restorecond' was found to have problems when watch descriptors were reused. This patch reverts to the pre inotify rewrite behavior to not reuse watch descriptors. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: use container_of instead of casting	Eric Paris
	inotify_free_mark casts directly from an fsnotify_mark_entry to an inotify_inode_mark_entry. This works, but should use container_of instead for future proofing. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	fsnotify: allow addition of duplicate fsnotify marks	Eric Paris
	This patch allows a task to add a second fsnotify mark to an inode for the same group. This mark will be added to the end of the inode's list and this will never be found by the stand fsnotify_find_mark() function. This is useful if a user wants to add a new mark before removing the old one. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28	inotify: simplify the inotify idr handling	Eric Paris
	This patch moves all of the idr editing operations into their own idr functions. It makes it easier to prove locking correctness and to to understand the code flow. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-21	Saner locking around deactivate_super()	Al Viro
	Make sure that s_umount is acquired before we drop the final active reference; we still have the fast path (atomic_dec_unless) and we have gotten rid of the window between the moment when s_active hits zero and s_umount is acquired. Which simplifies the living hell out of grab_super() and inotify pin_to_kill() stuff. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-21	get rid of S_BIAS	Al Viro
	use atomic_inc_not_zero(&sb->s_active) instead of playing games with checking ->s_count > S_BIAS Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-14	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify	Linus Torvalds
	* 'for-linus' of git://git.infradead.org/users/eparis/notify: inotify: don't leak user struct on inotify release inotify: race use after free/double free in inotify inode marks inotify: clean up the inotify_add_watch out path Inotify: undefined reference to `anon_inode_getfd' Manual merge to remove duplicate "select ANON_INODES" from Kconfig file
2010-05-14	inotify: don't leak user struct on inotify release	Pavel Emelyanov
	inotify_new_group() receives a get_uid-ed user_struct and saves the reference on group->inotify_data.user. The problem is that free_uid() is never called on it. Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify using fsnotify) after 2.6.30. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Eric Paris <eparis@parisplace.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-14	inotify: race use after free/double free in inotify inode marks	Eric Paris
	There is a race in the inotify add/rm watch code. A task can find and remove a mark which doesn't have all of it's references. This can result in a use after free/double free situation. Task A Task B ------------ ----------- inotify_new_watch() allocate a mark (refcnt == 1) add it to the idr inotify_rm_watch() inotify_remove_from_idr() fsnotify_put_mark() refcnt hits 0, free take reference because we are on idr [at this point it is a use after free] [time goes on] refcnt may hit 0 again, double free The fix is to take the reference BEFORE the object can be found in the idr. Signed-off-by: Eric Paris <eparis@redhat.com> Cc: <stable@kernel.org>
2010-05-14	inotify: clean up the inotify_add_watch out path	Eric Paris
	inotify_add_watch explictly frees the unused inode mark, but it can just use the generic code. Just do that. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-05-12	Inotify: undefined reference to `anon_inode_getfd'	Russell King
	Fix: fs/built-in.o: In function `sys_inotify_init1': summary.c:(.text+0x347a4): undefined reference to `anon_inode_getfd' found by kautobuild with arms bcmring_defconfig, which ends up with INOTIFY_USER enabled (through the 'default y') but leaves ANON_INODES unset. However, inotify_user.c uses anon_inode_getfd(). Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Eric Paris <eparis@redhat.com>
2010-04-30	Inotify: Fix build failure in inotify user support	Ralf Baechle
	CONFIG_INOTIFY_USER defined but CONFIG_ANON_INODES undefined will result in the following build failure: LD vmlinux fs/built-in.o: In function 'sys_inotify_init1': (.text.sys_inotify_init1+0x22c): undefined reference to 'anon_inode_getfd' fs/built-in.o: In function `sys_inotify_init1': (.text.sys_inotify_init1+0x22c): relocation truncated to fit: R_MIPS_26 against 'anon_inode_getfd' make[2]: * [vmlinux] Error 1 make[1]: * [sub-make] Error 2 make: *** [all] Error 2 Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-19	switch inotify_user to anon_inode	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-01-15	inotify: only warn once for inotify problems	Eric Paris
	inotify will WARN() if it finds that the idr and the fsnotify internals somehow got out of sync. It was only supposed to do this once but due to this stupid bug it would warn every single time a problem was detected. Signed-off-by: Eric Paris <eparis@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-15	inotify: do not reuse watch descriptors	Eric Paris
	Since commit 7e790dd5fc937bc8d2400c30a05e32a9e9eef276 ("inotify: fix error paths in inotify_update_watch") inotify changed the manor in which it gave watch descriptors back to userspace. Previous to this commit inotify acted like the following: inotify_add_watch(X, Y, Z) = 1 inotify_rm_watch(X, 1); inotify_add_watch(X, Y, Z) = 2 but after this patch inotify would return watch descriptors like so: inotify_add_watch(X, Y, Z) = 1 inotify_rm_watch(X, 1); inotify_add_watch(X, Y, Z) = 1 which I saw as equivalent to opening an fd where open(file) = 1; close(1); open(file) = 1; seemed perfectly reasonable. The issue is that quite a bit of userspace apparently relies on the behavior in which watch descriptors will not be quickly reused. KDE relies on it, I know some selinux packages rely on it, and I have heard complaints from other random sources such as debian bug 558981. Although the man page implies what we do is ok, we broke userspace so this patch almost reverts us to the old behavior. It is still slightly racey and I have patches that would fix that, but they are rather large and this will fix it for all real world cases. The race is as follows: - task1 creates a watch and blocks in idr_new_watch() before it updates the hint. - task2 creates a watch and updates the hint. - task1 updates the hint with it's older wd - task removes the watch created by task2 - task adds a new watch and will reuse the wd originally given to task2 it requires moving some locking around the hint (last_wd) but this should solve it for the real world and be -stable safe. As a side effect this patch papers over a bug in the lib/idr code which is causing a large number WARN's to pop on people's system and many reports in kerneloops.org. I'm working on the root cause of that idr bug seperately but this should make inotify immune to that issue. Signed-off-by: Eric Paris <eparis@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>