summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2008-02-08libfs: make simple attributes interruptibleChristoph Hellwig
Use mutex_lock_interruptible in simple_attr_read/write. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <stefano.brivio@polimi.it> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg KH <greg@kroah.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08libfs: allow error return from simple attributesChristoph Hellwig
Sometimes simple attributes might need to return an error, e.g. for acquiring a mutex interruptibly. In fact we have that situation in spufs already which is the original user of the simple attributes. This patch merged the temporarily forked attributes in spufs back into the main ones and allows to return errors. [akpm@linux-foundation.org: build fix] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <stefano.brivio@polimi.it> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg KH <greg@kroah.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08write_inode_now(): avoid unnecessary synchronous writeMike Galbraith
We shouldn't use WB_SYNC_ALL if the caller is asking for asynchronous treatment. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08Allow executables larger than 2GBAndi Kleen
This allows us to use executables >2GB. Based on a patch by Dave Anderson Signed-off-by: Andi Kleen <ak@suse.de> Cc: Dave Anderson <anderson@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08ufs: fix symlink creation on ufs2Evgeniy Dushistov
If we create symlink on UFS2 filesystem under Linux, it looks wrong under other OSes, because of max symlink length field was not initialized properly, and data blocks were not used to save short symlink names. [akpm@linux-foundation.org: add missing fs32_to_cpu()] Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Cc: Steven <stevenaaus@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08ext2: remove unused ext2_put_inode prototypeChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aio: negative offset should return -EINVALRusty Russell
An AIO read or write should return -EINVAL if the offset is negative. This check matches the one in pread and pwrite. This was found by the libaio test suite. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aio: partial write should not return error codeRusty Russell
When an AIO write gets an error after writing some data (eg. ENOSPC), it should return the amount written already, not the error. Just like write() is supposed to. This was found by the libaio test suite. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-By: Zach Brown <zach.brown@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08ext3: replace all adds to little endians variables with le*_add_cpuMarcin Slusarz
replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); sparse didn't generate any new warning with this patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08byteorder: move le32_add_cpu & friends from OCFS2 to coreMarcin Slusarz
This patchset moves le*_add_cpu and be*_add_cpu functions from OCFS2 to core header (1st), converts ext3 filesystem to this API (2nd) and replaces XFS different named functions with new ones (3rd). There are many places where these functions will be useful. Just look at: grep -r 'cpu_to_[ble12346]*([ble12346]*_to_cpu.*[-+]' linux-src/ Patch for ext3 is an example how conversions will probably look like. This patch: - move inline functions which add native byte order variable to little/big endian variable to core header * le16_add_cpu(__le16 *var, u16 val) * le32_add_cpu(__le32 *var, u32 val) * le64_add_cpu(__le64 *var, u64 val) * be32_add_cpu(__be32 *var, u32 val) - add for completeness: * be16_add_cpu(__be16 *var, u16 val) * be64_add_cpu(__be64 *var, u64 val) Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08fs: remove fastcall, it is always emptyHarvey Harrison
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08rewrite rdNick Piggin
This is a rewrite of the ramdisk block device driver. The old one is really difficult because it effectively implements a block device which serves data out of its own buffer cache. It relies on the dirty bit being set, to pin its backing store in cache, however there are non trivial paths which can clear the dirty bit (eg. try_to_free_buffers()), which had recently lead to data corruption. And in general it is completely wrong for a block device driver to do this. The new one is more like a regular block device driver. It has no idea about vm/vfs stuff. It's backing store is similar to the buffer cache (a simple radix-tree of pages), but it doesn't know anything about page cache (the pages in the radix tree are not pagecache pages). There is one slight downside -- direct block device access and filesystem metadata access goes through an extra copy and gets stored in RAM twice. However, this downside is only slight, because the real buffercache of the device is now reclaimable (because we're not playing crazy games with it), so under memory intensive situations, footprint should effectively be the same -- maybe even a slight advantage to the new driver because it can also reclaim buffer heads. The fact that it now goes through all the regular vm/fs paths makes it much more useful for testing, too. text data bss dec hex filename 2837 849 384 4070 fe6 drivers/block/rd.o 3528 371 12 3911 f47 drivers/block/brd.o Text is larger, but data and bss are smaller, making total size smaller. A few other nice things about it: - Similar structure and layout to the new loop device handlinag. - Dynamic ramdisk creation. - Runtime flexible buffer head size (because it is no longer part of the ramdisk code). - Boot / load time flexible ramdisk size, which could easily be extended to a per-ramdisk runtime changeable size (eg. with an ioctl). - Can use highmem for the backing store. [akpm@linux-foundation.org: fix build] [byron.bbradley@gmail.com: make rd_size non-static] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Byron Bradley <byron.bbradley@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aout: remove unnecessary inclusions of {asm, linux}/a.out.hDavid Howells
Remove now unnecessary inclusions of {asm,linux}/a.out.h. [akpm@linux-foundation.org: fix alpha build] Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08aout: suppress A.OUT library support if !CONFIG_ARCH_SUPPORTS_AOUTDavid Howells
Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set. Not all architectures support the A.OUT binfmt, so the ELF binfmt should not be permitted to go looking for A.OUT libraries to load in such a case. Not only that, but under such conditions A.OUT core dumps are not produced either. To make this work, this patch also does the following: (1) Makes the existence of the contents of linux/a.out.h contingent on CONFIG_ARCH_SUPPORTS_AOUT. (2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT core dumping code. (3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This is then included only where needed. This means that this bit of arch code will be stored in the appropriate A.OUT binfmt module rather than the core kernel. (4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not needed) and FRV. This patch depends on the previous patch to move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT format is available. [jdike@addtoit.com: uml: re-remove accidentally restored code] Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08Pidns: make full use of xxx_vnr() callsPavel Emelyanov
Some time ago the xxx_vnr() calls (e.g. pid_vnr or find_task_by_vpid) were _all_ converted to operate on the current pid namespace. After this each call like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo) one. Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where appropriate. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Reviewed-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08ITIMER_REAL: convert to use struct pidOleg Nesterov
signal_struct->tsk points to the ->group_leader and thus we have the nasty code in de_thread() which has to change it and restart ->real_timer if the leader is changed. Use "struct pid *leader_pid" instead. This also allows us to kill now unneeded send_group_sig_info(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: fix ->open'less usage due to ->proc_fops flipAlexey Dobriyan
Typical PDE creation code looks like: pde = create_proc_entry("foo", 0, NULL); if (pde) pde->proc_fops = &foo_proc_fops; Notice that PDE is first created, only then ->proc_fops is set up to final value. This is a problem because right after creation a) PDE is fully visible in /proc , and b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's possible to ->read without ->open (see one class of oopses below). The fix is new API called proc_create() which makes sure ->proc_fops are set up before gluing PDE to main tree. Typical new code looks like: pde = proc_create("foo", 0, NULL, &foo_proc_fops); if (!pde) return -ENOMEM; Fix most networking users for a start. In the long run, create_proc_entry() for regular files will go. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024 printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000 Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC last sysfs file: /sys/block/sda/sda1/dev Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2) EIP: 0060:[<c1188c1b>] EFLAGS: 00210002 CPU: 0 EIP is at mutex_lock_nested+0x75/0x25d EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570 ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000) Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb 00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246 00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550 Call Trace: [<c106f7ce>] seq_read+0x24/0x28a [<c106f7aa>] seq_read+0x0/0x28a [<c106f7ce>] seq_read+0x24/0x28a [<c106f7aa>] seq_read+0x0/0x28a [<c10818b8>] proc_reg_read+0x60/0x73 [<c1081858>] proc_reg_read+0x0/0x73 [<c105a34f>] vfs_read+0x6c/0x8b [<c105a6f3>] sys_read+0x3c/0x63 [<c10025f2>] sysenter_past_esp+0x5f/0xa5 [<c10697a7>] destroy_inode+0x24/0x33 ======================= INFO: lockdep is turned off. Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff <f0> fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33 EIP: [<c1188c1b>] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8 [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: fix the threaded /proc/selfEric W. Biederman
Long ago when the CLONE_THREAD support first went it someone thought it would be wise to point /proc/self at /proc/<tgid> instead of /proc/<pid>. Given that /proc/<tgid> can return information about a very different task (if enough things have been unshared) then our current process /proc/<tgid> seems blatantly wrong. So far I have yet to think up an example where the current behavior would be advantageous, and I can see several places where it is seriously non-intuitive. We may be stuck with the current broken behavior for backwards compatibility reasons but lets try fixing our ancient bug for the 2.6.25 time frame and see if anyone screams. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: "Guillaume Chazarain" <guichaz@yahoo.fr> Cc: "Pavel Emelyanov" <xemul@openvz.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: proper pidns handling for /proc/selfEric W. Biederman
Currently if you access a /proc that is not mounted with your processes current pid namespace /proc/self will point at a completely random task. This patch fixes /proc/self to point to the current process if it is available in the particular mount of /proc or to return -ENOENT if the current process is not visible. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: seqfile convert proc_pid_status to properly handle pid namespacesEric W. Biederman
Currently we possibly lookup the pid in the wrong pid namespace. So seq_file convert proc_pid_status which ensures the proper pid namespaces is passed in. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: another build fix] [akpm@linux-foundation.org: s390 build fix] [akpm@linux-foundation.org: fix task_name() output] [akpm@linux-foundation.org: fix nommu build] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Andrew Morgan <morgan@kernel.org> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08seqfile convert proc_pid_statmEric W. Biederman
This conversion is just for code cleanliness, uniformity, and general safety. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: rewrite do_task_stat to correctly handle pid namespaces.Eric W. Biederman
Currently (as pointed out by Oleg) do_task_stat has a race when calling task_pid_nr_ns with the task exiting. In addition do_task_stat is not currently displaying information in the context of the pid namespace that mounted the /proc filesystem. So "cut -d' ' -f 1 /proc/<pid>/stat" may not equal <pid>. This patch fixes the problem by converting to a single_open seq_file show method. Getting the pid namespace from the filesystem superblock instead of current, and simply using the the struct pid from the inode instead of attempting to get that same pid from the task. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: implement proc_single_file_operationsEric W. Biederman
Currently many /proc/pid files use a crufty precursor to the current seq_file api, and they don't have direct access to the pid_namespace or the pid of for which they are displaying data. So implement proc_single_file_operations to make the seq_file routines easy to use, and to give access to the full state of the pid of we are displaying data for. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: detect duplicate names on registrationZhang Rui
Print a warning if PDE is registered with a name which already exists in target directory. Bug report and a simple fix can be found here: http://bugzilla.kernel.org/show_bug.cgi?id=8798 [\n fixlet and no undescriptive variable usage --adobriyan] [akpm@linux-foundation.org: make printk comprehensible] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: remove useless check on symlink removalAlexey Dobriyan
proc symlinks always have valid ->data containing destination of symlink. No need to check it on removal -- proc_symlink() already done it. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: simplify function prototypesAlexey Dobriyan
Move code around so as to reduce the number of forward-declarations. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: less LOCK operations during lookupAlexey Dobriyan
Pseudo-code for lookup effectively is: LOCK kernel LOCK proc_subdir_lock find PDE UNLOCK proc_subdir_lock get inode LOCK proc_subdir_lock goto unlock UNLOCK proc_subdir_lock UNLOCK kernel We can get rid of LOCK/UNLOCK pair after getting inode simply by jumping to unlock_kernel() directly. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08proc: remove MODULE_LICENSEAlexey Dobriyan
proc is not modular, so MODULE_LICENSE just expands to empty space. proc without doubts remains GPLed. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08inotify: fix check for one-shot watches before destroying themUlisses Furquim
As the IN_ONESHOT bit is never set when an event is sent we must check it in the watch's mask and not in the event's mask. Signed-off-by: Ulisses Furquim <ulissesf@gmail.com> Reported-by: "Clem Taylor" <clem.taylor@gmail.com> Tested-by: "Clem Taylor" <clem.taylor@gmail.com> Cc: Amy Griffis <amy.griffis@hp.com> Cc: Robert Love <rlove@google.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (44 commits) dm raid1: report fault status dm raid1: handle read failures dm raid1: fix EIO after log failure dm raid1: handle recovery failures dm raid1: handle write failures dm snapshot: combine consecutive exceptions in memory dm: stripe enhanced status return dm: stripe trigger event on failure dm log: auto load modules dm: move deferred bio flushing to workqueue dm crypt: use async crypto dm crypt: prepare async callback fn dm crypt: add completion for async dm crypt: add async request mempool dm crypt: extract scatterlist processing dm crypt: tidy io ref counting dm crypt: introduce crypt_write_io_loop dm crypt: abstract crypt_write_done dm crypt: store sector mapping in dm_crypt_io dm crypt: move queue functions ...
2008-02-07Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6Linus Torvalds
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (62 commits) [XFS] add __init/__exit mark to specific init/cleanup functions [XFS] Fix oops in xfs_file_readdir() [XFS] kill xfs_root [XFS] keep i_nlink updated and use proper accessors [XFS] stop updating inode->i_blocks [XFS] Make xfs_ail_check check less by default [XFS] Move AIL pushing into it's own thread [XFS] use generic_permission [XFS] stop re-checking permissions in xfs_swapext [XFS] clean up xfs_swapext [XFS] remove permission check from xfs_change_file_space [XFS] prevent panic during log recovery due to bogus op_hdr length [XFS] Cleanup various fid related bits: [XFS] Fix xfs_lowbit64 [XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros. [XFS] kill superflous buffer locking (2nd attempt) [XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity [XFS] Remove the BPCSHIFT and NB* based macros from XFS. [XFS] Remove bogus assert [XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config ...
2008-02-08dm ioctl: move compat codeMilan Broz
Move compat_ioctl handling into dm-ioctl.c. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-07SUNRPC xptrdma: simplify build configurationJames Lentini
Trond and Bruce, This is a patch for 2.6.25. This is the same version that was sent out on December 12 for review (no comments to date). To simplify the RPC/RDMA client and server build configuration, make SUNRPC_XPRT_RDMA a hidden config option that continues to depend on SUNRPC and INFINIBAND. The value of SUNRPC_XPRT_RDMA will be: - N if either SUNRPC or INFINIBAND are N - M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M - Y if both SUNRPC and INFINIBAND are Y In 2.6.25, all of the RPC/RDMA related files are grouped in net/sunrpc/xprtrdma and the net/sunrpc/xprtrdma/Makefile builds both the client and server RPC/RDMA support using this config option. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-07NFS: Fix a potential file corruption issue when writingTrond Myklebust
If the inode is flagged as having an invalid mapping, then we can't rely on the PageUptodate() flag. Ensure that we don't use the "anti-fragmentation" write optimisation in nfs_updatepage(), since that will cause NFS to write out areas of the page that are no longer guaranteed to be up to date. A potential corruption could occur in the following scenario: client 1 client 2 =============== =============== fd=open("f",O_CREAT|O_WRONLY,0644); write(fd,"fubar\n",6); // cache last page close(fd); fd=open("f",O_WRONLY|O_APPEND); write(fd,"foo\n",4); close(fd); fd=open("f",O_WRONLY|O_APPEND); write(fd,"bar\n",4); close(fd); ----- The bug may lead to the file "f" reading 'fubar\n\0\0\0\nbar\n' because client 2 does not update the cached page after re-opening the file for write. Instead it keeps it marked as PageUptodate() until someone calls invaldate_inode_pages2() (typically by calling read()). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: BKL-removal: Implement a compat_ioctl handler for JFS BKL-removal: Use unlocked_ioctl for jfs
2008-02-07Merge branch 'upstream-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Negotiate locking protocol versions.
2008-02-07BKL-removal: Implement a compat_ioctl handler for JFSAndi Kleen
The ioctls were already compatible except for the actual values so this was fairly easy to do. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
2008-02-07BKL-removal: Use unlocked_ioctl for jfsAndi Kleen
Convert jfs_ioctl over to not use the BKL. The only potential race I could see was with two ioctls in parallel changing the flags and losing the updates. Use the i_mutex to protect against this. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
2008-02-07sysfs: remove BUG_ON() from sysfs_remove_group()Greg Kroah-Hartman
It's possible that the caller of sysfs_remove_group messed up and passed in an attribute group that was not really registered to this kobject. But don't panic for such a foolish error, spit out a warning about what happened, and continue on our way safely. Cc: Roland Dreier <rdreier@cisco.com> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-07Block: Fix whole_disk attribute bugGreg Kroah-Hartman
The "whole_disk" attribute was not properly converted in the block device conversion earlier, and if the file is read, bad things can happen. This patch fixes this, making the attribute an empty one, preserving the original functionality. Many thanks to David Miller for finding this, and pointing me in the proper place within the block code to look. Acked-by: David S. Miller <davem@davemloft.net> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-07Merge git://git.infradead.org/mtd-2.6Linus Torvalds
* git://git.infradead.org/mtd-2.6: (120 commits) [MTD] Fix mtdoops.c compilation [MTD] [NOR] fix startup lock when using multiple nor flash chips [MTD] [DOC200x] eccbuf is statically defined and always evaluate to true [MTD] Fix maps/physmap.c compilation with CONFIG_PM [MTD] onenand: Add panic_write function to the onenand driver [MTD] mtdoops: Use the panic_write function when present [MTD] Add mtd panic_write function pointer [MTD] [NAND] Freescale enhanced Local Bus Controller FCM NAND support. [MTD] physmap.c: Add support for multiple resources [MTD] [NAND] Fix misparenthesization introduced by commit 78b65179... [MTD] [NAND] Fix Blackfin NFC ECC calculating bug with page size 512 bytes [MTD] [NAND] Remove wrong operation in PM function of the BF54x NFC driver [MTD] [NAND] Remove unused variable in plat_nand_remove [MTD] Unlocking all Intel flash that is locked on power up. [MTD] [NAND] at91_nand: Make mtdparts option can override board info [MTD] mtdoops: Various minor cleanups [MTD] mtdoops: Ensure sequential write to the buffer [MTD] mtdoops: Perform write operations in a workqueue [MTD] mtdoops: Add further error return code checking [MTD] [NOR] Test devtype, not definition in flash_probe(), drivers/mtd/devices/lart.c ...
2008-02-07Sanitize the type of struct user.u_ar0H. Peter Anvin
struct user.u_ar0 is defined to contain a pointer offset on all architectures in which it is defined (all architectures which define an a.out format except SPARC.) However, it has a pointer type in the headers, which is pointless -- <asm/user.h> is not exported to userspace, and it just makes the code messy. Redefine the field as "unsigned long" (which is the same size as a pointer on all Linux architectures) and change the setting code to user offsetof() instead of hand-coded arithmetic. Cc: Linux Arch Mailing List <linux-arch@vger.kernel.org> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Lennert Buytenhek <kernel@wantstofly.org> Cc: Håvard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: remove iget() and the read_inode() super op as being obsoleteDavid Howells
Remove the old iget() call and the read_inode() superblock operation it uses as these are really obsolete, and the use of read_inode() does not produce proper error handling (no distinction between ENOMEM and EIO when marking an inode bad). Furthermore, this removes the temptation to use iget() to find an inode by number in a filesystem from code outside that filesystem. iget_locked() should be used instead. A new function is added in an earlier patch (iget_failed) that is to be called to mark an inode as bad, unlock it and release it should the get routine fail. Mark iget() and read_inode() as being obsolete and remove references to them from the documentation. Typically a filesystem will be modified such that the read_inode function becomes an internal iget function, for example the following: void thingyfs_read_inode(struct inode *inode) { ... } would be changed into something like: struct inode *thingyfs_iget(struct super_block *sp, unsigned long ino) { struct inode *inode; int ret; inode = iget_locked(sb, ino); if (!inode) return ERR_PTR(-ENOMEM); if (!(inode->i_state & I_NEW)) return inode; ... unlock_new_inode(inode); return inode; error: iget_failed(inode); return ERR_PTR(ret); } and then thingyfs_iget() would be called rather than iget(), for example: ret = -EINVAL; inode = iget(sb, ino); if (!inode || is_bad_inode(inode)) goto error; becomes: inode = thingyfs_iget(sb, ino); if (IS_ERR(inode)) { ret = PTR_ERR(inode); goto error; } Note that is_bad_inode() does not need to be called. The error returned by thingyfs_iget() should render it unnecessary. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop HPPFS from using iget() and read_inode()David Howells
Stop the HPPFS filesystem from using iget() and read_inode(). Provide an hppfs_iget(), and call that instead of iget(). hppfs_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. hppfs_fill_sb_common() returns any error incurred when getting the root inode instead of EINVAL. Note that the contents of hppfs_kern.c need to be examined: (*) The HPPFS inode retains a pointer to the proc dentry it is shadowing, but whilst it does appear to retain a reference to it, it doesn't appear to destroy the reference if the inode goes away. (*) hppfs_iget() should perhaps subsume init_inode() and hppfs_read_inode(). (*) It would appear that all hppfs inodes are the same inode because iget() was being called with inode number 0, which forms the lookup key. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop HOSTFS from using iget() and read_inode()David Howells
Stop the HOSTFS filesystem from using iget() and read_inode(). Provide hostfs_iget(), and call that instead of iget(). hostfs_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. hostfs_fill_sb_common() returns any error incurred when getting the root inode instead of EINVAL. Note that the contents of hostfs_kern.c need to be examined: (*) hostfs_iget() should perhaps subsume init_inode() and hostfs_read_inode(). (*) It would appear that all hostfs inodes are the same inode because iget() was being called with inode number 0 - which forms the lookup key. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop OPENPROMFS from using iget() and read_inode()David Howells
Stop the OPENPROMFS filesystem from using iget() and read_inode(). Replace openpromfs_read_inode() with openpromfs_iget(), and call that instead of iget(). openpromfs_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. openpromfs_fill_super() returns any error incurred when getting the root inode instead of ENOMEM (not that it currently incurs any other error). Signed-off-by: David Howells <dhowells@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop UFS from using iget() and read_inode()David Howells
Stop the UFS filesystem from using iget() and read_inode(). Replace ufs_read_inode() with ufs_iget(), and call that instead of iget(). ufs_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. ufs_fill_super() returns any error incurred when getting the root inode instead of EINVAL. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop the SYSV filesystem from using iget() and read_inode()David Howells
Stop the SYSV filesystem from using iget() and read_inode(). Replace sysv_read_inode() with sysv_iget(), and call that instead of iget(). sysv_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop ROMFS from using iget() and read_inode()David Howells
Stop the ROMFS filesystem from using iget() and read_inode(). Replace romfs_read_inode() with romfs_iget(), and call that instead of iget(). romfs_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. romfs_fill_super() returns any error incurred when getting the root inode instead of EINVAL. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07iget: stop QNX4 from using iget() and read_inode()David Howells
Stop the QNX4 filesystem from using iget() and read_inode(). Replace qnx4_read_inode() with qnx4_iget(), and call that instead of iget(). qnx4_iget() then uses iget_locked() directly and returns a proper error code instead of an inode in the event of an error. qnx4_fill_super() returns any error incurred when getting the root inode instead of EINVAL. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Howells <dhowells@redhat.com> Cc: Anders Larsen <al@alarsen.net> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>