summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2013-04-16nfsd4: remove some useless codefanchaoting
The "list_empty(&oo->oo_owner.so_stateids)" is aways true, so remove it. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKEDJ. Bruce Fields
A 4.1 server must notify a client that has had any state revoked using the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag. The client can figure out exactly which state is the problem using CHECK_STATEID and then free it using FREE_STATEID. The status flag will be unset once all such revoked stateids are freed. Our server's only recallable state is delegations. So we keep with each 4.1 client a list of delegations that have timed out and been recalled, but haven't yet been freed by FREE_STATEID. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-15Merge branches 'timers-urgent-for-linus', 'irq-urgent-for-linus' and ↵Linus Torvalds
'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull {timer,irq,core} fixes from Thomas Gleixner: - timer: bug fix for a cpu hotplug race. - irq: single bugfix for a wrong return value, which prevents the calling function to invoke the software fallback. - core: bugfix which plugs two race confitions which can cause hotplug per cpu threads to end up on the wrong cpu. * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Don't reinitialize a cpu_base lock on CPU_UP * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: gic: fix irq_trigger return * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kthread: Prevent unpark race which puts threads on the wrong cpu
2013-04-14Merge 3.9-rc7 into driver-core-nextGreg Kroah-Hartman
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull one more btrfs fix from Chris Mason: "This has a recent fix from Josef for our tree log replay code. It fixes problems where the inode counter for the number of bytes in the file wasn't getting updated properly during fsync replay. The commit did get rebased this morning, but it was only to clean up the subject line. The code hasn't changed." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: make sure nbytes are right after log replay
2013-04-14NFSv4.1: Set the RPC_CLNT_CREATE_INFINITE_SLOTS flag for NFSv4.1 transportsTrond Myklebust
This ensures that the RPC layer doesn't override the NFS session negotiation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-13vfs: Revert spurious fix to spinning prevention in prune_icache_sbSuleiman Souhlal
Revert commit 62a3ddef6181 ("vfs: fix spinning prevention in prune_icache_sb"). This commit doesn't look right: since we are looking at the tail of the list (sb->s_inode_lru.prev) if we want to skip an inode, we should put it back at the head of the list instead of the tail, otherwise we will keep spinning on it. Discovered when investigating why prune_icache_sb came top in perf reports of a swapping load. Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-13Btrfs: make sure nbytes are right after log replayJosef Bacik
While trying to track down a tree log replay bug I noticed that fsck was always complaining about nbytes not being right for our fsynced file. That is because the new fsync stuff doesn't wait for ordered extents to complete, so the inodes nbytes are not necessarily updated properly when we log it. So to fix this we need to set nbytes to whatever it is on the inode that is on disk, so when we replay the extents we can just add the bytes that are being added as we replay the extent. This makes it work for the case that we have the wrong nbytes or the case that we logged everything and nbytes is actually correct. With this I'm no longer getting nbytes errors out of btrfsck. Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-04-12Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS fix from Steve French: "Fixes a regression in cifs in which a password which begins with a comma is parsed incorrectly as a blank password" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: Allow passwords which begin with a delimitor
2013-04-12NFSv4: Fix handling of revoked delegations by setattrTrond Myklebust
Currently, _nfs4_do_setattr() will use the delegation stateid if no writeable open file stateid is available. If the server revokes that delegation stateid, then the call to nfs4_handle_exception() will fail to handle the error due to the lack of a struct nfs4_state, and will just convert the error into an EIO. This patch just removes the requirement that we must have a struct nfs4_state in order to invalidate the delegation and retry. Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-12treewide: Fix typo in printksMasanari Iida
Correct spelling typos in printk and comments. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-04-12kthread: Prevent unpark race which puts threads on the wrong cpuThomas Gleixner
The smpboot threads rely on the park/unpark mechanism which binds per cpu threads on a particular core. Though the functionality is racy: CPU0 CPU1 CPU2 unpark(T) wake_up_process(T) clear(SHOULD_PARK) T runs leave parkme() due to !SHOULD_PARK bind_to(CPU2) BUG_ON(wrong CPU) We cannot let the tasks move themself to the target CPU as one of those tasks is actually the migration thread itself, which requires that it starts running on the target cpu right away. The solution to this problem is to prevent wakeups in park mode which are not from unpark(). That way we can guarantee that the association of the task to the target cpu is working correctly. Add a new task state (TASK_PARKED) which prevents other wakeups and use this state explicitly for the unpark wakeup. Peter noticed: Also, since the task state is visible to userspace and all the parked tasks are still in the PID space, its a good hint in ps and friends that these tasks aren't really there for the moment. The migration thread has another related issue. CPU0 CPU1 Bring up CPU2 create_thread(T) park(T) wait_for_completion() parkme() complete() sched_set_stop_task() schedule(TASK_PARKED) The sched_set_stop_task() call is issued while the task is on the runqueue of CPU1 and that confuses the hell out of the stop_task class on that cpu. So we need the same synchronizaion before sched_set_stop_task(). Reported-by: Dave Jones <davej@redhat.com> Reported-and-tested-by: Dave Hansen <dave@sr71.net> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Peter Ziljstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: dhillf@gmail.com Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-12ext4: clear buffer_uninit flag when submitting IOJan Kara
Currently noone cleared buffer_uninit flag. This results in writeback needlessly marking io_end as needing extent conversion scanning extent tree for extents to convert. So clear the buffer_uninit flag once the buffer is submitted for IO and the flag is transformed into EXT4_IO_END_UNWRITTEN flag. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-04-11ext4: use io_end for multiple biosJan Kara
Change writeback path to create just one io_end structure for the extent to which we submit IO and share it among bios writing that extent. This prevents needless splitting and joining of unwritten extents when they cannot be submitted as a single bio. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-04-11ext4: make ext4_bio_write_page() use BH_Async_Write flagsJan Kara
So far ext4_bio_write_page() attached all the pages to ext4_io_end structure. This makes that structure pretty heavy (1 KB for pointers + 16 bytes per page attached to the bio). Also later we would like to share ext4_io_end structure among several bios in case IO to a single extent needs to be split among several bios and pointing to pages from ext4_io_end makes this complex. We remove page pointers from ext4_io_end and use pointers from bio itself instead. This isn't as easy when blocksize < pagesize because then we can have several bios in flight for a single page and we have to be careful when to call end_page_writeback(). However this is a known problem already solved by block_write_full_page() / end_buffer_async_write() so we mimic its behavior here. We mark buffers going to disk with BH_Async_Write flag and in ext4_bio_end_io() we check whether there are any buffers with BH_Async_Write flag left. If there are not, we can call end_page_writeback(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
2013-04-11ext4: Use kstrtoul() instead of parse_strtoul()Lukas Czerner
In parse_strtoul() we're still using deprecated simple_strtoul(). Remove parse_strtoul() altogether and replace it with kstrtoul() Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-11ext4: defragmentation code cleanupDmitry Monakhov
- grab_cache_page_write_begin() may not wait on page's writeback since (1d1d1a767206). But it is still reasonable to wait on page's writeback here in order to be on the safe side. - Fix miss typo: pass 'length' instead of 'end' to __block_write_begin() https://bugzilla.kernel.org/show_bug.cgi?id=56241 TESTCASE: git://oss.sgi.com/xfs/cmds/xfstests.git MKFS_OPTIONS="-b1024" ; ./check ext4/304 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Akira Fujita <a-fujita.rs.jp.nec.com>
2013-04-11ext4: do not convert to indirect with bigalloc enabledLukas Czerner
With bigalloc feature enabled we do not support indirect addressing at all so we have to prevent extent addressing to indirect addressing conversion in this case. The problem has been introduced with the commit "ext4: support simple conversion of extent-mapped inodes to use i_blocks" Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-11NFSv4 release the sequence id in the return on close caseAndy Adamson
Otherwise we deadlock if state recovery is initiated while we sleep. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-10ext4: move ext4_ind_migrate() into migrate.cLukas Czerner
Move ext4_ind_migrate() into migrate.c file since it makes much more sense and ext4_ext_migrate() is there as well. Also fix tiny style problem - add spaces around "=" in "i=0". Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-10cifs: Allow passwords which begin with a delimitorSachin Prabhu
Fixes a regression in cifs_parse_mount_options where a password which begins with a delimitor is parsed incorrectly as being a blank password. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2013-04-10nfs: remove unnecessary check for NULL inode->i_flock from ↵Jeff Layton
nfs_delegation_claim_locks The second check was added in commit 65b62a29 but it will never be true. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-10Merge tag 'nfs-for-3.9-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull another nfs fixlet from Trond Myklebust: "I suddenly noticed that a one-line issue that I _thought_ I had fixed with the nfs41_walk_client_list patch was apparently still there in the pull request I sent earlier today. I'm very sorry for not catching that in time. - Fix a brain fart in nfs41_walk_client_list" * tag 'nfs-for-3.9-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Doh! Typo in the fix to nfs41_walk_client_list
2013-04-10NFSv4: Doh! Typo in the fix to nfs41_walk_client_listTrond Myklebust
Make sure that we set the status to 0 on success. Missed in testing because it never appears when doing multiple mounts to _different_ servers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: <stable@vger.kernel.org> # 3.7.x: 7b1f1fd: NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
2013-04-10Merge tag 'nfs-for-3.9-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: - fix for memory corruption issues in nfs4[01]_walk_client_list (stable) - fix for an Oopsable bug in rpc_clone_client (stable) - another state manager deadlock in the NFSv4 open code - memory leaks in nfs4_discover_server_trunking and rpc_new_client * tag 'nfs-for-3.9-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix another potential state manager deadlock SUNRPC: Fix a potential memory leak in rpc_new_client NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list NFSv4: Fix a memory leak in nfs4_discover_server_trunking SUNRPC: Remove extra xprt_put()
2013-04-10GFS2: Add origin indicator to glock demote tracingSteven Whitehouse
This adds the origin indicator to the trace point for glock demotion, so that it is possible to see where demote requests have come from. Note that requests generated from the demote_rq sysfs interface will show as remote, since they are intended to replicate exactly the effect of a demote reuqest from a remote node. It is still possible to tell these apart by looking at the process which initiated the demote request. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-04-10GFS2: Add origin indicator to glock callbacksSteven Whitehouse
This patch adds a bool indicating whether the demote request was originated locally or remotely. This is then used by the iopen ->go_callback() to make 100% sure that it will only respond to remote callbacks. Since ->evict_inode() uses GL_NOCACHE when it attempts to get an exclusive lock on the iopen lock, this may result in extra scheduling of the workqueue in case that the exclusive promotion request failed. This patch prevents that from happening. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-04-09ext4: fix miscellaneous big endian warningsTheodore Ts'o
None of these result in any bug, but they makes sparse complain. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-09ext4: fix big-endian bug in metadata checksum calculationsDmitry Monakhov
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-04-09ext4: fix big-endian bug in extent migration codeDmitry Monakhov
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-04-09ext4: fix usless declarationsDmitri Monakho
This patch should fix sparse complains about shadow declatations. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-09ext4: introduce reserved spaceLukas Czerner
Currently in ENOSPC condition when writing into unwritten space, or punching a hole, we might need to split the extent and grow extent tree. However since we can not allocate any new metadata blocks we'll have to zero out unwritten part of extent or punched out part of extent, or in the worst case return ENOSPC even though use actually does not allocate any space. Also in delalloc path we do reserve metadata and data blocks for the time we're going to write out, however metadata block reservation is very tricky especially since we expect that logical connectivity implies physical connectivity, however that might not be the case and hence we might end up allocating more metadata blocks than previously reserved. So in future, metadata reservation checks should be removed since we can not assure that we do not under reserve. And this is where reserved space comes into the picture. When mounting the file system we slice off a little bit of the file system space (2% or 4096 clusters, whichever is smaller) which can be then used for the cases mentioned above to prevent costly zeroout, or unexpected ENOSPC. The number of reserved clusters can be set via sysfs, however it can never be bigger than number of free clusters in the file system. Note that this patch fixes the failure of xfstest 274 as expected. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2013-04-09nfsd4: clean up validate_stateidJ. Bruce Fields
The logic here is better expressed with a switch statement. While we're here, CLOSED stateids (or stateids of an unkown type--which would indicate a server bug) should probably return nfserr_bad_stateid, though this behavior shouldn't affect any non-buggy client. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09nfsd4: check backchannel attributes on create_sessionJ. Bruce Fields
Make sure the client gives us an adequate backchannel. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09nfsd4: fix forechannel attribute negotiationJ. Bruce Fields
Negotiation of the 4.1 session forechannel attributes is a mess. Fix: - Move it all into check_forechannel_attrs instead of spreading it between that, alloc_session, and init_forechannel_attrs. - set a minimum "slotsize" so that our drc memory limits apply even for small maxresponsesize_cached. This also fixes some bugs when slotsize becomes <= 0. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09nfsd4: cleanup check_forechannel_attrsJ. Bruce Fields
Pass this struct by reference, not by value, and return an error instead of a boolean to allow for future additions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A nasty bug in fs/namespace.c caught by Andrey + a couple of less serious unpleasantness - ecryptfs misc device playing hopeless games with try_module_get() and palinfo procfs support being... not quite correctly done, to be polite." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mnt: release locks on error path in do_loopback palinfo fixes procfs: add proc_remove_subtree() ecryptfs: close rmmod race
2013-04-09try a saner locking for pde_opener...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09deal with races between remove_proc_entry() and proc_reg_release()Al Viro
* serialize the call of ->release() on per-pdeo mutex * don't remove pdeo from per-pde list until we are through with it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: preparations for remove_proc_entry() race fixesAl Viro
* leave ->proc_fops alone; make ->pde_users negative instead * trim pde_opener * move relevant code in fs/proc/inode.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: Clean up huge if-statement in __proc_file_read()David Howells
Switch huge if-statement in __proc_file_read() around. This then puts the single line loop break immediately after the if-statement and allows us to de-indent the huge comment and make it take fewer lines. The code following the if-statement then follows naturally from the call to dp->read_proc(). Signed-off-by: David Howells <dhowells@redhat.com>
2013-04-09proc: Kill create_proc_entry()David Howells
Kill create_proc_entry() in favour of create_proc_read_entry(), proc_create() and proc_create_data(). Signed-off-by: David Howells <dhowells@redhat.com>
2013-04-09constify a bunch of struct file_operations instancesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: new helper - PDE_DATA(inode)Al Viro
The only part of proc_dir_entry the code outside of fs/proc really cares about is PDE(inode)->data. Provide a helper for that; static inline for now, eventually will be moved to fs/proc, along with the knowledge of struct proc_dir_entry layout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: kill ->write_proc()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09new helper: single_open_size()Al Viro
Same as single_open(), but preallocates the buffer of given size. Doesn't make any sense for sizes up to PAGE_SIZE and doesn't make sense if output of show() exceeds PAGE_SIZE only rarely - seq_read() will take care of growing the buffer and redoing show(). If you _know_ that it will be large, it might make more sense to look into saner iterator, rather than go with single-shot one. If that's impossible, single_open_size() might be for you. Again, don't use that without a good reason; occasionally that's really the best way to go, but very often there are better solutions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: don't allow to use proc_create, create_proc_entry, etc. for directoriesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09reiserfs: use proc_remove_subtree()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: switch /proc/self away from proc_dir_entryAl Viro
Just have it pinned in dcache all along and let procfs ->kill_sb() drop it before kill_anon_super(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09mode_t, whack-a-mole at 11...Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>