summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2012-03-20NFS: Use cond_resched_lock() to reduce latencies in the commit scansTrond Myklebust
Ensure that we conditionally drop the inode->i_lock when it is safe to do so in the commit loops. We do so after locking the nfs_page, but before removing it from the commit list. We can then use list_safe_reset_next to recover the loop after the lock is retaken. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20NFSv4: It is not safe to dereference lsp->ls_state in release_lockownerTrond Myklebust
It is quite possible for the release_lockowner RPC call to race with the close RPC call, in which case, we cannot dereference lsp->ls_state in order to find the nfs_server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20NFS: ncommit count is being double decrementedFred Isaman
The decrement is handled by each call to nfs_request_remove_commit_list, no need to do it again in nfs_scan_commit. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20CIFS: Respect negotiated MaxMpxCountPavel Shilovsky
Some servers sets this value less than 50 that was hardcoded and we lost the connection if when we exceed this limit. Fix this by respecting this value - not sending more than the server allows. Cc: stable@kernel.org Reviewed-by: Jeff Layton <jlayton@samba.org> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <stevef@smf-gateway.(none)>
2012-03-20udf: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20ubifs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20squashfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20reiserfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20ocfs2: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20ntfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20nilfs2: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20nfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20minix: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20logfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20jbd2: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20jbd: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20gfs2: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20fuse: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20ext2: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20exofs: remove the second argument of k[un]map_atomic()Cong Wang
Ack-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20afs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20btrfs: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20fs: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20kcore: fix spelling in read_kcore() commentLaura Vasilescu
Signed-off-by: Laura Vasilescu <laura@rosedu.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-03-20GFS2: Change truncate page allocation to be GFP_NOFSBob Peterson
This patch changes the page allocation in gfs2_block_truncate_page and two others to GFP_NOFS to avoid deadlock in low-memory conditions. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-19ext4: change some printk() calls to use ext4_msg() insteadTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: avoid output message interleaving in ext4_error_<foo>()Joe Perches
Using KERN_CONT means that messages from multiple threads may be interleaved. Avoid this by using a single printk call in ext4_error_inode and ext4_error_file. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: remove trailing newlines from ext4_msg() and ext4_error() messagesTheodore Ts'o
The functions ext4_msg() and ext4_error() already tack on a trailing newline, so remove the unnecessary extra newline. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: add no_printk argument validation, fix falloutJoe Perches
Add argument validation to debug functions. Use ##__VA_ARGS__. Fix format and argument mismatches. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: remove redundant "EXT4-fs: " from uses of ext4_msgJoe Perches
ext4_msg adds "EXT4-fs: " to the messsage output. Remove the redundant bits from uses. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: give more helpful error message in ext4_ext_rm_leaf()Lukas Czerner
The error message produced by the ext4_ext_rm_leaf() when we are removing blocks which accidentally ends up inside the existing extent, is not very helpful, because we would like to also know which extent did we collide with. This commit changes the error message to get us also the information about the extent we are colliding with. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: remove unused code from ext4_ext_map_blocks()Lukas Czerner
Since the commit 'Rewrite punch hole to use ext4_ext_remove_space()' reworked the punch hole implementation to use ext4_ext_remove_space() instead of ext4_ext_map_blocks(), we can remove the code which is no longer needed from the ext4_ext_map_blocks(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19ext4: rewrite punch hole to use ext4_ext_remove_space()Lukas Czerner
This commit rewrites ext4 punch hole implementation to use ext4_ext_remove_space() instead of its home gown way of doing this via ext4_ext_map_blocks(). There are several reasons for changing this. Firstly it is quite non obvious that punching hole needs to ext4_ext_map_blocks() to punch a hole, especially given that this function should map blocks, not unmap it. It also required a lot of new code in ext4_ext_map_blocks(). Secondly the design of it is not very effective. The reason is that we are trying to punch out blocks in ext4_ext_punch_hole() in opposite direction than in ext4_ext_rm_leaf() which causes the ext4_ext_rm_leaf() to iterate through the whole tree from the end to the start to find the requested extent for every extent we are going to punch out. And finally the current implementation does not use the existing code, but bring a lot of new code, which is IMO unnecessary since there already is some infrastructure we can use. Specifically ext4_ext_remove_space(). This commit changes ext4_ext_remove_space() to accept 'end' parameter so we can not only truncate to the end of file, but also remove the space in the middle of the file (punch a hole). Moreover, because the last block to punch out, might be in the middle of the extent, we have to split the extent at 'end + 1' so ext4_ext_rm_leaf() can easily either remove the whole fist part of split extent, or change its size. ext4_ext_remove_space() is then used to actually remove the space (extents) from within the hole, instead of ext4_ext_map_blocks(). Note that this also fix the issue with punch hole, where we would forget to remove empty index blocks from the extent tree, resulting in double free block error and file system corruption. This is simply because we now use different code path, where this problem does not exist. This has been tested with fsx running for several days and xfstests, plus xfstest #251 with '-o discard' run on the loop image (which converts discard requestes into punch hole to the backing file). All of it on 1K and 4K file system block size. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-19Merge branch 'dcache-word-accesses'Linus Torvalds
* branch 'dcache-word-accesses': vfs: use 'unsigned long' accesses for dcache name comparison and hashing This does the name hashing and lookup using word-sized accesses when that is efficient, namely on x86 (although any little-endian machine with good unaligned accesses would do). It does very much depend on little-endian logic, but it's a very hot couple of functions under some real loads, and this patch improves the performance of __d_lookup_rcu() and link_path_walk() by up to about 30%. Giving a 10% improvement on some very pathname-heavy benchmarks. Because we do make unaligned accesses past the filename, the optimization is disabled when CONFIG_DEBUG_PAGEALLOC is active, and we effectively depend on the fact that on x86 we don't really ever have the last page of usable RAM followed immediately by any IO memory (due to ACPI tables, BIOS buffer areas etc). Some of the bit operations we do are a bit "subtle". It's commented, but you do need to really think about the code. Or just consider it black magic. Thanks to people on G+ for some of the optimized bit tricks.
2012-03-19vfs: get rid of batshit-insane pointless dentry hash calculationsLinus Torvalds
For some odd historical reason, the final mixing round for the dentry cache hash table lookup had an insane "xor with big constant" logic. In two places. The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a fairly random-looking number that is designed to be *multiplied* with so that the bits get spread out over a whole long-word. But xor'ing with it is insane. It doesn't really even change the hash - it really only shifts the hash around in the hash table. To make matters worse, the insane big constant is different on 32-bit and 64-bit builds, even though the name hash bits we use are always 32-bit (and the bits from the pointer we mix in effectively are too). It's all total voodoo programming, in other words. Now, some testing and analysis of the hash chains shows that the rest of the hash function seems to be fairly good. It does pick the right bits of the parent dentry pointer, for example, and while it's generally a bad idea to use an xor to mix down the upper bits (because if there is a repeating pattern, the xor can cause "destructive interference"), it seems to not have been a disaster. For example, replacing the hash with the normal "hash_long()" code (that uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes the hash worse. The hand-picked hash knew which bits of the pointer had the highest entropy, and hash_long() ends up mixing bits less optimally at least in some trivial tests. So the hash function overall seems fine, it just has that really odd "shift result around by a constant xor". So get rid of the silly xor, and replace the down-mixing of the bits with an add instead of an xor that tends to not have the same kind of destructive interference issues. Some stats on the resulting hash chains shows that they look statistically identical before and after, but the code is simpler and no longer makes you go "WTF?". Also, the incoming hash really is just "unsigned int", not a long, and there's no real point to worry about the high 26 bits of the dentry pointer for the 64-bit case, because they are all going to be identical anyway. So also change the hashing to be done in the more natural 'unsigned int' that is the real size of the actual hashed data anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-19exofs: Cap on the memcpy() sizeDan Carpenter
This data comes from the device, so probably it's fairly trustworthy but it makes the static checkers happy if we check it. [Boaz] the system_id_len is zero, if not present, or always OSD_SYSTEMID_LEN. So always copy OSD_SYSTEMID_LEN bytes. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-03-19exofs: (trivial) Fix typo in super.cMasanari Iida
Correct spelling "faild" to "failed" in fs/exofs/super.c Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-03-19exofs: fix endian conversion in exofs_sync_fs()Dan Carpenter
fscb->s_numfiles is an __le64 field so we need to use cpu_to_le64() to get a little endian 64 bit on big endian systems. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-03-19nfsd: merge cookie collision fixes from ext4 treeJ. Bruce Fields
These changes fix readdir loops on ext4 filesystems with dir_index turned on. I'm pulling them from Ted's tree as I'd like to give them some extra nfsd testing, and expect to be applying (potentially conflicting) patches to the same code before the next merge window. From the nfs-ext4-premerge branch of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-19Merge branch 'stable/cleancache.v13' into linux-nextKonrad Rzeszutek Wilk
* stable/cleancache.v13: mm: cleancache: Use __read_mostly as appropiate. mm: cleancache: report statistics via debugfs instead of sysfs. mm: zcache/tmem/cleancache: s/flush/invalidate/ mm: cleancache: s/flush/invalidate/
2012-03-19CIFS: Fix a spurious error in cifs_push_posix_locksPavel Shilovsky
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Steve French <stevef@smf-gateway.(none)>
2012-03-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-03-18nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)Bernd Schubert
Use 32-bit or 64-bit llseek() hashes for directory offsets depending on the NFS version. NFSv2 gets 32-bit hashes only. NOTE: This patch got rather complex as Christoph asked to set the filp->f_mode flag in the open call or immediatly after dentry_open() in nfsd_open() to avoid races. Personally I still do not see a reason for that and in my opinion FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it follows directly after nfsd_open() without a chance of races. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields<bfields@redhat.com>
2012-03-18nfsd: rename 'int access' to 'int may_flags' in nfsd_open()Bernd Schubert
Just rename this variable, as the next patch will add a flag and 'access' as variable name would not be correct any more. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields<bfields@redhat.com>
2012-03-18ext4: return 32/64-bit dir name hash according to usage typeFan Yong
Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext4 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This still needs integration on the NFS side. Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> (blame me if something is not correct) Signed-off-by: Fan Yong <yong.fan@whamcloud.com> Signed-off-by: Andreas Dilger <adilger@whamcloud.com> Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-18Don't limit non-nested epoll pathsJason Baron
Commit 28d82dc1c4ed ("epoll: limit paths") that I did to limit the number of possible wakeup paths in epoll is causing a few applications to longer work (dovecot for one). The original patch is really about limiting the amount of epoll nesting (since epoll fds can be attached to other fds). Thus, we probably can allow an unlimited number of paths of depth 1. My current patch limits it at 1000. And enforce the limits on paths that have a greater depth. This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578 Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-17Try using machine credentials for RENEW callsSachin Prabhu
Using user credentials for RENEW calls will fail when the user credentials have expired. To avoid this, try using the machine credentials when making RENEW calls. If no machine credentials have been set, fall back to using user credentials as before. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-17NFSv4.1: Fix a few issues in filelayout_commit_pagelistTrond Myklebust
- Fix a race in which NFS_I(inode)->commits_outstanding could potentially go to zero (triggering a call to nfs_commit_clear_lock()) before we're done sending out all the commit RPC calls. - If nfs_commitdata_alloc fails, there is no reason why we shouldn't try to send off all the commits-to-ds. - Simplify the error handling. - Change pnfs_commit_list() to always return either PNFS_ATTEMPTED or PNFS_NOT_ATTEMPTED. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
2012-03-17NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit codeTrond Myklebust
Move more pnfs-isms out of the generic commit code. Bugfixes: - filelayout_scan_commit_lists doesn't need to get/put the lseg. In fact since it is run under the inode->i_lock, the lseg_put() can deadlock. - Ensure that we distinguish between what needs to be done for commit-to-data server and what needs to be done for commit-to-MDS using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling put_lseg() on a bucket for a struct nfs_page that got written through the MDS. - Fix a case where we were using list_del() on an nfs_page->wb_list instead of list_del_init(). - filelayout_initiate_commit needs to call filelayout_commit_release on error instead of the mds_ops->rpc_release(). Otherwise it won't clear the commit lock. Cleanups: - Let the files layout manage the commit lists for the pNFS case. Don't expose stuff like pnfs_choose_commit_list, and the fact that the commit buckets hold references to the layout segment in common code. - Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg into the pNFS layer from whence they came. - Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
2012-03-16Merge branch 'akpm' (more patches from Andrew)Linus Torvalds
Merge some more email patches from Andrew Morton: "A couple of nilfs fixes" * emailed from Andrew Morton <akpm@linux-foundation.org>: nilfs2: fix NULL pointer dereference in nilfs_load_super_block() nilfs2: clamp ns_r_segments_percentage to [1, 99]