summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2012-03-29Merge branch 'x86-x32-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x32 support for x86-64 from Ingo Molnar: "This tree introduces the X32 binary format and execution mode for x86: 32-bit data space binaries using 64-bit instructions and 64-bit kernel syscalls. This allows applications whose working set fits into a 32 bits address space to make use of 64-bit instructions while using a 32-bit address space with shorter pointers, more compressed data structures, etc." Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c} * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) x32: Fix alignment fail in struct compat_siginfo x32: Fix stupid ia32/x32 inversion in the siginfo format x32: Add ptrace for x32 x32: Switch to a 64-bit clock_t x32: Provide separate is_ia32_task() and is_x32_task() predicates x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls x86/x32: Fix the binutils auto-detect x32: Warn and disable rather than error if binutils too old x32: Only clear TIF_X32 flag once x32: Make sure TS_COMPAT is cleared for x32 tasks fs: Remove missed ->fds_bits from cessation use of fd_set structs internally fs: Fix close_on_exec pointer in alloc_fdtable x32: Drop non-__vdso weak symbols from the x32 VDSO x32: Fix coding style violations in the x32 VDSO code x32: Add x32 VDSO support x32: Allow x32 to be configured x32: If configured, add x32 system calls to system call tables x32: Handle process creation x32: Signal-related system calls x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h> ...
2012-03-29Revert "ext4: don't release page refs in ext4_end_bio()"Linus Torvalds
This reverts commit b43d17f319f2c502b17139d1cf70731b2b62c644. Dave Jones reports that it causes lockups on his laptop, and his debug output showed a lot of processes hung waiting for page_writeback (or more commonly - processes hung waiting for a lock that was held during that writeback wait). The page_writeback hint made Ted suggest that Dave look at this commit, and Dave verified that reverting it makes his problems go away. Ted says: "That commit fixes a race which is seen when you write into fallocated (and hence uninitialized) disk blocks under *very* heavy memory pressure. Furthermore, although theoretically it could trigger under normal direct I/O writes, it only seems to trigger if you are issuing a huge number of AIO writes, such that a just-written page can get evicted from memory, and then read back into memory, before the workqueue has a chance to update the extent tree. This race has been around for a little over a year, and no one noticed until two months ago; it only happens under fairly exotic conditions, and in fact even after trying very hard to create a simple repro under lab conditions, we could only reproduce the problem and confirm the fix on production servers running MySQL on very fast PCIe-attached flash devices. Given that Dave was able to hit this problem pretty quickly, if we confirm that this commit is at fault, the only reasonable thing to do is to revert it IMO." Reported-and-tested-by: Dave Jones <davej@redhat.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-29Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd changes from Bruce Fields: Highlights: - Benny Halevy and Tigran Mkrtchyan implemented some more 4.1 features, moving us closer to a complete 4.1 implementation. - Bernd Schubert fixed a long-standing problem with readdir cookies on ext2/3/4. - Jeff Layton performed a long-overdue overhaul of the server reboot recovery code which will allow us to deprecate the current code (a rather unusual user of the vfs), and give us some needed flexibility for further improvements. - Like the client, we now support numeric uid's and gid's in the auth_sys case, allowing easier upgrades from NFSv2/v3 to v4.x. Plus miscellaneous bugfixes and cleanup. Thanks to everyone! There are also some delegation fixes waiting on vfs review that I suppose will have to wait for 3.5. With that done I think we'll finally turn off the "EXPERIMENTAL" dependency for v4 (though that's mostly symbolic as it's been on by default in distro's for a while). And the list of 4.1 todo's should be achievable for 3.5 as well: http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues though we may still want a bit more experience with it before turning it on by default. * 'for-3.4' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled nfsd4: use auth_unix unconditionally on backchannel nfsd: fix NULL pointer dereference in cld_pipe_downcall nfsd4: memory corruption in numeric_name_to_id() sunrpc: skip portmap calls on sessions backchannel nfsd4: allow numeric idmapping nfsd: don't allow legacy client tracker init for anything but init_net nfsd: add notifier to handle mount/unmount of rpc_pipefs sb nfsd: add the infrastructure to handle the cld upcall nfsd: add a header describing upcall to nfsdcld nfsd: add a per-net-namespace struct for nfsd sunrpc: create nfsd dir in rpc_pipefs nfsd: add nfsd4_client_tracking_ops struct and a way to set it nfsd: convert nfs4_client->cl_cb_flags to a generic flags field NFSD: Fix nfs4_verifier memory alignment NFSD: Fix warnings when NFSD_DEBUG is not defined nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) nfsd: rename 'int access' to 'int may_flags' in nfsd_open() ext4: return 32/64-bit dir name hash according to usage type fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash ...
2012-03-29Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Single fix for a commit from the first batch of patches through Andrew. * emailed from Andrew Morton <akpm@linux-foundation.org>: pagemap: remove remaining unneeded spin_lock()
2012-03-29pagemap: remove remaining unneeded spin_lock()Naoya Horiguchi
Commit 025c5b2451e4 ("thp: optimize away unnecessary page table locking") moves spin_lock() into pmd_trans_huge_lock() in order to avoid locking unless pmd is for thp. So this spin_lock() is a bug. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-29nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabledJeff Layton
Otherwise, we get a warning or error similar to this when building with CONFIG_NFSD_V4 disabled: ERROR: "nfsd4_cld_block" [fs/nfsd/nfsd.ko] undefined! Fix this by wrapping the calls to rpc_pipefs_notifier_register and ..._unregister in another function and providing no-op replacements when CONFIG_NFSD_V4 is disabled. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-28Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds
Pull trivial exofs changes from Boaz Harrosh: "Just nothingness really. The big exofs changes are reserved for the next merge window." * 'for-linus' of git://git.open-osd.org/linux-open-osd: exofs: Cap on the memcpy() size exofs: (trivial) Fix typo in super.c exofs: fix endian conversion in exofs_sync_fs()
2012-03-28Merge tag 'nfs-for-3.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes for Linux 3.4 from Trond Myklebust Highlights include: - Fix infinite loops in the mount code - Fix a userspace buffer overflow in __nfs4_get_acl_uncached - Fix a memory leak due to a double reference count in rpcb_getport_async() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> * tag 'nfs-for-3.4-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Minor cleanups for nfs4_handle_exception and nfs4_async_handle_error NFSv4.1: Fix layoutcommit error handling NFSv4: Fix two infinite loops in the mount code SUNRPC: Use the already looked-up xprt in rpcb_getport_async() NFS4.1: remove duplicate variable declaration in filelayout_clear_request_commit Fix length of buffer copied in __nfs4_get_acl_uncached
2012-03-28Merge tag 'squashfs-updates' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next Pull squashfs updates from Phillip Lougher: "Add an extra mount time sanity check, plus some code cleanups and bug fixes." * tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next: Squashfs: add mount time sanity check for block_size and block_log match Squashfs: fix f_pos check in get_dir_index_using_offset Squashfs: get rid of obsolete definitions in header file Squashfs: remove redundant length initialisation in squashfs_lookup Squashfs: remove redundant length initialisation in squashfs_readdir Squashfs: update comment removing reference to zlib only Squashfs: use define instead of constant
2012-03-28Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge third batch of patches from Andrew Morton: - Some MM stragglers - core SMP library cleanups (on_each_cpu_mask) - Some IPI optimisations - kexec - kdump - IPMI - the radix-tree iterator work - various other misc bits. "That'll do for -rc1. I still have ~10 patches for 3.4, will send those along when they've baked a little more." * emailed from Andrew Morton <akpm@linux-foundation.org>: (35 commits) backlight: fix typo in tosa_lcd.c crc32: add help text for the algorithm select option mm: move hugepage test examples to tools/testing/selftests/vm mm: move slabinfo.c to tools/vm mm: move page-types.c from Documentation to tools/vm selftests/Makefile: make `run_tests' depend on `all' selftests: launch individual selftests from the main Makefile radix-tree: use iterators in find_get_pages* functions radix-tree: rewrite gang lookup using iterator radix-tree: introduce bit-optimized iterator fs/proc/namespaces.c: prevent crash when ns_entries[] is empty nbd: rename the nbd_device variable from lo to nbd pidns: add reboot_pid_ns() to handle the reboot syscall sysctl: use bitmap library functions ipmi: use locks on watchdog timeout set on reboot ipmi: simplify locking ipmi: fix message handling during panics ipmi: use a tasklet for handling received messages ipmi: increase KCS timeouts ipmi: decrease the IPMI message transaction time in interrupt mode ...
2012-03-28fs/proc/namespaces.c: prevent crash when ns_entries[] is emptyAndrew Morton
If CONFIG_NET_NS, CONFIG_UTS_NS and CONFIG_IPC_NS are disabled, ns_entries[] becomes empty and things like ns_entries[ARRAY_SIZE(ns_entries) - 1] will explode. Reported-by: Richard Weinberger <richard@nod.at> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28fs: only send IPI to invalidate LRU BH when neededGilad Ben-Yossef
In several code paths, such as when unmounting a file system (but not only) we send an IPI to ask each cpu to invalidate its local LRU BHs. For multi-cores systems that have many cpus that may not have any LRU BH because they are idle or because they have not performed any file system accesses since last invalidation (e.g. CPU crunching on high perfomance computing nodes that write results to shared memory or only using filesystems that do not use the bh layer.) This can lead to loss of performance each time someone switches the KVM (the virtual keyboard and screen type, not the hypervisor) if it has a USB storage stuck in. This patch attempts to only send an IPI to cpus that have LRU BH. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28mm: thp: fix up pmd_trans_unstable() locationsAndrea Arcangeli
pmd_trans_unstable() should be called before pmd_offset_map() in the locations where the mmap_sem is held for reading. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mark Salter <msalter@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28procfs: fix /proc/statmKAMEZAWA Hiroyuki
bda7bad62bc4 ("procfs: speed up /proc/pid/stat, statm") broke /proc/statm - 'text' is printed twice by mistake. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reported-by: Ulrich Drepper <drepper@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28nfsd4: use auth_unix unconditionally on backchannelJ. Bruce Fields
This isn't actually correct, but it works with the Linux client, and agrees with the behavior we used to have before commit 80fc015bdfe. Later patches will implement the spec-mandated behavior (which is to use the security parameters explicitly given by the client in create_session or backchannel_ctl). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-28Merge tag 'split-asm_system_h-for-linus-20120328' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...
2012-03-28Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull XFS update (part 2) from Ben Myers: "Fixes for tracing of xfs_name strings, flag handling in open_by_handle, a log space hang with freeze/unfreeze, fstrim offset calculations, a section mismatch with xfs_qm_exit, an oops in xlog_recover_process_iunlinks, and a deadlock in xfs_rtfree_extent. There are also additional trace points for attributes, and the addition of a workqueue for allocation to work around kernel stack size limitations." * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: add lots of attribute trace points xfs: Fix oops on IO error during xlog_recover_process_iunlinks() xfs: fix fstrim offset calculations xfs: Account log unmount transaction correctly xfs: don't cache inodes read through bulkstat xfs: trace xfs_name strings correctly xfs: introduce an allocation workqueue xfs: Fix open flag handling in open_by_handle code xfs: fix deadlock in xfs_rtfree_extent fs: xfs: fix section mismatch in linux-next
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28Add #includes needed to permit the removal of asm/system.hDavid Howells
asm/system.h is a cause of circular dependency problems because it contains commonly used primitive stuff like barrier definitions and uncommonly used stuff like switch_to() that might require MMU definitions. asm/system.h has been disintegrated by this point on all arches into the following common segments: (1) asm/barrier.h Moved memory barrier definitions here. (2) asm/cmpxchg.h Moved xchg() and cmpxchg() here. #included in asm/atomic.h. (3) asm/bug.h Moved die() and similar here. (4) asm/exec.h Moved arch_align_stack() here. (5) asm/elf.h Moved AT_VECTOR_SIZE_ARCH here. (6) asm/switch_to.h Moved switch_to() here. Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28Merge tag 'writeback-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux Pull trivial writeback fixes from Wu Fengguang: "They've been tested in linux-next for 20 days actually." * tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Remove outdated comment fs: Remove bogus wait in write_inode_now()
2012-03-28Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates for 3.4 from Ted Ts'o: "Ext4 commits for 3.3 merge window; mostly cleanups and bug fixes The changes to export dirty_writeback_interval are from Artem's s_dirt cleanup patch series. The same is true of the change to remove the s_dirt helper functions which never got used by anyone in-tree. I've run these changes by Al Viro, and am carrying them so that Artem can more easily fix up the rest of the file systems during the next merge window. (Originally we had hopped to remove the use of s_dirt from ext4 during this merge window, but his patches had some bugs, so I ultimately ended dropping them from the ext4 tree.)" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (66 commits) vfs: remove unused superblock helpers mm: export dirty_writeback_interval ext4: remove useless s_dirt assignment ext4: write superblock only once on unmount ext4: do not mark superblock as dirty unnecessarily ext4: correct ext4_punch_hole return codes ext4: remove restrictive checks for EOFBLOCKS_FL ext4: always set then trimmed blocks count into len ext4: fix trimmed block count accunting ext4: fix start and len arguments handling in ext4_trim_fs() ext4: update s_free_{inodes,blocks}_count during online resize ext4: change some printk() calls to use ext4_msg() instead ext4: avoid output message interleaving in ext4_error_<foo>() ext4: remove trailing newlines from ext4_msg() and ext4_error() messages ext4: add no_printk argument validation, fix fallout ext4: remove redundant "EXT4-fs: " from uses of ext4_msg ext4: give more helpful error message in ext4_ext_rm_leaf() ext4: remove unused code from ext4_ext_map_blocks() ext4: rewrite punch hole to use ext4_ext_remove_space() jbd2: cleanup journal tail after transaction commit ...
2012-03-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates for 3.4-rc1 from Sage Weil: "Alex has been busy. There are a range of rbd and libceph cleanups, especially surrounding device setup and teardown, and a few critical fixes in that code. There are more cleanups in the messenger code, virtual xattrs, a fix for CRC calculation/checks, and lots of other miscellaneous stuff. There's a patch from Amon Ott to make inos behave a bit better on 32-bit boxes, some decode check fixes from Xi Wang, and network throttling fix from Jim Schutt, and a couple RBD fixes from Josh Durgin. No new functionality, just a lot of cleanup and bug fixing." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (65 commits) rbd: move snap_rwsem to the device, rename to header_rwsem ceph: fix three bugs, two in ceph_vxattrcb_file_layout() libceph: isolate kmap() call in write_partial_msg_pages() libceph: rename "page_shift" variable to something sensible libceph: get rid of zero_page_address libceph: only call kernel_sendpage() via helper libceph: use kernel_sendpage() for sending zeroes libceph: fix inverted crc option logic libceph: some simple changes libceph: small refactor in write_partial_kvec() libceph: do crc calculations outside loop libceph: separate CRC calculation from byte swapping libceph: use "do" in CRC-related Boolean variables ceph: ensure Boolean options support both senses libceph: a few small changes libceph: make ceph_tcp_connect() return int libceph: encapsulate some messenger cleanup code libceph: make ceph_msgr_wq private libceph: encapsulate connection kvec operations libceph: move prepare_write_banner() ...
2012-03-28Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3, UDF, and quota fixes from Jan Kara: "A couple of ext3 & UDF fixes and also one improvement in quota locking." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext3: fix start and len arguments handling in ext3_trim_fs() udf: Fix deadlock in udf_release_file() udf: Fix file entry logicalBlocksRecorded udf: Fix handling of i_blocks quota: Make quota code not call tty layer with dqptr_sem held udf: Init/maintain file entry checkpoint field ext3: Update ctime in ext3_splice_branch() only when needed ext3: Don't call dquot_free_block() if we don't update anything udf: Remove unnecessary OOM messages
2012-03-28Merge tag 'for-linus-3.4-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p changes for the 3.4 merge window from Eric Van Hensbergen. * tag 'for-linus-3.4-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: statfs should not override server f_type net/9p: handle flushed Tclunk/Tremove net/9p: don't allow Tflush to be interrupted
2012-03-28vfs: fix d_ancestor() case in d_materialize_uniqueMichel Lespinasse
In d_materialise_unique() there are 3 subcases to the 'aliased dentry' case; in two subcases the inode i_lock is properly released but this does not occur in the -ELOOP subcase. This seems to have been introduced by commit 1836750115f2 ("fix loop checks in d_materialise_unique()"). Signed-off-by: Michel Lespinasse <walken@google.com> Cc: stable@vger.kernel.org # v3.0+ [ Added a comment, and moved the unlock to where we generate the -ELOOP, which seems to be more natural. You probably can't actually trigger this without a buggy network file server - d_materialize_unique() is for finding aliases on non-local filesystems, and the d_ancestor() case is for a hardlinked directory loop. But we should be robust in the case of such buggy servers anyway. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28nfsd: fix NULL pointer dereference in cld_pipe_downcallJeff Layton
If we find that "cup" is NULL in this case, then we obviously don't want to dereference it. What we really want to print in this case is the xid that we copied off earlier. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-28nfsd4: memory corruption in numeric_name_to_id()Dan Carpenter
"id" is type is a uid_t (32 bits) but on 64 bit systems strict_strtoul() modifies 64 bits of data. We should use kstrtouint() instead. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-27NFSv4: Minor cleanups for nfs4_handle_exception and nfs4_async_handle_errorTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-27NFSv4.1: Fix layoutcommit error handlingTrond Myklebust
Firstly, task->tk_status will always return negative error values, so the current tests for 'NFS4ERR_DELEG_REVOKED' etc. are all being ignored. Secondly, clean up the code so that we only need to test task->tk_status once! Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-03-27NFSv4: Fix two infinite loops in the mount codeTrond Myklebust
We can currently loop forever in nfs4_lookup_root() and in nfs41_proc_secinfo_no_name(), if the first iteration returns a NFS4ERR_DELAY or something else that causes exception.retry to get set. Reported-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-03-27Merge branch 'for-linus-3.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML changes from Richard Weinberger: "Mostly bug fixes and cleanups" * 'for-linus-3.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (35 commits) um: Update defconfig um: Switch to large mcmodel on x86_64 MTD: Relax dependencies um: Wire CONFIG_GENERIC_IO up um: Serve io_remap_pfn_range() Introduce CONFIG_GENERIC_IO um: allow SUBARCH=x86 um: most of the SUBARCH uses can be killed um: deadlock in line_write_interrupt() um: don't bother trying to rebuild CHECKFLAGS for USER_OBJS um: use the right ifdef around exports in user_syms.c um: a bunch of headers can be killed by using generic-y um: ptrace-generic.h doesn't need user.h um: kill HOST_TASK_PID um: remove pointless include of asm/fixmap.h from asm/pgtable.h um: asm-offsets.h might as well come from underlying arch... um: merge processor_{32,64}.h a bit... um: switch close_chan() to struct line um: race fix: initialize delayed_work *before* registering IRQ um: line->have_irq is never checked... ...
2012-03-27xfs: add lots of attribute trace pointsDave Chinner
Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-27xfs: Fix oops on IO error during xlog_recover_process_iunlinks()Jan Kara
When an IO error happens during inode deletion run from xlog_recover_process_iunlinks() filesystem gets shutdown. Thus any subsequent attempt to read buffers fails. Code in xlog_recover_process_iunlinks() does not count with the fact that read of a buffer which was read a while ago can really fail which results in the oops on agi = XFS_BUF_TO_AGI(agibp); Fix the problem by cleaning up the buffer handling in xlog_recover_process_iunlinks() as suggested by Dave Chinner. We release buffer lock but keep buffer reference to AG buffer. That is enough for buffer to stay pinned in memory and we don't have to call xfs_read_agi() all the time. CC: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-27xfs: fix fstrim offset calculationsDave Chinner
xfs_ioc_fstrim() doesn't treat the incoming offset and length correctly. It treats them as a filesystem block address, rather than a disk address. This is wrong because the range passed in is a linear representation, while the filesystem block address notation is a sparse representation. Hence we cannot convert the range direct to filesystem block units and then use that for calculating the range to trim. While this sounds dangerous, the problem is limited to calculating what AGs need to be trimmed. The code that calcuates the actual ranges to trim gets the right result (i.e. only ever discards free space), even though it uses the wrong ranges to limit what is trimmed. Hence this is not a bug that endangers user data. Fix this by treating the range as a disk address range and use the appropriate functions to convert the range into the desired formats for calculations. Further, fix the first free extent lookup (the longest) to actually find the largest free extent. Currently this lookup uses a <= lookup, which results in finding the extent to the left of the largest because we can never get an exact match on the largest extent. This is due to the fact that while we know it's size, we don't know it's location and so the exact match fails and we move one record to the left to get the next largest extent. Instead, use a >= search so that the lookup returns the largest extent regardless of the fact we don't get an exact match on it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-26xfs: Account log unmount transaction correctlyDave Chinner
There have been a few reports of this warning appearing recently: XFS (dm-4): xlog_space_left: head behind tail tail_cycle = 129, tail_bytes = 20163072 GH cycle = 129, GH bytes = 20162880 The common cause appears to be lots of freeze and unfreeze cycles, and the output from the warnings indicates that we are leaking around 8 bytes of log space per freeze/unfreeze cycle. When we freeze the filesystem, we write an unmount record and that uses xlog_write directly - a special type of transaction, effectively. What it doesn't do, however, is correctly account for the log space it uses. The unmount record writes an 8 byte structure with a special magic number into the log, and the space this consumes is not accounted for in the log ticket tracking the operation. Hence we leak 8 bytes every unmount record that is written. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-26xfs: don't cache inodes read through bulkstatDave Chinner
When we read inodes via bulkstat, we generally only read them once and then throw them away - they never get used again. If we retain them in cache, then it simply causes the working set of inodes and other cached items to be reclaimed just so the inode cache can grow. Avoid this problem by marking inodes read by bulkstat not to be cached and check this flag in .drop_inode to determine whether the inode should be added to the VFS LRU or not. If the inode lookup hits an already cached inode, then don't set the flag. If the inode lookup hits an inode marked with no cache flag, remove the flag and allow it to be cached once the current reference goes away. Inodes marked as not cached will get cleaned up by the background inode reclaim or via memory pressure, so they will still generate some short term cache pressure. They will, however, be reclaimed much sooner and in preference to cache hot inodes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-26xfs: trace xfs_name strings correctlyChristoph Hellwig
Strings store in an xfs_name structure are often not NUL terminated, print them using the correct printf specifiers that make use of the string length store in the xfs_name structure. Reported-by: Brian Candler <B.Candler@pobox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-26nfsd4: allow numeric idmappingJ. Bruce Fields
Mimic the client side by providing a module parameter that turns off idmapping in the auth_sys case, for backwards compatibility with NFSv2 and NFSv3. Unlike in the client case, we don't have any way to negotiate, since the client can return an error to us if it doesn't like the id that we return to it in (for example) a getattr call. However, it has always been possible for servers to return numeric id's, and as far as we're aware clients have always been able to handle them. Also, in the auth_sys case clients already need to have numeric id's the same between client and server. Therefore we believe it's safe to default this to on; but the module parameter is available to return to previous behavior if this proves to be a problem in some unexpected setup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26nfsd: don't allow legacy client tracker init for anything but init_netJeff Layton
This code isn't set up for containers, so don't allow it to be used for anything but init_net. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26nfsd: add notifier to handle mount/unmount of rpc_pipefs sbJeff Layton
In the event that rpc_pipefs isn't mounted when nfsd starts, we must register a notifier to handle creating the dentry once it is mounted, and to remove the dentry on unmount. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26nfsd: add the infrastructure to handle the cld upcallJeff Layton
...and add a mechanism for switching between the "legacy" tracker and the new one. The decision is made by looking to see whether the v4recoverydir exists. If it does, then the legacy client tracker is used. If it's not, then the kernel will create a "cld" pipe in rpc_pipefs. That pipe is used to talk to a daemon for handling the upcall. Most of the data structures for the new client tracker are handled on a per-namespace basis, so this upcall should be essentially ready for containerization. For now however, nfsd just starts it by calling the initialization and exit functions for init_net. I'm making the assumption that at some point in the future we'll be able to determine the net namespace from the nfs4_client. Until then, this patch hardcodes init_net in those places. I've sprinkled some "FIXME" comments around that code to attempt to make it clear where we'll need to fix that up later. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26nfsd: add a per-net-namespace struct for nfsdJeff Layton
Eventually, we'll need this when nfsd gets containerized fully. For now, create a struct on a per-net-namespace basis that will just hold a pointer to the cld_net structure. That struct will hold all of the per-net data that we need for the cld tracker. Eventually we can add other pernet objects to struct nfsd_net. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26nfsd: add nfsd4_client_tracking_ops struct and a way to set itJeff Layton
Abstract out the mechanism that we use to track clients into a set of client name tracking functions. This gives us a mechanism to plug in a new set of client tracking functions without disturbing the callers. It also gives us a way to decide on what tracking scheme to use at runtime. For now, this just looks like pointless abstraction, but later we'll add a new alternate scheme for tracking clients on stable storage. Note too that this patch anticipates the eventual containerization of this code by passing in struct net pointers in places. No attempt is made to containerize the legacy client tracker however. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26nfsd: convert nfs4_client->cl_cb_flags to a generic flags fieldJeff Layton
We'll need a way to flag the nfs4_client as already being recorded on stable storage so that we don't continually upcall. Currently, that's recorded in the cl_firststate field of the client struct. Using an entire u32 to store a flag is rather wasteful though. The cl_cb_flags field is only using 2 bits right now, so repurpose that to a generic flags field. Rename NFSD4_CLIENT_KILL to NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback flags. Add a mask that we can use for existing checks that look to see whether any flags are set, so that the new flags don't interfere. Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag, and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26Merge nfs containerization work from Trond's treeJ. Bruce Fields
The nfs containerization work is a prerequisite for Jeff Layton's reboot recovery rework.
2012-03-25uml/hostfs: Propagate dirent.d_type to filldir()Geert Uytterhoeven
Currently the (optional) d_type member in struct dirent is always DT_UNKNOWN on hostfs, which may confuse buggy software using readdir(). Make sure to propagate its value from the underlying filesystem if it's available there. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2012-03-24NFS4.1: remove duplicate variable declaration in filelayout_clear_request_commitFred Isaman
inode is declared twice for no good reason Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-24Fix length of buffer copied in __nfs4_get_acl_uncachedSachin Prabhu
_copy_from_pages() used to copy data from the temporary buffer to the user passed buffer is passed the wrong size parameter when copying data. res.acl_len contains both the bitmap and acl lenghts while acl_len contains the acl length after adjusting for the bitmap size. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-24Merge tag 'module-for-3.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull cleanup of fs/ and lib/ users of module.h from Paul Gortmaker: "Fix up files in fs/ and lib/ dirs to only use module.h if they really need it. These are trivial in scope vs the work done previously. We now have things where any few remaining cleanups can be farmed out to arch or subsystem maintainers, and I have done so when possible. What is remaining here represents the bits that don't clearly lie within a single arch/subsystem boundary, like the fs dir and the lib dir. Some duplicate includes arising from overlapping fixes from independent subsystem maintainer submissions are also quashed." Fix up trivial conflicts due to clashes with other include file cleanups (including some due to the previous bug.h cleanup pull). * tag 'module-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: lib: reduce the use of module.h wherever possible fs: reduce the use of module.h wherever possible includecheck: delete any duplicate instances of module.h
2012-03-24Merge tag 'bug-for-3.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull <linux/bug.h> cleanup from Paul Gortmaker: "The changes shown here are to unify linux's BUG support under the one <linux/bug.h> file. Due to historical reasons, we have some BUG code in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in linux/kernel.h predates the addition of linux/bug.h, but old code in kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h was including <asm/bug.h> to pseudo link them. This has caused confusion[1] and general yuck/WTF[2] reactions. Here is an example that violates the principle of least surprise: CC lib/string.o lib/string.c: In function 'strlcat': lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON' make[2]: *** [lib/string.o] Error 1 $ $ grep linux/bug.h lib/string.c #include <linux/bug.h> $ We've included <linux/bug.h> for the BUG infrastructure and yet we still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh - very confusing for someone who is new to kernel development. With the above in mind, the goals of this changeset are: 1) find and fix any include/*.h files that were relying on the implicit presence of BUG code. 2) find and fix any C files that were consuming kernel.h and hence relying on implicitly getting some/all BUG code. 3) Move the BUG related code living in kernel.h to <linux/bug.h> 4) remove the asm/bug.h from kernel.h to finally break the chain. During development, the order was more like 3-4, build-test, 1-2. But to ensure that git history for bisect doesn't get needless build failures introduced, the commits have been reorderd to fix the problem areas in advance. [1] https://lkml.org/lkml/2012/1/3/90 [2] https://lkml.org/lkml/2012/1/17/414" Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul and linux-next. * tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: kernel.h: doesn't explicitly use bug.h, so don't include it. bug: consolidate BUILD_BUG_ON with other bug code BUG: headers with BUG/BUG_ON etc. need linux/bug.h bug.h: add include of it to various implicit C users lib: fix implicit users of kernel.h for TAINT_WARN spinlock: macroize assert_spin_locked to avoid bug.h dependency x86: relocate get/set debugreg fcns to include/asm/debugreg.