summaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
AgeCommit message (Collapse)Author
2012-03-23Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates for Linux 3.4 from Trond Myklebust: "New features include: - Add NFS client support for containers. This should enable most of the necessary functionality, including lockd support, and support for rpc.statd, NFSv4 idmapper and RPCSEC_GSS upcalls into the correct network namespace from which the mount system call was issued. - NFSv4 idmapper scalability improvements Base the idmapper cache on the keyring interface to allow concurrent access to idmapper entries. Start the process of migrating users from the single-threaded daemon-based approach to the multi-threaded request-key based approach. - NFSv4.1 implementation id. Allows the NFSv4.1 client and server to mutually identify each other for logging and debugging purposes. - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of having to use the more counterintuitive 'vers=4,minorversion=1'. - SUNRPC tracepoints. Start the process of adding tracepoints in order to improve debugging of the RPC layer. - pNFS object layout support for autologin. Important bugfixes include: - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail to wake up all tasks when applied to priority waitqueues. - Ensure that we handle read delegations correctly, when we try to truncate a file. - A number of fixes for NFSv4 state manager loops (mostly to do with delegation recovery)." * tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits) NFS: fix sb->s_id in nfs debug prints xprtrdma: Remove assumption that each segment is <= PAGE_SIZE xprtrdma: The transport should not bug-check when a dup reply is received pnfs-obj: autologin: Add support for protocol autologin NFS: Remove nfs4_setup_sequence from generic rename code NFS: Remove nfs4_setup_sequence from generic unlink code NFS: Remove nfs4_setup_sequence from generic read code NFS: Remove nfs4_setup_sequence from generic write code NFS: Fix more NFS debug related build warnings SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined nfs: non void functions must return a value SUNRPC: Kill compiler warning when RPC_DEBUG is unset SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG NFS: Use cond_resched_lock() to reduce latencies in the commit scans NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner NFS: ncommit count is being double decremented SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() Try using machine credentials for RENEW calls NFSv4.1: Fix a few issues in filelayout_commit_pagelist NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code ...
2012-03-22Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge first batch of patches from Andrew Morton: "A few misc things and all the MM queue" * emailed from Andrew Morton <akpm@linux-foundation.org>: (92 commits) memcg: avoid THP split in task migration thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE memcg: clean up existing move charge code mm/memcontrol.c: remove unnecessary 'break' in mem_cgroup_read() mm/memcontrol.c: remove redundant BUG_ON() in mem_cgroup_usage_unregister_event() mm/memcontrol.c: s/stealed/stolen/ memcg: fix performance of mem_cgroup_begin_update_page_stat() memcg: remove PCG_FILE_MAPPED memcg: use new logic for page stat accounting memcg: remove PCG_MOVE_LOCK flag from page_cgroup memcg: simplify move_account() check memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat) memcg: kill dead prev_priority stubs memcg: remove PCG_CACHE page_cgroup flag memcg: let css_get_next() rely upon rcu_read_lock() cgroup: revert ss_id_lock to spinlock idr: make idr_get_next() good for rcu_read_lock() memcg: remove unnecessary thp check in page stat accounting memcg: remove redundant returns memcg: enum lru_list lru ...
2012-03-21procfs: mark thread stack correctly in proc/<pid>/mapsSiddhesh Poyarekar
Stack for a new thread is mapped by userspace code and passed via sys_clone. This memory is currently seen as anonymous in /proc/<pid>/maps, which makes it difficult to ascertain which mappings are being used for thread stacks. This patch uses the individual task stack pointers to determine which vmas are actually thread stacks. For a multithreaded program like the following: #include <pthread.h> void *thread_main(void *foo) { while(1); } int main() { pthread_t t; pthread_create(&t, NULL, thread_main, NULL); pthread_join(t, NULL); } proc/PID/maps looks like the following: 00400000-00401000 r-xp 00000000 fd:0a 3671804 /home/siddhesh/a.out 00600000-00601000 rw-p 00000000 fd:0a 3671804 /home/siddhesh/a.out 019ef000-01a10000 rw-p 00000000 00:00 0 [heap] 7f8a44491000-7f8a44492000 ---p 00000000 00:00 0 7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 7f8a44c92000-7f8a44e3d000 r-xp 00000000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a44e3d000-7f8a4503d000 ---p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a4503d000-7f8a45041000 r--p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a45041000-7f8a45043000 rw-p 001af000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a45043000-7f8a45048000 rw-p 00000000 00:00 0 7f8a45048000-7f8a4505f000 r-xp 00000000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4505f000-7f8a4525e000 ---p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4525e000-7f8a4525f000 r--p 00016000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4525f000-7f8a45260000 rw-p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a45260000-7f8a45264000 rw-p 00000000 00:00 0 7f8a45264000-7f8a45286000 r-xp 00000000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45457000-7f8a4545a000 rw-p 00000000 00:00 0 7f8a45484000-7f8a45485000 rw-p 00000000 00:00 0 7f8a45485000-7f8a45486000 r--p 00021000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45486000-7f8a45487000 rw-p 00022000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45487000-7f8a45488000 rw-p 00000000 00:00 0 7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack] 7fff627ff000-7fff62800000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Here, one could guess that 7f8a44492000-7f8a44c92000 is a stack since the earlier vma that has no permissions (7f8a44e3d000-7f8a4503d000) but that is not always a reliable way to find out which vma is a thread stack. Also, /proc/PID/maps and /proc/PID/task/TID/maps has the same content. With this patch in place, /proc/PID/task/TID/maps are treated as 'maps as the task would see it' and hence, only the vma that that task uses as stack is marked as [stack]. All other 'stack' vmas are marked as anonymous memory. /proc/PID/maps acts as a thread group level view, where all thread stack vmas are marked as [stack:TID] where TID is the process ID of the task that uses that vma as stack, while the process stack is marked as [stack]. So /proc/PID/maps will look like this: 00400000-00401000 r-xp 00000000 fd:0a 3671804 /home/siddhesh/a.out 00600000-00601000 rw-p 00000000 fd:0a 3671804 /home/siddhesh/a.out 019ef000-01a10000 rw-p 00000000 00:00 0 [heap] 7f8a44491000-7f8a44492000 ---p 00000000 00:00 0 7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack:1442] 7f8a44c92000-7f8a44e3d000 r-xp 00000000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a44e3d000-7f8a4503d000 ---p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a4503d000-7f8a45041000 r--p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a45041000-7f8a45043000 rw-p 001af000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a45043000-7f8a45048000 rw-p 00000000 00:00 0 7f8a45048000-7f8a4505f000 r-xp 00000000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4505f000-7f8a4525e000 ---p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4525e000-7f8a4525f000 r--p 00016000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4525f000-7f8a45260000 rw-p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a45260000-7f8a45264000 rw-p 00000000 00:00 0 7f8a45264000-7f8a45286000 r-xp 00000000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45457000-7f8a4545a000 rw-p 00000000 00:00 0 7f8a45484000-7f8a45485000 rw-p 00000000 00:00 0 7f8a45485000-7f8a45486000 r--p 00021000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45486000-7f8a45487000 rw-p 00022000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45487000-7f8a45488000 rw-p 00000000 00:00 0 7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack] 7fff627ff000-7fff62800000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Thus marking all vmas that are used as stacks by the threads in the thread group along with the process stack. The task level maps will however like this: 00400000-00401000 r-xp 00000000 fd:0a 3671804 /home/siddhesh/a.out 00600000-00601000 rw-p 00000000 fd:0a 3671804 /home/siddhesh/a.out 019ef000-01a10000 rw-p 00000000 00:00 0 [heap] 7f8a44491000-7f8a44492000 ---p 00000000 00:00 0 7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack] 7f8a44c92000-7f8a44e3d000 r-xp 00000000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a44e3d000-7f8a4503d000 ---p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a4503d000-7f8a45041000 r--p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a45041000-7f8a45043000 rw-p 001af000 fd:00 2097482 /lib64/libc-2.14.90.so 7f8a45043000-7f8a45048000 rw-p 00000000 00:00 0 7f8a45048000-7f8a4505f000 r-xp 00000000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4505f000-7f8a4525e000 ---p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4525e000-7f8a4525f000 r--p 00016000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a4525f000-7f8a45260000 rw-p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so 7f8a45260000-7f8a45264000 rw-p 00000000 00:00 0 7f8a45264000-7f8a45286000 r-xp 00000000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45457000-7f8a4545a000 rw-p 00000000 00:00 0 7f8a45484000-7f8a45485000 rw-p 00000000 00:00 0 7f8a45485000-7f8a45486000 r--p 00021000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45486000-7f8a45487000 rw-p 00022000 fd:00 2097348 /lib64/ld-2.14.90.so 7f8a45487000-7f8a45488000 rw-p 00000000 00:00 0 7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 7fff627ff000-7fff62800000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] where only the vma that is being used as a stack by *that* task is marked as [stack]. Analogous changes have been made to /proc/PID/smaps, /proc/PID/numa_maps, /proc/PID/task/TID/smaps and /proc/PID/task/TID/numa_maps. Relevant snippets from smaps and numa_maps: [siddhesh@localhost ~ ]$ pgrep a.out 1441 [siddhesh@localhost ~ ]$ cat /proc/1441/smaps | grep "\[stack" 7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack:1442] 7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack] [siddhesh@localhost ~ ]$ cat /proc/1441/task/1442/smaps | grep "\[stack" 7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack] [siddhesh@localhost ~ ]$ cat /proc/1441/task/1441/smaps | grep "\[stack" 7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack] [siddhesh@localhost ~ ]$ cat /proc/1441/numa_maps | grep "stack" 7f8a44492000 default stack:1442 anon=2 dirty=2 N0=2 7fff6273a000 default stack anon=3 dirty=3 N0=3 [siddhesh@localhost ~ ]$ cat /proc/1441/task/1442/numa_maps | grep "stack" 7f8a44492000 default stack anon=2 dirty=2 N0=2 [siddhesh@localhost ~ ]$ cat /proc/1441/task/1441/numa_maps | grep "stack" 7fff6273a000 default stack anon=3 dirty=3 N0=3 [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix build] Signed-off-by: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jamie Lokier <jamie@shareable.org> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "This is _not_ all; in particular, Miklos' and Jan's stuff is not there yet." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits) ext4: initialization of ext4_li_mtx needs to be done earlier debugfs-related mode_t whack-a-mole hfsplus: add an ioctl to bless files hfsplus: change finder_info to u32 hfsplus: initialise userflags qnx4: new helper - try_extent() qnx4: get rid of qnx4_bread/qnx4_getblk take removal of PF_FORKNOEXEC to flush_old_exec() trim includes in inode.c um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it um: embed ->stub_pages[] into mmu_context gadgetfs: list_for_each_safe() misuse ocfs2: fix leaks on failure exits in module_init ecryptfs: make register_filesystem() the last potential failure exit ntfs: forgets to unregister sysctls on register_filesystem() failure logfs: missing cleanup on register_filesystem() failure jfs: mising cleanup on register_filesystem() failure make configfs_pin_fs() return root dentry on success configfs: configfs_create_dir() has parent dentry in dentry->d_parent configfs: sanitize configfs_create() ...
2012-03-21pnfs-obj: autologin: Add support for protocol autologinSachin Bhamare
The pnfs-objects protocol mandates that we autologin into devices not present in the system, according to information specified in the get_device_info returned from the server. The Protocol specifies two login hints. 1. An IP address:port combination 2. A string URI which is constructed as a URL with a protocol prefix followed by :// and a string as address. For each protocol prefix the string-address format might be different. We only support the second option. The first option is just redundant to the second one. NOTE: The Kernel part of autologin does not parse the URI string. It just channels it to a user-mode script. So any new login protocols should only update the user-mode script which is a part of the nfs-utils package, but the Kernel need not change. We implement the autologin by using the call_usermodehelper() API. (Thanks to Steve Dickson <steved@redhat.com> for pointing it out) So there is no running daemon needed, and/or special setup. We Add the osd_login_prog Kernel module parameters which defaults to: /sbin/osd_login Kernel try's to upcall the program specified in osd_login_prog. If the file is not found or the execution fails Kernel will disable any farther upcalls, by zeroing out osd_login_prog, Until Admin re-enables it by setting the osd_login_prog parameter to a proper program. Also add text about the osd_login program command line API to: Documentation/filesystems/nfs/pnfs.txt and documentation of the new osd_login_prog module parameter to: Documentation/kernel-parameters.txt TODO: Add timeout option in the case osd_login program gets stuck Signed-off-by: Sachin Bhamare <sbhamare@panasas.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "It's indeed trivial -- mostly documentation updates and a bunch of typo fixes from Masanari. There are also several linux/version.h include removals from Jesper." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits) kcore: fix spelling in read_kcore() comment constify struct pci_dev * in obvious cases Revert "char: Fix typo in viotape.c" init: fix wording error in mm_init comment usb: gadget: Kconfig: fix typo for 'different' Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c" writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header writeback: fix typo in the writeback_control comment Documentation: Fix multiple typo in Documentation tpm_tis: fix tis_lock with respect to RCU Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c" Doc: Update numastat.txt qla4xxx: Add missing spaces to error messages compiler.h: Fix typo security: struct security_operations kerneldoc fix Documentation: broken URL in libata.tmpl Documentation: broken URL in filesystems.tmpl mtd: simplify return logic in do_map_probe() mm: fix comment typo of truncate_inode_pages_range power: bq27x00: Fix typos in comment ...
2012-03-20debugfs-related mode_t whack-a-moleAl Viro
all of those should be umode_t... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20fs: initial qnx6fs additionKai Bankett
Adds support for qnx6fs readonly support to the linux kernel. * Mount option The option mmi_fs can be used to mount Harman Becker/Audi MMI 3G HDD qnx6fs filesystems. * Documentation A high level filesystem stucture description can be found in the Documentation/filesystems directory. (qnx6.txt) * Additional features - Active (stable) superblock selection - Superblock checksum check (enforced) - Supports mount of qnx6 filesystems with to host different endianess - Automatic endianess detection - Longfilename support (with non-enfocing crc check) - All blocksizes (512, 1024, 2048 and 4096 supported) Signed-off-by: Kai Bankett <chaosman@ontika.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20vfs: d_alloc_root() goneAl Viro
all callers converted to d_make_root() by now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-07Documentation: Fix multiple typo in DocumentationMasanari Iida
Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-02-21Documentation: Fix Broken URL "freshmeat"Masanari Iida
Fix broken URL in documents "freshmeat" to "freecode" in Documentation/filesystems/ramfs-rootfs-initramfs.txt, Documentation/scsi/scsi-generic.txt Documentation/usb/mtouchusb.txt Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-02-06NFS: Update idmapper documentationBryan Schumaker
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-26debugfs: add mode, uid and gid optionsLudwig Nussel
Cautious admins may want to restrict access to debugfs. Currently a manual chown/chmod e.g. in an init script is needed to achieve that. Distributions that want to make the mount options configurable need to add extra config files. By allowing to set the root inode's uid, gid and mode via mount options no such hacks are needed anymore. Instead configuration becomes straight forward via fstab. Signed-off-by: Ludwig Nussel <ludwig.nussel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-14Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: nfsd4_create_clid_dir return value is unused NFSD: Change name of extended attribute containing junction svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown svcrpc: fix double-free on shutdown of nfsd after changing pool mode nfsd4: be forgiving in the absence of the recovery directory nfsd4: fix spurious 4.1 post-reboot failures NFSD: forget_delegations should use list_for_each_entry_safe NFSD: Only reinitilize the recall_lru list under the recall lock nfsd4: initialize special stateid's at compile time NFSd: use network-namespace-aware cache registering routines SUNRPC: create svc_xprt in proper network namespace svcrpc: update outdated BKL comment nfsd41: allow non-reclaim open-by-fh's in 4.1 svcrpc: avoid memory-corruption on pool shutdown svcrpc: destroy server sockets all at once svcrpc: make svc_delete_xprt static nfsd: Fix oops when parsing a 0 length export nfsd4: Use kmemdup rather than duplicating its implementation nfsd4: add a separate (lockowner, inode) lookup nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error ...
2012-01-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-nextLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next: Squashfs: fix i_blocks calculation with extended regular files Squashfs: fix mount time sanity check for corrupted superblock Squashfs: optimise squashfs_cache_get entry search Squashfs: Update documentation to include xattrs Squashfs: add missing block release on error condition
2012-01-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: ensure prealloc_blob is in place when removing xattr rbd: initialize snap_rwsem in rbd_add() ceph: enable/disable dentry complete flags via mount option vfs: export symbol d_find_any_alias() ceph: always initialize the dentry in open_root_dentry() libceph: remove useless return value for osd_client __send_request() ceph: avoid iput() while holding spinlock in ceph_dir_fsync ceph: avoid useless dget/dput in encode_fh ceph: dereference pointer after checking for NULL crush: fix force for non-root TAKE ceph: remove unnecessary d_fsdata conditional checks ceph: Use kmemdup rather than duplicating its implementation Fix up conflicts in fs/ceph/super.c (d_alloc_root() failure handling vs always initialize the dentry in open_root_dentry)
2012-01-12c/r: procfs: add start_data, end_data, start_brk members to /proc/$pid/stat v4Cyrill Gorcunov
The mm->start_code/end_code, mm->start_data/end_data, mm->start_brk are involved into calculation of program text/data segment sizes (which might be seen in /proc/<pid>/statm) and into brk() call final address. For restore we need to know all these values. While mm->start_code/end_code already present in /proc/$pid/stat, the rest members are not, so this patch brings them in. The restore procedure of these members is addressed in another patch using prctl(). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-12ceph: enable/disable dentry complete flags via mount optionSage Weil
Enable/disable use of the dentry dir 'complete' flag via a mount option. This lets the admin control whether ceph uses the dcache to satisfy negative lookups or readdir when it has the entire directory contents in its cache. This is purely a performance optimization; correctness is guaranteed whether it is enabled or not. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sage Weil <sage@newdream.net>
2012-01-10Merge branch 'akpm' (aka "Andrew's patch-bomb")Linus Torvalds
Andrew elucidates: - First installmeant of MM. We have a HUGE number of MM patches this time. It's crazy. - MAINTAINERS updates - backlight updates - leds - checkpatch updates - misc ELF stuff - rtc updates - reiserfs - procfs - some misc other bits * akpm: (124 commits) user namespace: make signal.c respect user namespaces workqueue: make alloc_workqueue() take printf fmt and args for name procfs: add hidepid= and gid= mount options procfs: parse mount options procfs: introduce the /proc/<pid>/map_files/ directory procfs: make proc_get_link to use dentry instead of inode signal: add block_sigmask() for adding sigmask to current->blocked sparc: make SA_NOMASK a synonym of SA_NODEFER reiserfs: don't lock root inode searching reiserfs: don't lock journal_init() reiserfs: delay reiserfs lock until journal initialization reiserfs: delete comments referring to the BKL drivers/rtc/interface.c: fix alarm rollover when day or month is out-of-range drivers/rtc/rtc-twl.c: add DT support for RTC inside twl4030/twl6030 drivers/rtc/: remove redundant spi driver bus initialization drivers/rtc/rtc-jz4740.c: make jz4740_rtc_driver static drivers/rtc/rtc-mc13xxx.c: make mc13xxx_rtc_idtable static rtc: convert drivers/rtc/* to use module_platform_driver() drivers/rtc/rtc-wm831x.c: convert to devm_kzalloc() drivers/rtc/rtc-wm831x.c: remove unused period IRQ handler ...
2012-01-10procfs: add hidepid= and gid= mount optionsVasiliy Kulikov
Add support for mount options to restrict access to /proc/PID/ directories. The default backward-compatible "relaxed" behaviour is left untouched. The first mount option is called "hidepid" and its value defines how much info about processes we want to be available for non-owners: hidepid=0 (default) means the old behavior - anybody may read all world-readable /proc/PID/* files. hidepid=1 means users may not access any /proc/<pid>/ directories, but their own. Sensitive files like cmdline, sched*, status are now protected against other users. As permission checking done in proc_pid_permission() and files' permissions are left untouched, programs expecting specific files' modes are not confused. hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other users. It doesn't mean that it hides whether a process exists (it can be learned by other means, e.g. by kill -0 $PID), but it hides process' euid and egid. It compicates intruder's task of gathering info about running processes, whether some daemon runs with elevated privileges, whether another user runs some sensitive program, whether other users run any program at all, etc. gid=XXX defines a group that will be able to gather all processes' info (as in hidepid=0 mode). This group should be used instead of putting nonroot user in sudoers file or something. However, untrusted users (like daemons, etc.) which are not supposed to monitor the tasks in the whole system should not be added to the group. hidepid=1 or higher is designed to restrict access to procfs files, which might reveal some sensitive private information like precise keystrokes timings: http://www.openwall.com/lists/oss-security/2011/11/05/3 hidepid=1/2 doesn't break monitoring userspace tools. ps, top, pgrep, and conky gracefully handle EPERM/ENOENT and behave as if the current user is the only user running processes. pstree shows the process subtree which contains "pstree" process. Note: the patch doesn't deal with setuid/setgid issues of keeping preopened descriptors of procfs files (like https://lkml.org/lkml/2011/2/7/368). We rely on that the leaked information like the scheduling counters of setuid apps doesn't threaten anybody's privacy - only the user started the setuid program may read the counters. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg KH <greg@kroah.com> Cc: Theodore Tso <tytso@MIT.EDU> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: James Morris <jmorris@namei.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10Merge branch 'for_linus' into for_linus_mergedTheodore Ts'o
Conflicts: fs/ext4/ioctl.c
2012-01-08Merge branch 'for-linus2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc/*/mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount * ...
2012-01-06vfs: switch ->show_options() to struct dentry *Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-06Merge branch 'driver-core-next' into Linux 3.2Greg Kroah-Hartman
This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file, and it fixes the build error in the arch/x86/kernel/microcode_core.c file, that the merge did not catch. The microcode_core.c patch was provided by Stephen Rothwell <sfr@canb.auug.org.au> who was invaluable in the merge issues involved with the large sysdev removal process in the driver-core tree. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-04ext4: add new online resize interfaceYongqiang Yang
This patch adds new online resize interface, whose input argument is a 64-bit integer indicating how many blocks there are in the resized fs. In new resize impelmentation, all work like allocating group tables are done by kernel side, so the new resize interface can support flex_bg feature and prepares ground for suppoting resize with features like bigalloc and exclude bitmap. Besides these, user-space tools just passes in the new number of blocks. We delay initializing the bitmaps and inode tables of added groups if possible and add multi groups (a flex groups) each time, so new resize is very fast like mkfs. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-03sysfs: propagate umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03configfs: convert to umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch debugfs to umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch ->mknod() to umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch ->create() to umode_tAl Viro
vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03switch vfs_mkdir() and ->mkdir() to umode_tAl Viro
vfs_mkdir() gets int, but immediately drops everything that might not fit into umode_t and that's the only caller of ->mkdir()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-12-30Squashfs: Update documentation to include xattrsPhillip Lougher
One paragraph was not updated when xattrs were added to Squashfs. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
2011-12-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix meta data raid-repair merge problem Btrfs: skip allocation attempt from empty cluster Btrfs: skip block groups without enough space for a cluster Btrfs: start search for new cluster at the beginning Btrfs: reset cluster's max_size when creating bitmap Btrfs: initialize new bitmaps' list Btrfs: fix oops when calling statfs on readonly device Btrfs: Don't error on resizing FS to same size Btrfs: fix deadlock on metadata reservation when evicting a inode Fix URL of btrfs-progs git repository in docs btrfs scrub: handle -ENOMEM from init_ipath()
2011-11-30Fix URL of btrfs-progs git repository in docsArnd Hannemann
The location of the btrfs-progs repository has been changed. This patch updates the documentation accordingly. Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
2011-11-18debugfs: add tools to printk 32-bit registersAlessandro Rubini
Some debugfs file I deal with are mostly blocks of registers, i.e. lines of the form "<name> = 0x<value>". Some files are only registers, some include registers blocks among other material. This patch introduces data structures and functions to deal with both cases. I expect more users of this over time. Signed-off-by: Alessandro Rubini <rubini@gnudd.com> Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com> Cc: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-07NFSD: Added fault injection documentationBryan Schumaker
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-11-04Documentation: HFS is orphanedMarcos Paulo de Souza
Removed the reference of Roman Zippel, last maintainer, of orphaned HFS filesystem. Signed-off-by: Marcos Paulo de Souza <marcos.mage@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-04Documentation: fix inotify source file pathsMarcos Paulo de Souza
Fixes the path to find the source files of the inotify subsystem. Signed-off-by: Marcos Paulo de Souza <marcos.mage@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-02Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: vfs: add d_prune dentry operation vfs: protect i_nlink filesystems: add set_nlink() filesystems: add missing nlink wrappers logfs: remove unnecessary nlink setting ocfs2: remove unnecessary nlink setting jfs: remove unnecessary nlink setting hypfs: remove unnecessary nlink setting vfs: ignore error on forced remount readlinkat: ensure we return ENOENT for the empty pathname for normal lookups vfs: fix dentry leak in simple_fill_super()
2011-11-02Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (97 commits) jbd2: Unify log messages in jbd2 code jbd/jbd2: validate sb->s_first in journal_get_superblock() ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled ext4: fix a typo in struct ext4_allocation_context ext4: Don't normalize an falloc request if it can fit in 1 extent. ext4: remove comments about extent mount option in ext4_new_inode() ext4: let ext4_discard_partial_buffers handle unaligned range correctly ext4: return ENOMEM if find_or_create_pages fails ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock() ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten ext4: optimize locking for end_io extent conversion ext4: remove unnecessary call to waitqueue_active() ext4: Use correct locking for ext4_end_io_nolock() ext4: fix race in xattr block allocation path ext4: trace punch_hole correctly in ext4_ext_map_blocks ext4: clean up AGGRESSIVE_TEST code ext4: move variables to their scope ext4: fix quota accounting during migration ext4: migrate cleanup ...
2011-11-02Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Cleanup metadata flags handling udf: Skip mirror metadata FE loading when metadata FE is ok ext3: Allow quota file use root reservation udf: Remove web reference from UDF MAINTAINERS entry quota: Drop path reference on error exit from quotactl udf: Neaten udf_debug uses udf: Neaten logging output, use vsprintf extension %pV udf: Convert printks to pr_<level> udf: Rename udf_warning to udf_warn udf: Rename udf_error to udf_err udf: Promote some debugging messages to udf_error ext3: Remove the obsolete broken EXT3_IOC32_WAIT_FOR_READONLY. udf: Add readpages support for udf. ext3/balloc.c: local functions should be static ext2: fix the outdated comment in ext2_nfs_get_inode() ext3: remove deprecated oldalloc fs/ext3/balloc.c: delete useless initialization fs/ext2/balloc.c: delete useless initialization ext3: fix message in ext3_remount for rw-remount case ext3: Remove i_mutex from ext3_sync_file() Fix up trivial (printf format cleanup) conflicts in fs/udf/udfdecl.h
2011-11-02vfs: add d_prune dentry operationSage Weil
This adds a d_prune dentry operation that is called by the VFS prior to pruning (i.e. unhashing and killing) a hashed dentry from the dcache. Wrap dentry_lru_del() and use the new _prune() helper in the cases where we are about to unhash and kill the dentry. This will be used by Ceph to maintain a flag indicating whether the complete contents of a directory are contained in the dcache, allowing it to satisfy lookups and readdir without addition server communication. Renumber a few DCACHE_* #defines to group DCACHE_OP_PRUNE with the other DCACHE_OP_ bits. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-26Merge branch 'for-linus' of git://github.com/ericvh/linuxLinus Torvalds
* 'for-linus' of git://github.com/ericvh/linux: 9p: fix 9p.txt to advertise msize instead of maxdata net/9p: Convert net/9p protocol dumps to tracepoints fs/9p: change an int to unsigned int fs/9p: Cleanup option parsing in 9p 9p: move dereference after NULL check fs/9p: inode file operation is properly initialized init_special_inode fs/9p: Update zero-copy implementation in 9p
2011-10-25Merge branch 'driver-core-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits) mm: memory hotplug: Check if pages are correctly reserved on a per-section basis Revert "memory hotplug: Correct page reservation checking" Update email address for stable patch submission dynamic_debug: fix undefined reference to `__netdev_printk' dynamic_debug: use a single printk() to emit messages dynamic_debug: remove num_enabled accounting dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions uio: Support physical addresses >32 bits on 32-bit systems sysfs: add unsigned long cast to prevent compile warning drivers: base: print rejected matches with DEBUG_DRIVER memory hotplug: Correct page reservation checking memory hotplug: Refuse to add unaligned memory regions remove the messy code file Documentation/zh_CN/SubmitChecklist ARM: mxc: convert device creation to use platform_device_register_full new helper to create platform devices with dma mask docs/driver-model: Update device class docs docs/driver-model: Document device.groups kobj_uevent: Ignore if some listeners cannot handle message dynamic_debug: make netif_dbg() call __netdev_printk() dynamic_debug: make netdev_dbg() call __netdev_printk() ...
2011-10-249p: fix 9p.txt to advertise msize instead of maxdataNicolae Mogoreanu
9p.txt advertises that maxdata mount option should be used to specify msize, in the code though we use msize option and completely ignore maxdata if passed Signed-off-by: Nicolae Mogoreanu <mogoreanu@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-10-08ext4: remove deprecated oldallocLukas Czerner
For a long time now orlov is the default block allocator in the ext4. It performs better than the old one and no one seems to claim otherwise so we can safely drop it and make oldalloc and orlov mount option deprecated. This is a part of the effort to reduce number of ext4 options hence the test matrix. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-10-08ext4: documentation: remove acl and user_xattr mount optionsTheodore Ts'o
Acl and user_xattr mount options are no longer needed since those features are enabled by default if configured in (seee commit ea6633369458992241599c9d9ebadffaeddec164). We can not easily deprecate mount options itself (since it is probably too early), but we can remove it from documentation first. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-09-27doc: fix broken referencesPaul Bolle
There are numerous broken references to Documentation files (in other Documentation files, in comments, etc.). These broken references are caused by typo's in the references, and by renames or removals of the Documentation files. Some broken references are simply odd. Fix these broken references, sometimes by dropping the irrelevant text they were part of. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-03ext4: improve handling of conflicting mount optionsTheodore Ts'o
If the user explicitly specifies conflicting mount options for delalloc or dioread_nolock and data=journal, fail the mount, instead of printing a warning and continuing (since many user's won't look at dmesg and notice the warning). Also, print a single warning that data=journal implies that delayed allocation is not on by default (since it's not supported), and furthermore that O_DIRECT is not supported. Improve the text in Documentation/filesystems/ext4.txt so this is clear there as well. Similarly, if the dioread_nolock mount option is specified when the file system block size != PAGE_SIZE, fail the mount instead of printing a warning message and ignoring the mount option. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-08-22docs/sysfs: Specify ABI documentation requirementsBart Van Assche
Although it is expected nowadays that every new sysfs attribute is documented under Documentation/ABI, this is not yet mentioned in the kernel documentation. This patch adds a note in the sysfs documentation about that requirement. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>