summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2012-12-16Btrfs: fix freeze vs auto defragMiao Xie
If we freeze the fs, the auto defragment should not run. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: restructure btrfs_run_defrag_inodes()Miao Xie
This patch restructure btrfs_run_defrag_inodes() and make the code of the auto defragment more readable. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: fix unprotected defragable inode insertionMiao Xie
We forget to get the defrag lock when we re-add the defragable inode, Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: use slabs for auto defrag allocationMiao Xie
The auto defrag allocation is in the fast path of the IO, so use slabs to improve the speed of the allocation. And besides that, it can do check for leaked objects when the module is removed. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: get write access for qgroup operationsMiao Xie
We need get write access for qgroup operations, or we will modify the R/O fs. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: get write access for scrubMiao Xie
We need get write access for scrub, or we will modify the R/O fs. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: get write access when removing a deviceMiao Xie
Steps to reproduce: # mkfs.btrfs -d single -m single <disk0> <disk1> # mount -o ro <disk0> <mnt0> # mount -o ro <disk0> <mnt1> # mount -o remount,rw <mnt0> # umount <mnt0> # btrfs device delete <disk1> <mnt1> We can remove a device from a R/O filesystem. The reason is that we just check the R/O flag of the super block object. It is not enough, because the kernel may set the R/O flag only for the mount point. We need invoke mnt_want_write_file() to do a full check. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: get write access when doing resize fsMiao Xie
Steps to reproduce: # mkfs.btrfs <partition> # mount -o ro <partition> <mnt0> # mount -o ro <partition> <mnt1> # mount -o remount,rw <mnt0> # umount <mnt0> # btrfs fi resize 10g <mnt1> We re-sized a R/O filesystem. The reason is that we just check the R/O flag of the super block object. It is not enough, because the kernel may set the R/O flag only for the mount point. We need invoke mnt_want_write_file() to do a full check. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: get write access when setting the default subvolumeMiao Xie
When wen want to set the default subvolume, we must get write access, or we will change the R/O file system. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: fix wrong return value of btrfs_wait_for_commit()Miao Xie
If the id of the existed transaction is more than the one we specified, it means the specified transaction was commited, so we should return 0, not EINVAL. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: don't start a new transaction when starting syncMiao Xie
If there is no running transaction in the fs, we needn't start a new one when we want to start sync. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: pass root object into btrfs_ioctl_{start, wait}_sync()Miao Xie
Since we have gotten the root in the caller, just pass it into btrfs_ioctl_{start, wait}_sync() directly. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: fix an while-loop of listxattrLiu Bo
If we found an invalid xattr dir item, we'd better try the next one instead. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: do not warn_on io_ctl->cur in io_ctl_map_pageWang Sheng-Hui
io_ctl_map_page is called by many functions in free-space-cache. In most scenarios, the ->cur is not null, e.g. io_ctl_add_entry. I think we'd better remove the warn_on here. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Btrfs: add support for device replace ioctlsStefan Behrens
This is the commit that allows to start the device replace procedure. An ioctl() interface is added that supports starting and canceling the device replace procedure, and to retrieve the status and progress. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-12-16Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 update from Ted Ts'o: "There are two major features for this merge window. The first is inline data, which allows small files or directories to be stored in the in-inode extended attribute area. (This requires that the file system use inodes which are at least 256 bytes or larger; 128 byte inodes do not have any room for in-inode xattrs.) The second new feature is SEEK_HOLE/SEEK_DATA support. This is enabled by the extent status tree patches, and this infrastructure will be used to further optimize ext4 in the future. Beyond that, we have the usual collection of code cleanups and bug fixes." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (63 commits) ext4: zero out inline data using memset() instead of empty_zero_page ext4: ensure Inode flags consistency are checked at build time ext4: Remove CONFIG_EXT4_FS_XATTR ext4: remove unused variable from ext4_ext_in_cache() ext4: remove redundant initialization in ext4_fill_super() ext4: remove redundant code in ext4_alloc_inode() ext4: use sync_inode_metadata() when syncing inode metadata ext4: enable ext4 inline support ext4: let fallocate handle inline data correctly ext4: let ext4_truncate handle inline data correctly ext4: evict inline data out if we need to strore xattr in inode ext4: let fiemap work with inline data ext4: let ext4_rename handle inline dir ext4: let empty_dir handle inline dir ext4: let ext4_delete_entry() handle inline data ext4: make ext4_delete_entry generic ext4: let ext4_find_entry handle inline data ext4: create a new function search_dir ext4: let ext4_readdir handle inline data ext4: let add_dir_entry handle inline data properly ...
2012-12-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "A quiet cycle for the security subsystem with just a few maintenance updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: create a sysfs mount point for smackfs Smack: use select not depends in Kconfig Yama: remove locking from delete path Yama: add RCU to drop read locking drivers/char/tpm: remove tasklet and cleanup KEYS: Use keyring_alloc() to create special keyrings KEYS: Reduce initial permissions on keys KEYS: Make the session and process keyrings per-thread seccomp: Make syscall skipping and nr changes more consistent key: Fix resource leak keys: Fix unreachable code KEYS: Add payload preparsing opportunity prior to key instantiate or update
2012-12-15NFS: Don't use SetPageError in the NFS writeback codeTrond Myklebust
The writeback code is already capable of passing errors back to user space by means of the open_context->error. In the case of ENOSPC, Neil Brown is reporting seeing 2 errors being returned. Neil writes: "e.g. if /mnt2/ if an nfs mounted filesystem that has no space then strace dd if=/dev/zero conv=fsync >> /mnt2/afile count=1 reported Input/output error and the relevant parts of the strace output are: write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512 fsync(1) = -1 EIO (Input/output error) close(1) = -1 ENOSPC (No space left on device)" Neil then shows that the duplication of error messages appears to be due to the use of the PageError() mechanism, which causes filemap_fdatawait_range to return the extra EIO. The regression was introduced by commit 7b281ee026552f10862b617a2a51acf49c829554 (NFS: fsync() must exit with an error if page writeback failed). Fix this by removing the call to SetPageError(), and just relying on open_context->error reporting the ENOSPC back to fsync(). Reported-by: Neil Brown <neilb@suse.de> Tested-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [3.6+]
2012-12-15Merge tag 'for-v3.8' of git://git.infradead.org/users/cbou/linux-pstoreLinus Torvalds
Pull pstore update from Anton Vorontsov: "Here are just a few fixups for the pstore subsystem, nothing special this time" * tag 'for-v3.8' of git://git.infradead.org/users/cbou/linux-pstore: pstore/ftrace: Adjust for ftrace_ops->func prototype change pstore/ram: Fix bounds checks for mem_size, record_size, console_size and ftrace_size pstore/ram: Fix undefined usage of rounddown_pow_of_two(0) pstore/ram: Fixup section annotations
2012-12-15NFSv4.1: Deal effectively with interrupted RPC calls.Trond Myklebust
If an RPC call is interrupted, assume that the server hasn't processed the RPC call so that the next time we use the slot, we know that if we get a NFS4ERR_SEQ_MISORDERED or NFS4ERR_SEQ_FALSE_RETRY, we just have to bump the sequence number. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmwLinus Torvalds
Pull GFS2 updates from Steven Whitehouse: "The main feature this time is the new Orlov allocator and the patches leading up to it which allow us to allocate new inodes from their own allocation context, rather than borrowing that of their parent directory. It is this change which then allows us to choose a different location for subdirectories when required. This works exactly as per the ext3 implementation from the users point of view. In addition to that, we've got a speed up in gfs2_rbm_from_block() from Bob Peterson, three locking related improvements from Dave Teigland plus a selection of smaller bug fixes and clean ups." * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: GFS2: Set gl_object during inode create GFS2: add error check while allocating new inodes GFS2: don't reference inode's glock during block allocation trace GFS2: remove redundant lvb pointer GFS2: only use lvb on glocks that need it GFS2: skip dlm_unlock calls in unmount GFS2: Fix one RG corner case GFS2: Eliminate redundant buffer_head manipulation in gfs2_unlink_inode GFS2: Use dirty_inode in gfs2_dir_add GFS2: Fix truncation of journaled data files GFS2: Add Orlov allocator GFS2: Use proper allocation context for new inodes GFS2: Add test for resource group congestion status GFS2: Rename glops go_xmote_th to go_sync GFS2: Speed up gfs2_rbm_from_block GFS2: Review bug traps in glops.c
2012-12-15NFSv4.1: Move the RPC timestamp out of the slot.Trond Myklebust
Shave a few bytes off the slot table size by moving the RPC timestamp into the sequence results. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED.Trond Myklebust
If the server returns NFS4ERR_SEQ_MISORDERED, it could be a sign that the slot was retired at some point. Retry the attempt after reinitialising the slot sequence number to 1. Also add a handler for NFS4ERR_SEQ_FALSE_RETRY. Just bump the slot sequence number and retry... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-14userns: Require CAP_SYS_ADMIN for most uses of setns.Eric W. Biederman
Andy Lutomirski <luto@amacapital.net> found a nasty little bug in the permissions of setns. With unprivileged user namespaces it became possible to create new namespaces without privilege. However the setns calls were relaxed to only require CAP_SYS_ADMIN in the user nameapce of the targed namespace. Which made the following nasty sequence possible. pid = clone(CLONE_NEWUSER | CLONE_NEWNS); if (pid == 0) { /* child */ system("mount --bind /home/me/passwd /etc/passwd"); } else if (pid != 0) { /* parent */ char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/%u/ns/mnt"); fd = open(path, O_RDONLY); setns(fd, 0); system("su -"); } Prevent this possibility by requiring CAP_SYS_ADMIN in the current user namespace when joing all but the user namespace. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2012-12-14NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0Trond Myklebust
If the inode has no links, then we should force a new lookup. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-14NFS: Fix calls to drop_nlink()Trond Myklebust
It is almost always wrong for NFS to call drop_nlink() after removing a file. What we really want is to mark the inode's attributes for revalidation, and we want to ensure that the VFS drops it if we're reasonably sure that this is the final unlink(). Do the former using the usual cache validity flags, and the latter by testing if inode->i_nlink == 1, and clearing it in that case. This also fixes the following warning reported by Neil Brown and Jeff Layton (among others). [634155.004438] WARNING: at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.5.0/lin [634155.004442] Hardware name: Latitude E6510 [634155.004577] crc_itu_t crc32c_intel snd_hwdep snd_pcm snd_timer snd soundcor [634155.004609] Pid: 13402, comm: bash Tainted: G W 3.5.0-36-desktop # [634155.004611] Call Trace: [634155.004630] [<ffffffff8100444a>] dump_trace+0xaa/0x2b0 [634155.004641] [<ffffffff815a23dc>] dump_stack+0x69/0x6f [634155.004653] [<ffffffff81041a0b>] warn_slowpath_common+0x7b/0xc0 [634155.004662] [<ffffffff811832e4>] drop_nlink+0x34/0x40 [634155.004687] [<ffffffffa05bb6c3>] nfs_dentry_iput+0x33/0x70 [nfs] [634155.004714] [<ffffffff8118049e>] dput+0x12e/0x230 [634155.004726] [<ffffffff8116b230>] __fput+0x170/0x230 [634155.004735] [<ffffffff81167c0f>] filp_close+0x5f/0x90 [634155.004743] [<ffffffff81167cd7>] sys_close+0x97/0x100 [634155.004754] [<ffffffff815c3b39>] system_call_fastpath+0x16/0x1b [634155.004767] [<00007f2a73a0d110>] 0x7f2a73a0d10f Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [3.3+]
2012-12-14NFS: Ensure that we always drop inodes that have been marked as staleTrond Myklebust
There is no need to cache stale inodes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-14exofs: don't leak io_state and pages on read errorBoaz Harrosh
Same bug as fixed by Idan for write_exec was in read_exec. Fix the io_state leak and pages state on read error. Also while at it: The if (!pcol->read_4_write) at the error path is redundant because all goto err; are after the if (pcol->read_4_write) bale out. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2012-12-13Merge branch 'autofs' (patches from Ian Kent)Linus Torvalds
Merge emailed autofs cleanup/fix patches from Ian Kent * autofs: autofs4 - use simple_empty() for empty directory check autofs4 - dont clear DCACHE_NEED_AUTOMOUNT on rootless mount
2012-12-13autofs4 - use simple_empty() for empty directory checkIan Kent
For direct (and offset) mounts, if an automounted mount is manually umounted the trigger mount dentry can appear non-empty causing it to not trigger mounts. This can also happen if there is a file handle leak in a user space automounting application. This happens because, when a ioctl control file handle is opened on the mount, a cursor dentry is created which causes list_empty() to see the dentry as non-empty. Since there is a case where listing the directory of these dentrys is needed, the use of dcache_dir_*() functions for .open() and .release() is needed. Consequently simple_empty() must be used instead of list_empty() when checking for an empty directory. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-13autofs4 - dont clear DCACHE_NEED_AUTOMOUNT on rootless mountIan Kent
The DCACHE_NEED_AUTOMOUNT flag is cleared on mount and set on expire for autofs rootless multi-mount dentrys to prevent unnecessary calls to ->d_automount(). Since DCACHE_MANAGE_TRANSIT is always set on autofs dentrys ->d_managed() is always called so the check can be done in ->d_manage() without the need to change the flag. This still avoids unnecessary calls to ->d_automount(), adds negligible overhead and eliminates a seriously ugly check in the expire code. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-13Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc VM changes from Andrew Morton: "The rest of most-of-MM. The other MM bits await a slab merge. This patch includes the addition of a huge zero_page. Not a performance boost but it an save large amounts of physical memory in some situations. Also a bunch of Fujitsu engineers are working on memory hotplug. Which, as it turns out, was badly broken. About half of their patches are included here; the remainder are 3.8 material." However, this merge disables CONFIG_MOVABLE_NODE, which was totally broken. We don't add new features with "default y", nor do we add Kconfig questions that are incomprehensible to most people without any help text. Does the feature even make sense without compaction or memory hotplug? * akpm: (54 commits) mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic() mm/memory.c: remove unused code from do_wp_page() asm-generic, mm: pgtable: consolidate zero page helpers mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage hwpoison, hugetlbfs: fix RSS-counter warning hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage mm: protect against concurrent vma expansion memcg: do not check for mm in __mem_cgroup_count_vm_event tmpfs: support SEEK_DATA and SEEK_HOLE (reprise) mm: provide more accurate estimation of pages occupied by memmap fs/buffer.c: remove redundant initialization in alloc_page_buffers() fs/buffer.c: do not inline exported function writeback: fix a typo in comment mm: introduce new field "managed_pages" to struct zone mm, oom: remove statically defined arch functions of same name mm, oom: remove redundant sleep in pagefault oom handler mm, oom: cleanup pagefault oom handler memory_hotplug: allow online/offline memory to result movable node numa: add CONFIG_MOVABLE_NODE for movable-dedicated node mm, memcg: avoid unnecessary function call when memcg is disabled ...
2012-12-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...
2012-12-13nfs: Remove unused list nfs4_clientid_listYanchuan Nian
This list was designed to store struct nfs4_client in the client side. But nfs4_client was obsolete and has been removed from the source code. So remove the unused list. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-13nfs: Remove duplicate function declaration in internal.hYanchuan Nian
Remove duplicate function declaration in internal.h Signed-off-by: Yanchuan Nian <ycnian@gmail.com> [Trond: Added nfs_pageio_init_read, which suffered from the same problem] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-13quota: Use the pre-processor to compile out quotactl_cmd_write when ↵Lee Jones
!CONFIG_BLOCK quotactl_cmd_write() is only ever invoked when BLOCK is configured. When !CONFIG_BLOCK, the build warning below is displayed. Let's fix that. fs/quota/quota.c:311:12: warning: ‘quotactl_cmd_write’ defined but not used [-Wunused-function] Cc: Jan Kara <jack@suse.cz> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Jan Kara <jack@suse.cz>
2012-12-13ext3: drop if around WARN_ONJulia Lawall
Just use WARN_ON rather than an if containing only WARN_ON(1). A simplified version of the semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e; @@ - if (e) WARN_ON(1); + WARN_ON(e); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Jan Kara <jack@suse.cz>
2012-12-13ext3: get rid of the duplicate code on ext3_fill_superZhao Hongjiang
Setting s_mount_opt to 0 is unnecessary because we use kzalloc() for sb allocation. s_resuid and s_resgid are set again few lines below based on values in on disk superblock. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-12-13udf: remove un-needed variable from inode_getblkNamjae Jeon
The variable last_block is not needed. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-12-13udf: don't increment lenExtents while writing to a holeNamjae Jeon
Incrementing lenExtents even while writing to a hole is bad for performance as calls to udf_discard_prealloc and udf_truncate_tail_extent would not return from start if isize != lenExtents Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-12-13udf: fix memory leak while allocating blocks during writeNamjae Jeon
Need to brelse the buffer_head stored in cur_epos and next_epos. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-12-13libceph: Unlock unprocessed pages in start_read() error pathDavid Zafman
Function start_read() can get an error before processing all pages. It must not only release the remaining pages, but unlock them too. This fixes http://tracker.newdream.net/issues/3370 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
2012-12-13ceph: call handle_cap_grant() for cap import messageYan, Zheng
If client sends cap message that requests new max size during exporting caps, the exporting MDS will drop the message quietly. So the client may wait for the reply that updates the max size forever. call handle_cap_grant() for cap import message can avoid this issue. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13ceph: Fix __ceph_do_pending_vmtruncateYan, Zheng
we should set i_truncate_pending to 0 after page cache is truncated to i_truncate_size Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13ceph: Don't add dirty inode to dirty list if caps is in migrationYan, Zheng
Add dirty inode to cap_dirty_migrating list instead, this can avoid ceph_flush_dirty_caps() entering infinite loop. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13ceph: Fix infinite loop in __wake_requestsYan, Zheng
__wake_requests() will enter infinite loop if we use it to wake requests in the session->s_waiting list. __wake_requests() deletes requests from the list and __do_request() adds requests back to the list. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13ceph: Don't update i_max_size when handling non-auth capYan, Zheng
The cap from non-auth mds doesn't have a meaningful max_size value. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
2012-12-13bdi_register: add __printf verification, fix arg mismatchJoe Perches
__printf is useful to verify format and arguments. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Alex Elder <elder@inktank.com>
2012-12-13libceph: remove 'osdtimeout' optionSage Weil
This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
2012-12-13ceph: fix dentry reference leak in ceph_encode_fh().Cyril Roelandt
dput() was not called in the error path. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Reviewed-by: Alex Elder <elder@inktank.com>