summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2012-10-01Btrfs: remove unused tmp_path from iterate_dir_itemAlexander Block
A leftover from older code and unused now. Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: code cleanups for send/receiveAlexander Block
Doing some code cleanups as suggested by Arne. Changes do not change any logic. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: add/fix comments/documentation for send/receiveAlexander Block
As the subject already said, add/fix comments. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: update send_progress at correct placesAlexander Block
Updating send_progress in process_recorded_refs was not correct. It got updated too early in the cur_inode_new_gen case. Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com> Reported-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: make aux field of ulist 64 bitAlexander Block
Btrfs send/receive uses the aux field to store inode numbers. On 32 bit machines this may become a problem. Also fix all users of ulist_add and ulist_add_merged. Reported-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: fix use of radix_tree for name_cache in send/receiveAlexander Block
We can't easily use the index of the radix tree for inums as the radix tree uses 32bit indexes on 32bit kernels. For 32bit kernels, we now use the lower 32bit of the inum as index and an additional list to store multiple entries per radix tree entry. Reported-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: fix memory leak for name_cache in send/receiveAlexander Block
When everything is done, name_cache_free is called which however forgot to call kfree on the cache entries. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: don't break in the final loop of find_extent_cloneAlexander Block
If we break, we may miss the clone from send_root which we prefer over all other clones. Commit is a result of Arne's review. Reported-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: use normal return path for root == send_root caseAlexander Block
Don't have a seperate return path for the mentioned case. Now we do the same "take lowest inode/offset" logic for all found clones. Commit is a result of Arne's review. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: use kmalloc instead of stack for backref_ctxAlexander Block
Make sure to never get in trouble due to the backref_ctx which was on the stack before. Commit is a result of Arne's review. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: rename backref_ctx::found_in_send_root to found_itselfAlexander Block
The new name should be easier to understand/read. Commit is a result of Arne's review. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: remove unused use_list from send/receive codeAlexander Block
use_list is a leftover and unused. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: add correct parent to check_dirs when dir got movedAlexander Block
We only added the parent for the new position of a moved dir. We also need to add the old parent of the moved dir. Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: remove unused code with #if 0Alexander Block
fs_path_remove is not used at the moment due to a previous patch. Remove it for now (with #if 0) to avoid compile warnings. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: add missing check for dir != tmp_dir to is_first_refAlexander Block
We missed that check which resultet in all refs with the same name being reported as first_ref. Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: fix cur_ino < parent_ino case for send/receiveAlexander Block
When the current inodes inum is smaller then the inum of the parent directory strange things were happending due to wrong path resolution and other bugs. Fix this with a new approach for the problem. Reported-by: Alex Lyakas <alex.bolshoy.btrfs@gmail.com> Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Btrfs: add rdev to get_inode_info in send/receiveAlexander Block
We need rdev in the next commit. Signed-off-by: Alexander Block <ablock84@googlemail.com>
2012-10-01Merge tag 'driver-core-3.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core merge from Greg Kroah-Hartman: "Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) memory: tegra{20,30}-mc: Fix reading incorrect register in mc_readl() device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit memory: emif: Add ifdef CONFIG_DEBUG_FS guard for emif_debugfs_[init|exit] Documentation: Fixes some translation error in Documentation/zh_CN/gpio.txt Documentation: Remove 3 byte redundant code at the head of the Documentation/zh_CN/arm/booting Documentation: Chinese translation of Documentation/video4linux/omap3isp.txt device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit dev: Add dev_vprintk_emit and dev_printk_emit netdev_printk/netif_printk: Remove a superfluous logging colon netdev_printk/dynamic_netdev_dbg: Directly call printk_emit dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack driver-core: Shut up dev_dbg_reatelimited() without DEBUG tools/hv: Parse /etc/os-release tools/hv: Check for read/write errors tools/hv: Fix exit() error code tools/hv: Fix file handle leak Tools: hv: Implement the KVP verb - KVP_OP_GET_IP_INFO Tools: hv: Rename the function kvp_get_ip_address() Tools: hv: Implement the KVP verb - KVP_OP_SET_IP_INFO Tools: hv: Add an example script to configure an interface ...
2012-10-01Merge tag 'arm64-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull arm64 support from Catalin Marinas: "Linux support for the 64-bit ARM architecture (AArch64) Features currently supported: - 39-bit address space for user and kernel (each) - 4KB and 64KB page configurations - Compat (32-bit) user applications (ARMv7, EABI only) - Flattened Device Tree (mandated for all AArch64 platforms) - ARM generic timers" * tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (35 commits) arm64: ptrace: remove obsolete ptrace request numbers from user headers arm64: Do not set the SMP/nAMP processor bit arm64: MAINTAINERS update arm64: Build infrastructure arm64: Miscellaneous header files arm64: Generic timers support arm64: Loadable modules arm64: Miscellaneous library functions arm64: Performance counters support arm64: Add support for /proc/sys/debug/exception-trace arm64: Debugging support arm64: Floating point and SIMD arm64: 32-bit (compat) applications support arm64: User access library functions arm64: Signal handling support arm64: VDSO support arm64: System calls handling arm64: ELF definitions arm64: SMP support arm64: DMA mapping API ...
2012-10-01[CIFS] Fix indentation of fs/cifs/Kconfig entriesSteve French
make menuconfig for cifs shows multiple entries toward the end of the list with the incorrect indentation (probably a bug in Kconfig parsing of items that are dependant on the module (cifs=m instead of just CONFIG_CIFS). This patch fixes the indentation of all but the last entry (CIFS_ACL) which I don't know how to fix. It also clarifies wording in two places Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-10-01[CIFS] Fix SMB2 negotiation support to select only one dialect (based on vers=)Steve French
Based on whether the user (on mount command) chooses: vers=3.0 (for smb3.0 support) vers=2.1 (for smb2.1 support) or (with subsequent patch, which will allow SMB2 support) vers=2.0 (for original smb2.02 dialect support) send only one dialect at a time during negotiate (we had been sending a list). Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2012-10-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull the trivial tree from Jiri Kosina: "Tiny usual fixes all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) doc: fix old config name of kprobetrace fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc btrfs: fix the commment for the action flags in delayed-ref.h btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID vfs: fix kerneldoc for generic_fh_to_parent() treewide: fix comment/printk/variable typos ipr: fix small coding style issues doc: fix broken utf8 encoding nfs: comment fix platform/x86: fix asus_laptop.wled_type module parameter mfd: printk/comment fixes doc: getdelays.c: remember to close() socket on error in create_nl_socket() doc: aliasing-test: close fd on write error mmc: fix comment typos dma: fix comments spi: fix comment/printk typos in spi Coccinelle: fix typo in memdup_user.cocci tmiofb: missing NULL pointer checks tools: perf: Fix typo in tools/perf tools/testing: fix comment / output typos ...
2012-10-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmwLinus Torvalds
Pull GFS2 updates from Steven Whitehouse: "The major feature this time is the "rbm" conversion in the resource group code. The new struct gfs2_rbm specifies the location of an allocatable block in (resource group, bitmap, offset) form. There are a number of added helper functions, and later patches then rewrite some of the resource group code in terms of this new structure. Not only does this give us a nice code clean up, but it also removes some of the previous restrictions where extents could not cross bitmap boundaries, for example. In addition to that, there are a few bug fixes and clean ups, but the rbm work is by far the majority of this patch set in terms of number of changed lines." * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: (27 commits) GFS2: Write out dirty inode metadata in delayed deletes GFS2: fix s_writers.counter imbalance in gfs2_ail_empty_gl GFS2: Fix infinite loop in rbm_find GFS2: Consolidate free block searching functions GFS2: Get rid of I_MUTEX_QUOTA usage GFS2: Stop block extents at the end of bitmaps GFS2: Fix unclaimed_blocks() wrapping bug and clean up GFS2: Improve block reservation tracing GFS2: Fall back to ignoring reservations, if there are no other blocks left GFS2: Fix ->show_options() for statfs slow GFS2: Use rbm for gfs2_setbit() GFS2: Use rbm for gfs2_testbit() GFS2: Eliminate unnecessary check for state > 3 in bitfit GFS2: Eliminate redundant calls to may_grant GFS2: Combine functions gfs2_glock_dq_wait and wait_on_demote GFS2: Combine functions gfs2_glock_wait and wait_on_holder GFS2: inline __gfs2_glock_schedule_for_reclaim GFS2: change function gfs2_direct_IO to use a normal gfs2_glock_dq GFS2: rbm code cleanup GFS2: Fix case where reservation finished at end of rgrp ...
2012-09-30ext4: fix mtime update in nodelalloc modeTheodore Ts'o
Commits 5e8830dc85d0 and 41c4d25f78c0 introduced a regression into v3.6-rc1 for ext4 in nodealloc mode, such that mtime updates would not take place for files modified via mmap if the page was already in the page cache. This would also affect ext3 file systems mounted using the ext4 file system driver. The problem was that ext4_page_mkwrite() had a shortcut which would avoid calling __block_page_mkwrite() under some circumstances, and the above two commit transferred the responsibility of calling file_update_time() to __block_page_mkwrite --- which woudln't get called in some circumstances. Since __block_page_mkwrite() only has three callers, block_page_mkwrite(), ext4_page_mkwrite, and nilfs_page_mkwrite(), the best way to solve this is to move the responsibility for calling file_update_time() to its caller. This problem was found via xfstests #215 with a file system mounted with -o nodelalloc. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: stable@vger.kernel.org
2012-09-30ext4: fix ext_remove_space for punch_hole caseDmitry Monakhov
Inode is allowed to have empty leaf only if it this is blockless inode. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-30ext4: punch_hole should wait for DIO writersDmitry Monakhov
punch_hole is the place where we have to wait for all existing writers (writeback, aio, dio), but currently we simply flush pended end_io request which is not sufficient. Other issue is that punch_hole performed w/o i_mutex held which obviously result in dangerous data corruption due to write-after-free. This patch performs following changes: - Guard punch_hole with i_mutex - Recheck inode flags under i_mutex - Block all new dio readers in order to prevent information leak caused by read-after-free pattern. - punch_hole now wait for all writers in flight NOTE: XXX write-after-free race is still possible because new dirty pages may appear due to mmap(), and currently there is no easy way to stop writeback while punch_hole is in progress. [ Fixed error return from ext4_ext_punch_hole() to make sure that we release i_mutex before returning EPERM or ETXTBUSY -- Ted ] Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-30generic sys_execve()Al Viro
Selected by __ARCH_WANT_SYS_EXECVE in unistd.h. Requires * working current_pt_regs() * *NOT* doing a syscall-in-kernel kind of kernel_execve() implementation. Using generic kernel_execve() is fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-30generic kernel_execve()Al Viro
based mostly on arm and alpha versions. Architectures can define __ARCH_WANT_KERNEL_EXECVE and use it, provided that * they have working current_pt_regs(), even for kernel threads. * kernel_thread-spawned threads do have space for pt_regs in the normal location. Normally that's as simple as switching to generic kernel_thread() and making sure that kernel threads do *not* go through return from syscall path; call the payload from equivalent of ret_from_fork if we are in a kernel thread (or just have separate ret_from_kernel_thread and make copy_thread() use it instead of ret_from_fork in kernel thread case). * they have ret_from_kernel_execve(); it is called after successful do_execve() done by kernel_execve() and gets normal pt_regs location passed to it as argument. It's essentially a longjmp() analog - it should set sp, etc. to the situation expected at the return for syscall and go there. Eventually the need for that sucker will disappear, but that'll take some surgery on kernel_thread() payloads. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-29vfs: dcache: fix deadlock in tree traversalMiklos Szeredi
IBM reported a deadlock in select_parent(). This was found to be caused by taking rename_lock when already locked when restarting the tree traversal. There are two cases when the traversal needs to be restarted: 1) concurrent d_move(); this can only happen when not already locked, since taking rename_lock protects against concurrent d_move(). 2) racing with final d_put() on child just at the moment of ascending to parent; rename_lock doesn't protect against this rare race, so it can happen when already locked. Because of case 2, we need to be able to handle restarting the traversal when rename_lock is already held. This patch fixes all three callers of try_to_ascend(). IBM reported that the deadlock is gone with this patch. [ I rewrote the patch to be smaller and just do the "goto again" if the lock was already held, but credit goes to Miklos for the real work. - Linus ] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-29JFFS2: don't fail on bitflips in OOBBrian Norris
JFFS2 was designed without thought for OOB bitflips, it seems, but they can occur and will be reported to JFFS2 via mtd_read_oob()[1]. We don't want to fail on these transactions, since the data was corrected. [1] Few drivers report bitflips for OOB-only transactions. With such drivers, this patch should have no effect. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-09-29JFFS2: fix unmount regressionArtem Bityutskiy
This patch fixes regression introduced by "8bdc81c jffs2: get rid of jffs2_sync_super". We submit a delayed work in order to make sure the write-buffer is synchronized at some point. But we do not flush it when we unmount, which causes an oops when we unmount the file-system and then the delayed work is executed. This patch fixes the issue by adding a "cancel_delayed_work_sync()" infocation in the '->sync_fs()' handler. This will make sure the delayed work is canceled on sync, unmount and re-mount. And because VFS always callse 'sync_fs()' before unmounting or remounting, this fixes the issue. Reported-by: Ludovic Desroches <ludovic.desroches@atmel.com> Cc: stable@vger.kernel.org [3.5+] Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-09-29ext4: serialize truncate with owerwrite DIO workersDmitry Monakhov
Jan Kara have spotted interesting issue: There are potential data corruption issue with direct IO overwrites racing with truncate: Like: dio write truncate_task ->ext4_ext_direct_IO ->overwrite == 1 ->down_read(&EXT4_I(inode)->i_data_sem); ->mutex_unlock(&inode->i_mutex); ->ext4_setattr() ->inode_dio_wait() ->truncate_setsize() ->ext4_truncate() ->down_write(&EXT4_I(inode)->i_data_sem); ->__blockdev_direct_IO ->ext4_get_block ->submit_io() ->up_read(&EXT4_I(inode)->i_data_sem); # truncate data blocks, allocate them to # other inode - bad stuff happens because # dio is still in flight. In order to serialize with truncate dio worker should grab extra i_dio_count reference before drop i_mutex. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-29ext4: endless truncate due to nonlocked dio readersDmitry Monakhov
If we have enough aggressive DIO readers, truncate and other dio waiters will wait forever inside inode_dio_wait(). It is reasonable to disable nonlock DIO read optimization during truncate. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-29ext4: serialize unlocked dio reads with truncateDmitry Monakhov
Current serialization will works only for DIO which holds i_mutex, but nonlocked DIO following race is possible: dio_nolock_read_task truncate_task ->ext4_setattr() ->inode_dio_wait() ->ext4_ext_direct_IO ->ext4_ind_direct_IO ->__blockdev_direct_IO ->ext4_get_block ->truncate_setsize() ->ext4_truncate() #alloc truncated blocks #to other inode ->submit_io() #INFORMATION LEAK In order to serialize with unlocked DIO reads we have to rearrange wait sequence 1) update i_size first 2) if i_size about to be reduced wait for outstanding DIO requests 3) and only after that truncate inode blocks Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-29ext4: serialize dio nonlocked reads with defrag workersDmitry Monakhov
Inode's block defrag and ext4_change_inode_journal_flag() may affect nonlocked DIO reads result, so proper synchronization required. - Add missed inode_dio_wait() calls where appropriate - Check inode state under extra i_dio_count reference. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-29ext4: completed_io locking cleanupDmitry Monakhov
Current unwritten extent conversion state-machine is very fuzzy. - For unknown reason it performs conversion under i_mutex. What for? My diagnosis: We already protect extent tree with i_data_sem, truncate and punch_hole should wait for DIO, so the only data we have to protect is end_io->flags modification, but only flush_completed_IO and end_io_work modified this flags and we can serialize them via i_completed_io_lock. Currently all these games with mutex_trylock result in the following deadlock truncate: kworker: ext4_setattr ext4_end_io_work mutex_lock(i_mutex) inode_dio_wait(inode) ->BLOCK DEADLOCK<- mutex_trylock() inode_dio_done() #TEST_CASE1_BEGIN MNT=/mnt_scrach unlink $MNT/file fallocate -l $((1024*1024*1024)) $MNT/file aio-stress -I 100000 -O -s 100m -n -t 1 -c 10 -o 2 -o 3 $MNT/file sleep 2 truncate -s 0 $MNT/file #TEST_CASE1_END Or use 286's xfstests https://github.com/dmonakhov/xfstests/blob/devel/286 This patch makes state machine simple and clean: (1) xxx_end_io schedule final extent conversion simply by calling ext4_add_complete_io(), which append it to ei->i_completed_io_list NOTE1: because of (2A) work should be queued only if ->i_completed_io_list was empty, otherwise the work is scheduled already. (2) ext4_flush_completed_IO is responsible for handling all pending end_io from ei->i_completed_io_list Flushing sequence consists of following stages: A) LOCKED: Atomically drain completed_io_list to local_list B) Perform extents conversion C) LOCKED: move converted io's to to_free list for final deletion This logic depends on context which we was called from. D) Final end_io context destruction NOTE1: i_mutex is no longer required because end_io->flags modification is protected by ei->ext4_complete_io_lock Full list of changes: - Move all completion end_io related routines to page-io.c in order to improve logic locality - Move open coded logic from various xx_end_xx routines to ext4_add_complete_io() - remove EXT4_IO_END_FSYNC - Improve SMP scalability by removing useless i_mutex which does not protect io->flags anymore. - Reduce lock contention on i_completed_io_lock by optimizing list walk. - Rename ext4_end_io_nolock to end4_end_io and make it static - Check flush completion status to ext4_ext_punch_hole(). Because it is not good idea to punch blocks from corrupted inode. Changes since V3 (in request to Jan's comments): Fall back to active flush_completed_IO() approach in order to prevent performance issues with nolocked DIO reads. Changes since V2: Fix use-after-free caused by race truncate vs end_io_work Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-28ext4: fix unwritten counter leakageDmitry Monakhov
ext4_set_io_unwritten_flag() will increment i_unwritten counter, so once we mark end_io with EXT4_END_IO_UNWRITTEN we have to revert it back on error path. - add missed error checks to prevent counter leakage - ext4_end_io_nolock() will clear EXT4_END_IO_UNWRITTEN flag to signal that conversion finished. - add BUG_ON to ext4_free_end_io() to prevent similar leakage in future. Visible effect of this bug is that unaligned aio_stress may deadlock Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-28ext4: give i_aiodio_unwritten a more appropriate nameDmitry Monakhov
AIO/DIO prefix is wrong because it account unwritten extents which also may be scheduled from buffered write endio Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-28ext4: ext4_inode_info dietDmitry Monakhov
Generic inode has unused i_private pointer which may be used as cur_aio_dio storage. TODO: If cur_aio_dio will be passed as an argument to get_block_t this allow to have concurent AIO_DIO requests. Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-28cifs: obtain file access during backup intent lookup (resend)Shirish Pargaonkar
Rebased and resending the patch. Path based queries can fail for lack of access, especially during lookup during open. open itself would actually succeed becasue of back up intent bit but queries (either path or file handle based) do not have a means to specifiy backup intent bit. So query the file info during lookup using trans2 / findfirst / file_id_full_dir_info to obtain file info as well as file_id/inode value. Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Acked-by: Jeff Layton <jlayton@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
2012-09-28NFSv4.1: nfs4_proc_layoutreturn must always drop the plh_block_lgets countTrond Myklebust
Currently it does not do so if the RPC call failed to start. Fix is to move the decrement of plh_block_lgets into nfs4_layoutreturn_release. Also remove a redundant test of task->tk_status in nfs4_layoutreturn_done: if lrp->res.lrs_present is set, then obviously the RPC call succeeded. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: _pnfs_return_layout() shouldn't invalidate the layout on failureTrond Myklebust
Failure of the layoutreturn allocation fails is not a good reason to mark the pnfs_layout_hdr as having failed a layoutget or i/o. Just exit cleanly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Remove the NFS_LAYOUT_RETURNED stateTrond Myklebust
It serves no purpose that the test for whether or not we have valid layout segments doesn't already serve. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Clear NFS_LAYOUT_BULK_RECALL when the layout segments are freedTrond Myklebust
Once all the affected layout segments have been freed up, clear the NFS_LAYOUT_BULK_RECALL flag so that we can reuse the pnfs_layout_hdr Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Get rid of the NFS_LAYOUT_DESTROYED stateTrond Myklebust
We already have a mechanism for blocking LAYOUTGET by means of the plh_block_lgets counter. The only "service" that NFS_LAYOUT_DESTROYED provides at this point is to block layoutget once the layout segment list is empty, which basically means that you have to wait until the pnfs_layout_hdr is destroyed before you can do pNFS on that file again. This patch enables the reuse of the pnfs_layout_hdr if the layout segment list is empty. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Remove unused 'default allocation' for pnfs_alloc_layout_hdr()Trond Myklebust
...and ditto for pnfs_free_layout_hdr() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Get rid of pNFS spin lock debugging asserts...Trond Myklebust
These are all in static declared functions that are called only once. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Balance pnfs_layout_hdr refcount in pnfs_layout_(insert|remove)_lsegTrond Myklebust
Ensure that the reference count for pnfs_layout_hdr reverts to the original value after a call to pnfs_layout_remove_lseg(). Note that the caller is expected to hold a reference to the struct pnfs_layout_hdr. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Clean up pnfs_put_lseg()Trond Myklebust
There is no longer a need to use pnfs_free_lseg_list(). Just call pnfs_free_lseg() directly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28NFSv4.1: Clean up the removal of pnfs_layout_hdr from the server listTrond Myklebust
Move the code into pnfs_free_layout_hdr(), and add checks to get_layout_by_fh_locked to ensure that they don't reference a layout that is being freed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>