summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2014-04-24cifs: fix actimeo=0 corner case when cifs_i->time == jiffiesJeff Layton
actimeo=0 is supposed to be a special case that ensures that inode attributes are always refetched from the server instead of trusting the cache. The cifs code however uses time_in_range() to determine whether the attributes have timed out. In the case where cifs_i->time equals jiffies, this leads to the cifs code not refetching the inode attributes when it should. Fix this by explicitly testing for actimeo=0, and handling it as a special case. Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-04-24Btrfs: correctly set profile flags on seqlock retryFilipe Manana
If we had to retry on the profiles seqlock (due to a concurrent write), we would set bits on the input flags that corresponded both to the current profile and to previous values of the profile. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: use correct key when repeating search for extent itemFilipe Manana
If skinny metadata is enabled and our first tree search fails to find a skinny extent item, we may repeat a tree search for a "fat" extent item (if the previous item in the leaf is not the "fat" extent we're looking for). However we were not setting the new key's objectid to the right value, as we previously used the same key variable to peek at the previous item in the leaf, which has a different objectid. So just set the right objectid to avoid modifying/deleting a wrong item if we repeat the tree search. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: fix inode caching vs tree logMiao Xie
Currently, with inode cache enabled, we will reuse its inode id immediately after unlinking file, we may hit something like following: |->iput inode |->return inode id into inode cache |->create dir,fsync |->power off An easy way to reproduce this problem is: mkfs.btrfs -f /dev/sdb mount /dev/sdb /mnt -o inode_cache,commit=100 dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync inode_id=`ls -i /mnt/data | awk '{print $1}'` rm -f /mnt/data i=1 while [ 1 ] do mkdir /mnt/dir_$i test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'` if [ $test1 -eq $inode_id ] then dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync echo b > /proc/sysrq-trigger fi sleep 1 i=$(($i+1)) done mount /dev/sdb /mnt umount /dev/sdb btrfs check /dev/sdb We fix this problem by adding unlinked inode's id into pinned tree, and we can not reuse them until committing transaction. Cc: stable@vger.kernel.org Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: fix possible memory leaks in open_ctree()Wang Shilong
Fix possible memory leaks in the following error handling paths: read_tree_block() btrfs_recover_log_trees btrfs_commit_super() btrfs_find_orphan_roots() btrfs_cleanup_fs_roots() Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: avoid triggering bug_on() when we fail to start inode caching taskWang Shilong
When running stress test(including snapshots,balance,fstress), we trigger the following BUG_ON() which is because we fail to start inode caching task. [ 181.131945] kernel BUG at fs/btrfs/inode-map.c:179! [ 181.137963] invalid opcode: 0000 [#1] SMP [ 181.217096] CPU: 11 PID: 2532 Comm: btrfs Not tainted 3.14.0 #1 [ 181.240521] task: ffff88013b621b30 ti: ffff8800b6ada000 task.ti: ffff8800b6ada000 [ 181.367506] Call Trace: [ 181.371107] [<ffffffffa036c1be>] btrfs_return_ino+0x9e/0x110 [btrfs] [ 181.379191] [<ffffffffa038082b>] btrfs_evict_inode+0x46b/0x4c0 [btrfs] [ 181.387464] [<ffffffff810b5a70>] ? autoremove_wake_function+0x40/0x40 [ 181.395642] [<ffffffff811dc5fe>] evict+0x9e/0x190 [ 181.401882] [<ffffffff811dcde3>] iput+0xf3/0x180 [ 181.408025] [<ffffffffa03812de>] btrfs_orphan_cleanup+0x1ee/0x430 [btrfs] [ 181.416614] [<ffffffffa03a6abd>] btrfs_mksubvol.isra.29+0x3bd/0x450 [btrfs] [ 181.425399] [<ffffffffa03a6cd6>] btrfs_ioctl_snap_create_transid+0x186/0x190 [btrfs] [ 181.435059] [<ffffffffa03a6e3b>] btrfs_ioctl_snap_create_v2+0xeb/0x130 [btrfs] [ 181.444148] [<ffffffffa03a9656>] btrfs_ioctl+0xf76/0x2b90 [btrfs] [ 181.451971] [<ffffffff8117e565>] ? handle_mm_fault+0x475/0xe80 [ 181.459509] [<ffffffff8167ba0c>] ? __do_page_fault+0x1ec/0x520 [ 181.467046] [<ffffffff81185b35>] ? do_mmap_pgoff+0x2f5/0x3c0 [ 181.474393] [<ffffffff811d4da8>] do_vfs_ioctl+0x2d8/0x4b0 [ 181.481450] [<ffffffff811d5001>] SyS_ioctl+0x81/0xa0 [ 181.488021] [<ffffffff81680b69>] system_call_fastpath+0x16/0x1b We should avoid triggering BUG_ON() here, instead, we output warning messages and clear inode_cache option. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24Btrfs: move btrfs_{set,clear}_and_info() to ctree.hWang Shilong
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24btrfs: replace error code from btrfs_drop_extentsDavid Sterba
There's a case which clone does not handle and used to BUG_ON instead, (testcase xfstests/btrfs/035), now returns EINVAL. This error code is confusing to the ioctl caller, as it normally signifies errorneous arguments. Change it to ENOPNOTSUPP which allows a fall back to copy instead of clone. This does not affect the common reflink operation. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24btrfs: Change the hole range to a more accurate value.Qu Wenruo
Commit 3ac0d7b96a268a98bd474cab8bce3a9f125aaccf fixed the btrfs expanding write problem but the hole punched is sometimes too large for some iovec, which has unmapped data ranges. This patch will change to hole range to a more accurate value using the counts checked by the write check routines. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-24xfs: enable the finobt feature on v5 superblocksBrian Foster
Add the finobt feature bit to the list of known features. As of this point, the kernel code knows how to mount and manage both finobt and non-finobt formatted filesystems. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: report finobt status in fs geometryBrian Foster
Define the XFS_FSOP_GEOM_FLAGS_FINOBT fs geometry flag and set the associated bit if the filesystem supports the free inode btree. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: add finobt support to growfsBrian Foster
Add finobt support to growfs. Initialize the agi root/level fields and the root finobt block. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: update the finobt on inode freeBrian Foster
An inode free operation can have several effects on the finobt. If all inodes have been freed and the chunk deallocated, we remove the finobt record. If the inode chunk was previously full, we must insert a new record based on the existing inobt record. Otherwise, we modify the record in place. Create the xfs_difree_finobt() function to identify the potential scenarios and update the finobt appropriately. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helperBrian Foster
Refactor xfs_difree() in preparation for the finobt. xfs_difree() performs the validity checks against the ag and reads the agi header. The work of physically updating the inode allocation btree is pushed down into the new xfs_difree_inobt() helper. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: use and update the finobt on inode allocationBrian Foster
Replace xfs_dialloc_ag() with an implementation that looks for a record in the finobt. The finobt only tracks records with at least one free inode. This eliminates the need for the intra-ag scan in the original algorithm. Once the inode is allocated, update the finobt appropriately (possibly removing the record) as well as the inobt. Move the original xfs_dialloc_ag() algorithm to xfs_dialloc_ag_inobt() and fall back as such if finobt support is not enabled. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: insert newly allocated inode chunks into the finobtBrian Foster
A newly allocated inode chunk, by definition, has at least one free inode, so a record is always inserted into the finobt. Create the xfs_inobt_insert() helper from existing code to insert a record in an inobt based on the provided BTNUM. Update xfs_ialloc_ag_alloc() to invoke the helper for the existing XFS_BTNUM_INO tree and XFS_BTNUM_FINO tree, if enabled. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: update inode allocation/free transaction reservations for finobtBrian Foster
Create the xfs_calc_finobt_res() helper to calculate the finobt log reservation for inode allocation and free. Update XFS_IALLOC_SPACE_RES() to reserve blocks for the additional finobt insertion on inode allocation. Create XFS_IFREE_SPACE_RES() to reserve blocks for the potential finobt record insertion on inode free (i.e., if an inode chunk was previously fully allocated). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: support the XFS_BTNUM_FINOBT free inode btree typeBrian Foster
Define the AGI fields for the finobt root/level and add magic numbers. Update the btree code to add support for the new XFS_BTNUM_FINOBT inode btree. The finobt root block is reserved immediately following the inobt root block in the AG. Update XFS_PREALLOC_BLOCKS() to determine the starting AG data block based on whether finobt support is enabled. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: reserve v5 superblock read-only compat. feature bit for finobtBrian Foster
Reserve a v5 read-only compatibility feature bit for the finobt and create the xfs_sb_version_hasfinobt() helper to determine whether an fs has the feature enabled. The finobt does not change existing on-disk structures, but must remain consistent with the ialloc btree. Modifications from older kernels would violate that constrant. Therefore, we restrict older kernels to read-only mounts of finobt-enabled filesystems. Note that this does not yet enable the ability to rw mount a finobt fs (by setting the feature bit in the XFS_SB_FEAT_RO_COMPAT_ALL mask). Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbersBrian Foster
The introduction of the free inode btree (finobt) requires that xfs_ialloc_btree.c handle multiple trees. Refactor xfs_ialloc_btree.c so the caller specifies the btree type on cursor initialization to prepare for addition of the finobt. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23locks: rename FL_FILE_PVT and IS_FILE_PVT to use "*_OFDLCK" insteadJeff Layton
File-private locks have been re-christened as "open file description" locks. Finish the symbol name cleanup in the internal implementation. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-23xfs: add filestream allocator tracepointsChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23xfs: remove xfs_filestream_associateChristoph Hellwig
There is no good reason to create a filestream when a directory entry is created. Delay it until the first allocation happens to simply the code and reduce the amount of mru cache lookups we do. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23xfs: don't create a slab cache for filestream itemsChristoph Hellwig
We only have very few of these around, and allocation isn't that much of a hot path. Remove the slab cache to simplify the code, and to not waste any resources for the usual case of not having any inodes that use the filestream allocator. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23xfs: rewrite the filestream allocator using the dentry cacheChristoph Hellwig
In Linux we will always be able to find a parent inode for file that are undergoing I/O. Use this to simply the file stream allocator by only keeping track of parent inodes. Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-04-23xfs: remove XFS_IFILESTREAMChristoph Hellwig
We never test the flag except in xfs_inode_is_filestream, but that function already tests the on-disk flag or filesystem wide flags, and is used to decide if we want to set XFS_IFILESTREAM in the first place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23xfs: embedd mru_elem into parent structureChristoph Hellwig
There is no need to do a separate allocation for each mru element, just embedd the structure into the parent one in the user. Besides saving a memory allocation and the infrastructure required for it this also simplifies the API. While we do major surgery on xfs_mru_cache.c also de-typedef it and make struct mru_cache private to the implementation file. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23xfs: handle duplicate entries in xfs_mru_cache_insertChristoph Hellwig
The radix tree code can detect and reject duplicate keys at insert time. Make xfs_mru_cache_insert handle this case so that future changes to the filestream allocator can take advantage of this. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-23xfs: split xfs_bmap_btalloc_nullfbChristoph Hellwig
Split xfs_bmap_btalloc_nullfb into one function for filestream allocations and one for everything else that share a few helpers. This dramatically simplifies the control flow. Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-04-22fs/bio.c: remove nr_segs (unused function parameter)Fabian Frederick
nr_segs is no longer used in bio_alloc_map_data since c8db444820a1e3 ("block: Don't save/copy bvec array anymore") Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-22fs/bio: remove bs paramater in biovec_create_poolFabian Frederick
bs is no longer used in biovec_create_pool since 9f060e2231ca96 ("block: Convert integrity to bvec_alloc_bs()") Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-22fs/aio.c: Remove ctx parameter in kiocb_cancelFabian Frederick
ctx is no longer used in kiocb_cancel since 57282d8fd74407 ("aio: Kill ki_users") Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
2014-04-22locks: rename file-private locks to "open file description locks"Jeff Layton
File-private locks have been merged into Linux for v3.15, and *now* people are commenting that the name and macro definitions for the new file-private locks suck. ...and I can't even disagree. The names and command macros do suck. We're going to have to live with these for a long time, so it's important that we be happy with the names before we're stuck with them. The consensus on the lists so far is that they should be rechristened as "open file description locks". The name isn't a big deal for the kernel, but the command macros are not visually distinct enough from the traditional POSIX lock macros. The glibc and documentation folks are recommending that we change them to look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a programmer will typo one of the commands wrong, and also makes it easier to spot this difference when reading code. This patch makes the following changes that I think are necessary before v3.15 ships: 1) rename the command macros to their new names. These end up in the uapi headers and so are part of the external-facing API. It turns out that glibc doesn't actually use the fcntl.h uapi header, but it's hard to be sure that something else won't. Changing it now is safest. 2) make the the /proc/locks output display these as type "OFDLCK" Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Carlos O'Donell <carlos@redhat.com> Cc: Stefan Metzmacher <metze@samba.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Frank Filz <ffilzlnx@mindspring.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-21ext4: remove obsoleted checkDmitry Monakhov
BH can not be NULL at this point, ext4_read_dirblock() always return non null value, and we already have done all necessery checks. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-04-21ext4: add a new spinlock i_raw_lock to protect the ext4's raw inodeTheodore Ts'o
To avoid potential data races, use a spinlock which protects the raw (on-disk) inode. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2014-04-21ext4: fix locking for O_APPEND writesTheodore Ts'o
Al Viro pointed out that locking for O_APPEND writes was problematic, since the location of the write isn't known until after we take the i_mutex, which impacts the ext4_unaligned_aio() and s_bitmap_maxbytes check. For O_APPEND always assume that the write is unaligned so call ext4_unwritten_wait(). And to solve the second problem, take the i_mutex earlier before we start the s_bitmap_maxbytes check. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-21ext4: factor out common code in ext4_file_write()Theodore Ts'o
This shouldn't change any logic flow; just delete duplicated code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2014-04-21ext4: move ext4_file_dio_write() into ext4_file_write()Theodore Ts'o
This commit doesn't actually change anything; it just moves code around in preparation for some code simplification work. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2014-04-21ext4: inline generic_file_aio_write() into ext4_file_write()Theodore Ts'o
Copy generic_file_aio_write() into ext4_file_write(). This is part of a patch series which allows us to simplify ext4_file_write() and ext4_file_dio_write(), by calling __generic_file_aio_write() directly. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2014-04-21fs: fix new kernel-doc warnings in fs/bio.cRandy Dunlap
Fix new kernel-doc warnings in fs/bio.c: Warning(fs/bio.c:316): No description found for parameter 'bio' Warning(fs/bio.c:316): No description found for parameter 'parent' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-20ext4: rename uninitialized extents to unwrittenLukas Czerner
Currently in ext4 there is quite a mess when it comes to naming unwritten extents. Sometimes we call it uninitialized and sometimes we refer to it as unwritten. The right name for the extent which has been allocated but does not contain any written data is _unwritten_. Other file systems are using this name consistently, even the buffer head state refers to it as unwritten. We need to fix this confusion in ext4. This commit changes every reference to an uninitialized extent (meaning allocated but unwritten) to unwritten extent. This includes comments, function names and variable names. It even covers abbreviation of the word uninitialized (such as uninit) and some misspellings. This commit does not change any of the code paths at all. This has been confirmed by comparing md5sums of the assembly code of each object file after all the function names were stripped from it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-20ext4: get rid of EXT4_MAP_UNINIT flagLukas Czerner
Currently EXT4_MAP_UNINIT is used in dioread_nolock case to mark the cases where we're using dioread_nolock and we're writing into either unallocated, or unwritten extent, because we need to make sure that any DIO write into that inode will wait for the extent conversion. However EXT4_MAP_UNINIT is not only entirely misleading name but also unnecessary because we can check for EXT4_MAP_UNWRITTEN in the dioread_nolock case instead. This commit removes EXT4_MAP_UNINIT flag. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-20Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "These are regression and bug fixes for ext4. We had a number of new features in ext4 during this merge window (ZERO_RANGE and COLLAPSE_RANGE fallocate modes, renameat, etc.) so there were many more regression and bug fixes this time around. It didn't help that xfstests hadn't been fully updated to fully stress test COLLAPSE_RANGE until after -rc1" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits) ext4: disable COLLAPSE_RANGE for bigalloc ext4: fix COLLAPSE_RANGE failure with 1KB block size ext4: use EINVAL if not a regular file in ext4_collapse_range() ext4: enforce we are operating on a regular file in ext4_zero_range() ext4: fix extent merging in ext4_ext_shift_path_extents() ext4: discard preallocations after removing space ext4: no need to truncate pagecache twice in collapse range ext4: fix removing status extents in ext4_collapse_range() ext4: use filemap_write_and_wait_range() correctly in collapse range ext4: use truncate_pagecache() in collapse range ext4: remove temporary shim used to merge COLLAPSE_RANGE and ZERO_RANGE ext4: fix ext4_count_free_clusters() with EXT4FS_DEBUG and bigalloc enabled ext4: always check ext4_ext_find_extent result ext4: fix error handling in ext4_ext_shift_extents ext4: silence sparse check warning for function ext4_trim_extent ext4: COLLAPSE_RANGE only works on extent-based files ext4: fix byte order problems introduced by the COLLAPSE_RANGE patches ext4: use i_size_read in ext4_unaligned_aio() fs: disallow all fallocate operation on active swapfile fs: move falloc collapse range check into the filesystem methods ...
2014-04-19ext4: disable COLLAPSE_RANGE for bigallocNamjae Jeon
Once COLLAPSE RANGE is be disable for ext4 with bigalloc feature till finding root-cause of problem. It will be enable with fixing that regression of xfstest(generic 075 and 091) again. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-19ext4: fix COLLAPSE_RANGE failure with 1KB block sizeNamjae Jeon
When formatting with 1KB or 2KB(not aligned with PAGE SIZE) block size, xfstests generic/075 and 091 are failing. The offset supplied to function truncate_pagecache_range is block size aligned. In this function start offset is re-aligned to PAGE_SIZE by rounding_up to the next page boundary. Due to this rounding up, old data remains in the page cache when blocksize is less than page size and start offset is not aligned with page size. In case of collapse range, we need to align start offset to page size boundary by doing a round down operation instead of round up. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-04-19coredump: fix va_list corruptionEric Dumazet
A va_list needs to be copied in case it needs to be used twice. Thanks to Hugh for debugging this issue, leading to various panics. Tested: lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern 'produce_core' is simply : main() { *(int *)0 = 1;} lpq84:~# ./produce_core Segmentation fault (core dumped) lpq84:~# dmesg | tail -1 [ 614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed Notice the last argument was replaced by a NULL (we were lucky enough to not crash, but do not try this on your production machine !) After fix : lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern lpq83:~# ./produce_core Segmentation fault lpq83:~# dmesg | tail -1 [ 740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed Fixes: 5fe9d8ca21cc ("coredump: cn_vprintf() has no reason to call vsnprintf() twice") Signed-off-by: Eric Dumazet <edumazet@google.com> Diagnosed-by: Hugh Dickins <hughd@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org # 3.11+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-19fix races between __d_instantiate() and checks of dentry flagsAl Viro
in non-lazy walk we need to be careful about dentry switching from negative to positive - both ->d_flags and ->d_inode are updated, and in some places we might see only one store. The cases where dentry has been obtained by dcache lookup with ->i_mutex held on parent are safe - ->d_lock and ->i_mutex provide all the barriers we need. However, there are several places where we run into trouble: * do_last() fetches ->d_inode, then checks ->d_flags and assumes that inode won't be NULL unless d_is_negative() is true. Race with e.g. creat() - we might have fetched the old value of ->d_inode (still NULL) and new value of ->d_flags (already not DCACHE_MISS_TYPE). Lin Ming has observed and reported the resulting oops. * a bunch of places checks ->d_inode for being non-NULL, then checks ->d_flags for "is it a symlink". Race with symlink(2) in case if our CPU sees ->d_inode update first - we see non-NULL there, but ->d_flags still contains DCACHE_MISS_TYPE instead of DCACHE_SYMLINK_TYPE. Result: false negative on "should we follow link here?", with subsequent unpleasantness. Cc: stable@vger.kernel.org # 3.13 and 3.14 need that one Reported-and-tested-by: Lin Ming <minggr@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-18Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "A set of 5 small cifs fixes" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cif: fix dead code cifs: fix error handling cifs_user_readv fs: cifs: remove unused variable. Return correct error on query of xattr on file with empty xattrs cifs: Wait for writebacks to complete before attempting write.
2014-04-18Merge tag 'driver-core-3.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some driver core fixes for 3.15-rc2. Also in here are some documentation updates, as well as an API removal that had to wait for after -rc1 due to the cleanups coming into you from multiple developer trees (this one and the PPC tree.) All have been in linux next successfully" * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: drivers/base/dd.c incorrect pr_debug() parameters Documentation: Update stable address in Chinese and Japanese translations topology: Fix compilation warning when not in SMP Chinese: add translation of io_ordering.txt stable_kernel_rules: spelling/word usage sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner() kernfs: protect lazy kernfs_iattrs allocation with mutex fs: Don't return 0 from get_anon_bdev
2014-04-18ext4: use EINVAL if not a regular file in ext4_collapse_range()Theodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>