Age | Commit message (Collapse) | Author |
|
Introduce new FALLOC_FL_ZERO_RANGE flag for fallocate. This has the same
functionality as xfs ioctl XFS_IOC_ZERO_RANGE.
It can be used to convert a range of file to zeros preferably without
issuing data IO. Blocks should be preallocated for the regions that span
holes in the file, and the entire range is preferable converted to
unwritten extents
This can be also used to preallocate blocks past EOF in the same way as
with fallocate. Flag FALLOC_FL_KEEP_SIZE which should cause the inode
size to remain the same.
Also add appropriate tracepoints.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Move block allocation out of the ext4_fallocate into separate function
called ext4_alloc_file_blocks(). This will allow us to use the same
allocation code for other allocation operations such as zero range which
is commit in the next patch.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Currently in ext4_fallocate we would update inode size, c_time and sync
the file with every partial allocation which is entirely unnecessary. It
is true that if the crash happens in the middle of truncate we might end
up with unchanged i size, or c_time which I do not think is really a
problem - it does not mean file system corruption in any way. Note that
xfs is doing things the same way e.g. update all of the mentioned after
the allocation is done.
This commit moves all the updates after the allocation is done. In
addition we also need to change m_time as not only inode has been change
bot also data regions might have changed (unwritten extents). However
m_time will be only updated when i_size changed.
Also we do not need to be paranoid about changing the c_time only if the
actual allocation have happened, we can change it even if we try to
allocate only to find out that there are already block allocated. It's
not really a big deal and it will save us some additional complexity.
Also use ext4_debug, instead of ext4_warning in #ifdef EXT4FS_DEBUG
section.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>-
--
v3: Do not remove the code to set EXT4_INODE_EOFBLOCKS flag
fs/ext4/extents.c | 96 ++++++++++++++++++++++++-------------------------------
1 file changed, 42 insertions(+), 54 deletions(-)
|
|
Commit 9cb00419fa, which enables hole punching for bigalloc file
systems, exposed a bug introduced by commit 6ae06ff51e in an earlier
release. When run on a bigalloc file system, xfstests generic/013, 068,
075, 083, 091, 100, 112, 127, 263, 269, and 270 fail with e2fsck errors
or cause kernel error messages indicating that previously freed blocks
are being freed again.
The latter commit optimizes the selection of the starting extent in
ext4_ext_rm_leaf() when hole punching by beginning with the extent
supplied in the path argument rather than with the last extent in the
leaf node (as is still done when truncating). However, the code in
rm_leaf that initially sets partial_cluster to track cluster sharing on
extent boundaries is only guaranteed to run if rm_leaf starts with the
last node in the leaf. Consequently, partial_cluster is not correctly
initialized when hole punching, and a cluster on the boundary of a
punched region that should be retained may instead be deallocated.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
Code deallocating the extent path referenced by an argument to
ext4_ext_handle_uninitialized_extents was made redundant with identical
code in its one caller, ext4_ext_map_blocks, by commit 3779473246.
Allocating and deallocating the path in the same function also makes
the code clearer.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This is the only time it is required for ext4.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Previously, the no-op "mount -o mount /dev/xxx" operation when the
file system is already mounted read-write causes an implied,
unconditional syncfs(). This seems pretty stupid, and it's certainly
documented or guaraunteed to do this, nor is it particularly useful,
except in the case where the file system was mounted rw and is getting
remounted read-only.
However, it's possible that there might be some file systems that are
actually depending on this behavior. In most file systems, it's
probably fine to only call sync_filesystem() when transitioning from
read-write to read-only, and there are some file systems where this is
not needed at all (for example, for a pseudo-filesystem or something
like romfs).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Jan Kara <jack@suse.cz>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Anders Larsen <al@alarsen.net>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Petr Vandrovec <petr@vandrovec.name>
Cc: xfs@oss.sgi.com
Cc: linux-btrfs@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: codalist@coda.cs.cmu.edu
Cc: linux-ext4@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: fuse-devel@lists.sourceforge.net
Cc: cluster-devel@redhat.com
Cc: linux-mtd@lists.infradead.org
Cc: jfs-discussion@lists.sourceforge.net
Cc: linux-nfs@vger.kernel.org
Cc: linux-nilfs@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: ocfs2-devel@oss.oracle.com
Cc: reiserfs-devel@vger.kernel.org
|
|
Fix up error messages printed when the transaction pointers in a
journal head are inconsistent. This improves the error messages which
are printed when running xfstests generic/068 in data=journal mode.
See the bug report at: https://bugzilla.kernel.org/show_bug.cgi?id=60786
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
When doing filesystem wide sync, there's no need to force transaction
commit (or synchronously write inode buffer) separately for each inode
because ext4_sync_fs() takes care of forcing commit at the end (VFS
takes care of flushing buffer cache, respectively). Most of the time
this slowness doesn't manifest because previous WB_SYNC_NONE writeback
doesn't leave much to write but when there are processes aggressively
creating new files and several filesystems to sync, the sync slowness
can be noticeable. In the following test script sync(1) takes around 6
minutes when there are two ext4 filesystems mounted on a standard SATA
drive. After this patch sync takes a couple of seconds so we have about
two orders of magnitude improvement.
function run_writers
{
for (( i = 0; i < 10; i++ )); do
mkdir $1/dir$i
for (( j = 0; j < 40000; j++ )); do
dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
done &
done
}
for dir in "$@"; do
run_writers $dir
done
sleep 40
time sync
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This patch implements fallocate's FALLOC_FL_COLLAPSE_RANGE for Ext4.
The semantics of this flag are following:
1) It collapses the range lying between offset and length by removing any data
blocks which are present in this range and than updates all the logical
offsets of extents beyond "offset + len" to nullify the hole created by
removing blocks. In short, it does not leave a hole.
2) It should be used exclusively. No other fallocate flag in combination.
3) Offset and length supplied to fallocate should be fs block size aligned
in case of xfs and ext4.
4) Collaspe range does not work beyond i_size.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Allow for merging uninitialized extents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
While handling punch-hole fallocate, it's useless to truncate page cache
before removing the range from extent tree (or block map in indirect case)
because page cache can be re-populated (by read-ahead or read(2) or mmap-ed
read) immediately after truncating page cache, but before updating extent
tree (or block map). In that case the user will see stale data even after
fallocate is completed.
Until the problem of data corruption resulting from pages backed by
already freed blocks is fully resolved, the simple thing we can do now
is to add another truncation of pagecache after punch hole is done.
Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
Adjust the conversion specifications in a few optionally compiled debug
messages to match the return type of ext4_es_status(). Also, make a
couple of minor grammatical message edits while we're at it.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
When looking at a bug report with:
> kernel: EXT4-fs: 0 scanned, 0 found
I thought wow, 0 scanned, that's odd? But it's not odd; it's printing
a variable that is initialized to 0 and never touched again.
It's never been used since the original merge, so I don't really even
know what the original intent was, either.
If anyone knows how to hook it up, speak now via patch, otherwise just
yank it so it's not making a confusing situation more confusing in
kernel logs.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The ext4_map_blocks() function returns the number of blocks which
satisfying the caller's request. This number of blocks requested by
the caller is specified by an unsigned integer, but the return value
of ext4_map_blocks() is a signed integer (to accomodate error codes
per the kernel's standard error signalling convention).
Historically, overflows could never happen since mballoc() will refuse
to allocate more than 2048 blocks at a time (which is something we
should fix), and if the blocks were already allocated, the fact that
there would be some number of intervening metadata blocks pretty much
guaranteed that there could never be a contiguous region of data
blocks that was greater than 2**31 blocks.
However, this is now possible if there is a file system which is a bit
bigger than 8TB, and is created using the new mke2fs hugeblock
feature, which can create a perfectly contiguous file. In that case,
if a userspace program attempted to call fallocate() on this already
fully allocated file, it's possible that ext4_map_blocks() could
return a number large enough that it would overflow a signed integer,
resulting in a ext4 thinking that the ext4_map_blocks() call had
failed with some strange error code.
Since ext4_map_blocks() is always free to return a smaller number of
blocks than what was requested by the caller, fix this by capping the
number of blocks that ext4_map_blocks() will ever try to map to 2**31
- 1. In practice this should never get hit, except by someone
deliberately trying to provke the above-described bug.
Thanks to the PaX team for asking whethre this could possibly happen
in some off-line discussions about using some static code checking
technology they are developing to find bugs in kernel code.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The lowest levels of mballoc set all of the fields of struct
ext4_free_extent except for fe_logical, since they are just trying to
find the requested free set of blocks, and the logical block hasn't
been set yet. This makes some static code checkers sad. Set it to
various different debug values, which would be useful when
debugging mballoc if these values were to ever show up due to the
parts of mballoc triyng to use ac->ac_b_ex.fe_logical before it is
properly upper layers of mballoc failing to properly set, usually by
ext4_mb_use_best_found().
Addresses-Coverity-Id: #139697
Addresses-Coverity-Id: #139698
Addresses-Coverity-Id: #139699
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The function ext4_expand_extra_isize_ea() doesn't need the size of all
of the extended attribute headers. So if we don't calculate it when
it is unneeded, it we can skip some undeeded memory references, and as
a bonus, we eliminate some kvetching by static code analysis tools.
Addresses-Coverity-Id: #741291
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Avoid false positives by static code analysis tools such as sparse and
coverity caused by the fact that we set the physical block, and then
the status in the extent_status structure. It is also more efficient
to set both of these values at once.
Addresses-Coverity-Id: #989077
Addresses-Coverity-Id: #989078
Addresses-Coverity-Id: #1080722
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
|
|
Commit 3779473246 breaks the return of error codes from
ext4_ext_handle_uninitialized_extents() in ext4_ext_map_blocks(). A
portion of the patch assigns that function's signed integer return
value to an unsigned int. Consequently, negatively valued error codes
are lost and can be treated as a bogus allocated block count.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
When !defined(CONFIG_EXT4_DEBUG), mb_debug() should be defined as a
no_printk() statement instead of an empty statement in order to suppress
the following compiler warning:
fs/ext4/mballoc.c: In function ‘ext4_mb_cleanup_pa’:
fs/ext4/mballoc.c:2659:47: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
mb_debug(1, "mballoc: %u PAs left\n", count);
Signed-off-by: Patrick Palka <patrick@parcs.ath.cx>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
"err" is zero here, there is no need to check again.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Tighten up the code to make the code easier to read and maintain.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
init_inodecache is only called by __init init_ext4_fs.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
If the i_crtime field is not present in the inode, don't leave the
field uninitialized.
Fixes: ef7f38359 ("ext4: Add nanosecond timestamps")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Tested-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
The set_flexbg_block_bitmap() function assumed that the number of
blocks in a blockgroup was sb->blocksize * 8, which is normally true,
but not always! Use EXT4_BLOCKS_PER_GROUP(sb) instead, to fix block
bitmap corruption after:
mke2fs -t ext4 -g 3072 -i 4096 /dev/vdd 1G
mount -t ext4 /dev/vdd /vdd
resize2fs /dev/vdd 8G
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Jon Bernard <jbernard@tuxion.com>
Cc: stable@vger.kernel.org
|
|
If a file system has a large number of inodes per block group, all of
the metadata blocks in a flex_bg may be larger than what can fit in a
single block group. Unfortunately, ext4_alloc_group_tables() in
resize.c was never tested to see if it would handle this case
correctly, and there were a large number of bugs which caused the
following sequence to result in a BUG_ON:
kernel bug at fs/ext4/resize.c:409!
...
call trace:
[<ffffffff81256768>] ext4_flex_group_add+0x1448/0x1830
[<ffffffff81257de2>] ext4_resize_fs+0x7b2/0xe80
[<ffffffff8123ac50>] ext4_ioctl+0xbf0/0xf00
[<ffffffff811c111d>] do_vfs_ioctl+0x2dd/0x4b0
[<ffffffff811b9df2>] ? final_putname+0x22/0x50
[<ffffffff811c1371>] sys_ioctl+0x81/0xa0
[<ffffffff81676aa9>] system_call_fastpath+0x16/0x1b
code: c8 4c 89 df e8 41 96 f8 ff 44 89 e8 49 01 c4 44 29 6d d4 0
rip [<ffffffff81254fa1>] set_flexbg_block_bitmap+0x171/0x180
This can be reproduced with the following command sequence:
mke2fs -t ext4 -i 4096 /dev/vdd 1G
mount -t ext4 /dev/vdd /vdd
resize2fs /dev/vdd 8G
To fix this, we need to make sure the right thing happens when a block
group's inode table straddles two block groups, which means the
following bugs had to be fixed:
1) Not clearing the BLOCK_UNINIT flag in the second block group in
ext4_alloc_group_tables --- the was proximate cause of the BUG_ON.
2) Incorrectly determining how many block groups contained contiguous
free blocks in ext4_alloc_group_tables().
3) Incorrectly setting the start of the next block range to be marked
in use after a discontinuity in setup_new_flex_group_blocks().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
If an ext4 file system is created by some tool other than mke2fs
(perhaps by someone who has a pathalogical fear of the GPL) that
doesn't set one or the other of the EXT2_FLAGS_{UN}SIGNED_HASH flags,
and that file system is then mounted read-only, don't try to modify
the s_flags field. Otherwise, if dm_verity is in use, the superblock
will change, causing an dm_verity failure.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
In swap_inode_boot_loader() we forgot to release ->i_mutex and resume
unlocked dio for inode and inode_bl if there is an error starting the
journal handle. This commit fixes this issue.
Reported-by: Ahmed Tamrawi <ahmedtamrawi@gmail.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dr. Tilmann Bubeck <t.bubeck@reinform.de>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org # v3.10+
|
|
Commit a115f749c1 (ext4: remove wait for unwritten extent conversion from
ext4_truncate) exposed a bug in ext4_ext_handle_uninitialized_extents().
It can be triggered by xfstest generic/299 when run on a test file
system created without a journal. This test continuously fallocates and
truncates files to which random dio/aio writes are simultaneously
performed by a separate process. The test completes successfully, but
if the test filesystem is mounted with the block_validity option, a
warning message stating that a logical block has been mapped to an
illegal physical block is posted in the kernel log.
The bug occurs when an extent is being converted to the written state
by ext4_end_io_dio() and ext4_ext_handle_uninitialized_extents()
discovers a mapping for an existing uninitialized extent. Although it
sets EXT4_MAP_MAPPED in map->m_flags, it fails to set map->m_pblk to
the discovered physical block number. Because map->m_pblk is not
otherwise initialized or set by this function or its callers, its
uninitialized value is returned to ext4_map_blocks(), where it is
stored as a bogus mapping in the extent status tree.
Since map->m_pblk can accidentally contain illegal values that are
larger than the physical size of the file system, calls to
check_block_validity() in ext4_map_blocks() that are enabled if the
block_validity mount option is used can fail, resulting in the logged
warning message.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org # 3.11+
|
|
It actually goes back to 2004 ([PATCH] Concurrent O_SYNC write support)
when sync_page_range() had been introduced; generic_file_write{,v}() correctly
synced
pos_after_write - written .. pos_after_write - 1
but generic_file_aio_write() synced
pos_before_write .. pos_before_write + written - 1
instead. Which is not the same thing with O_APPEND, obviously.
A couple of years later correct variant had been killed off when
everything switched to use of generic_file_aio_write().
All users of generic_file_aio_write() are affected, and the same bug
has been copied into other instances of ->aio_write().
The fix is trivial; the only subtle point is that generic_write_sync()
ought to be inlined to avoid calculations useless for the majority of
calls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull core block IO changes from Jens Axboe:
"The major piece in here is the immutable bio_ve series from Kent, the
rest is fairly minor. It was supposed to go in last round, but
various issues pushed it to this release instead. The pull request
contains:
- Various smaller blk-mq fixes from different folks. Nothing major
here, just minor fixes and cleanups.
- Fix for a memory leak in the error path in the block ioctl code
from Christian Engelmayer.
- Header export fix from CaiZhiyong.
- Finally the immutable biovec changes from Kent Overstreet. This
enables some nice future work on making arbitrarily sized bios
possible, and splitting more efficient. Related fixes to immutable
bio_vecs:
- dm-cache immutable fixup from Mike Snitzer.
- btrfs immutable fixup from Muthu Kumar.
- bio-integrity fix from Nic Bellinger, which is also going to stable"
* 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
xtensa: fixup simdisk driver to work with immutable bio_vecs
block/blk-mq-cpu.c: use hotcpu_notifier()
blk-mq: for_each_* macro correctness
block: Fix memory leak in rw_copy_check_uvector() handling
bio-integrity: Fix bio_integrity_verify segment start bug
block: remove unrelated header files and export symbol
blk-mq: uses page->list incorrectly
blk-mq: use __smp_call_function_single directly
btrfs: fix missing increment of bi_remaining
Revert "block: Warn and free bio if bi_end_io is not set"
block: Warn and free bio if bi_end_io is not set
blk-mq: fix initializing request's start time
block: blk-mq: don't export blk_mq_free_queue()
block: blk-mq: make blk_sync_queue support mq
block: blk-mq: support draining mq queue
dm cache: increment bi_remaining when bi_end_io is restored
block: fixup for generic bio chaining
block: Really silence spurious compiler warnings
block: Silence spurious compiler warnings
block: Kill bio_pair_split()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 update from Ted Ts'o:
"Bug fixes and cleanups for ext4. We also enable the punch hole
functionality for bigalloc file systems"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: delete "set but not used" variables
ext4: don't pass freed handle to ext4_walk_page_buffers
ext4: avoid clearing beyond i_blocks when truncating an inline data file
ext4: ext4_inode_is_fast_symlink should use EXT4_CLUSTER_SIZE
ext4: fix a typo in extents.c
ext4: use %pd printk specificer
ext4: standardize error handling in ext4_da_write_inline_data_begin()
ext4: retry allocation when inline->extent conversion failed
ext4: enable punch hole for bigalloc
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
"Assorted stuff; the biggest pile here is Christoph's ACL series. Plus
assorted cleanups and fixes all over the place...
There will be another pile later this week"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits)
__dentry_path() fixes
vfs: Remove second variable named error in __dentry_path
vfs: Is mounted should be testing mnt_ns for NULL or error.
Fix race when checking i_size on direct i/o read
hfsplus: remove can_set_xattr
nfsd: use get_acl and ->set_acl
fs: remove generic_acl
nfs: use generic posix ACL infrastructure for v3 Posix ACLs
gfs2: use generic posix ACL infrastructure
jfs: use generic posix ACL infrastructure
xfs: use generic posix ACL infrastructure
reiserfs: use generic posix ACL infrastructure
ocfs2: use generic posix ACL infrastructure
jffs2: use generic posix ACL infrastructure
hfsplus: use generic posix ACL infrastructure
f2fs: use generic posix ACL infrastructure
ext2/3/4: use generic posix ACL infrastructure
btrfs: use generic posix ACL infrastructure
fs: make posix_acl_create more useful
fs: make posix_acl_chmod more useful
...
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Rename the current posix_acl_created to __posix_acl_create and add
a fully featured helper to set up the ACLs on file creation that
uses get_acl().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Rename the current posix_acl_chmod to __posix_acl_chmod and add
a fully featured ACL chmod helper that uses the ->set_acl inode
operation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead
of opencoding an alternate postorder iteration that modifies the tree
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Michel Lespinasse <walken@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull xfs update from Ben Myers:
"This is primarily bug fixes, many of which you already have. New
stuff includes a series to decouple the in-memory and on-disk log
format, helpers in the area of inode clusters, and i_version handling.
We decided to try to use more topic branches this release, so there
are some merge commits in there on account of that. I'm afraid I
didn't do a good job of putting meaningful comments in the first
couple of merges. Sorry about that. I think I have the hang of it
now.
For 3.14-rc1 there are fixes in the areas of remote attributes,
discard, growfs, memory leaks in recovery, directory v2, quotas, the
MAINTAINERS file, allocation alignment, extent list locking, and in
xfs_bmapi_allocate. There are cleanups in xfs_setsize_buftarg,
removing unused macros, quotas, setattr, and freeing of inode
clusters. The in-memory and on-disk log format have been decoupled, a
common helper to calculate the number of blocks in an inode cluster
has been added, and handling of i_version has been pulled into the
filesystems that use it.
- cleanup in xfs_setsize_buftarg
- removal of remaining unused flags for vop toss/flush/flushinval
- fix for memory corruption in xfs_attrlist_by_handle
- fix for out-of-date comment in xfs_trans_dqlockedjoin
- fix for discard if range length is less than one block
- fix for overrun of agfl buffer using growfs on v4 superblock
filesystems
- pull i_version handling out into the filesystems that use it
- don't leak recovery items on error
- fix for memory leak in xfs_dir2_node_removename
- several cleanups for quotas
- fix bad assertion in xfs_qm_vop_create_dqattach
- cleanup for xfs_setattr_mode, and add xfs_setattr_time
- fix quota assert in xfs_setattr_nonsize
- fix an infinite loop when turning off group/project quota before
user quota
- fix for temporary buffer allocation failure in xfs_dir2_block_to_sf
with large directory block sizes
- fix Dave's email address in MAINTAINERS
- cleanup calculation of freed inode cluster blocks
- fix alignment of initial file allocations to match filesystem
geometry
- decouple in-memory and on-disk log format
- introduce a common helper to calculate the number of filesystem
blocks in an inode cluster
- fixes for extent list locking
- fix for off-by-one in xfs_attr3_rmt_verify
- fix for missing destroy_work_on_stack in xfs_bmapi_allocate"
* tag 'xfs-for-linus-v3.14-rc1' of git://oss.sgi.com/xfs/xfs: (51 commits)
xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
xfs: fix off-by-one error in xfs_attr3_rmt_verify
xfs: assert that we hold the ilock for extent map access
xfs: use xfs_ilock_attr_map_shared in xfs_attr_list_int
xfs: use xfs_ilock_attr_map_shared in xfs_attr_get
xfs: use xfs_ilock_data_map_shared in xfs_qm_dqiterate
xfs: use xfs_ilock_data_map_shared in xfs_qm_dqtobp
xfs: take the ilock around xfs_bmapi_read in xfs_zero_remaining_bytes
xfs: reinstate the ilock in xfs_readdir
xfs: add xfs_ilock_attr_map_shared
xfs: rename xfs_ilock_map_shared
xfs: remove xfs_iunlock_map_shared
xfs: no need to lock the inode in xfs_find_handle
xfs: use xfs_icluster_size_fsb in xfs_imap
xfs: use xfs_icluster_size_fsb in xfs_ifree_cluster
xfs: use xfs_icluster_size_fsb in xfs_ialloc_inode_init
xfs: use xfs_icluster_size_fsb in xfs_bulkstat
xfs: introduce a common helper xfs_icluster_size_fsb
xfs: get rid of XFS_IALLOC_BLOCKS macros
xfs: get rid of XFS_INODE_CLUSTER_SIZE macros
...
|
|
Signed-off-by: Jon Ernst <jonernst07@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
|
|
This is harmless, since ext4_walk_page_buffers only passes the handle
onto the callback function, and in this call site the function in
question, bput_one(), doesn't actually use the handle. But there's no
point passing in an invalid handle, and it creates a Coverity warning,
so let's just clean it up.
Addresses-Coverity-Id: #1091168
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
A missing cast means that when we are truncating a file which is less
than 60 bytes, we don't clear the correct area of memory, and in fact
we can end up truncating the next inode in the inode table, or worse
yet, some other kernel data structure.
Addresses-Coverity-Id: #751987
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|
Can be reproduced by xfstests 62 with bigalloc and 128bit size inode.
Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
|
Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
|
|
Use the new %pd printk() specifier in Ext4 to replace passing of
dentry name or dentry name and name length * 2 with just passing the
dentry.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
cc: Andreas Dilger <adilger.kernel@dilger.ca>
cc: linux-ext4@vger.kernel.org
|
|
The function has a bit non-standard (for ext4) error recovery in that it
used a mix of 'out' labels and testing for 'handle' being NULL. There
isn't a good reason for that in the function so clean it up a bit.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Similarly as other ->write_begin functions in ext4, also
ext4_da_write_inline_data_begin() should retry allocation if the
conversion failed because of ENOSPC. This avoids returning ENOSPC
prematurely because of uncommitted block deletions.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
After applied this commit (d23142c6), ext4 has supported punch hole for
a file system with bigalloc feature. But we forgot to enable it. This
commit fixes it.
Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Commit f5a44db5d2 introduced a regression on filesystems created with
the bigalloc feature (cluster size > blocksize). It causes xfstests
generic/006 and /013 to fail with an unexpected JBD2 failure and
transaction abort that leaves the test file system in a read only state.
Other xfstests run on bigalloc file systems are likely to fail as well.
The cause is the accidental use of a cluster mask where a cluster
offset was needed in ext4_ext_map_blocks().
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
|
|
Needed to bring blk-mq uptodate, since changes have been going in
since for-3.14/core was established.
Fixup merge issues related to the immutable biovec changes.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Conflicts:
block/blk-flush.c
fs/btrfs/check-integrity.c
fs/btrfs/extent_io.c
fs/btrfs/scrub.c
fs/logfs/dev_bdev.c
|