Age | Commit message (Collapse) | Author |
|
btrfs_submit_bio_hook currently calls btrfs_bio_wq_end_io in either case
of an if statement that determines one of the arguments.
This patch moves the function call outside of the if statement and uses it
to only determine the different argument. This allows us to catch an
error in one place in a more visually obvious way.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
btrfs_update_root BUG's when it can't alloc a path, yet it can recover
from a search error. This patch returns -ENOMEM instead.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
find_and_setup_root BUGs when it encounters an error from
btrfs_find_last_root, which can occur if a path can't be allocated.
This patch pushes it up to its callers where it is already handled.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
There is only one caller of clear_extent_bit that checks the return value
and it only checks if it's negative. Since there are no users of the
returned bits functionality of clear_extent_bit, stop returning it
and avoid complicating error handling.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
__find_space_info can return NULL but we don't check it before calling
dump_space_info().
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
return void
The only error condition in clean_tree_block is an accounting bug.
Returning without modifying dirty_metadata_bytes and as if the cleaning
as been performed may cause problems later so it should panic instead.
It should probably be a BUG_ON but we have btrfs_panic now.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
btrfs_insert_root is just a wrapper for btrfs_insert_item. Just return
the error directly.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
Correctness fix: The kfree calls in the add_delayed_* functions free
the node that's passed into it, but the node is a member of another
structure. It works because it's always the first member of the
containing structure, but it should really be using the containing
structure itself.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
The ordered data and relocation trees have BUG_ONs to protect against
bad tree operations.
This patch replaces them with a panic that will report the problem.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
The *_state functions can only return 0 or -EEXIST. This patch addresses
the cases where those functions returning -EEXIST represent a locking
failure. It handles them by panicking with an appropriate error message.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
As part of the effort to eliminate BUG_ON as an error handling
technique, we need to determine which errors are actual logic errors,
which are on-disk corruption, and which are normal runtime errors
e.g. -ENOMEM.
Annotating these error cases is helpful to understand and report them.
This patch adds a btrfs_panic() routine that will either panic
or BUG depending on the new -ofatal_errors={panic,bug} mount option.
Since there are still so many BUG_ONs, it defaults to BUG for now but I
expect that to change once the error handling effort has made
significant progress.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 1 from Al Viro:
"This is _not_ all; in particular, Miklos' and Jan's stuff is not there
yet."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
ext4: initialization of ext4_li_mtx needs to be done earlier
debugfs-related mode_t whack-a-mole
hfsplus: add an ioctl to bless files
hfsplus: change finder_info to u32
hfsplus: initialise userflags
qnx4: new helper - try_extent()
qnx4: get rid of qnx4_bread/qnx4_getblk
take removal of PF_FORKNOEXEC to flush_old_exec()
trim includes in inode.c
um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
um: embed ->stub_pages[] into mmu_context
gadgetfs: list_for_each_safe() misuse
ocfs2: fix leaks on failure exits in module_init
ecryptfs: make register_filesystem() the last potential failure exit
ntfs: forgets to unregister sysctls on register_filesystem() failure
logfs: missing cleanup on register_filesystem() failure
jfs: mising cleanup on register_filesystem() failure
make configfs_pin_fs() return root dentry on success
configfs: configfs_create_dir() has parent dentry in dentry->d_parent
configfs: sanitize configfs_create()
...
|
|
Pull kmap_atomic cleanup from Cong Wang.
It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().
Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.
* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
drbd: remove the second argument of k[un]map_atomic()
zcache: remove the second argument of k[un]map_atomic()
gma500: remove the second argument of k[un]map_atomic()
dm: remove the second argument of k[un]map_atomic()
tomoyo: remove the second argument of k[un]map_atomic()
sunrpc: remove the second argument of k[un]map_atomic()
rds: remove the second argument of k[un]map_atomic()
net: remove the second argument of k[un]map_atomic()
mm: remove the second argument of k[un]map_atomic()
lib: remove the second argument of k[un]map_atomic()
power: remove the second argument of k[un]map_atomic()
kdb: remove the second argument of k[un]map_atomic()
udf: remove the second argument of k[un]map_atomic()
ubifs: remove the second argument of k[un]map_atomic()
squashfs: remove the second argument of k[un]map_atomic()
reiserfs: remove the second argument of k[un]map_atomic()
ocfs2: remove the second argument of k[un]map_atomic()
ntfs: remove the second argument of k[un]map_atomic()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
"It's indeed trivial -- mostly documentation updates and a bunch of
typo fixes from Masanari.
There are also several linux/version.h include removals from Jesper."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits)
kcore: fix spelling in read_kcore() comment
constify struct pci_dev * in obvious cases
Revert "char: Fix typo in viotape.c"
init: fix wording error in mm_init comment
usb: gadget: Kconfig: fix typo for 'different'
Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c"
writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header
writeback: fix typo in the writeback_control comment
Documentation: Fix multiple typo in Documentation
tpm_tis: fix tis_lock with respect to RCU
Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c"
Doc: Update numastat.txt
qla4xxx: Add missing spaces to error messages
compiler.h: Fix typo
security: struct security_operations kerneldoc fix
Documentation: broken URL in libata.tmpl
Documentation: broken URL in filesystems.tmpl
mtd: simplify return logic in do_map_probe()
mm: fix comment typo of truncate_inode_pages_range
power: bq27x00: Fix typos in comment
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Cong Wang <amwang@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
"I have two additional and btrfs fixes in my for-linus branch. One is
a casting error that leads to memory corruption on i386 during scrub,
and the other fixes a corner case in the backref walking code (also
triggered by scrub)."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix casting error in scrub reada code
btrfs: fix locking issues in find_parent_nodes()
|
|
The reada code from scrub was casting down a u64 to
an unsigned long so it could insert it into a radix tree.
What it really wanted to do was cast down the result of a shift, instead
of casting down the u64. The bug resulted in trying to insert our
reada struct into the wrong place, which caused soft lockups and other
problems.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
- We might unlock head->mutex while it was not locked
- We might leave the function without unlocking delayed_refs->lock
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Quoth Chris:
"This is later than I wanted because I got backed up running through
btrfs bugs from the Oracle QA teams. But they are all bug fixes that
we've queued and tested since rc1.
Nothing in particular stands out, this just reflects bug fixing and QA
done in parallel by all the btrfs developers. The most user visible
of these is:
Btrfs: clear the extent uptodate bits during parent transid failures
Because that helps deal with out of date drives (say an iscsi disk
that has gone away and come back). The old code wasn't always
properly retrying the other mirror for this type of failure."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (24 commits)
Btrfs: fix compiler warnings on 32 bit systems
Btrfs: increase the global block reserve estimates
Btrfs: clear the extent uptodate bits during parent transid failures
Btrfs: add extra sanity checks on the path names in btrfs_mksubvol
Btrfs: make sure we update latest_bdev
Btrfs: improve error handling for btrfs_insert_dir_item callers
Btrfs: be less strict on finding next node in clear_extent_bit
Btrfs: fix a bug on overcommit stuff
Btrfs: kick out redundant stuff in convert_extent_bit
Btrfs: skip states when they does not contain bits to clear
Btrfs: check return value of lookup_extent_mapping() correctly
Btrfs: fix deadlock on page lock when doing auto-defragment
Btrfs: fix return value check of extent_io_ops
btrfs: honor umask when creating subvol root
btrfs: silence warning in raid array setup
btrfs: fix structs where bitfields and spinlock/atomic share 8B word
btrfs: delalloc for page dirtied out-of-band in fixup worker
Btrfs: fix memory leak in load_free_space_cache()
btrfs: don't check DUP chunks twice
Btrfs: fix trim 0 bytes after a device delete
...
|
|
The enospc tracing code added some interesting uses of
u64 pointer casts.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
When doing IO with large amounts of data fragmentation, the global block
reserve calulations are too low. This increases them to avoid
ENOSPC crashes.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
If btrfs reads a block and finds a parent transid mismatch, it clears
the uptodate flags on the extent buffer, and the pages inside it. But
we only clear the uptodate bits in the state tree if the block straddles
more than one page.
This is from an old optimization from to reduce contention on the extent
state tree. But it is buggy because the code that retries a read from
a different copy of the block is going to find the uptodate state bits
set and skip the IO.
The end result of the bug is that we'll never actually read the good
copy (if there is one).
The fix here is to always clear the uptodate state bits, which is safe
because this code is only called when the parent transid fails.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
When we are setting up the mount, we close all the
devices that were not actually part of the metadata we found.
But, we don't make sure that one of those devices wasn't
fs_devices->latest_bdev, which means we can do a use after free
on the one we closed.
This updates latest_bdev as it goes.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
This allows us to gracefully continue if we aren't able to insert
directory items, both for normal files/dirs and snapshots.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
In clear_extent_bit, it is enough that next node is adjacent in tree level.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
|
|
When overcommitting, we should check the sum of pinned space and
bytes for delayed item.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
|
|
clear_state_bit will do merge_state for us, so kick out the redundant one.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
|
|
Clearing a range's bits is different with setting them, since we don't
need to touch them when states do not contain bits we want.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
|
|
This patch corrects error checking of lookup_extent_mapping().
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
|
|
When I ran xfstests circularly on a auto-defragment btrfs, the deadlock
happened.
Steps to reproduce:
[tty0]
# export MOUNT_OPTIONS="-o autodefrag"
# export TEST_DEV=<partition1>
# export TEST_DIR=<mountpoint1>
# export SCRATCH_DEV=<partition2>
# export SCRATCH_MNT=<mountpoint2>
# while [ 1 ]
> do
> ./check 091 127 263
> sleep 1
> done
[tty1]
# while [ 1 ]
> do
> echo 3 > /proc/sys/vm/drop_caches
> done
Several hours later, the test processes will hang on, and the deadlock will
happen on page lock.
The reason is that:
Auto defrag task Flush thread Test task
btrfs_writepages()
add ordered extent
(including page 1, 2)
set page 1 writeback
set page 2 writeback
endio_fn()
end page 2 writeback
release page 2
lock page 1
alloc and lock page 2
page 2 is not uptodate
btrfs_readpage()
start ordered extent()
btrfs_writepages()
try to lock page 1
so deadlock happens.
Fix this bug by unlocking the page which is in writeback, and re-locking it
after the writeback end.
Signed-off-by: Miao Xie <miax@cn.fujitsu.com>
|
|
This patch adds the check on the return value of extent_io_ops.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
|
|
Set the subvol root inode permissions based on the current umask.
|
|
Raid array setup code creates an extent buffer in an usual way. When the
PAGE_CACHE_SIZE is > super block size, the extent pages are not marked
up-to-date, which triggers a WARN_ON in the following
write_extent_buffer call. Add an explicit up-to-date call to silence the
warning.
Signed-off-by: David Sterba <dsterba@suse.cz>
|
|
On ia64, powerpc64 and sparc64 the bitfield is modified through a RMW cycle and current
gcc rewrites the adjacent 4B word, which in case of a spinlock or atomic has
disaterous effect.
https://lkml.org/lkml/2012/2/1/220
Signed-off-by: David Sterba <dsterba@suse.cz>
|
|
We encountered an issue that was easily observable on s/390 systems but
could really happen anywhere. The timing just seemed to hit reliably
on s/390 with limited memory.
The gist is that when an unexpected set_page_dirty() happened, we'd
run into the BUG() in btrfs_writepage_fixup_worker since it wasn't
properly set up for delalloc.
This patch does the following:
- Performs the missing delalloc in the fixup worker
- Allow the start hook to return -EBUSY which informs __extent_writepage
that it should mark the page skipped and not to redirty it. This is
required since the fixup worker can fail with -ENOSPC and the page
will have already been redirtied. That causes an Oops in
drop_outstanding_extents later. Retrying the fixup worker could
lead to an infinite loop. Deferring the page redirty also saves us
some cycles since the page would be stuck in a resubmit-redirty loop
until the fixup worker completes. It's not harmful, just wasteful.
- If the fixup worker fails, we mark the page and mapping as errored,
and end the writeback, similar to what we would do had the page
actually been submitted to writeback.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
|
|
load_free_space_cache() has forgotten to free path.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
|
|
Because scrub enumerates the dev extent tree to find the chunks to scrub,
it currently finds each DUP chunk twice and also scrubs it twice. This
patch makes sure that scrub_chunk only checks that part of the chunk the
dev extent has been found for. This only changes the behaviour for DUP
chunks.
Reported-and-tested-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Arne Jansen <sensille@gmx.net>
|
|
A user reported a bug of btrfs's trim, that is we will trim 0 bytes
after a device delete.
The reproducer:
$ mkfs.btrfs disk1
$ mkfs.btrfs disk2
$ mount disk1 /mnt
$ fstrim -v /mnt
$ btrfs device add disk2 /mnt
$ btrfs device del disk1 /mnt
$ fstrim -v /mnt
This is because after we delete the device, the block group may start from
a non-zero place, which will confuse trim to discard nothing.
Reported-by: Lutz Euler <lutz.euler@freenet.de>
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
|
|
failed for SEEK_DATA/SEEK_HOLE inquiry
Given that ENXIO only means "offset beyond EOF" for either SEEK_DATA or SEEK_HOLE inquiry
in a desired file range, so we should return the internal error unchanged if btrfs_get_extent_fiemap()
call failed, rather than ENXIO.
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
|
|
inode_ref_info() returns 1 when the element wasn't found and < 0 on error,
just like btrfs_search_slot(). In iref_to_path() it's an error when the
inode ref can't be found, thus we return ERR_PTR(ret) in that case. In order
to avoid ERR_PTR(1), we now set ret to -ENOENT in that case.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
|
|
Gracefully fail when trying to mount a BTRFS file system that has a
sectorsize smaller than PAGE_SIZE.
On PPC it is possible to build a FS while using a 4k PAGE_SIZE kernel
then boot into a 64K PAGE_SIZE kernel. Presently open_ctree fails in an
endless loop and hangs the machine in this situation.
My debugging has show this Sector size < Page size to be a non trivial
situation and a graceful exit from the situation would be nice for the
time being.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
|
|
Once should be enough.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Correct spelling "cace" to "cache" in
fs/btrfs/free-space-cache.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
btrfs_fallocate tries to allocate space only if ranges in the file don't
already exist. But the enospc checks it does are not allowed with
extents locked.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix reservations in btrfs_page_mkwrite
Btrfs: advance window_start if we're using a bitmap
btrfs: mask out gfp flags in releasepage
Btrfs: fix enospc error caused by wrong checks of the chunk
Btrfs: do not defrag a file partially
Btrfs: fix warning for 32-bit build of fs/btrfs/check-integrity.c
Btrfs: use cluster->window_start when allocating from a cluster bitmap
Btrfs: Check for NULL page in extent_range_uptodate
btrfs: Fix busyloops in transaction waiting code
Btrfs: make sure a bitmap has enough bytes
Btrfs: fix uninit warning in backref.c
|
|
Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error. But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.
This makes sure we only release if we were able to reserve.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
If we span a long area in a bitmap we could end up taking a lot of time
searching to the next free area if we're searching from the original
window_start, so advance window_start in order to make sure we don't do any
superficial searching. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
btree_releasepage is a callback and can be passed unknown gfp flags and then
they may end up in kmem_cache_alloc called from alloc_extent_state, slab
allocator will BUG_ON when there is HIGHMEM or DMA32 flag set.
This may happen when btrfs is mounted from a loop device, which masks out
__GFP_IO flag. The check in try_release_extent_state
3399 if ((mask & GFP_NOFS) == GFP_NOFS)
3400 mask = GFP_NOFS;
will not work and passes unfiltered flags further resulting in crash at
mm/slab.c:2963
[<000000000024ae4c>] cache_alloc_refill+0x3b4/0x5c8
[<000000000024c810>] kmem_cache_alloc+0x204/0x294
[<00000000001fd3c2>] mempool_alloc+0x52/0x170
[<000003c000ced0b0>] alloc_extent_state+0x40/0xd4 [btrfs]
[<000003c000cee5ae>] __clear_extent_bit+0x38a/0x4cc [btrfs]
[<000003c000cee78c>] try_release_extent_state+0x9c/0xd4 [btrfs]
[<000003c000cc4c66>] btree_releasepage+0x7e/0xd0 [btrfs]
[<0000000000210d84>] shrink_page_list+0x6a0/0x724
[<0000000000211394>] shrink_inactive_list+0x230/0x578
[<0000000000211bb8>] shrink_list+0x6c/0x120
[<0000000000211e4e>] shrink_zone+0x1e2/0x228
[<0000000000211f24>] shrink_zones+0x90/0x254
[<0000000000213410>] do_try_to_free_pages+0xac/0x420
[<0000000000213ae0>] try_to_free_pages+0x13c/0x1b0
[<0000000000204e6c>] __alloc_pages_nodemask+0x5b4/0x9a8
[<00000000001fb04a>] grab_cache_page_write_begin+0x7e/0xe8
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|