Age | Commit message (Collapse) | Author |
|
Re-arrange the budget dump and make sure we first dump all
the 'struct ubifs_budg_info' fields, and then the other information.
Additionally, print the 'uncommitted_idx' variable.
This change is required for to the following dumping function
enhancement where it will be possible to dump saved
'struct ubifs_budg_info' objects, not only the current one.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
The current 'dbg_dump_budg()' calling convention is that the
'c->space_lock' spinlock is held. However, none of the callers
actually use it from contects which have 'c->space_lock' locked,
so all callers have to explicitely lock and unlock the spinlock.
This is not very sensible convention. This patch changes it and
makes 'dbg_dump_budg()' lock the spinlock instead of imposing this
to the callers. This simplifies the code a little.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
This patch separates out all the budgeting-related information
from 'struct ubifs_info' to 'struct ubifs_budg_info'. This way the
code looks a bit cleaner. However, the main driver for this is
that we want to save budgeting information and print it later,
so a separate data structure for this is helpful.
This patch is a preparation for the further debugging output
improvements.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
There was an attempt to standartize various "__attribute__" and
other macros in order to have potentially portable and more
consistent code, see commit 82ddcb040570411fc2d421d96b3e69711c670328.
Note, that commit refers Rober Love's blog post, but the URL
is broken, the valid URL is:
http://blog.rlove.org/2005/10/with-little-help-from-your-compiler.html
Moreover, nowadays checkpatch.pl warns about using
__attribute__((packed)):
"WARNING: __packed is preferred over __attribute__((packed))"
It is not a big deal for UBIFS to use __packed, so let's do it.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
Fix several minor stylistic issues:
* lines longer than 80 characters
* space before closing parenthesis ')'
* spaces in the indentations
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
Turn the debufs files UBIFS maintains into non-seekable. Indeed, none
of them is supposed to be seek'ed.
Do this by making the '.lseek()' handler to be 'no_llseek()' and by
using 'nonseekable_open()' in the '.open()' operation.
This does mean an API break but this debugging API is only used by a couple
of test scripts which do not rely in the 'llseek()' operation.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
In a multi device setup, the chunk allocator currently always allocates
chunks on the devices in the same order. This leads to a very uneven
distribution, especially with RAID1 or RAID10 and an uneven number of
devices.
This patch always sorts the devices before allocating, and allocates the
stripes on the devices with the most available space, as long as there
is enough space available. In a low space situation, it first tries to
maximize striping.
The patch also simplifies the allocator and reduces the checks for
corner cases.
The simplification is done by several means. First, it defines the
properties of each RAID type upfront. These properties are used afterwards
instead of differentiating cases in several places.
Second, the old allocator defined a minimum stripe size for each block
group type, tried to find a large enough chunk, and if this fails just
allocates a smaller one. This is now done in one step. The largest possible
chunk (up to max_chunk_size) is searched and allocated.
Because we now have only one pass, the allocation of the map (struct
map_lookup) is moved down to the point where the number of stripes is
already known. This way we avoid reallocation of the map.
We still avoid allocating stripes that are not a multiple of STRIPE_SIZE.
|
|
currently alloc_start is disregarded if the requested
chunk size is bigger than (device size - alloc_start),
but smaller than the device size.
The only situation where I see this could have made sense
was when a chunk equal the size of the device has been
requested. This was possible as the allocator failed to
take alloc_start into account when calculating the request
chunk size. As this gets fixed by this patch, the workaround
is not necessary anymore.
|
|
this function won't be used here anymore, so move it super.c where it is
used for df-calculation
|
|
Now that there are no longer any exceptions to the normal inode
creation code path, we can move the parts of the locking code
which were duplicated in mkdir/mknod/create/symlink into the
inode create function.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This moves the symlink specific parts of inode creation
into the function where we initialise the rest of the
dinode. As a result we have one less place where we need
to look up the inode's buffer.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This moves the initialisation of the directory into the inode
creation functions to avoid having to duplicate the lookup
of the inode's buffer.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.cz>
|
|
As per printk_ratelimit comment, it should not be used.
Signed-off-by: David Sterba <dsterba@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix oops in revalidate when called with NULL nameidata
|
|
setting the readonly flag prevents writes in case an error is detected
Signed-off-by: Arne Jansen <sensille@gmx.net>
|
|
btrfs scrub - make fixups sync, don't reuse fixup bios
Fixups are already sync for csum failures, this patch makes them sync
for EIO case as well.
Fixups are now sharing pages with the parent sbio - instead of
allocating a separate page to do a fixup we grab the page from the sbio
buffer.
Fixup bios are no longer reused.
struct fixup is no longer needed, instead pass [sbio pointer, index].
Originally this was added to look at the possibility of sharing the code
between drive swap and scrub, but it actually fixes a serious bug in
scrub code where errors that could be corrected were ignored and
reported as uncorrectable.
btrfs scrub - restore bios properly after media errors
The current code reallocates a bio after a media error. This is a
temporary measure introduced in v3 after a serious problem related to
bio reuse was found in v2 of scrub patchset.
Basically we did not reset bv_offset and bv_len fields of the bio_vec
structure. They are changed in case I/O error happens, for example, at
offset 512 or 1024 into the page. Also bi_flags field wasn't properly
setup before reusing the bio.
Signed-off-by: Arne Jansen <sensille@gmx.net>
|
|
adds ioctls necessary to start and cancel scrubs, to get current
progress and to get info about devices to be scrubbed.
Note that the scrub is done per-device and that the ioctl only
returns after the scrub for this devices is finished or has been
canceled.
Signed-off-by: Arne Jansen <sensille@gmx.net>
|
|
This adds an initial implementation for scrub. It works quite
straightforward. The usermode issues an ioctl for each device in the
fs. For each device, it enumerates the allocated device chunks. For
each chunk, the contained extents are enumerated and the data checksums
fetched. The extents are read sequentially and the checksums verified.
If an error occurs (checksum or EIO), a good copy is searched for. If
one is found, the bad copy will be rewritten.
All enumerations happen from the commit roots. During a transaction
commit, the scrubs get paused and afterwards continue from the new
roots.
This commit is based on the series originally posted to linux-btrfs
with some improvements that resulted from comments from David Sterba,
Ilya Dryomov and Jan Schmidt.
Signed-off-by: Arne Jansen <sensille@gmx.net>
|
|
Currently, writebacks may end up recursing back into the filesystem due to
GFP_KERNEL direct reclaims in the pnfs subsystem.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: do not use i_wrbuffer_ref as refcount for Fb cap
ceph: fix list_add in ceph_put_snap_realm
ceph: print debug message before put mds session
|
|
Prevents an infinite loop as list was never emptied.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Free the slot and resend the RPC with new session <slot#,seq#>.
For nfs4_async_handle_error, return -EAGAIN and set the task->tk_status to 0
to restart the async rpc in the rpc_restart_call_prepare state which resets
the slot.
For nfs4_handle_exception, retrying a call that uses nfs4_call_sync will
reset the slot via nfs41_call_sync_prepare.
For open/close/lock/locku/delegreturn/layoutcommit/unlink/rename/write
cachethis is true, so these operations will not trigger an
NFS4ERR_RETRY_UNCACHED_REP.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
We increments i_wrbuffer_ref when taking the Fb cap. This breaks
the dirty page accounting and causes looping in
__ceph_do_pending_vmtruncate, and ceph client hangs.
This bug can be reproduced occasionally by running blogbench.
Add a new field i_wb_ref to inode and dedicate it to Fb reference
counting.
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
The mds session, s, could be freed during ceph_put_mds_session.
Move dout before ceph_put_mds_session.
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Implementing file descriptors for the network namespace
is simple and straight forward.
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Create files under /proc/<pid>/ns/ to allow controlling the
namespaces of a process.
This addresses three specific problems that can make namespaces hard to
work with.
- Namespaces require a dedicated process to pin them in memory.
- It is not possible to use a namespace unless you are the child
of the original creator.
- Namespaces don't have names that userspace can use to talk about
them.
The namespace files under /proc/<pid>/ns/ can be opened and the
file descriptor can be used to talk about a specific namespace, and
to keep the specified namespace alive.
A namespace can be kept alive by either holding the file descriptor
open or bind mounting the file someplace else. aka:
mount --bind /proc/self/ns/net /some/filesystem/path
mount --bind /proc/self/fd/<N> /some/filesystem/path
This allows namespaces to be named with userspace policy.
It requires additional support to make use of these filedescriptors
and that will be comming in the following patches.
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Fix what is clearly a simple copy-and-paste error in commenting the
sysfs_update_group() routine.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: fix race condition in AIL push trigger
xfs: make AIL target updates and compares 32bit safe.
xfs: always push the AIL to the target
xfs: exit AIL push work correctly when AIL is empty
xfs: ensure reclaim cursor is reset correctly at end of AG
|
|
Some cases (e.g. ecryptfs) can call ->dentry_revalidate with NULL
nameidata.
https://bugzilla.kernel.org/show_bug.cgi?id=34732
Tyler Hicks pointed out that this bug was introduced by commit
e7c0a16786 "fuse: make fuse_dentry_revalidate() RCU aware"
Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
The VFS superblock structure now has a UUID field, so we can use that
in preference to the UUID field in the GFS2 superblock now.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
This replaces nilfs_mdt_mark_buffer_dirty and nilfs_btnode_mark_dirty
macros with mark_buffer_dirty and gets rid of nilfs_mark_buffer_dirty,
an own mark buffer dirty function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
In the current nilfs, page cache for btree nodes and meta data files
do not set a valid back pointer to the host inode in mapping->host.
This will change it so that every address space in nilfs uses
mapping->host to hold its host inode.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
This replaces all references of NILFS_I_NILFS(inode)->ns_bdev with
inode->i_sb->s_bdev and unfolds remaining uses of NILFS_I_NILFS inline
function.
Before 2.6.37, referring to a nilfs object from inodes needed a
conditional judgement, and NILFS_I_NILFS was helpful to simplify it.
But now we can simply do it by going through a super block instance
like inode->i_sb->s_fs_info.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
This uses list_first_entry macro instead of list_entry if it's used to
get the first entry.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
Applies empty_aops for address space operations of gc-inodes.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
This adds resize ioctl which makes online resize possible.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
When shrinking the filesystem, segments to be truncated must be test
if they are busy or not, and unneeded sufile block should be deleted.
This adds routines for the truncation.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
After resizing the filesystem, the secondary super block must be moved
to a new location. This adds a helper function for this.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
This adds a new ioctl command which limits range of segment to be
allocated. This is intended to gather data whithin a range of the
partition before shrinking the filesystem, or to control new log
location for some purpose.
If a range is specified by the ioctl, segment allocator of nilfs tries
to allocate new segments from the range unless no free segments are
available there.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
The super root block is newly-allocated each time it is written back
to disk, so unused portion of the block should be cleared.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
The size of super root structure depends on inode size, so
NILFS_SR_BYTES macro should be a function of the inode size. This
fixes the issue.
Even though a different size value will be written for a possible
future filesystem with extended inode, but fortunately this does not
break disk format compatibility.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
Previously, nilfs was cloning pages for mmapped region to freeze their
data and ensure consistency of checksum during writeback cycles. A
private page allocator was used for this page cloning. But, we no
longer need to do that since clear_page_dirty_for_io function sets up
pte so that vm_ops->page_mkwrite function is called right before the
mmapped pages are modified and nilfs_page_mkwrite function can safely
wait for the pages to be written back to disk.
So, this stops making a copy of mmapped pages during writeback, and
eliminates the private page allocation and deallocation functions from
nilfs.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
Merge list_del() + list_add_tail() to list_move_tail().
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
After having applied commit 9954e7af14868b8b ("nilfs2: add free
entries count only if clear bit operation succeeded"), a free routine
of nilfs came to fall into an infinite loop, outputting the same
message endlessly:
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed ...
That patch broke the routine so that a loop counter is never updated
in an abnormal state. This fixes the regression.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
|
|
This is the final part of the ops_inode.c/inode.c reordering. We
are left with a single file called inode.c which now contains
all the inode operations, as expected.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|