Age | Commit message (Collapse) | Author |
|
The callback program is allowed to depend on the session which the
callback is going over.
No change in behavior yet, while we still only do callbacks over a
single session for the lifetime of the client.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
We need to keep track of which connections are available for use with
the backchannel, which for the forechannel, and which for both.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
|
Following rfc 5661, section 18.36.4: "If the session is not successfully
created, then no changes are made to any client records on the server."
We shouldn't be confirming or incrementing the sequence id in this case.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Currently we don't deal well with a client that has multiple sessions
associated with it (even simultaneously, or serially over the lifetime
of the client).
In particular, we don't attempt to keep the backchannel running after
the original session diseappears.
We will fix that soon.
Once we do that, we need the slot sequence number to be per-session;
otherwise, for example, we cannot correctly handle a case like this:
- All session 1 connections are lost.
- The client creates session 2. We use it for the backchannel
(since it's the only working choice).
- The client gives us a new connection to use with session 1.
- The client destroys session 2.
At this point our only choice is to go back to using session 1. When we
do so we must use the sequence number that is next for session 1. We
therefore need to maintain multiple sequence number streams.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
|
Instead of copying the sessionid, use the new cl_cb_session pointer,
which indicates which session we're using for the backchannel.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
|
The backchannel should be associated with a session, it isn't really
global to the client.
We do, however, want a pointer global to the client which tracks which
session we're currently using for client-based callbacks.
This is a first step in that direction; for now, just reshuffling of
code with no significant change in behavior.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
|
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
With all the patches we have queued in the BKL removal tree, only a
few dozen modules are left that actually rely on the BKL, and even
there are lots of low-hanging fruit. We need to decide what to do
about them, this patch illustrates one of the options:
Every user of the BKL is marked as 'depends on BKL' in Kconfig,
and the CONFIG_BKL becomes a user-visible option. If it gets
disabled, no BKL using module can be built any more and the BKL
code itself is compiled out.
The one exception is file locking, which is practically always
enabled and does a 'select BKL' instead. This effectively forces
CONFIG_BKL to be enabled until we have solved the fs/lockd
mess and can apply the patch that removes the BKL from fs/locks.c.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
|
|
cifs_tcp_ses_lock is a rwlock with protects the cifs_tcp_ses_list,
server->smb_ses_list and the ses->tcon_list. It also protects a few
ref counters in server, ses and tcon. In most cases the critical section
doesn't seem to be large, in a few cases where it is slightly large, there
seem to be really no benefit from concurrent access. I briefly considered RCU
mechanism but it appears to me that there is no real need.
Replace it with a spinlock and get rid of the last rwlock in the cifs code.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
The arguments were transposed, we want to assign the error code to
'ret', which is being returned.
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
In 'ubifs_replay_journal()' we allocate 'sbuf' for scanning the log.
However, we already have 'c->sbuf' for these purposes, so do not
allocate yet another one. This reduces UBIFS memory consumption while
recovering.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
This is a bug-fix: when we unmount, and we are currently in R/O
mode because of an error - we do not sync write-buffers, which
means we also do not cancel write-buffer timers we may possibly
have armed. This patch fixes the issue.
The issue can easily be reproduced by enabling UBIFS failure debug
mode (echo 4 > /sys/module/ubifs/parameters/debug_tsts) and
unmounting as soon as a failure happen. At some point the system
oopses because we have an armed hrtimer but UBIFS is unmounted
already.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
This is a clean-up patch which:
1. Removes explicite 'hrtimer_cancel()' after 'ubifs_wbuf_sync()' in
'ubifs_remount_ro()', because the timers will be canceled by
'ubifs_wbuf_sync()', no need to cancel them for the second time.
2. Remove "if (c->jheads)" check from 'ubifs_put_super()', because
at journal heads must always be allocated there, since we checked
earlier that we were mounted R/W, and the olny situation when
journal heads are not allocated is when mounter or re-mounted R/O.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
|
|
We were taking dcache_lock inside of i_lock, which introduces a dependency
not found elsewhere in the kernel, complicationg the vfs locking
scalability work. Since we don't actually need it here anyway, remove
it.
We only need i_lock to test for the I_COMPLETE flag, so be careful to do
so without dcache_lock held.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Convert a sequence of kmalloc and memcpy to use kmemdup.
The semantic patch that performs this transformation is:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression a,flag,len;
expression arg,e1,e2;
statement S;
@@
a =
- \(kmalloc\|kzalloc\)(len,flag)
+ kmemdup(arg,len,flag)
<... when != a
if (a == NULL || ...) S
...>
- memcpy(a,arg,len+1);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Include "super.h" outside of CONFIG_DEBUG_FS to eliminate a compiler warning:
fs/ceph/debugfs.c:266: warning: 'struct ceph_fs_client' declared inside parameter list
fs/ceph/debugfs.c:266: warning: its scope is only this definition or declaration, which is probably not what you want
fs/ceph/debugfs.c:271: warning: 'struct ceph_fs_client' declared inside parameter list
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
|
|
Switch from using the BKL explicitly to the new lock_flocks() interface.
Eventually this will turn into a spinlock.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
When the lock_kernel() turns into lock_flocks() and a spinlock, we won't
be able to do allocations with the lock held. Preallocate space without
the lock, and retry if the lock state changes out from underneath us.
Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
This is simpler and faster.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
The i_rdcache_gen value only implies we MAY have cached pages; actually
check the mapping to see if it's worth bothering with an invalidate.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Snaps in the root directory are now supported by the MDS, and harmless on
older versions.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph. This
is mostly a matter of moving files around. However, a few key pieces
of the interface change as well:
- ceph_client becomes ceph_fs_client and ceph_client, where the latter
captures the mon and osd clients, and the fs_client gets the mds client
and file system specific pieces.
- Mount option parsing and debugfs setup is correspondingly broken into
two pieces.
- The mon client gets a generic handler callback for otherwise unknown
messages (mds map, in this case).
- The basic supported/required feature bits can be expanded (and are by
ceph_fs_client).
No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
This will be used for rbd snapshots administration.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
|
|
Allow the messenger to send/receive data in a bio. This is added
so that we wouldn't need to copy the data into pages or some other buffer
when doing IO for an rbd block device.
We can now have trailing variable sized data for osd
ops. Also osd ops encoding is more modular.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
The osd requests creation are being decoupled from the
vino parameter, allowing clients using the osd to use
other arbitrary object names that are not necessarily
vino based. Also, calc_raw_layout now takes a snap id.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
Implement a pool lookup by name. This will be used by rbd.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
If the server sends us an NFS4ERR_STALE_CLIENTID while the state management
thread is busy reclaiming state, we do want to treat all state that wasn't
reclaimed before the STALE_CLIENTID as if a network partition occurred (see
the edge conditions described in RFC3530 and RFC5661).
What we do not want to do is to send an nfs4_reclaim_complete(), since we
haven't yet even started reclaiming state after the server rebooted.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
|
|
In the case of a server reboot, the state recovery thread starts by calling
nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when
the server reboots while the client is in the middle of recovery.
However, if the client has already marked the nfs4_state as requiring
reboot recovery, then the above behaviour will cause the recovery thread to
treat the open as if it was part of such an edge condition: the open will
be recovered as if it was part of a lease expiration (and all the locks
will be lost).
Fix is to remove the call to nfs4_state_mark_reclaim_reboot from
nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it
to the recovery thread to do this for us.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
|
|
NFSv4 open recovery is currently broken: since we do not clear the
state->flags states before attempting recovery, we end up with the
'can_open_cached()' function triggering. This again leads to no OPEN call
being put on the wire.
Reported-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
|
|
In the case where we lock the page, and then find out that the page has
been thrown out of the page cache, we should just return VM_FAULT_NOPAGE.
This is what block_page_mkwrite() does in these situations.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
|
|
|
|
cancel_delayed_work_sync()
flush_scheduled_work() is going away.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
- Eliminate double declaration of variable blob_len
- Modify function build_ntlmssp_auth_blob to return error code
as well as length of the blob.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>
|
|
RAW_SETBIND and RAW_GETBIND 32bit versions are fscked in interesting ways.
1) fs/compat_ioctl.c has COMPATIBLE_IOCTL(RAW_SETBIND) followed by
HANDLE_IOCTL(RAW_SETBIND, raw_ioctl). The latter is ignored.
2) on amd64 (and itanic) the damn thing is broken - we have int + u64 + u64
and layouts on i386 and amd64 are _not_ the same. raw_ioctl() would
work there, but it's never called due to (1). As it is, i386 /sbin/raw
definitely doesn't work on amd64 boxen.
3) switching to raw_ioctl() as is would *not* work on e.g. sparc64 and ppc64,
which would be rather sad, seeing that normal userland there is 32bit.
The thing is, slapping __packed on the struct in question does not DTRT -
it eliminates *all* padding. The real solution is to use compat_u64.
4) of course, all that stuff has no business being outside of raw.c in the
first place - there should be ->compat_ioctl() for /dev/rawctl instead of
messing with compat_ioctl.c.
[akpm@linux-foundation.org: coding-style fixes]
[arnd@arndb.de: port to 2.6.36]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Conflicts:
block/blk-core.c
drivers/block/loop.c
mm/swapfile.c
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
/proc/diskstats would display a strange output as follows.
$ cat /proc/diskstats |grep sda
8 0 sda 90524 7579 102154 20464 0 0 0 0 0 14096 20089
8 1 sda1 19085 1352 21841 4209 0 0 0 0 4294967064 15689 4293424691
~~~~~~~~~~
8 2 sda2 71252 3624 74891 15950 0 0 0 0 232 23995 1562390
8 3 sda3 54 487 2188 92 0 0 0 0 0 88 92
8 4 sda4 4 0 8 0 0 0 0 0 0 0 0
8 5 sda5 81 2027 2130 138 0 0 0 0 0 87 137
Its reason is the wrong way of accounting hd_struct->in_flight. When a bio is
merged into a request belongs to different partition by ELEVATOR_FRONT_MERGE.
The detailed root cause is as follows.
Assuming that there are two partition, sda1 and sda2.
1. A request for sda2 is in request_queue. Hence sda1's hd_struct->in_flight
is 0 and sda2's one is 1.
| hd_struct->in_flight
---------------------------
sda1 | 0
sda2 | 1
---------------------------
2. A bio belongs to sda1 is issued and is merged into the request mentioned on
step1 by ELEVATOR_BACK_MERGE. The first sector of the request is changed
from sda2 region to sda1 region. However the two partition's
hd_struct->in_flight are not changed.
| hd_struct->in_flight
---------------------------
sda1 | 0
sda2 | 1
---------------------------
3. The request is finished and blk_account_io_done() is called. In this case,
sda2's hd_struct->in_flight, not a sda1's one, is decremented.
| hd_struct->in_flight
---------------------------
sda1 | -1
sda2 | 1
---------------------------
The patch fixes the problem by caching the partition lookup
inside the request structure, hence making sure that the increment
and decrement will always happen on the same partition struct. This
also speeds up IO with accounting enabled, since it cuts down on
the number of lookups we have to do.
When reloading partition tables, quiesce IO to ensure that no
request references to the partition struct exists. When it is safe
to free the partition table, the IO for that device is restarted
again.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Get rid of init_MUTEX[_LOCKED]() and use sema_init() instead.
(Ported to current XFS code by <aelder@sgi.com>.)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
This patch adds support for 32bit project quota identifiers.
On disk format is backward compatible with 16bit projid numbers. projid
on disk is now kept in two 16bit values - di_projid_lo (which holds the
same position as old 16bit projid value) and new di_projid_hi (takes
existing padding) and converts from/to 32bit value on the fly.
xfs_admin (for existing fs), mkfs.xfs (for new fs) needs to be used
to enable PROJID32BIT support.
Signed-off-by: Arkadiusz Miśkiewicz <arekm@maven.pl>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Stop having two different names for many buffer functions and use
the more descriptive xfs_buf_* names directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
We're not actually passing around credentials inside XFS for a while
now, so remove all xfs_cred.h with it's cred_t typedef and all
instances of it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
This header only provides one extern that isn't actually declared
anywhere, and shadowed by a macro.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
It used to have a place when it contained an automatically generated
CVS version, but these days it's entirely superflous.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
This header has been completely unused for a couple of years.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Use the correct prototype for xfs_trans_committed instead of casting it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
These days inode64 should only control which AGs we allocate new
inodes from, while we still try to support reading all existing
inodes. To make this actually work the check ontop of xfs_iget
needs to be relaxed to allow inodes in all allocation groups instead
of just those that we allow allocating inodes from. Note that we
can't simply remove the check - it prevents us from accessing
invalid data when fed invalid inode numbers from NFS or bulkstat.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Update the per-cpu counters manually in xfs_trans_unreserve_and_mod_sb
and remove support for per-cpu counters from xfs_mod_incore_sb_batch
to simplify it. And added benefit is that we don't have to take
m_sb_lock for transactions that only modify per-cpu counters.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|
|
Export xfs_icsb_modify_counters and always use it for modifying
the per-cpu counters. Remove support for per-cpu counters from
xfs_mod_incore_sb to simplify it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
|