Age | Commit message (Collapse) | Author |
|
Signed-off-by: Jan Mrazek <email@honzamrazek.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We can be more aggressive about this, if we are clever and careful. This is subtle.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
there will be one more change of ->kill() calling conventions; this
isn't final.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
There are only 3 callers and quite a bit of that thing is executed
exactly in one of those. Just lift it there...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
no users left
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
What we want is to have non-counting references to children in
pagecache of parent directory, and avoid picking them after a child
has been freed. Fine, so let's just have ->d_prune() clear
parent's inode "has directory contents in page cache" flag.
That way we don't need ->d_fsdata for storing offsets, so we can
use it as a quick and dirty "is it referenced from page cache"
flag.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes - deadlock in CIFS and build breakage in cris serial
driver (resurfaced f_dentry in there)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
VFS: Convert file->f_dentry->d_inode to file_inode()
fix deadlock in cifs_ioctl_clone()
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and don't bother with dput(dentry) in the former and with
dget(dentry) preceding all its calls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and trim the arguments list
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
dev is always zero, dir was only used to get its ->i_sb, which is
equal to ->d_sb of dentry...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Ensure that we deal correctly with the case where the server sends us a
newer instance of the same delegation. If the stateids match, but the
sequence numbers differ, then treat the new delegation as if it were
an atomic upgrade.
Signed-off-by: Trond Myklebust <Trond.Myklebust@primarydata.com>
|
|
Replace the current code with something that is a little closer to what
net/sunrpc/auth_unix.c uses.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Optimise the layout return on close code by ensuring that
1) Add a check for whether we hold a layout before taking any spinlocks
2) Only take the spin lock once
3) Use nfs_state->state to speed up open file checks
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Note, however, that we still serialise on the open stateid if the lock
stateid is unconfirmed. Hopefully that will not prove too much of a
burden for first time locks; it should leave the ability to parallelise
OPENs unchanged, since they no longer call the serialisation primitives.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Ensure that we test the lock stateid remained unchanged while we were
updating the VFS tracking of the byte range lock. Have the process
replay the lock to the server if we detect that was not the case.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
This patch ensures that the server cannot reorder our LOCK/LOCKU
requests if they are sent in parallel on the wire.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
The original text in RFC3530 was terribly confusing since it conflated
lockowners and lock stateids. RFC3530bis clarifies that you must use
open_to_lock_owner when there is no lock state for that file+lockowner
combination.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
When we update the lock stateid, we really do need to ensure that this is
done under the state->state_lock, and that we are indeed only updating
confirmed locks with a newer version of the same stateid.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Remove the serialisation of OPEN/OPEN_DOWNGRADE and CLOSE calls for the
case of NFSv4.1 and newer.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
When we relax the sequencing on the NFSv4.1 OPEN/CLOSE code, we will want
to use the value NULL to indicate that no sequencing is needed.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
If an OPEN RPC call races with a CLOSE or OPEN_DOWNGRADE so that it
updates the nfs_state structure before the CLOSE/OPEN_DOWNGRADE has
a chance to do so, then we know that the state->flags need to be
recalculated from scratch.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We have a few fixes in my for-linus branch.
Qu Wenruo's batch fix a regression between some our merge window pull
and the inode_cache feature. The rest are smaller bugs"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
btrfs: Fix the bug that fs_info->pending_changes is never cleared.
btrfs: fix state->private cast on 32 bit machines
Btrfs: fix race deleting block group from space_info->ro_bgs list
Btrfs: fix incorrect freeing in scrub_stripe
btrfs: sync ioctl, handle errors after transaction start
|
|
If we are to remove the serialisation of OPEN/CLOSE, then we need to
ensure that the stateid sent as part of a CLOSE operation does not
change after we test the state in nfs4_close_prepare.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The BKL is completely out of the picture in the lockd and sunrpc code
these days. Update the antiquated comments that refer to it.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Someone with a weird time_t happened to notice this, it shouldn't really
manifest till 2038. It may not be our ownly year-2038 problem.
Reported-by: Aaron Pace <Aaron.Pace@alcatel-lucent.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
In order to ensure that filenames are not released before the audit
subsystem is done with the strings there are a number of hacks built
into the fs and audit subsystems around getname() and putname(). To
say these hacks are "ugly" would be kind.
This patch removes the filename hackery in favor of a more
conventional reference count based approach. The diffstat below tells
most of the story; lots of audit/fs specific code is replaced with a
traditional reference count based approach that is easily understood,
even by those not familiar with the audit and/or fs subsystems.
CC: viro@zeniv.linux.org.uk
CC: linux-fsdevel@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Enable recording of filenames in getname_kernel() and remove the
kludgy workaround in __audit_inode() now that we have proper filename
logging for kernel users.
CC: viro@zeniv.linux.org.uk
CC: linux-fsdevel@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
a) make it accept ERR_PTR() as filename (and return its PTR_ERR() in that case)
b) make it putname() the sucker in the end otherwise
simplifies life for callers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
There are several areas in the kernel that create temporary filename
objects using the following pattern:
int func(const char *name)
{
struct filename *file = { .name = name };
...
return 0;
}
... which for the most part works okay, but it causes havoc within the
audit subsystem as the filename object does not persist beyond the
lifetime of the function. This patch converts all of these temporary
filename objects into proper filename objects using getname_kernel()
and putname() which ensure that the filename object persists until the
audit subsystem is finished with it.
Also, a special thanks to Al Viro, Guenter Roeck, and Sabrina Dubroca
for helping resolve a difficult kernel panic on boot related to a
use-after-free problem in kern_path_create(); the thread can be seen
at the link below:
* https://lkml.org/lkml/2015/1/20/710
This patch includes code that was either based on, or directly written
by Al in the above thread.
CC: viro@zeniv.linux.org.uk
CC: linux@roeck-us.net
CC: sd@queasysnail.net
CC: linux-fsdevel@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
In preparation for expanded use in the kernel, make getname_kernel()
more useful by allowing it to handle any legal filename length.
Thanks to Guenter Roeck for his suggestion to substitute memcpy() for
strlcpy().
CC: linux@roeck-us.net
CC: viro@zeniv.linux.org.uk
CC: linux-fsdevel@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and don't bother with new struct filename when we already have one
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Running a heavy fs workload, I ran into a situation where we pass
down a page for writeback/swap that doesn't have an inode mapping:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffff8119589f>] inode_to_bdi+0xf/0x50
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: wl(O) tun cfg80211 btusb joydev hid_apple hid_generic usbhid hid bcm5974 usb_storage nouveau snd_hda_codec_hdmi snd_hda_codec_cirrus snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel kvm_intel snd_hda_controller snd_hda_codec kvm snd_hwdep snd_pcm applesmc input_polldev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd xhci_pci xhci_hcd ttm thunderbolt soundcore apple_gmux apple_bl bluetooth binfmt_misc fuse nls_iso8859_1 nls_cp437 vfat fat [last unloaded: wl]
CPU: 4 PID: 50 Comm: kswapd0 Tainted: G U O 3.19.0-rc5+ #60
Hardware name: Apple Inc. MacBookPro11,3/Mac-2BD1B31983FE1663, BIOS MBP112.88Z.0138.B02.1310181745 10/18/2013
task: ffff880462e917f0 ti: ffff880462edc000 task.ti: ffff880462edc000
RIP: 0010:[<ffffffff8119589f>] [<ffffffff8119589f>] inode_to_bdi+0xf/0x50
RSP: 0000:ffff880462edf8e8 EFLAGS: 00010282
RAX: ffffffff81c4cd80 RBX: ffffea0001b3abc0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880462edf8f8 R08: 00000000001e8500 R09: ffff880460f7cb68
R10: ffff880462edfa00 R11: 0000000000000101 R12: 0000000000000000
R13: ffffffff81c4cd98 R14: 0000000000000000 R15: ffff880460f7c9c0
FS: 0000000000000000(0000) GS:ffff88047f300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 00000002b6341000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
ffffea0001b3abc0 ffffffff81c4cd80 ffff880462edf948 ffffffff811244aa
ffffffff811565b0 ffff880460f7c9c0 ffff880462edf948 ffffea0001b3abc0
0000000000000001 ffff880462edfb40 ffff880008b999c0 ffff880460f7c9c0
Call Trace:
[<ffffffff811244aa>] __test_set_page_writeback+0x3a/0x170
[<ffffffff811565b0>] ? SyS_madvise+0x790/0x790
[<ffffffff81156bb6>] __swap_writepage+0x216/0x280
[<ffffffff8133d592>] ? radix_tree_insert+0x32/0xe0
[<ffffffff81157741>] ? swap_info_get+0x61/0xf0
[<ffffffff81159bfc>] ? page_swapcount+0x4c/0x60
[<ffffffff81156c4d>] swap_writepage+0x2d/0x50
[<ffffffff81131658>] shmem_writepage+0x198/0x2c0
[<ffffffff8112cae4>] shrink_page_list+0x464/0xa00
[<ffffffff8112d666>] shrink_inactive_list+0x266/0x500
[<ffffffff8112e215>] shrink_lruvec+0x5d5/0x720
[<ffffffff8112e3bb>] shrink_zone+0x5b/0x190
[<ffffffff8112ee3f>] kswapd+0x48f/0x8d0
[<ffffffff8112e9b0>] ? try_to_free_pages+0x4c0/0x4c0
[<ffffffff81067be2>] kthread+0xd2/0xf0
[<ffffffff81060000>] ? workqueue_congested+0x30/0x80
[<ffffffff81067b10>] ? kthread_create_on_node+0x180/0x180
[<ffffffff816b556c>] ret_from_fork+0x7c/0xb0
[<ffffffff81067b10>] ? kthread_create_on_node+0x180/0x180
Code: 00 48 c7 c7 8d 8d a4 81 e8 3f 62 eb ff e9 fc fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 5f 28 48 89 df e8 15 f8 00 00 85 c0 75 11 48 8b 83 d8 00
RIP [<ffffffff8119589f>] inode_to_bdi+0xf/0x50
RSP <ffff880462edf8e8>
CR2: 0000000000000028
---[ end trace eb0e21aa7dad3ddf ]---
Handle this in inode_to_bdi() by punting it to noop_backing_dev_info,
if mapping->host is NULL.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
btrfs_alloc_tree_block() returns an extent buffer on which a blocked lock has
been taken. Hence assign the appropriate value to path->locks[level].
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
There isn't any real use of following members of struct btrfs_root
so delete them.
struct kobject root_kobj;
struct completion kobj_unregister;
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
In function qgroup_excl_accounting(), we need to WARN when
qg->excl is less than what we want to free, same to child
and parents. But currently, for parent qgroup, the WARN_ON()
is located after freeing qg->excl. It will WARN out even we
free it normally.
This patch move this WARN_ON() before freeing qg->excl.
Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
"run_most" is not used anymore.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|