summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2014-02-24GFS2: Move log buffer accounting to transactionSteven Whitehouse
Now we have a master transaction into which other transactions are merged, the accounting can be done using this master transaction. We no longer require the superblock fields which were being used for this function. In addition, this allows for a clean up in calc_reserved() making it rather easier understand. Also, by reducing the number of variables used to track the buffers being added and removed from the journal, a number of error checks are now no longer required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-24GFS2: Move log buffer lists into transactionSteven Whitehouse
Over time, we hope to be able to improve the concurrency available in the log code. This is one small step towards that, by moving the buffer lists from the super block, and into the transaction structure, so that each transaction builds its own buffer lists. At transaction commit time, the buffer lists are merged into the currently accumulating transaction. That transaction then is passed into the before and after commit functions at journal flush time. Thus there should be no change in overall behaviour yet. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-21GFS2: Reduce struct gfs2_trans in sizeSteven Whitehouse
A couple of "int" fields were being used as boolean values so we can make them bitfields of one bit, and put them in what might otherwise be a hole in the structure with 64 bit alignment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-17GFS2: add missing newlineDavid Teigland
Log message is missing newline. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-10GFS2: Mark functions as static in gfs2/rgrp.cRashika Kheria
Mark functions as static in gfs2/rgrp.c because they are not used outside this file. This eliminates the following warning in gfs2/rgrp.c: fs/gfs2/rgrp.c:1092:5: warning: no previous prototype for ‘gfs2_rgrp_bh_get’ [-Wmissing-prototypes] fs/gfs2/rgrp.c:1157:5: warning: no previous prototype for ‘update_rgrp_lvb’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-07GFS2: Add meta readahead field in directory entriesSteven Whitehouse
The intent of this new field in the directory entry is to allow a subsequent lookup to know how many blocks, which are contiguous with the inode, contain metadata which relates to the inode. This will then allow the issuing of a single read to read these blocks, rather than reading the inode first, and then issuing a second read for the metadata. This only works under some fairly strict conditions, since we do not have back pointers from inodes to directory entries we must ensure that the blocks referenced in this way will always belong to the inode. This rules out being able to use this system for indirect blocks, as these can change as a result of truncate/rewrite. So the idea here is to restrict this to xattr blocks only for the time being. For most inodes, that means only a single block. Also, when using ACLs and/or SELinux or other LSMs, these will be added at inode creation time so that they will be contiguous with the inode on disk and also will almost always be needed when we read the inode in for permissions checks. Once an xattr block for an inode is allocated, it will never change until the inode is deallocated. This patch adds the new field, a further patch will add the readahead in due course. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-06GFS2: Lock i_mutex and use a local gfs2_holder for fallocateBob Peterson
This patch causes GFS2 to lock the i_mutex during fallocate. It also switches from using a dinode's inode glock to using a local holder like the other GFS2 i_operations. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-06GFS2: journal data writepages updateSteven Whitehouse
GFS2 has carried what is more or less a copy of the write_cache_pages() for some time. It seems that this copy has slipped behind the core code over time. This patch brings it back uptodate, and in addition adds the tracepoint which would otherwise be missing. We could go further, and eliminate some or all of the code duplication here. The issue is that if we do that, then the function we need to split out from the existing write_cache_pages(), which will look a lot like gfs2_jdata_write_pagevec(), would land up putting quite a lot of extra variables on the stack. I know that has been a problem in the past in the writeback code path, which is why I've hesitated to do it here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-04GFS2: Allocate block for xattr at inode alloc time, if requiredSteven Whitehouse
This is another step towards improving the allocation of xattr blocks at inode allocation time. Here we take advantage of Christoph's recent work on ACLs to allocate a block for the xattrs early if we know that we will be adding ACLs to the inode later on. The advantage of that is that it is much more likely that we'll get a contiguous run of two blocks where the first is the inode and the second is the xattr block. We still have to fall back to the original system in case we don't get the requested two contiguous blocks, or in case the ACLs are too large to fit into the block. Future patches will move more of the ACL setting code further up the gfs2_inode_create() function. Also, I'd like to be able to do the same thing with the xattrs from LSMs in due course, too. That way we should be able to slowly reduce the number of independent transactions, at least in the most common cases. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-03GFS2: Plug on AIL flushSteven Whitehouse
When we do a flush of the AIL list, we are writing out what is likely to be a lot of small I/Os, which are possibly in an order which is not ideal performance-wise. Since this is done by calling filemap_fdatatwrite for each individual inode's address space there is no overall plugging going on. In addition to that, we do not always wait for AIL i/o when we flush it, so that it is possible for things to get left behind on the queue. By adding explicit plugging here, we reduce the chances of this being an issues. A quick test using the AIL flush tracepoint shows a small, but measurable improvement. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2014-02-02hpfs: optimize quad buffer loadingMikulas Patocka
HPFS needs to load 4 consecutive 512-byte sectors when accessing the directory nodes or bitmaps. We can't switch to 2048-byte block size because files are allocated in the units of 512-byte sectors. Previously, the driver would allocate a 2048-byte area using kmalloc, copy the data from four buffers to this area and eventually copy them back if they were modified. In the current implementation of the buffer cache, buffers are allocated in the pagecache. That means that 4 consecutive 512-byte buffers are stored in consecutive areas in the kernel address space. So, we don't need to allocate extra memory and copy the content of the buffers there. This patch optimizes the code to avoid copying the buffers. It checks if the four buffers are stored in contiguous memory - if they are not, it falls back to allocating a 2048-byte area and copying data there. Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-02hpfs: remember free spaceMikulas Patocka
Previously, hpfs scanned all bitmaps each time the user asked for free space using statfs. This patch changes it so that hpfs scans the bitmaps only once, remembes the free space and on next invocation of statfs it returns the value instantly. New versions of wine are hammering on the statfs syscall very heavily, making some games unplayable when they're stored on hpfs, with load times in minutes. This should be backported to the stable kernels because it fixes user-visible problem (excessive level load times in wine). Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-01afs: proc cells and rootcell are writeablePali Rohár
Both proc files are writeable and used for configuring cells. But there is missing correct mode flag for writeable files. Without this patch both proc files are read only. [ It turns out they aren't really read-only, since root can write to them even if the write bit isn't set due to CAP_DAC_OVERRIDE ] Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-01Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "A set of cifs fixes (mostly for symlinks, and SMB2 xattrs) and cleanups" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix check for regular file in couldbe_mf_symlink() [CIFS] Fix SMB2 mounts so they don't try to set or get xattrs via cifs CIFS: Cleanup cifs open codepath CIFS: Remove extra indentation in cifs_sfu_type CIFS: Cleanup cifs_mknod CIFS: Cleanup CIFSSMBOpen cifs: Add support for follow_link on dfs shares under posix extensions cifs: move unix extension call to cifs_query_symlink() cifs: Re-order M-F Symlink code cifs: Add create MFSymlinks to protocol ops struct cifs: use protocol specific call for query_mf_symlink() cifs: Rename MF symlink function names cifs: Rename and cleanup open_query_close_cifs_symlink() cifs: Fix memory leak in cifs_hardlink()
2014-02-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Several obvious fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Fix mountpoint reference leakage in linkat hfsplus: use xattr handlers for removexattr Typo in compat_sys_lseek() declaration fs/super.c: sync ro remount after blocking writers vfs: unexport the getname() symbol
2014-01-31Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights: - Fix several races in nfs_revalidate_mapping - NFSv4.1 slot leakage in the pNFS files driver - Stable fix for a slot leak in nfs40_sequence_done - Don't reject NFSv4 servers that support ACLs with only ALLOW aces" * tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: initialize the ACL support bits to zero. NFSv4.1: Cleanup NFSv4.1: Clean up nfs41_sequence_done NFSv4: Fix a slot leak in nfs40_sequence_done NFSv4.1 free slot before resending I/O to MDS nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING NFS: Fix races in nfs_revalidate_mapping sunrpc: turn warn_gssd() log message into a dprintk() NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping nfs: handle servers that support only ALLOW ACE type.
2014-01-31Fix mountpoint reference leakage in linkatOleg Drokin
Recent changes to retry on ESTALE in linkat (commit 442e31ca5a49e398351b2954b51f578353fdf210) introduced a mountpoint reference leak and a small memory leak in case a filesystem link operation returns ESTALE which is pretty normal for distributed filesystems like lustre, nfs and so on. Free old_path in such a case. [AV: there was another missing path_put() nearby - on the previous goto retry] Signed-off-by: Oleg Drokin: <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31hfsplus: use xattr handlers for removexattrChristoph Hellwig
hfsplus was already using the handlers for get and set operations, and with the removal of can_set_xattr we've now allow operations that wouldn't otherwise be allowed. With this we can also centralize the special-casing of the osx. attrs that don't have prefixes on disk in the osx xattr handlers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31fs/super.c: sync ro remount after blocking writersAndrew Ruder
Move sync_filesystem() after sb_prepare_remount_readonly(). If writers sneak in anywhere from sync_filesystem() to sb_prepare_remount_readonly() it can cause inodes to be dirtied and writeback to occur well after sys_mount() has completely successfully. This was spotted by corrupted ubifs filesystems on reboot, but appears that it can cause issues with any filesystem using writeback. Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> CC: Richard Weinberger <richard@nod.at> Co-authored-by: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Ruder <andrew.ruder@elecsyscorp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31vfs: unexport the getname() symbolJeff Layton
Leaving getname() exported when putname() isn't is a bad idea. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-31Merge tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggyLinus Torvalds
Pull jfs fix from David Kleikamp: "Minor bug fix for linux-3.14" * tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggy: jfs: fix xattr value size overflow in __jfs_setxattr
2014-01-31ceph: fix missing dput in ceph_set_aclSage Weil
Add matching dput() for d_find_alias(). Move d_find_alias() down a bit at Julia's suggestion. [ Introduced by commit 72466d0b92e0: "ceph: fix posix ACL hooks" ] Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-31cifs: Fix check for regular file in couldbe_mf_symlink()Sachin Prabhu
MF Symlinks are regular files containing content in a specified format. The function couldbe_mf_symlink() checks the mode for a set S_IFREG bit as a test to confirm that it is a regular file. This bit is also set for other filetypes and simply checking for this bit being set may return false positives. We ensure that we are actually checking for a regular file by using the S_ISREG macro to test instead. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reported-by: Neil Brown <neilb@suse.de> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <smfrench@gmail.com>
2014-01-31nfs: initialize the ACL support bits to zero.Malahal Naineni
Avoid returning incorrect acl mask attributes when the server doesn't support ACLs. Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This is a pretty big pull, and most of these changes have been floating in btrfs-next for a long time. Filipe's properties work is a cool building block for inheriting attributes like compression down on a per inode basis. Jeff Mahoney kicked in code to export filesystem info into sysfs. Otherwise, lots of performance improvements, cleanups and bug fixes. Looks like there are still a few other small pending incrementals, but I wanted to get the bulk of this in first" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits) Btrfs: fix spin_unlock in check_ref_cleanup Btrfs: setup inode location during btrfs_init_inode_locked Btrfs: don't use ram_bytes for uncompressed inline items Btrfs: fix btrfs_search_slot_for_read backwards iteration Btrfs: do not export ulist functions Btrfs: rework ulist with list+rb_tree Btrfs: fix memory leaks on walking backrefs failure Btrfs: fix send file hole detection leading to data corruption Btrfs: add a reschedule point in btrfs_find_all_roots() Btrfs: make send's file extent item search more efficient Btrfs: fix to catch all errors when resolving indirect ref Btrfs: fix protection between walking backrefs and root deletion btrfs: fix warning while merging two adjacent extents Btrfs: fix infinite path build loops in incremental send btrfs: undo sysfs when open_ctree() fails Btrfs: fix snprintf usage by send's gen_unique_name btrfs: fix defrag 32-bit integer overflow btrfs: sysfs: list the NO_HOLES feature btrfs: sysfs: don't show reserved incompat feature btrfs: call permission checks earlier in ioctls and return EPERM ...
2014-01-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull some further ceph acl cleanups from Sage Weil: "I do have a couple patches on top of what's in your tree, though, that clean up a couple duplicated lines in your fix and apply Christoph's cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: simplify ceph_{get,init}_acl ceph: remove duplicate declaration of ceph_setattr
2014-01-30ceph: simplify ceph_{get,init}_aclChristoph Hellwig
- ->get_acl only gets called after we checked for a cached ACL, so no need to call get_cached_acl again. - no need to check IS_POSIXACL in ->get_acl, without that it should never get set as all the callers that set it already have the check. - you should be able to use the full posix_acl_create in CEPH Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Sage Weil <sage@inktank.com>
2014-01-30Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block IO driver changes from Jens Axboe: - bcache update from Kent Overstreet. - two bcache fixes from Nicholas Swenson. - cciss pci init error fix from Andrew. - underflow fix in the parallel IDE pg_write code from Dan Carpenter. I'm sure the 1 (or 0) users of that are now happy. - two PCI related fixes for sx8 from Jingoo Han. - floppy init fix for first block read from Jiri Kosina. - pktcdvd error return miss fix from Julia Lawall. - removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from Michael Opdenacker. - comment typo fix for the loop driver from Olaf Hering. - potential oops fix for null_blk from Raghavendra K T. - two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing an OOM problem and a problem with handling security locked conditions * 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits) mg_disk: Spelling s/finised/finished/ null_blk: Null pointer deference problem in alloc_page_buffers mtip32xx: Correctly handle security locked condition mtip32xx: Make SGL container per-command to eliminate high order dma allocation drivers/block/loop.c: fix comment typo in loop_config_discard drivers/block/cciss.c:cciss_init_one(): use proper errnos drivers/block/paride/pg.c: underflow bug in pg_write() drivers/block/sx8.c: remove unnecessary pci_set_drvdata() drivers/block/sx8.c: use module_pci_driver() floppy: bail out in open() if drive is not responding to block0 read bcache: Fix auxiliary search trees for key size > cacheline size bcache: Don't return -EINTR when insert finished bcache: Improve bucket_prio() calculation bcache: Add bch_bkey_equal_header() bcache: update bch_bkey_try_merge bcache: Move insert_fixup() to btree_keys_ops bcache: Convert sorting to btree_keys bcache: Convert debug code to btree_keys bcache: Convert btree_iter to struct btree_keys bcache: Refactor bset_tree sysfs stats ...
2014-01-30Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...
2014-01-30Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: - Handle some loose ends from the vfs read delegation support. (For example nfsd can stop breaking leases on its own in a fewer places where it can now depend on the vfs to.) - Make life a little easier for NFSv4-only configurations (thanks to Kinglong Mee). - Fix some gss-proxy problems (thanks Jeff Layton). - miscellaneous bug fixes and cleanup * 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits) nfsd: consider CLAIM_FH when handing out delegation nfsd4: fix delegation-unlink/rename race nfsd4: delay setting current_fh in open nfsd4: minor nfs4_setlease cleanup gss_krb5: use lcm from kernel lib nfsd4: decrease nfsd4_encode_fattr stack usage nfsd: fix encode_entryplus_baggage stack usage nfsd4: simplify xdr encoding of nfsv4 names nfsd4: encode_rdattr_error cleanup nfsd4: nfsd4_encode_fattr cleanup minor svcauth_gss.c cleanup nfsd4: better VERIFY comment nfsd4: break only delegations when appropriate NFSD: Fix a memory leak in nfsd4_create_session sunrpc: get rid of use_gssp_lock sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt sunrpc: don't wait for write before allowing reads from use-gss-proxy file nfsd: get rid of unused function definition Define op_iattr for nfsd4_open instead using macro NFSD: fix compile warning without CONFIG_NFSD_V3 ...
2014-01-30nfs: fix xattr inode op pointers when disabledChristoph Hellwig
Chris Mason reported a NULL pointer derefernence in generic_getxattr() that was due to sb->s_xattr being NULL. The reason is that the nfs #ifdef's for ACL support were misplaced, and the nfs3 inode operations had the xattr operation pointers set up, even though xattrs were not actually supported. As a result, the xattr code was being called without the infrastructure having been set up. Move the #ifdef's appropriately. Reported-and-tested-by: Chris Mason <clm@fb.com> Acked-by: Al Viro viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-30ceph: remove duplicate declaration of ceph_setattrPeter Rosin
Signed-off-by: Peter Rosin <peda@lysator.liu.se> Signed-off-by: Sage Weil <sage@inktank.com>
2014-01-29Merge branch 'akpm' (patches from Andrew Morton)Linus Torvalds
Merge random fixes from Andrew Morton: "Random fixes. I have one batch remaining for -rc1, mainly zram changes which await a merge of Jens's trees" * emailed patches fron Andrew Morton akpm@linux-foundation.org>: MAINTAINERS: ADI Linux development mailing lists: change to the new server Documentation: fix multiple typo occurences s/KenelVersion/KernelVersion/ dma-debug: fix overlap detection memblock: add limit checking to memblock_virt_alloc mm/readahead.c: fix do_readahead() for no readpage(s) mm/slub.c: do not VM_BUG_ON_PAGE() for temporary on-stack pages slab: fix wrong retval on kmem_cache_create_memcg error path s390/compat: change parameter types from unsigned long to compat_ulong_t fs/compat: fix lookup_dcookie() parameter handling fs/compat: fix parameter handling for compat readv/writev syscalls mm/mempolicy.c: convert to pr_foo() mm: numa: initialise numa balancing after jump label initialisation mm/page-writeback.c: do not count anon pages as dirtyable memory mm/page-writeback.c: fix dirty_balance_reserve subtraction from dirtyable memory mm: document improved handling of swappiness==0 lib/genalloc.c: add check gen_pool_dma_alloc() if dma pointer is not NULL
2014-01-29fs/compat: fix lookup_dcookie() parameter handlingHeiko Carstens
Commit d5dc77bfeeab ("consolidate compat lookup_dcookie()") coverted all architectures to the new compat_sys_lookup_dcookie() syscall. The "len" paramater of the new compat syscall must have the type compat_size_t in order to enforce zero extension for architectures where the ABI requires that the caller of a function performed zero and/or sign extension to 64 bit of all parameters. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-29fs/compat: fix parameter handling for compat readv/writev syscallsHeiko Carstens
We got a report that the pwritev syscall does not work correctly in compat mode on s390. It turned out that with commit 72ec35163f9f ("switch compat readv/writev variants to COMPAT_SYSCALL_DEFINE") we lost the zero extension of a couple of syscall parameters because the some parameter types haven't been converted from unsigned long to compat_ulong_t. This is needed for architectures where the ABI requires that the caller of a function performed zero and/or sign extension to 64 bit of all parameters. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-29Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fanotify use-after-free fixes from Jan Kara: "Three fixes for the fanotify use after free problems guys were reporting. I have ended up with different lifetime rules for struct fanotify_event_info depending on whether it is for permission event or normal event which isn't ideal. My plan is to split these into two different structures (as permission events need larger struct anyway) which will make the rules trivial again. But that can wait for later I guess (but I can add the patch to the pile if you want), now I wanted to make -rc1 boot for these guys" [ "These guys" being Jiri Kosina and Dave Jones that reported the slab corruption issues due to incorrect object lifetimes ] * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fanotify: Fix use after free for permission events fsnotify: Do not return merged event from fsnotify_add_notify_event() fanotify: Fix use after free in mask checking
2014-01-29ceph: fix posix ACL hooksSage Weil
The merge of commit 7221fe4c2ed7 ("ceph: add acl for cephfs") raced with upstream changes in the generic POSIX ACL code (eg commit 2aeccbe957d0 "fs: add generic xattr_acl handlers" and others). Some of the fallout was fixed in commit 4db658ea0ca ("ceph: Fix up after semantic merge conflict"), but it was incomplete: the set_acl inode_operation wasn't getting set, and the prototype needed to be adjusted a bit (it doesn't take a dentry anymore). Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-29NFSv4.1: CleanupTrond Myklebust
It is now completely safe to call nfs41_sequence_free_slot with a NULL slot. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29NFSv4.1: Clean up nfs41_sequence_doneTrond Myklebust
Move the test for res->sr_slot == NULL out of the nfs41_sequence_free_slot helper and into the main function for efficiency. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29NFSv4: Fix a slot leak in nfs40_sequence_doneTrond Myklebust
The check for whether or not we sent an RPC call in nfs40_sequence_done is insufficient to decide whether or not we are holding a session slot, and thus should not be used to decide when to free that slot. This patch replaces the RPC_WAS_SENT() test with the correct test for whether or not slot == NULL. Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org # 3.12+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29NFSv4.1 free slot before resending I/O to MDSAndy Adamson
Fix a dynamic session slot leak where a slot is preallocated and I/O is resent through the MDS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-01-29Btrfs: fix spin_unlock in check_ref_cleanupChris Mason
Our goto out should have gone a little farther. Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29Btrfs: setup inode location during btrfs_init_inode_lockedChris Mason
We have a race during inode init because the BTRFS_I(inode)->location is setup after the inode hash table lock is dropped. btrfs_find_actor uses the location field, so our search might not find an existing inode in the hash table if we race with the inode init code. This commit changes things to setup the location field sooner. Also the find actor now uses only the location objectid to match inodes. For inode hashing, we just need a unique and stable test, it doesn't have to reflect the inode numbers we show to userland. Signed-off-by: Chris Mason <clm@fb.com> CC: stable@vger.kernel.org
2014-01-29Btrfs: don't use ram_bytes for uncompressed inline itemsChris Mason
If we truncate an uncompressed inline item, ram_bytes isn't updated to reflect the new size. The fixe uses the size directly from the item header when reading uncompressed inlines, and also fixes truncate to update the size as it goes. Reported-by: Jens Axboe <axboe@fb.com> Signed-off-by: Chris Mason <clm@fb.com> CC: stable@vger.kernel.org
2014-01-29Btrfs: fix btrfs_search_slot_for_read backwards iterationFilipe David Borba Manana
If the current path's leaf slot is 0, we do search for the previous leaf (via btrfs_prev_leaf) and set the new path's leaf slot to a value corresponding to the number of items - 1 of the former leaf. Fix this by using the slot set by btrfs_prev_leaf, decrementing it by 1 if it's equal to the leaf's number of items. Use of btrfs_search_slot_for_read() for backward iteration is used in particular by the send feature, which could miss items when the input leaf has less items than its previous leaf. This could be reproduced by running btrfs/007 from xfstests in a loop. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29Btrfs: do not export ulist functionsWang Shilong
There are not any users that use ulist except Btrfs,don't export them. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29Btrfs: rework ulist with list+rb_treeWang Shilong
We are really suffering from now ulist's implementation, some developers gave their try, and i just gave some of my ideas for things: 1. use list+rb_tree instead of arrary+rb_tree 2. add cur_list to iterator rather than ulist structure. 3. add seqnum into every node when they are added, this is used to do selfcheck when iterating node. I noticed Zach Brown's comments before, long term is to kick off ulist implementation, however, for now, we need at least avoid arrary from ulist. Cc: Liu Bo <bo.li.liu@oracle.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Zach Brown <zab@redhat.com> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29Btrfs: fix memory leaks on walking backrefs failureWang Shilong
When walking backrefs, we may iterate every inode's extent and add/merge them into ulist, and the caller will free memory from ulist. However, if we fail to allocate inode's extents element memory or ulist_add() fail to allocate memory, we won't add allocated memory into ulist, and the caller won't free some allocated memory thus memory leaks happen. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29Btrfs: fix send file hole detection leading to data corruptionFilipe David Borba Manana
There was a case where file hole detection was incorrect and it would cause an incremental send to override a section of a file with zeroes. This happened in the case where between the last leaf we processed which contained a file extent item for our current inode and the leaf we're currently are at (and has a file extent item for our current inode) there are only leafs containing exclusively file extent items for our current inode, and none of them was updated since the previous send operation. The file hole detection code would incorrectly consider the file range covered by these leafs as a hole. A test case for xfstests follows soon. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-01-29Btrfs: add a reschedule point in btrfs_find_all_roots()Wang Shilong
I can easily trigger the following warnings when enabling quota in my virtual machine(running Opensuse), Steps are firstly creating a subvolume full of fragment extents, and then create many snapshots (500 in my test case). [ 2362.808459] BUG: soft lockup - CPU#0 stuck for 22s! [btrfs-qgroup-re:1970] [ 2362.809023] task: e4af8450 ti: e371c000 task.ti: e371c000 [ 2362.809026] EIP: 0060:[<fa38f4ae>] EFLAGS: 00000246 CPU: 0 [ 2362.809049] EIP is at __merge_refs+0x5e/0x100 [btrfs] [ 2362.809051] EAX: 00000000 EBX: cfadbcf0 ECX: 00000000 EDX: cfadbcb0 [ 2362.809052] ESI: dd8d3370 EDI: e371dde0 EBP: e371dd6c ESP: e371dd5c [ 2362.809054] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 2362.809055] CR0: 80050033 CR2: ac454d50 CR3: 009a9000 CR4: 001407d0 [ 2362.809099] Stack: [ 2362.809100] 00000001 e371dde0 dfcc6890 f29f8000 e371de28 fa39016d 00000011 00000001 [ 2362.809105] 99bfc000 00000000 93928000 00000000 00000001 00000050 e371dda8 00000001 [ 2362.809109] f3a31000 f3413000 00000001 e371ddb8 000040a8 00000202 00000000 00000023 [ 2362.809113] Call Trace: [ 2362.809136] [<fa39016d>] find_parent_nodes+0x34d/0x1280 [btrfs] [ 2362.809156] [<fa391172>] btrfs_find_all_roots+0xb2/0x110 [btrfs] [ 2362.809174] [<fa3934a8>] btrfs_qgroup_rescan_worker+0x358/0x7a0 [btrfs] [ 2362.809180] [<c024d0ce>] ? lock_timer_base.isra.39+0x1e/0x40 [ 2362.809199] [<fa3648df>] worker_loop+0xff/0x470 [btrfs] [ 2362.809204] [<c027a88a>] ? __wake_up_locked+0x1a/0x20 [ 2362.809221] [<fa3647e0>] ? btrfs_queue_worker+0x2b0/0x2b0 [btrfs] [ 2362.809225] [<c025ebbc>] kthread+0x9c/0xb0 [ 2362.809229] [<c06b487b>] ret_from_kernel_thread+0x1b/0x30 [ 2362.809233] [<c025eb20>] ? kthread_create_on_node+0x110/0x110 By adding a reschedule point at the end of btrfs_find_all_roots(), i no longer hit these warnings. Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>