summaryrefslogtreecommitdiffstats
path: root/fs/f2fs
AgeCommit message (Collapse)Author
2014-03-13fs: push sync_filesystem() down to the file system's remount_fs()Theodore Ts'o
Previously, the no-op "mount -o mount /dev/xxx" operation when the file system is already mounted read-write causes an implied, unconditional syncfs(). This seems pretty stupid, and it's certainly documented or guaraunteed to do this, nor is it particularly useful, except in the case where the file system was mounted rw and is getting remounted read-only. However, it's possible that there might be some file systems that are actually depending on this behavior. In most file systems, it's probably fine to only call sync_filesystem() when transitioning from read-write to read-only, and there are some file systems where this is not needed at all (for example, for a pseudo-filesystem or something like romfs). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Jan Kara <jack@suse.cz> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anders Larsen <al@alarsen.net> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: xfs@oss.sgi.com Cc: linux-btrfs@vger.kernel.org Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net Cc: fuse-devel@lists.sourceforge.net Cc: cluster-devel@redhat.com Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: linux-nfs@vger.kernel.org Cc: linux-nilfs@vger.kernel.org Cc: linux-ntfs-dev@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Cc: reiserfs-devel@vger.kernel.org
2014-03-12f2fs: check upper bound of ino value in f2fs_nfs_get_inodeChao Yu
Upper bound checking of ino should be added to f2fs_nfs_get_inode, so unneeded process before do_read_inode in f2fs_iget could be avoided when ino is invalid. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-12f2fs: introduce f2fs_has_inline_xattr for better readabilityChao Yu
This patch introduces a help function f2fs_has_inline_xattr for better readability. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-11f2fs: recover inline xattr data in roll-forward processChao Yu
Previously we do not recover inline xattr data of inode after power-cut, so inline xattr data may be lost. We should recover the data during the roll-forward process. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: optimize restore_node_summary slightlyGu Zheng
Previously, we ra_sum_pages to pre-read contiguous pages as more as possible, and if we fail to alloc more pages, an ENOMEM error will be reported upstream, even though we have alloced some pages yet. In fact, we can use the available pages to do the job partly, and continue the rest in the following circle. Only reporting ENOMEM upstream if we really can not alloc any available page. And another fix is ignoring dealing with the following pages if an EIO occurs when reading page from page_list. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: modify the flow for better neat code] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: format segment_info's show for better legibilityGu Zheng
The original segment_info's show is a bit out-of-format: [root@guz Demoes]# cat /proc/fs/f2fs/loop0/segment_info 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...... 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 [root@guz Demoes]# so we fix it here for better legibility. [root@guz Demoes]# cat /proc/fs/f2fs/loop0/segment_info 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...... 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 [root@guz Demoes]# Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: remove the unused ctor argument of f2fs_kmem_cache_create()Gu Zheng
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-10f2fs: update start nid only once each circleGu Zheng
Integrated a couple of minor changes for better readability suggested by Chao Yu. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-05f2fs: fix wrong kernel coding styleJaegeuk Kim
This patch includes a simple fix to adjust coding style. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-03-03f2fs: fix to write node pages with WRITE_SYNCJaegeuk Kim
This patch fixes performance regression of dbench reported by Alex <hbx7d@yandex.com>. This issue was revealed by Phoronix tests results: http://www.phoronix.com/scan.php?page=article&item=linux_314_ssdfs&num=2 It turns out that we need to assign WRITE_SYNC to the node writes, if fsync is triggered. The performance numbers are like below, which is measured by Alex. 1. 355MB/s ext4 2. 225MB/s f2fs : WRITE for node writes 3. 525MB/s f2fs : WRITE_SYNC for node writes Reported-And-Tested-by: Alex <hbx7d@yandex.com>. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-28f2fs: fix dirty page accounting when redirtyChao Yu
We should de-account dirty counters for page when redirty in ->writepage(). Wu Fengguang described in 'commit 971767caf632190f77a40b4011c19948232eed75': "writeback: fix dirtied pages accounting on redirty De-account the accumulative dirty counters on page redirty. Page redirties (very common in ext4) will introduce mismatch between counters (a) and (b) a) NR_DIRTIED, BDI_DIRTIED, tsk->nr_dirtied b) NR_WRITTEN, BDI_WRITTEN This will introduce systematic errors in balanced_rate and result in dirty page position errors (ie. the dirty pages are no longer balanced around the global/bdi setpoints)." Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27f2fs: use existing macro to clean up some codesChao Yu
This patch use existing macro F2FS_INODE/NEXT_FREE_BLKADDR to clean up some codes. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27f2fs: readahead contiguous SSA blocks for f2fs_gcChao Yu
If there are multi segments in one section, we will read those SSA blocks which have contiguous address one by one in f2fs_gc. It may lost performance, let's read ahead SSA blocks by merge multi read request. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27f2fs: add an sysfs entry to control the directory levelJaegeuk Kim
This patch adds an sysfs entry to control dir_level used by the large directory. The description of this entry is: dir_level This parameter controls the directory level to support large directory. If a directory has a number of files, it can reduce the file lookup latency by increasing this dir_level value. Otherwise, it needs to decrease this value to reduce the space overhead. The default value is 0. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27f2fs: introduce large directory supportJaegeuk Kim
This patch introduces an i_dir_level field to support large directory. Previously, f2fs maintains multi-level hash tables to find a dentry quickly from a bunch of chiild dentries in a directory, and the hash tables consist of the following tree structure as below. In Documentation/filesystems/f2fs.txt, ---------------------- A : bucket B : block N : MAX_DIR_HASH_DEPTH ---------------------- level #0 | A(2B) | level #1 | A(2B) - A(2B) | level #2 | A(2B) - A(2B) - A(2B) - A(2B) . | . . . . level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . | . . . . level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B) But, if we can guess that a directory will handle a number of child files, we don't need to traverse the tree from level #0 to #N all the time. Since the lower level tables contain relatively small number of dentries, the miss ratio of the target dentry is likely to be high. In order to avoid that, we can configure the hash tables sparsely from level #0 like this. level #0 | A(2B) - A(2B) - A(2B) - A(2B) level #1 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . | . . . . level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B) . | . . . . level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B) With this structure, we can skip the ineffective tree searches in lower level hash tables. This patch adds just a facility for this by introducing i_dir_level in f2fs_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-27f2fs: remove costly bit operations for f2fs_find_entryJaegeuk Kim
It turns out that a bit operation like find_next_bit is not always fast enough for f2fs_find_entry. Instead, it is pretty much simple and fast to traverse each dentries. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: implement a lock-free stat_showJaegeuk Kim
The stat_show is just to show the current status of f2fs. So, we can remove all the there-in locks. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: introduce a radix_tree for the free_nid listJaegeuk Kim
This patch introduces a radix tree for the list of free_nids, which enhances the performance on free nid management. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: introduce help macro on_build_free_nids()Gu Zheng
Introduce help macro on_build_free_nids() which just uses build_lock to judge whether the building free nid is going, so that we can remove the on_build_free_nids field from f2fs_sb_info. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: remove an unnecessary white line removal] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: fix to mark the checkpointed nat entry correctlyJaegeuk Kim
The nat cache entry maintains a status whether it is checkpointed or not. So, if a new cache entry is loaded from the last checkpoint, nat_entry->checkpointed should be true. If the cache entry is modified as being dirty, nat_entry->checkpoint should be false. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: fix to do build_stat prior to the recovery procedureJaegeuk Kim
At the end of the recovery procedure, write_checkpoint is called and updates the cp count which is managed by f2fs stat. But, previously build_stat() is called after the recovery procedure, which results in: BUG: unable to handle kernel NULL pointer dereference at 000000000000012c IP: [<ffffffffa03b1030>] write_checkpoint+0x720/0xbc0 [f2fs] Call Trace: [<ffffffff810a6b44>] ? mark_held_locks+0x74/0x140 [<ffffffff8109a3e0>] ? __init_waitqueue_head+0x60/0x60 [<ffffffffa03bf036>] recover_fsync_data+0x656/0xf20 [f2fs] [<ffffffff812ee3eb>] ? security_d_instantiate+0x1b/0x30 [<ffffffffa03aeb4d>] f2fs_fill_super+0x94d/0xa00 [f2fs] [<ffffffff811a9825>] mount_bdev+0x1a5/0x1f0 [<ffffffff8114915e>] ? __get_free_pages+0xe/0x40 [<ffffffffa03ae200>] ? f2fs_remount+0x130/0x130 [f2fs] [<ffffffffa03aa575>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff811aa713>] mount_fs+0x43/0x1b0 [<ffffffff811c7124>] vfs_kern_mount+0x74/0x160 [<ffffffff811c5cb1>] ? __get_fs_type+0x51/0x60 [<ffffffff811c9727>] do_mount+0x237/0xb50 [<ffffffff811c936a>] ? copy_mount_options+0x3a/0x170 So, this patche changes the order of recovery_fsync_data() and f2fs_build_stats(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-24f2fs: fix not to write data pages on the page reclaiming pathJaegeuk Kim
Even if f2fs_write_data_page is called by the page reclaiming path, we should not write the page to provide enough free segments for the worst case scenario. Otherwise, f2fs can face with no free segment while gc is conducted, resulting in: ------------[ cut here ]------------ kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/segment.c:565! RIP: 0010:[<ffffffffa02c3b11>] [<ffffffffa02c3b11>] new_curseg+0x331/0x340 [f2fs] Call Trace: allocate_segment_by_default+0x204/0x280 [f2fs] allocate_data_block+0x108/0x210 [f2fs] write_data_page+0x8a/0xc0 [f2fs] do_write_data_page+0xe1/0x2a0 [f2fs] move_data_page+0x8a/0xf0 [f2fs] f2fs_gc+0x446/0x970 [f2fs] f2fs_balance_fs+0xb6/0xd0 [f2fs] f2fs_write_begin+0x50/0x350 [f2fs] ? unlock_page+0x27/0x30 ? unlock_page+0x27/0x30 generic_file_buffered_write+0x10a/0x280 ? file_update_time+0xa3/0xf0 __generic_file_aio_write+0x1c8/0x3d0 ? generic_file_aio_write+0x52/0xb0 ? generic_file_aio_write+0x52/0xb0 generic_file_aio_write+0x65/0xb0 do_sync_write+0x5a/0x90 vfs_write+0xc5/0x1f0 SyS_write+0x55/0xa0 system_call_fastpath+0x16/0x1b Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix the calculation of max_nidsJaegeuk Kim
Total nids that f2fs can use should not include 0, nid for node inode, and nid for meta inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: show counts of checkpoint in statusChangman Lee
This patch shows the counts of checkpoint in f2fs' status. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: introduce ra_meta_pages to readahead CP/NAT/SIT pagesChao Yu
This patch help us to cleanup the readahead code by merging ra_{sit,nat}_pages function into ra_meta_pages. Additionally the new function is used to readahead cp block in recover_orphan_inodes. Change log from v1: o fix a deadloop bug pointed by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: use inode mutex to keep atomicity of f2fs_fallocChao Yu
Previously without protection of inode mutex, f2fs_falloc and other data correlated operations will interfere with each other. So let's use inode mutex to keep atomicity of f2fs_falloc. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: clean up redundant function callJaegeuk Kim
This patch integrates inode_[inc|dec]_dirty_dents with inc_page_count to remove redundant calls. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix f2fs_write_meta_page at no checkpoint statusJaegeuk Kim
If f2fs entered errorneous checkpoint status, it should skip writing meta pages instead of redirtying the pages out. Otherwise, it cannot unmount the partition even though f2fs is under read-only status. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix to truncate dentry pages in the error caseJaegeuk Kim
When a new directory is allocated, if an error is occurred, we should truncate preallocated dentry pages too. This bug was reported by Andrey Tsyvarev after a while as follows. mkdir()-> f2fs_add_link()-> init_inode_metadata()-> f2fs_init_acl()-> f2fs_get_acl()-> f2fs_getxattr()-> read_all_xattrs() fails. Also there was a BUG_ON triggered after the fault in mkdir()-> f2fs_add_link()-> init_inode_metadata()-> remove_inode_page() -> f2fs_bug_on(inode->i_blocks != 0 && inode->i_blocks != 1); But, previous patch wasn't perfect to resolve that bug, so the following bug report was also submitted. kernel BUG at fs/f2fs/inode.c:274! Call Trace: [<ffffffff811fde03>] evict+0xa3/0x1a0 [<ffffffff811fe615>] iput+0xf5/0x180 [<ffffffffa01c7f63>] f2fs_mkdir+0xf3/0x150 [f2fs] [<ffffffff811f2a77>] vfs_mkdir+0xb7/0x160 [<ffffffff811f36bf>] SyS_mkdir+0x5f/0xc0 [<ffffffff81680769>] system_call_fastpath+0x16/0x1b Finally, this patch resolves all the issues like below. If an error is occurred after make_empty_dir(), 1. truncate_inode_pages() The make_bad_inode() prior to iput() will change i_mode to S_IFREG, which means that f2fs will not decrement fi->dirty_dents during f2fs_evict_inode. But, by calling it here, we can do that. 2. truncate_blocks() Preallocated dentry pages are trucated here to sync i_blocks. 3. remove_dirty_dir_inode() Remove this directory inode from the list. Reported-and-Tested-by: Andrey Tsyvarev <tsyvarev@ispras.ru> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix a build warningJaegeuk Kim
This patch modifies flow a little bit to avoid the following build warnings. src/fs/f2fs/recovery.c: In function ‘check_index_in_prev_nodes’: src/fs/f2fs/recovery.c:288:51: warning: ‘sum.<U5390>.<U52f8>.ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized] src/fs/f2fs/recovery.c:260:23: warning: ‘sum.nid’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: clean up with a macroJaegeuk Kim
This patch adds GET_BLKOFF_FROM_SEG0 to clean up some codes. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix the potential mismatch between dir's i_size and i_blocksJaegeuk Kim
This is the erroneous scenario. i_size on-disk i_size i_blocks __f2fs_add_link() 4096 4096 2 get_new_data_page 8192 4096 3 -ENOSPC = init_inode_metadata checkpoint - 4096 3 POR and reboot __f2fs_add_link() 4096 4096 3 page = get_new_data_page (page->index = 1 by NEW_ADDR) add a dentry to the page successfully f2fs_rmdir() f2fs_empty_dir() 4096 4096 3 f2fs_unlink() goes, since there is no valid dentry due to i_size = 4096. But, still there is one dentry in page->index = 1. So this patch moves the code to write dir->i_size into on-disk i_size in order to sync dir's i_size, on-disk i_size, and its i_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: remove the ugly pointer conversionJaegeuk Kim
This patch modifies the use of bi_private to remove pointer chasing for sbi. Previously, we had a bi_private structure, but it needs memory allocation. So this patch uses bi_private by the sbi pointer and adds a completion pointer into the sbi. This can achieve no memory allocation and nice use of the bi_private. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: fix to recover xattr node blockJaegeuk Kim
If a new xattr node page was allocated and its inode is fsynced, we should recover the xattr node page during the roll-forward process after power-cut. But, previously, f2fs didn't handle that case, resulting in kernel panic as follows reported by Tom Li. BUG: unable to handle kernel paging request at ffffc9001c861a98 IP: [<ffffffffa0295236>] check_index_in_prev_nodes+0x86/0x2d0 [f2fs] Call Trace: [<ffffffff815ece9b>] ? printk+0x48/0x4a [<ffffffffa029626a>] recover_fsync_data+0xdca/0xf50 [f2fs] [<ffffffffa02873ae>] f2fs_fill_super+0x92e/0x970 [f2fs] [<ffffffff8112c9f8>] mount_bdev+0x1b8/0x200 [<ffffffffa0286a80>] ? f2fs_remount+0x130/0x130 [f2fs] [<ffffffffa0285e40>] f2fs_mount+0x10/0x20 [f2fs] [<ffffffff8112d4de>] mount_fs+0x3e/0x1b0 [<ffffffff810ef4eb>] ? __alloc_percpu+0xb/0x10 [<ffffffff8114761f>] vfs_kern_mount+0x6f/0x120 [<ffffffff811497b9>] do_mount+0x259/0xa90 [<ffffffff810ead1d>] ? memdup_user+0x3d/0x80 [<ffffffff810eadb3>] ? strndup_user+0x53/0x70 [<ffffffff8114a2c9>] SyS_mount+0x89/0xd0 [<ffffffff815feae2>] system_call_fastpath+0x16/0x1b This patch adds a recovery function of xattr node pages. Reported-by: Tom Li <biergaizi@members.fsf.org> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: handle dirty segments inside refresh_sit_entryJaegeuk Kim
This patch cleans up the refresh_sit_entry to handle locate_dirty_segments. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-02-17f2fs: update_inode_page should be done all the timeJaegeuk Kim
In order to make fs consistency, update_inode_page should not be failed all the time. Otherwise, it is possible to lose some metadata in the inode like a link count. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-30Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block IO changes from Jens Axboe: "The major piece in here is the immutable bio_ve series from Kent, the rest is fairly minor. It was supposed to go in last round, but various issues pushed it to this release instead. The pull request contains: - Various smaller blk-mq fixes from different folks. Nothing major here, just minor fixes and cleanups. - Fix for a memory leak in the error path in the block ioctl code from Christian Engelmayer. - Header export fix from CaiZhiyong. - Finally the immutable biovec changes from Kent Overstreet. This enables some nice future work on making arbitrarily sized bios possible, and splitting more efficient. Related fixes to immutable bio_vecs: - dm-cache immutable fixup from Mike Snitzer. - btrfs immutable fixup from Muthu Kumar. - bio-integrity fix from Nic Bellinger, which is also going to stable" * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits) xtensa: fixup simdisk driver to work with immutable bio_vecs block/blk-mq-cpu.c: use hotcpu_notifier() blk-mq: for_each_* macro correctness block: Fix memory leak in rw_copy_check_uvector() handling bio-integrity: Fix bio_integrity_verify segment start bug block: remove unrelated header files and export symbol blk-mq: uses page->list incorrectly blk-mq: use __smp_call_function_single directly btrfs: fix missing increment of bi_remaining Revert "block: Warn and free bio if bi_end_io is not set" block: Warn and free bio if bi_end_io is not set blk-mq: fix initializing request's start time block: blk-mq: don't export blk_mq_free_queue() block: blk-mq: make blk_sync_queue support mq block: blk-mq: support draining mq queue dm cache: increment bi_remaining when bi_end_io is restored block: fixup for generic bio chaining block: Really silence spurious compiler warnings block: Silence spurious compiler warnings block: Kill bio_pair_split() ...
2014-01-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted stuff; the biggest pile here is Christoph's ACL series. Plus assorted cleanups and fixes all over the place... There will be another pile later this week" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits) __dentry_path() fixes vfs: Remove second variable named error in __dentry_path vfs: Is mounted should be testing mnt_ns for NULL or error. Fix race when checking i_size on direct i/o read hfsplus: remove can_set_xattr nfsd: use get_acl and ->set_acl fs: remove generic_acl nfs: use generic posix ACL infrastructure for v3 Posix ACLs gfs2: use generic posix ACL infrastructure jfs: use generic posix ACL infrastructure xfs: use generic posix ACL infrastructure reiserfs: use generic posix ACL infrastructure ocfs2: use generic posix ACL infrastructure jffs2: use generic posix ACL infrastructure hfsplus: use generic posix ACL infrastructure f2fs: use generic posix ACL infrastructure ext2/3/4: use generic posix ACL infrastructure btrfs: use generic posix ACL infrastructure fs: make posix_acl_create more useful fs: make posix_acl_chmod more useful ...
2014-01-25f2fs: use generic posix ACL infrastructureChristoph Hellwig
f2fs has some weird mode bit handling, so still using the old chmod code for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-25fs: make posix_acl_create more usefulChristoph Hellwig
Rename the current posix_acl_created to __posix_acl_create and add a fully featured helper to set up the ACLs on file creation that uses get_acl(). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-25fs: make posix_acl_chmod more usefulChristoph Hellwig
Rename the current posix_acl_chmod to __posix_acl_chmod and add a fully featured ACL chmod helper that uses the ->set_acl inode operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-23f2fs: drop obsolete node page when it is truncatedJaegeuk Kim
If a node page is trucated, we'd better drop the page in the node_inode's page cache for better memory footprint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: introduce NODE_MAPPING for code consistencyJaegeuk Kim
This patch adds NODE_MAPPING which is similar as META_MAPPING introduced by Gu Zheng. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: remove the orphan block page arrayGu Zheng
As the orphan_blocks may be max to 504, so it is not security and rigorous to store such a large array in the kernel stack as Dan Carpenter said. In fact, grab_meta_page has locked the page in the page cache, and we can use find_get_page() to fetch the page safely in the downstream, so we can remove the page array directly. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: add help function META_MAPPINGGu Zheng
Introduce help function META_MAPPING() to get the cache meta blocks' address space. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: move a branch for code redabilityJaegeuk Kim
This patch moves a function in f2fs_delete_entry for code readability. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-22f2fs: call mark_inode_dirty to flush dirty pagesJaegeuk Kim
If a dentry page is updated, we should call mark_inode_dirty to add the inode into the dirty list, so that its dentry pages are flushed to the disk. Otherwise, the inode can be evicted without flush. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-20f2fs: clean checkpatch warningsChris Fries
Fixed a variety of trivial checkpatch warnings. The only delta should be some minor formatting on log strings that were split / too long. Signed-off-by: Chris Fries <cfries@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-16f2fs: missing REQ_META and REQ_PRIO when sync_meta_pages(META_FLUSH)Changman Lee
Doing sync_meta_pages with META_FLUSH when checkpoint, we overide rw using WRITE_FLUSH_FUA. At this time, we also should set REQ_META|REQ_PRIO. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2014-01-16f2fs: avoid f2fs_balance_fs call during pageoutJaegeuk Kim
This patch should resolve the following bug. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 3.13.0-rc5.f2fs+ #6 Not tainted --------------------------------------------------------- kswapd0/41 just changed the state of lock: (&sbi->gc_mutex){+.+.-.}, at: [<ffffffffa030503e>] f2fs_balance_fs+0xae/0xd0 [f2fs] but this lock took another, RECLAIM_FS-READ-unsafe lock in the past: (&sbi->cp_rwsem){++++.?} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Chain exists of: &sbi->gc_mutex --> &sbi->cp_mutex --> &sbi->cp_rwsem Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->cp_rwsem); local_irq_disable(); lock(&sbi->gc_mutex); lock(&sbi->cp_mutex); <Interrupt> lock(&sbi->gc_mutex); *** DEADLOCK *** This bug is due to the f2fs_balance_fs call in f2fs_write_data_page. If f2fs_write_data_page is triggered by wbc->for_reclaim via kswapd, it should not call f2fs_balance_fs which tries to get a mutex grabbed by original syscall flow. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>