summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2008-07-28[XFS] make inode reclaim wait for log I/O to completeLachlan McIlroy
During a forced shutdown a xfs inode can be destroyed before log I/O involving that inode is complete. We need to wait for the inode to be unpinned before tearing it down. Version 2 cleans up the code a bit by relying on xfs_iflush() to do the unpinning and forced shutdown check. SGI-PV: 981240 SGI-Modid: xfs-linux-melb:xfs-kern:31326a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28[XFS] Switches xfs_vn_listxattr to set it's put_listent callback directlyChristoph Hellwig
and not go through xfs_attr_list. SGI-PV: 983395 SGI-Modid: xfs-linux-melb:xfs-kern:31324a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Factor out code for whether inode has attributes or not.Christoph Hellwig
SGI-PV: 983394 SGI-Modid: xfs-linux-melb:xfs-kern:31323a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Pack some shortform dir2 structures for the ARM old ABIEric Sandeen
architecture. This should fix the longstanding issues with xfs and old ABI arm boxes, which lead to various asserts and xfs shutdowns, and for which an (incorrect) patch has been floating around for years. I've verified this patch by comparing the on-disk structure layouts using pahole from the dwarves package, as well as running through a bit of xfsqa under qemu-arm, modified so that the check/repair phase after each test actually executes check/repair from the x86 host, on the filesystem populated by the arm emulator. Thus far it all looks good. There are 2 other structures with extra padding at the end, but they don't seem to cause trouble. I suppose they could be packed as well: xfs_dir2_data_unused_t and xfs_dir2_sf_t. Note that userspace needs a similar treatment, and any filesystems which were running with the previous rogue "fix" will now see corruption (either in the kernel, or during xfs_repair) with this fix properly in place; it may be worth teaching xfs_repair to identify and fix that specific issue. SGI-PV: 982930 SGI-Modid: xfs-linux-melb:xfs-kern:31280a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Use the generic xattr methods.Lachlan McIlroy
Use the generic set, get and removexattr methods and supply the s_xattr array with fine-grained handlers. All XFS/Linux highlevel attr handling is rewritten from scratch and placed into fs/xfs/linux-2.6/xfs_xattr.c so that it's separated from the generic low-level code. SGI-PV: 982343 SGI-Modid: xfs-linux-melb:xfs-kern:31234a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Invalidate dentry in unlink/rmdir if in case-insensitive modeBarry Naujok
The vfs_unlink/d_delete functionality in the Linux VFS make the dentry negative if it is the only inode being referenced. Case-insensitive mode doesn't work with negative dentries, so if using CI-mode, invalidate the dentry on unlink/rmdir. SGI-PV: 983102 SGI-Modid: xfs-linux-melb:xfs-kern:31308a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] Zero uninitialised xfs_da_args structure in xfs_dir2.cBarry Naujok
Fixes a problem in the xfs_dir2_remove and xfs_dir2_replace paths which intenally call directory format specific lookup funtions that assume args->cmpresult is zeroed. SGI-PV: 982606 SGI-Modid: xfs-linux-melb:xfs-kern:31268a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] Remove d_add call for an ENOENT lookup return codeBarry Naujok
SGI-PV: 981521 SGI-Modid: xfs-linux-melb:xfs-kern:31214a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28[XFS] kmem_free and kmem_realloc to use const void *Barry Naujok
SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31212a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] XFS: ASCII case-insensitive supportBarry Naujok
Implement ASCII case-insensitive support. It's primary purpose is for supporting existing filesystems that already use this case-insensitive mode migrated from IRIX. But, if you only need ASCII-only case-insensitive support (ie. English only) and will never use another language, then this mode is perfectly adequate. ASCII-CI is implemented by generating hashes based on lower-case letters and doing lower-case compares. It implements a new xfs_nameops vector for doing the hashes and comparisons for all filename operations. To create a filesystem with this CI mode, use: # mkfs.xfs -n version=ci <device> SGI-PV: 981516 SGI-Modid: xfs-linux-melb:xfs-kern:31209a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] Return case-insensitive match for dentry cacheBarry Naujok
This implements the code to store the actual filename found during a lookup in the dentry cache and to avoid multiple entries in the dcache pointing to the same inode. To avoid polluting the dcache, we implement a new directory inode operations for lookup. xfs_vn_ci_lookup() stores the correct case name in the dcache. The "actual name" is only allocated and returned for a case- insensitive match and not an actual match. Another unusual interaction with the dcache is not storing negative dentries like other filesystems doing a d_add(dentry, NULL) when an ENOENT is returned. During the VFS lookup, if a dentry returned has no inode, dput is called and ENOENT is returned. By not doing a d_add, this actually removes it completely from the dcache to be reused. create/rename have to be modified to support unhashed dentries being passed in. SGI-PV: 981521 SGI-Modid: xfs-linux-melb:xfs-kern:31208a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28dcache: Add case-insensitive support d_ci_add() routineBarry Naujok
This add a dcache entry to the dcache for lookup, but changing the name that is associated with the entry rather than the one passed in to the lookup routine. First, it sees if the case-exact match already exists in the dcache and uses it if one exists. Otherwise, it allocates a new node with the new name and splices it into the dcache. Original code from ntfs_lookup in fs/ntfs/namei.c by Anton Altaparmakov. Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Acked-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] Add op_flags field and helpers to xfs_da_argsBarry Naujok
The end of the xfs_da_args structure has 4 unsigned char fields for true/false information on directory and attr operations using the xfs_da_args structure. The following converts these 4 into a op_flags field that uses the first 4 bits for these fields and allows expansion for future operation information (eg. case-insensitive lookup request). SGI-PV: 981520 SGI-Modid: xfs-linux-melb:xfs-kern:31206a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS] Name operation vector for hash and compareBarry Naujok
Adds two pieces of functionality for the basis of case-insensitive support in XFS: 1. A comparison result enumerated type: xfs_dacmp. It represents an exact match, case-insensitive match or no match at all. This patch only implements different and exact results. 2. xfs_nameops vector for specifying how to perform the hash generation of filenames and comparision methods. In this patch the hash vector points to the existing xfs_da_hashname function and the comparison method does a length compare, and if the same, does a memcmp and return the xfs_dacmp result. All filename functions that use the hash (create, lookup remove, rename, etc) now use the xfs_nameops.hashname function and all directory lookup functions also use the xfs_nameops.compname function. The lookup functions also handle case-insensitive results even though the default comparison function cannot return that. And important aspect of the lookup functions is that an exact match always has precedence over a case-insensitive. So while a case-insensitive match is found, we have to keep looking just in case there is an exact match. In the meantime, the info for the first case-insensitive match is retained if no exact match is found. SGI-PV: 981519 SGI-Modid: xfs-linux-melb:xfs-kern:31205a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28[XFS]Eric Sandeen
de-duplicate calls to xfs_attr_trace_enter Every call to xfs_attr_trace_enter() shares the exact same 16 args in the middle... just send in the context pointer and let the next level down split it into the ktrace. Compile tested only. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:31200a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] add missing call to xfs_filestream_unmount on xfs_mountfs failureChristoph Hellwig
SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31199a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] rename error2 goto label in xfs_fs_fill_superChristoph Hellwig
SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31198a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] kill calls to xfs_binval in the mount error pathChristoph Hellwig
xfs_binval aka xfs_flush_buftarg is the first thing done in xfs_free_buftarg, so there is no need to have duplicated calls just before xfs_free_buftarg in the mount failure path. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31197a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] kill xfs_mount_initChristoph Hellwig
xfs_mount_init is inlined into xfs_fs_fill_super and allocation switched to kzalloc. Plug a leak of the mount structure for most early mount failures. Move xfs_icsb_init_counters to as late as possible in the mount path and make sure to undo it so that no stale hotplug cpu notifiers are left around on mount failures. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31196a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] allow xfs_args_allocate to failChristoph Hellwig
Switch xfs_args_allocate to kzalloc and handle failures. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31195a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] add xfs_setup_devices helperChristoph Hellwig
Split setting the block and sector size out of xfs_fs_fill_super into a small helper to make xfs_fs_fill_super more readable. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31194a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] sort out opening and closing of the block devicesChristoph Hellwig
Currently closing the rt/log block device is done in the wrong spot, and far too early. So revampt it: - xfs_blkdev_put moved out of xfs_free_buftarg into the caller so that it is done after tearing down the buftarg completely. - call to xfs_unmountfs_close moved from xfs_mountfs into caller so that it's done after tearing down the filesystem completely. - xfs_unmountfs_close is renamed to xfs_close_devices and made static in xfs_super.c - opening of the block devices is split into a helper xfs_open_devices that is symetric in use to xfs_close_devices - xfs_unmountfs can now lose struct cred - error handling around device opening sanitized in xfs_fs_fill_super SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31193a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] don't call xfs_freesb from xfs_mountfs failure caseChristoph Hellwig
Freeing of the superblock is already handled in the caller, and that is more symmetric with the mount path, too. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31192a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] merge xfs_mount into xfs_fs_fill_superChristoph Hellwig
xfs_mount is already pretty linux-specific so merge it into xfs_fs_fill_super to allow for a more structured mount code in the next patches. xfs_start_flags and xfs_finish_flags also move to xfs_super.c. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31189a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] merge xfs_unmount into xfs_fs_put_super / xfs_fs_fill_superChristoph Hellwig
xfs_unmount is small and already pretty Linux specific, so merge it into the callers. The real unmount path is simplified a little by doing a WARN_ON on the xfs_unmount_flush retval directly instead of propagating the error back to the caller, and the mout failure case in simplified significantly by removing the forced shutdown case and all the dmapi events that shouldn't be sent because the dmapi mount event hasn't been sent by that time either. SGI-PV: 981951 SGI-Modid: xfs-linux-melb:xfs-kern:31188a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] kill xfs_igrow_start and xfs_igrow_finishChristoph Hellwig
xfs_igrow_start just expands to xfs_zero_eof with two asserts that are useless in the context of the only caller and some rather confusing comments. xfs_igrow_finish is just a few lines of code decorated again with useless asserts and confusing comments. Just kill those two and merge them into xfs_setattr. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31186a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] merge xfs_mntupdate into xfs_fs_remountChristoph Hellwig
xfs_mntupdate already is completely Linux specific due to the VFS flags passed in, so it might aswell be merged into xfs_fs_remount. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31185a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] kill xfs_uuid_unmountChristoph Hellwig
Quite useless wrapper that doesn't help making the code more readable. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31184a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Update valid fields in xfs_mount_log_sb()David Chinner
Recent changes to update the version number during mount (attr2 stuff) failed to change the assert that checked for calid flags being changed on mount. Clearly this path hasn't been exercised by the test code.... SGI-PV: 981950 SGI-Modid: xfs-linux-melb:xfs-kern:31183a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Kill attr_capable checks as already done in xattr_permission.Christoph Hellwig
No need for addition permission checks in the xattr handler, fs/xattr.c:xattr_permission() already does them, and in fact slightly more strict then what was in the attr_capable handlers. SGI-PV: 981809 SGI-Modid: xfs-linux-melb:xfs-kern:31164a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Convert l_flushsema to a sv_tMatthew Wilcox
The l_flushsema doesn't exactly have completion semantics, nor mutex semantics. It's used as a list of tasks which are waiting to be notified that a flush has completed. It was also being used in a way that was potentially racy, depending on the semaphore implementation. By using a sv_t instead of a semaphore we avoid the need for a separate counter, since we know we just need to wake everything on the queue. Original waitqueue implementation from Matthew Wilcox. Cleanup and conversion to sv_t by Christoph Hellwig. SGI-PV: 981507 SGI-Modid: xfs-linux-melb:xfs-kern:31059a Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Ensure that 2 GiB xfs logs work properly.Michael Nishimoto
We found this while experimenting with 2GiB xfs logs. The previous code never assumed that xfs logs would ever get so large. SGI-PV: 981502 SGI-Modid: xfs-linux-melb:xfs-kern:31058a Signed-off-by: Michael Nishimoto <miken@agami.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Remove unused wbc parameter from xfs_start_page_writeback()Denys Vlasenko
SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31057a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Remove unused Falgs parameter from xfs_qm_dqpurge()Denys Vlasenko
SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31056a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Remove unused arg from kmem_free()Denys Vlasenko
kmem_free() function takes (ptr, size) arguments but doesn't actually use second one. This patch removes size argument from all callsites. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31050a Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Fix up noattr2 so that it will properly update the versionnum andTim Shimmin
features2 fields. Previously, mounting with noattr2 failed to achieve anything because although it cleared the attr2 mount flag, it would set it again as soon as it processed the superblock fields. The fix now has an explicit noattr2 flag and uses it later to fix up the versionnum and features2 fields. SGI-PV: 980021 SGI-Modid: xfs-linux-melb:xfs-kern:31003a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28[XFS] Split xfs_dir2_leafn_lookup_int into its two pieces of functionalityBarry Naujok
SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30834a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-18ext4: Fix small file fragmentationAneesh Kumar K.V
For small file block allocations, mballoc uses per cpu prealloc space. Use goal block when searching for the right prealloc space. Also make sure ext4_da_writepages tries to write all the pages for small files in single attempt Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: Initialize writeback_index to 0 when allocating a new inodeAneesh Kumar K.V
The write_cache_pages() function uses the mapping->writeback_index as the starting index to write out when range_cyclic is set. Properly initialize writeback_index so that we start the writeout at index 0. This was found when debugging the small file fragmentation on ext4. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: make sure ext4_has_free_blocks returns 0 for ENOSPCAneesh Kumar K.V
Fix ext4_has_free_blocks() to return 0 when we don't have enough space. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: journal credit fix for the delayed allocation's writepages() functionMingming Cao
Previous delalloc writepages implementation started a new transaction outside of a loop which called get_block() to do the block allocation. Since we didn't know exactly how many blocks would need to be allocated, the estimated journal credits required was very conservative and caused many issues. With the reworked delayed allocation, a new transaction is created for each get_block(), thus we don't need to guess how many credits for the multiple chunk of allocation. We start every transaction with enough credits for inserting a single exent. When estimate the credits for indirect blocks to allocate a chunk of blocks, we need to know the number of data blocks to allocate. We use the total number of reserved delalloc datablocks; if that is too big, for non-extent files, we need to limit the number of blocks to EXT4_MAX_TRANS_BLOCKS. Code cleanup from Aneesh. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: Rework the ext4_da_writepages() functionAneesh Kumar K.V
With the below changes we reserve credit needed to insert only one extent resulting from a call to single get_block. This makes sure we don't take too much journal credits during writeout. We also don't limit the pages to write. That means we loop through the dirty pages building largest possible contiguous block request. Then we issue a single get_block request. We may get less block that we requested. If so we would end up not mapping some of the buffer_heads. That means those buffer_heads are still marked delay. Later in the writepage callback via __mpage_writepage we redirty those pages. We should also not limit/throttle wbc->nr_to_write in the filesystem writepages callback. That cause wrong behaviour in generic_sync_sb_inodes caused by wbc->nr_to_write being <= 0 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: journal credits reservation fixes for DIO, fallocateMingming Cao
DIO and fallocate credit calculation is different than writepage, as they do start a new journal right for each call to ext4_get_blocks_wrap(). This patch uses the helper function in DIO and fallocate case, passing a flag indicating that the modified data are contigous thus could account less indirect/index blocks. This patch also fixed the journal credit reservation for direct I/O (DIO). Previously the estimated credits for DIO only was calculated for non-extent files, which was not enough if the file is extent-based. Also fixed was fallocate double-counting credits for modifying the the superblock. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: journal credits reservation fixes for extent file writepageMingming Cao
This patch modified the writepage/write_begin credit calculation for extent files, to use the credits caculation helper function. The current calculation of how many index/leaf blocks should be accounted is too conservetive, it always considered the worse case, where the tree level is 5, and in the case of multiple chunk allocations, it always assumed no blocks were dirtied in common across the allocations. This path uses the accurate depth of the inode with some extras to calculate the index blocks, and also less conservative in the case of multiple allocation accounting. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: journal credits calulation cleanup and fix for non-extent writepageMingming Cao
When considering how many journal credits are needed for modifying a chunk of data, we need to account for the super block, inode block, quota blocks and xattr block, indirect/index blocks, also, group bitmap and group descriptor blocks for new allocation (including data and indirect/index blocks). There are many places in ext4 do the calculation on their own and often missed one or two meta blocks, and often they assume single block allocation, and did not considering the multile chunk of allocation case. This patch is trying to cleanup current journal credit code, provides some common helper funtion to calculate the journal credits, to be used for writepage, writepages, DIO, fallocate, migration, defrag, and for both nonextent and extent files. This patch modified the writepage/write_begin credit caculation for nonextent files, to use the new helper function. It also fixed the problem that writepage on nonextent files did not consider the case blocksize <pagesize, thus could possibelly need multiple block allocation in a single transaction. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: Fix bug where we return ENOSPC even though we have plenty of inodesEric Sandeen
The find_group_flex() function starts with best_flex as the parent_fbg_group, which happens to have 0 inodes free. Some of the flex groups searched have free blocks and free inodes, but the flex_freeb_ratio is < 10, so they're skipped. Then when a group is compared to the current "best" flex group, it does not have more free blocks than "best", so it is skipped as well. This continues until no flex group with free inodes is found which has a proper ratio or which has more free blocks than the "best" group, and we're left with a "best" group that has 0 inodes free, and we return -ENOSPC. We fix this by changing the logic so that if the current "best" flex group has no inodes free, and the current one does have room, it is promoted to the next "best." Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: don't try to resize if there are no reserved gdt blocks leftJosef Bacik
When trying to resize an ext4 fs and you run out of reserved gdt blocks, you get an error that doesn't actually tell you what went wrong, it just says that the gdb it picked is not correct, which is the case since you don't have any reserved gdt blocks left. This patch adds a check to make sure you have reserved gdt blocks to use, and if not prints out a more relevant error. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Andreas Dilger <adilger@sun.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-16ext4: Use ext4_discard_reservations instead of mballoc-specific callTheodore Ts'o
In ext4_ext_truncate(), we should use the more generic ext4_discard_reservations() call so we do the right thing when the filesystem is mounted with the nomballoc option. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>
2008-08-19ext4: Fix ext4_dx_readdir hash collision handlingTheodore Ts'o
This fixes a bug where readdir() would return a directory entry twice if there was a hash collision in an hash tree indexed directory. Signed-off-by: Eugene Dashevsky <eugene@ibrix.com> Signed-off-by: Mike Snitzer <msnitzer@ibrix.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19ext4: Fix delalloc release block reservation for truncateMingming Cao
Ext4 will release the reserved blocks for delayed allocations when inode is truncated/unlinked. If there is no reserved block at all, we shouldn't need to do so. But current code still tries to release the reserved blocks regardless whether the counters's value is 0. Continue to do that causes the later calculation to go wrong and a kernel BUG_ON() caught that. This doesn't happen for extent-based files, as the calculation for 0 reserved blocks was right for extent based file. This patch fixed the kernel BUG() due to above reason. It adds checks for 0 to avoid unnecessary release and fix calculation for non-extent files. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>