summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/ordered-data.h
AgeCommit message (Collapse)Author
2008-10-03Btrfs: O_DIRECT writes via buffered writes + invaldiateChris Mason
This reworks the btrfs O_DIRECT write code a bit. It had always fallen back to buffered IO and done an invalidate, but needed to be updated for the data=ordered code. The invalidate wasn't actually removing pages because they were still inside an ordered extent. This also combines the O_DIRECT/O_SYNC paths where possible, and kicks off IO in the main btrfs_file_write loop to keep the pipe down the the disk full as we process long writes. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix nodatacow for the new data=ordered modeYan Zheng
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix the defragmention code and the block relocation code for data=orderedChris Mason
Before setting an extent to delalloc, the code needs to wait for pending ordered extents. Also, the relocation code needs to wait for ordered IO before scanning the block group again. This is because the extents are not removed until the IO for the new extents is finished Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix 32 bit compiles by using an unsigned long byte count in the ↵Chris Mason
ordered extent The ordered extents have to fit in memory, so an unsigned long is sufficient. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Take the csum mutex while reading checksumsChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix some data=ordered related data corruptionsChris Mason
Stress testing was showing data checksum errors, most of which were caused by a lookup bug in the extent_map tree. The tree was caching the last pointer returned, and searches would check the last pointer first. But, search callers also expect the search to return the very first matching extent in the range, which wasn't always true with the last pointer usage. For now, the code to cache the last return value is just removed. It is easy to fix, but I think lookups are rare enough that it isn't required anymore. This commit also replaces do_sync_mapping_range with a local copy of the related functions. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Handle data checksumming on bios that span multiple ordered extentsChris Mason
Data checksumming is done right before the bio is sent down the IO stack, which means a single bio might span more than one ordered extent. In this case, the checksumming data is split between two ordered extents. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Cleanup and comment ordered-data.cChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Don't pin pages in ram until the entire ordered extent is on disk.Chris Mason
Checksum items are not inserted until the entire ordered extent is on disk, but individual pages might be clean and available for reclaim long before the whole extent is on disk. In order to allow those pages to be freed, we need to be able to search the list of ordered extents to find the checksum that is going to be inserted in the tree. This way if the page needs to be read back in before the checksums are in the btree, we'll be able to verify the checksum on the page. This commit adds the ability to search the pending ordered extents for a given offset in the file, and changes btrfs_releasepage to allow ordered pages to be freed. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Update on disk i_size only after pending ordered extents are doneChris Mason
This changes the ordered data code to update i_size after the extent is on disk. An on disk i_size is maintained in the in-memory btrfs inode structures, and this is updated as extents finish. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: New data=ordered implementationChris Mason
The old data=ordered code would force commit to wait until all the data extents from the transaction were fully on disk. This introduced large latencies into the commit and stalled new writers in the transaction for a long time. The new code changes the way data allocations and extents work: * When delayed allocation is filled, data extents are reserved, and the extent bit EXTENT_ORDERED is set on the entire range of the extent. A struct btrfs_ordered_extent is allocated an inserted into a per-inode rbtree to track the pending extents. * As each page is written EXTENT_ORDERED is cleared on the bytes corresponding to that page. * When all of the bytes corresponding to a single struct btrfs_ordered_extent are written, The previously reserved extent is inserted into the FS btree and into the extent allocation trees. The checksums for the file data are also updated. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Fix btrfs_del_ordered_inode to allow forcing the drop during unlinksChris Mason
This allows us to delete an unlinked inode with dirty pages from the list instead of forcing commit to write these out before deleting the inode. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25btrfs delete ordered inode handling fixMingming
Use btrfs_release_file instead of a put_inode call Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Throttle file_write when data=ordered is flushing the inodeChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix data=ordered vs wait_on_inode deadlock on older kernelsChris Mason
Using ilookup5 during data=ordered writeback could deadlock on I_LOCK. This saves a pointer to the inode instead. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Rework btrfs_drop_inode to avoid schedulingChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add data=ordered supportChris Mason
This forces file data extents down the disk along with the metadata that references them. The current implementation is fairly simple, and just writes out all of the dirty pages in an inode before the commit. Signed-off-by: Chris Mason <chris.mason@oracle.com>