From 5693486bad2bc2ac585a2c24f7e2f3964b478df9 Mon Sep 17 00:00:00 2001 From: Joel Becker Date: Thu, 1 Jul 2010 15:13:31 -0700 Subject: ocfs2: Zero the tail cluster when extending past i_size. ocfs2's allocation unit is the cluster. This can be larger than a block or even a memory page. This means that a file may have many blocks in its last extent that are beyond the block containing i_size. There also may be more unwritten extents after that. When ocfs2 grows a file, it zeros the entire cluster in order to ensure future i_size growth will see cleared blocks. Unfortunately, block_write_full_page() drops the pages past i_size. This means that ocfs2 is actually leaking garbage data into the tail end of that last cluster. This is a bug. We adjust ocfs2_write_begin_nolock() and ocfs2_extend_file() to detect when a write or truncate is past i_size. They will use ocfs2_zero_extend() to ensure the data is properly zeroed. Older versions of ocfs2_zero_extend() simply zeroed every block between i_size and the zeroing position. This presumes three things: 1) There is allocation for all of these blocks. 2) The extents are not unwritten. 3) The extents are not refcounted. (1) and (2) hold true for non-sparse filesystems, which used to be the only users of ocfs2_zero_extend(). (3) is another bug. Since we're now using ocfs2_zero_extend() for sparse filesystems as well, we teach ocfs2_zero_extend() to check every extent between i_size and the zeroing position. If the extent is unwritten, it is ignored. If it is refcounted, it is CoWed. Then it is zeroed. Signed-off-by: Joel Becker Cc: stable@kernel.org --- fs/ocfs2/refcounttree.c | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'fs/ocfs2/refcounttree.c') diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c index 4793f36f651..32949df1069 100644 --- a/fs/ocfs2/refcounttree.c +++ b/fs/ocfs2/refcounttree.c @@ -4166,6 +4166,12 @@ static int __ocfs2_reflink(struct dentry *old_dentry, struct inode *inode = old_dentry->d_inode; struct buffer_head *new_bh = NULL; + if (OCFS2_I(inode)->ip_flags & OCFS2_INODE_SYSTEM_FILE) { + ret = -EINVAL; + mlog_errno(ret); + goto out; + } + ret = filemap_fdatawrite(inode->i_mapping); if (ret) { mlog_errno(ret); -- cgit v1.2.3-70-g09d2 From f5e27b6ddfbafdd9c9c2f06bbf28af12581409bc Mon Sep 17 00:00:00 2001 From: Tao Ma Date: Wed, 14 Jul 2010 11:19:32 +0800 Subject: ocfs2: Don't duplicate pages past i_size during CoW. During CoW, the pages after i_size don't contain valid data, so there's no need to read and duplicate them. Signed-off-by: Tao Ma Signed-off-by: Joel Becker --- fs/ocfs2/refcounttree.c | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'fs/ocfs2/refcounttree.c') diff --git a/fs/ocfs2/refcounttree.c b/fs/ocfs2/refcounttree.c index 32949df1069..3ac5aa733e9 100644 --- a/fs/ocfs2/refcounttree.c +++ b/fs/ocfs2/refcounttree.c @@ -2931,6 +2931,12 @@ static int ocfs2_duplicate_clusters_by_page(handle_t *handle, offset = ((loff_t)cpos) << OCFS2_SB(sb)->s_clustersize_bits; end = offset + (new_len << OCFS2_SB(sb)->s_clustersize_bits); + /* + * We only duplicate pages until we reach the page contains i_size - 1. + * So trim 'end' to i_size. + */ + if (end > i_size_read(context->inode)) + end = i_size_read(context->inode); while (offset < end) { page_index = offset >> PAGE_CACHE_SHIFT; -- cgit v1.2.3-70-g09d2