summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)Author
2011-03-08UBI: incorporate LEB offset informationArtem Bityutskiy
Incorporate the LEB offset information into UBIFS. We'll use this information in one of the next patches to figure out what are the max. write size offsets relative to the PEB. So this patch is just a preparation. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-03-08UBIFS: incorporate maximum write sizeArtem Bityutskiy
Incorporate maximum write size into the UBIFS description data structure. This patch just introduces new 'c->max_write_size' and 'c->max_write_shift' fields as a preparation for the following patches. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-03-08block: biovec_slab vs. CONFIG_BLK_DEV_INTEGRITYMartin K. Petersen
The block integrity subsystem no longer uses the bio_vec slabs so this code can safely be compiled in. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-08unfuck proc_sysctl ->d_compare()Al Viro
a) struct inode is not going to be freed under ->d_compare(); however, the thing PROC_I(inode)->sysctl points to just might. Fortunately, it's enough to make freeing that sucker delayed, provided that we don't step on its ->unregistering, clear the pointer to it in PROC_I(inode) before dropping the reference and check if it's NULL in ->d_compare(). b) I'm not sure that we *can* walk into NULL inode here (we recheck dentry->seq between verifying that it's still hashed / fetching dentry->d_inode and passing it to ->d_compare() and there's no negative hashed dentries in /proc/sys/*), but if we can walk into that, we really should not have ->d_compare() return 0 on it! Said that, I really suspect that this check can be simply killed. Nick? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-08nilfs2: record used amount of each checkpoint in checkpoint listRyusuke Konishi
This records the number of used blocks per checkpoint in each checkpoint entry of cpfile. Even though userland tools can get the block count via nilfs_get_cpinfo ioctl, it was not updated by the nilfs2 kernel code. This fixes the issue and makes it available for userland tools to calculate used amount per checkpoint. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Jiro SEKIBA <jir@unicus.jp>
2011-03-08nilfs2: optimize rec_len functionsRyusuke Konishi
This is a similar change to those in ext2/ext3 codebase (commit 40a063f6691ce937 and a4ae3094869f18e2, respectively). The addition of 64k block capability in the rec_len_from_disk and rec_len_to_disk functions added a bit of math overhead which slows down file create workloads needlessly when the architecture cannot even support 64k blocks. This will cut the corner. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: append blocksize info to warnings during loading super blocksRyusuke Konishi
At present, the same warning message can be output twice when nilfs detected a problem on super blocks: NILFS warning: broken superblock. using spare superblock. NILFS warning: broken superblock. using spare superblock. ... This is because these super blocks are reloaded with the block size written in a super block if it differs from the first block size, but this repetition looks somewhat confusing. So, we hint at what is going on by appending block size information to those messages. Reported-by: Wakko Warner <wakko@animx.eu.org> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: add compat ioctlRyusuke Konishi
The current FS_IOC_GETFLAGS/SETFLAGS/GETVERSION will fail if application is 32 bit and kernel is 64 bit. This issue is avoidable by adding compat_ioctl method. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSIONRyusuke Konishi
Add support for the standard attributes set via chattr and read via lsattr. These attributes are already in the flags value in the nilfs2 inode, but currently we don't have any ioctl commands that expose them to the userland. Collaterally, this adds the FS_IOC_GETVERSION ioctl for getting i_generation, which allows users to list the file's generation number with "lsattr -v". Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: tighten restrictions on inode flagsRyusuke Konishi
Nilfs has few rectrictions on which flags may be set on which inodes like ext2/3/4 filesystems used to be. Specifically DIRSYNC may only be set on directories and IMMUTABLE and APPEND may not be set on links. Tighten that to disallow TOPDIR being set on non-directories and only NODUMP and NOATIME to be set on non-regular file, non-directories. This introduces a flags masking function like those of extN and uses it during inode creation. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: mark S_NOATIME on inodes only if NOATIME attribute is setRyusuke Konishi
At present, nilfs marks S_NOATIME flag on all inodes. This restricts nilfs_set_inode_flags function so that it marks S_NOATIME only if a given inode has an FS_NOATIME_FL flag. Although nilfs does not support atime yet, touch_atime() still safely returns on IS_NOATIME check since MS_NOATIME is always set on sb. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: use common file attribute macrosRyusuke Konishi
Replaces uses of own inode flags (i.e. NILFS_SECRM_FL, NILFS_UNRM_FL, NILFS_COMPR_FL, and so forth) with common inode flags, and removes the own flag declarations. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: add free entries count only if clear bit operation succeededRyusuke Konishi
Three functions of the current persistent object allocator, nilfs_palloc_commit_free_entry, nilfs_palloc_abort_alloc_entry, and nilfs_palloc_freev functions unconditionally add a counter after doing clear bit operation on a bitmap block. If the clear bit operation overlapped, the counter will not add up. This fixes the issue by making the counter operations conditional. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08nilfs2: decrement inodes count only if raw inode was successfully deletedRyusuke Konishi
This fixes the issue that inodes count will not add up after removal of raw inodes fails. Hence, this prevents possible under flow of the inodes count. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-08Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into nextJames Morris
2011-03-08Merge branch 'master'; commit 'v2.6.38-rc7' into nextJames Morris
2011-03-07nfsd41: modify the members value of nfsd4_op_flagsMi Jinlong
The members of nfsd4_op_flags, (ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS) equals to ALLOWED_AS_FIRST_OP, maybe that's not what we want. OP_PUTROOTFH with op_flags = ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS, can't appears as the first operation with out SEQUENCE ops. This patch modify the wrong value of ALLOWED_WITHOUT_FH etc which was introduced by f9bb94c4. Cc: stable@kernel.org Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07nfsd: add proc file listing kernel's gss_krb5 enctypesKevin Coffman
Add a new proc file which lists the encryption types supported by the kernel's gss_krb5 code. Newer MIT Kerberos libraries support the assertion of acceptor subkeys. This enctype information allows user-land (svcgssd) to request that the Kerberos libraries limit the encryption types that it uses when generating the subkeys. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07NFSD, VFS: Remove dead code in nfsd_rename()Jesper Juhl
Currently we have the following code in fs/nfsd/vfs.c::nfsd_rename() : ... host_err = nfsd_break_lease(odentry->d_inode); if (host_err) goto out_drop_write; if (ndentry->d_inode) { host_err = nfsd_break_lease(ndentry->d_inode); if (host_err) goto out_drop_write; } if (host_err) goto out_drop_write; ... 'host_err' is guaranteed to be 0 by the time we test 'ndentry->d_inode'. If 'host_err' becomes != 0 inside the 'if' statement, then we goto 'out_drop_write'. So, after the 'if' statement there is no way that 'host_err' can be anything but 0, so the test afterwards is just dead code. This patch removes the dead code. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07nfsd: kill unused macro definitionShan Wei
These macros had never been used for several years. So, remove them. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07locks: use assign_type()Namhyung Kim
Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07nfsd4: fix bad pointer on failure to find delegationJ. Bruce Fields
In case of a nonempty list, the return on error here is obviously bogus; it ends up being a pointer to the list head instead of to any valid delegation on the list. In particular, if nfsd4_delegreturn() hits this case, and you're quite unlucky, then renew_client may oops, and it may take an embarassingly long time to figure out why. Facepalm. BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 IP: [<ffffffff81292965>] nfsd4_delegreturn+0x125/0x200 ... Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07ext3: Always set dx_node's fake_dirent explicitly.Eric Sandeen
(crossport of 1f7bebb9e911d870fa8f997ddff838e82b5715ea by Andreas Schlick <schlick@lavabit.com>) When ext3_dx_add_entry() has to split an index node, it has to ensure that name_len of dx_node's fake_dirent is also zero, because otherwise e2fsck won't recognise it as an intermediate htree node and consider the htree to be corrupted. CC: stable@kernel.org Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-03-07Btrfs: deal with short returns from copy_from_userChris Mason
When copy_from_user is only able to copy some of the bytes we requested, we may end up creating a partially up to date page. To avoid garbage in the page, we need to treat a partial copy as a zero length copy. This makes the rest of the file_write code drop the page and retry the whole copy instead of marking the partially up to date page as dirty. Signed-off-by: Chris Mason <chris.mason@oracle.com> cc: stable@kernel.org
2011-03-07Btrfs: fix regressions in copy_from_user handlingChris Mason
Commit 914ee295af418e936ec20a08c1663eaabe4cd07a fixed deadlocks in btrfs_file_write where we would catch page faults on pages we had locked. But, there were a few problems: 1) The x86-32 iov_iter_copy_from_user_atomic code always fails to copy data when the amount to copy is more than 4K and the offset to start copying from is not page aligned. The result was btrfs_file_write looping forever retrying the iov_iter_copy_from_user_atomic We deal with this by changing btrfs_file_write to drop down to single page copies when iov_iter_copy_from_user_atomic starts returning failure. 2) The btrfs_file_write code was leaking delalloc reservations when iov_iter_copy_from_user_atomic returned zero. The looping above would result in the entire filesystem running out of delalloc reservations and constantly trying to flush things to disk. 3) btrfs_file_write will lock down page cache pages, make sure any writeback is finished, do the copy_from_user and then release them. Before the loop runs we check the first and last pages in the write to see if they are only being partially modified. If the start or end of the write isn't aligned, we make sure the corresponding pages are up to date so that we don't introduce garbage into the file. With the copy_from_user changes, we're allowing the VM to reclaim the pages after a partial update from copy_from_user, but we're not making sure the page cache page is up to date when we loop around to resume the write. We deal with this by pushing the up to date checks down into the page prep code. This fits better with how the rest of file_write works. Signed-off-by: Chris Mason <chris.mason@oracle.com> Reported-by: Mitch Harder <mitch.harder@sabayonlinux.org> cc: stable@kernel.org
2011-03-07xfs: kill support/debug.[ch]Dave Chinner
The remaining functionality in debug.[ch] is effectively just assert handling, conditional debug definitions and hex dumping. The hex dumping and assert function can be moved into the new printk module, while the rest can be moved into top-level header files. This allows fs/xfs/support/debug.[ch] to be completely removed from the codebase. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: Convert remaining cmn_err() callers to new APIDave Chinner
Once converted, kill the remainder of the cmn_err() interface. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: convert the quota debug prints to new APIDave Chinner
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: rename xfs_cmn_err_fsblock_zero()Dave Chinner
The "cmn_err" part of the function name is no longer relevant. Rename the function to xfs_alert_fsblock_zero() to match the new logging API. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: convert xfs_fs_cmn_err to new error logging APIDave Chinner
Continue to clean up the error logging code by converting all the callers of xfs_fs_cmn_err() to the new API. Once done, remove the unused old API function. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: kill xfs_fs_mount_cmn_err() macroDave Chinner
The xfs_fs_mount_cmn_err() hides a simple check as to whether the mount path should output an error or not. Remove the macro and open code the check. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: kill xfs_fs_repair_cmn_err() macroDave Chinner
In certain cases of inode corruption, the xfs_fs_repair_cmn_err() macro is used to output an extra message in the corruption report. That extra message is "unmount and run xfs_repair", which really applies to any corruption report. Each case that this macro is called (except one) a following call to xfs_corruption_error() is made to optionally dump more information about the error. Hence, move the output of "run xfs_repair" to xfs_corruption_error() so that it is output on all corruption reports. Also, convert the callers of the repair macro that don't call xfs_corruption_error() to call it, hence provide consiѕtent error reporting for all cases where xfs_fs_repair_cmn_err() used to be called. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: convert xfs_cmn_err to xfs_alert_tagDave Chinner
Continue the conversion of the old cmn_err interface be converting all the conditional panic tag errors to xfs_alert_tag() and then removing xfs_cmn_err(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: Convert xlog_warn to new logging interfaceDave Chinner
Convert the xfs log operations to use the new error logging interfaces. This removes the xlog_{warn,panic} wrappers and makes almost all errors emit the device they belong to instead of just refering to "XFS". Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-07xfs: Convert linux-2.6/ files to new logging interfaceDave Chinner
Convert the files in fs/xfs/linux-2.6/ to use the new xfs_<level> logging format that replaces the old Irix inherited cmn_err() interfaces. While there, also convert naked printk calls to use the relevant xfs logging function to standardise output format. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2011-03-05omfs: make readdir stop when filldir says soAl Viro
filldir returning an error does *not* mean "skip this entry, try the next one"... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Bob Copeland <me@bobcopeland.com>
2011-03-05omfs: merge unlink() and rmdir(), close leak in rename()Al Viro
In case of directory-overwriting rename(), omfs forgot to mark the victim doomed, so omfs_evict_inode() didn't free it. We could fix that by calling omfs_rmdir() for directory victims instead of doing omfs_unlink(), but it's easier to merge omfs_unlink() and omfs_rmdir() instead. Note that we have no hardlinks here. It also makes the checks in omfs_rename() go away - they fold into what omfs_remove() does when it runs into a directory. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Bob Copeland <me@bobcopeland.com>
2011-03-05omfs: stop playing silly buggers with omfs_unlink() in ->rename()Al Viro
Since omfs directories are hashes of inodes and name is part of inode, we have to remove inode from old directory before we can put it into new one / under new name. So instead of bump i_nlink call omfs_unlink, which does omfs_delete_entry() decrement i_nlink and mark parent dirty in case of success decrement i_nlink if omfs_unlink failed and hadn't done it itself let's just call omfs_delete_entry() and dirty the parent ourselves... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Bob Copeland <me@bobcopeland.com>
2011-03-05omfs: rename() needs to mark old_inode dirty after ctime updateAl Viro
we *do* mark it dirty before, but it doesn't guarantee that we don't get preempted just before assignment to ->i_ctime, with inode getting written out before we get CPU back... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Bob Copeland <me@bobcopeland.com>
2011-03-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: no .snap inside of snapped namespace libceph: fix msgr standby handling libceph: fix msgr keepalive flag libceph: fix msgr backoff libceph: retry after authorization failure libceph: fix handling of short returns from get_user_pages ceph: do not clear I_COMPLETE from d_release ceph: do not set I_COMPLETE Revert "ceph: keep reference to parent inode on ceph_dentry"
2011-03-05ext4: Use single thread to perform DIO unwritten convertionMingming Cao
While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing conversion from unwritten extents to written for IOs completed from async direct IO patch. Per filesystem is enough, we don't need per cpu threads to work on conversion. Signed-off-by: Mingming Cao <cmm@us.ibm.com>
2011-03-05fs/locks.c: Remove stale FIXME left over from BKL conversionMatt Fleming
The comment is no longer true as (now that the BKL conversion is finished) a spinlock _is_ now used to protect file_lock_list, blocked_list and inode->i_flock. Signed-off-by: Matt Fleming <matt.fleming@linux.intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2011-03-04nfs4: Ensure that ACL pages sent over NFS were not allocated from the slab (v3)Neil Horman
The "bad_page()" page allocator sanity check was reported recently (call chain as follows): bad_page+0x69/0x91 free_hot_cold_page+0x81/0x144 skb_release_data+0x5f/0x98 __kfree_skb+0x11/0x1a tcp_ack+0x6a3/0x1868 tcp_rcv_established+0x7a6/0x8b9 tcp_v4_do_rcv+0x2a/0x2fa tcp_v4_rcv+0x9a2/0x9f6 do_timer+0x2df/0x52c ip_local_deliver+0x19d/0x263 ip_rcv+0x539/0x57c netif_receive_skb+0x470/0x49f :virtio_net:virtnet_poll+0x46b/0x5c5 net_rx_action+0xac/0x1b3 __do_softirq+0x89/0x133 call_softirq+0x1c/0x28 do_softirq+0x2c/0x7d do_IRQ+0xec/0xf5 default_idle+0x0/0x50 ret_from_intr+0x0/0xa default_idle+0x29/0x50 cpu_idle+0x95/0xb8 start_kernel+0x220/0x225 _sinittext+0x22f/0x236 It occurs because an skb with a fraglist was freed from the tcp retransmit queue when it was acked, but a page on that fraglist had PG_Slab set (indicating it was allocated from the Slab allocator (which means the free path above can't safely free it via put_page. We tracked this back to an nfsv4 setacl operation, in which the nfs code attempted to fill convert the passed in buffer to an array of pages in __nfs4_proc_set_acl, which gets used by the skb->frags list in xs_sendpages. __nfs4_proc_set_acl just converts each page in the buffer to a page struct via virt_to_page, but the vfs allocates the buffer via kmalloc, meaning the PG_slab bit is set. We can't create a buffer with kmalloc and free it later in the tcp ack path with put_page, so we need to either: 1) ensure that when we create the list of pages, no page struct has PG_Slab set or 2) not use a page list to send this data Given that these buffers can be multiple pages and arbitrarily sized, I think (1) is the right way to go. I've written the below patch to allocate a page from the buddy allocator directly and copy the data over to it. This ensures that we have a put_page free-able page for every entry that winds up on an skb frag list, so it can be safely freed when the frame is acked. We do a put page on each entry after the rpc_call_sync call so as to drop our own reference count to the page, leaving only the ref count taken by tcp_sendpages. This way the data will be properly freed when the ack comes in Successfully tested by myself to solve the above oops. Note, as this is the result of a setacl operation that exceeded a page of data, I think this amounts to a local DOS triggerable by an uprivlidged user, so I'm CCing security on this as well. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Trond Myklebust <Trond.Myklebust@netapp.com> CC: security@kernel.org CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04ceph: no .snap inside of snapped namespaceSage Weil
Otherwise you can do things like # mkdir .snap/foo # cd .snap/foo/.snap # ls <badness> Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-04minimal fix for do_filp_open() raceAl Viro
failure exits on the no-O_CREAT side of do_filp_open() merge with those of O_CREAT one; unfortunately, if do_path_lookup() returns -ESTALE, we'll get out_filp:, notice that we are about to return -ESTALE without having trying to create the sucker with LOOKUP_REVAL and jump right into the O_CREAT side of code. And proceed to try and create a file. Usually that'll fail with -ESTALE again, but we can race and get that attempt of pathname resolution to succeed. open() without O_CREAT really shouldn't end up creating files, races or not. The real fix is to rearchitect the whole do_filp_open(), but for now splitting the failure exits will do. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-03Merge branch 'i_nlink' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'i_nlink' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: hfs: fix rename() over non-empty directory udf: fix i_nlink limit fix reiserfs mkdir() breakage exofs: i_nlink races in rename() nilfs2: i_nlink races in rename() minix: i_nlink races in rename() ufs: i_nlink races in rename() sysv: i_nlink races in rename()
2011-03-04ext3: Fix an overflow in ext3_trim_fs.Tao Ma
In a bs=4096 volume, if we call FITRIM with the following parameter as fstrim_range(start = 102400, len = 134144000, minlen = 10240), with the following code: if (len >= EXT3_BLOCKS_PER_GROUP(sb)) len -= (EXT3_BLOCKS_PER_GROUP(sb) - first_block); else last_block = first_block + len; So if len < EXT3_BLOCKS_PER_GROUP while first_block + len > EXT3_BLOCKS_PER_GROUP, last_block will be set to an overflow value which exceeds EXT3_BLOCKS_PER_GROUP. This patch fixes it and adjusts len and last_block accordingly. Cc: Lukas Czerner <lczerner@redhat.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz>
2011-03-03LSM: Pass -o remount options to the LSMEric Paris
The VFS mount code passes the mount options to the LSM. The LSM will remove options it understands from the data and the VFS will then pass the remaining options onto the underlying filesystem. This is how options like the SELinux context= work. The problem comes in that -o remount never calls into LSM code. So if you include an LSM specific option it will get passed to the filesystem and will cause the remount to fail. An example of where this is a problem is the 'seclabel' option. The SELinux LSM hook will print this word in /proc/mounts if the filesystem is being labeled using xattrs. If you pass this word on mount it will be silently stripped and ignored. But if you pass this word on remount the LSM never gets called and it will be passed to the FS. The FS doesn't know what seclabel means and thus should fail the mount. For example an ext3 fs mounted over loop # mount -o loop /tmp/fs /mnt/tmp # cat /proc/mounts | grep /mnt/tmp /dev/loop0 /mnt/tmp ext3 rw,seclabel,relatime,errors=continue,barrier=0,data=ordered 0 0 # mount -o remount /mnt/tmp mount: /mnt/tmp not mounted already, or bad option # dmesg EXT3-fs (loop0): error: unrecognized mount option "seclabel" or missing value This patch passes the remount mount options to an new LSM hook. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: James Morris <jmorris@namei.org>
2011-03-03Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: zero proper structure size for geometry calls
2011-03-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix regression that i-flag is not set on changeless checkpoints