asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2007-10-10	[GFS2] Fix quota do_list operation hang	Abhijith Das
	This is the filesystem part of the patches to fix this bz. There are additional userland patches (gfs2_quota, libgfs2) for the complete solution. This patch adds a new field qu_ll_next to the gfs2_quota structure. This field allows us to create linked lists of quotas in the ondisk quota inode. Instead of scanning through the entire sparse quota file for valid quotas, we can now simply walk through the user and group quota linked lists to perform the do_list operation. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] fixed a NULL pointer assignment BUG	Denis Cheng
	Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Force unstuff of hidden quota inode	Abhijith Das
	This patch forcibly unstuffs (if stuffed) the hidden quota inode at the first availble opportunity. In any practical scenario the quota inode won't be stuffed, so this is ok to do. Unstuffing the quota inode allows us to ignore the case of a stuffed quota inode in gfs2_adjust_quota(). Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] better code for translating characters	Denis Cheng
	the original code could work, but I think this code could work better. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] unneeded typecast	Denis Cheng
	sb->s_fs_info is a void pointer, thus the type cast is not needed. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] use list_for_each_entry instead	Denis Cheng
	Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Ensure journal file cache is flushed after recovery	Bob Peterson
	This is for bugzilla bug #248176: GFS2: invalid metadata block Patches 1 thru 3 were accepted upstream, but there were problems with 4 and 5. Those issues have been resolved and now the recovery tests are passing without errors. This code has gone through 41 * 3 successful gfs2 recovery tests before it hit an unrelated (openais) problem. I'm continuing to test it. This is a complete rewrite of patch 5 for bug #248176, written by Steve Whitehouse. This is referred to in the bugzilla record as "new 6" and "a different solution". The problem was that the journal inodes, although protected by a glock, were not synched with the other nodes because they don't use the inode glock synch operations (i.e. no "glops" were defined). Therefore, journal recovery on a journal-recovering node were causing the blocks to get out of sync with the node that was actually trying to use that journal as it comes back up from a reboot. There are two possible solutions: (1) To make the journals use the normal inode glock sync operations, or (2) To make the journal operations take effect immediately (i.e. no caching). Although option 1 works, it turns out to be a lot more code. Steve opted for option 2, which is much simpler and therefore less prone to regression errors. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> --
2007-10-10	[GFS2] invalid metadata block - REVISED	Bob Peterson
	This is for bugzilla bug #248176: GFS2: invalid metadata block Patches 1 thru 3 were accepted upstream, but there were problems with 4 and 5. Those issues have been resolved and now the recovery tests are passing without errors. This code has gone through 41 * 3 successful gfs2 recovery tests before it hit an unrelated (openais) problem. This is a complete rewrite of patch 4 for bug #248176. Part of the problem was that inodes were being recycled before their buffers were flushed to the journal logs. Another problem was that the clone bitmaps were being searched for deleted inodes to recycle, but only the "real" bitmaps should be searched for that purpose. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Reduce number of gfs2_scand processes to one	Steven Whitehouse
	We only need a single gfs2_scand process rather than the one per filesystem which we had previously. As a result the parameter determining the frequency of gfs2_scand runs becomes a module parameter rather than a mount parameter as it was before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] use the declaration of gfs2_dops in the header file instead	Denis Cheng
	Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] mark struct *_operations const	Denis Cheng
	these struct *_operations are all method tables, thus should be const. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Detach buf data during in-place writeback	Bob Peterson
	This is patch 5 of 5 for bug #248176 Metadata corruption was occurring because page references weren't being removed in all cases. I previously added a function called detach_bufdata, but I discovered there already WAS a function out there to do the job. It's called gfs2_meta_cache_flush. So I added a call to that to remove the page references. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] use an temp variable to reduce a spin_unlock	Denis Cheng
	this is more clear. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Prevent infinite loop in try_rgrp_unlink()	Bob Peterson
	This is patch three of five for bug #248176. The try_rgrp_unlink code in rgrp.c had an infinite loop. This was caused because the bitmap function rgblk_search can return a block less than the "goal" block, in which case it was looping. The fix is to make it always march forward as needed. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Revert part of earlier log.c changes	Bob Peterson
	This is patch 2 of 5 for bug #248176. The list_move code previously concocted in log.c for bug #238162 (see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=238162#c23) never runs as bh can now never be NULL at this point. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Move some code inside the log lock	Bob Peterson
	This is the first of five patches for bug #248176: There were still some critical variables being manipulated outside the log_lock spinlock. That usually resulted in a hang. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Fix an oops in glock dumping	Steven Whitehouse
	This fixes an oops which was occurring during glock dumping due to the seq file code not taking a reference to the glock. Also this fixes a memory leak which occurred in certain cases, in turn preventing the filesystem from unmounting. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] GFS2 not checking pointer on create when running under nfsd	Steve French
	When looking at an unrelated problem, I noticed that nfsd does not set nameidata pointer on create (ie nd is NULL). This should cause an oops in some cases in which when NFSd is mounted over GFS2. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Clean up duplicate includes in fs/gfs2/	Jesper Juhl
	This patch cleans up duplicate includes in fs/gfs2/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Fix calculation of demote state	Josef Whiter
	If a glock is in the exclusive state and a request for demote to deferred has been received, then further requests for demote to shared are being ignored. This patch fixes that by ensuring that we demote to unlocked in that case. Signed-off-by: Josef Whiter <jwhiter@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10	[GFS2] Fix two races relating to glock callbacks	Steven Whitehouse
	One of the races relates to referencing a variable while not holding its protecting spinlock. The patch simply moves the test inside the spin lock. The other races occurs when a demote to unlocked request occurs during the time a demote to shared request is already running. This of course only happens in the case that the lock was in the exclusive mode to start with. The patch adds a check to see if another demote request has occurred in the mean time and if it has, then it performs a second demote. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14	[GFS2] Revert remounting w/o acl option leaves acls enabled	Steven Whitehouse
	This reverts commit 569a7b6c2e8965ff4908003b925757703a3d649c. The code was correct originally. The default setting for ACLs after a remount should be to be the same as before the remount. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14	[GFS2] Fix setting of inherit jdata attr	Steven Whitehouse
	Due to a mix up between the jdata attribute and inherit jdata attribute it has not been possible to set the inherit jdata attribute on directories. This is now fixed and the ioctl will report the inherit jdata attribute for directories rather than the jdata attribute as it did previously. This stems from our need to have the one bit in the ioctl attr flags mean two different things according to whether the underlying inode is a directory or not. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14	[GFS2] Fix incorrect error path in prepare_write()	Steven Whitehouse
	The error path in prepare_write() was incorrect in the (very rare) event that the transaction fails to start. The following prevents a NULL pointer dereference, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14	[GFS2] Fix incorrect return code in rgrp.c	Steven Whitehouse
	The following patch fixes a bug where 0 was being used as a return code to indicate "nothing to do" when in fact 0 was a valid block location which might be returned by the function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14	[GFS2] soft lockup in rgblk_search	Bob Peterson
	This patch seems to fix the problem described in bugzilla bug 246114. It was written by Steve Whitehouse with some tweaking by me. The code was looping in the relatively new section of code designed to search for and reuse unlinked inodes. In cases where it was finding an appropriate inode to reuse, it was looping around and finding the same block over and over because a "<=" check should have been a "<" when comparing the goal block to the last unlinked block found. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-08-14	[GFS2] soft lockup detected in databuf_lo_before_commit	Bob Peterson
	This is part 2 of the patch for bug #245832, part 1 of which is already in the git tree. The problem was that sdp->sd_log_num_databuf was not always being protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf (which it is supposed to reflect) was protected. That meant there was a timing window during which gfs2_log_flush called databuf_lo_before_commit and the count didn't match what was really on the linked list in that window. So when it ran out of items on the linked list, it decremented total_dbuf from 0 to -1 and thus never left the "while(total_dbuf)" loop. The solution is to protect the variable sdp->sd_log_num_databuf so that the value will always match the contents of the linked list, and therefore the number will never go negative, and therefore, the loop will be exited properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-31	rename setlease to generic_setlease	Christoph Hellwig
	Make it a little more clear that this is the default implementation for the setleast operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-20	mm: Remove slab destructors from kmem_cache_create().	Paul Mundt
	Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-19	mm: fault feedback #2	Nick Piggin
	This patch completes Linus's wish that the fault return codes be made into bit flags, which I agree makes everything nicer. This requires requires all handle_mm_fault callers to be modified (possibly the modifications should go further and do things like fault accounting in handle_mm_fault -- however that would be for another patch). [akpm@linux-foundation.org: fix alpha build] [akpm@linux-foundation.org: fix s390 build] [akpm@linux-foundation.org: fix sparc build] [akpm@linux-foundation.org: fix sparc64 build] [akpm@linux-foundation.org: fix ia64 build] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Still apparently needs some ARM and PPC loving - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19	mm: fault feedback #1	Nick Piggin
	Change ->fault prototype. We now return an int, which contains VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte. FAULT_RET_ code tells the VM whether a page was found, whether it has been locked, and potentially other things. This is not quite the way he wanted it yet, but that's changed in the next patch (which requires changes to arch code). This means we no longer set VM_CAN_INVALIDATE in the vma in order to say that a page is locked which requires filemap_nopage to go away (because we can no longer remain backward compatible without that flag), but we were going to do that anyway. struct fault_data is renamed to struct vm_fault as Linus asked. address is now a void __user * that we should firmly encourage drivers not to use without really good reason. The page is now returned via a page pointer in the vm_fault struct. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19	mm: merge populate and nopage into fault (fixes nonlinear)	Nick Piggin
	Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes the virtual address -> file offset differently from linear mappings. ->populate is a layering violation because the filesystem/pagecache code should need to know anything about the virtual memory mapping. The hitch here is that the ->nopage handler didn't pass down enough information (ie. pgoff). But it is more logical to pass pgoff rather than have the ->nopage function calculate it itself anyway (because that's a similar layering violation). Having the populate handler install the pte itself is likewise a nasty thing to be doing. This patch introduces a new fault handler that replaces ->nopage and ->populate and (later) ->nopfn. Most of the old mechanism is still in place so there is a lot of duplication and nice cleanups that can be removed if everyone switches over. The rationale for doing this in the first place is that nonlinear mappings are subject to the pagefault vs invalidate/truncate race too, and it seemed stupid to duplicate the synchronisation logic rather than just consolidate the two. After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in pagecache. Seems like a fringe functionality anyway. NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no users have hit mainline yet. [akpm@linux-foundation.org: cleanup] [randy.dunlap@oracle.com: doc. fixes for readahead] [akpm@linux-foundation.org: build fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19	mm: fix fault vs invalidate race for linear mappings	Nick Piggin
	Fix the race between invalidate_inode_pages and do_no_page. Andrea Arcangeli identified a subtle race between invalidation of pages from pagecache with userspace mappings, and do_no_page. The issue is that invalidation has to shoot down all mappings to the page, before it can be discarded from the pagecache. Between shooting down ptes to a particular page, and actually dropping the struct page from the pagecache, do_no_page from any process might fault on that page and establish a new mapping to the page just before it gets discarded from the pagecache. The most common case where such invalidation is used is in file truncation. This case was catered for by doing a sort of open-coded seqlock between the file's i_size, and its truncate_count. Truncation will decrease i_size, then increment truncate_count before unmapping userspace pages; do_no_page will read truncate_count, then find the page if it is within i_size, and then check truncate_count under the page table lock and back out and retry if it had subsequently been changed (ptl will serialise against unmapping, and ensure a potentially updated truncate_count is actually visible). Complexity and documentation issues aside, the locking protocol fails in the case where we would like to invalidate pagecache inside i_size. do_no_page can come in anytime and filemap_nopage is not aware of the invalidation in progress (as it is when it is outside i_size). The end result is that dangling (->mapping == NULL) pages that appear to be from a particular file may be mapped into userspace with nonsense data. Valid mappings to the same place will see a different page. Andrea implemented two working fixes, one using a real seqlock, another using a page->flags bit. He also proposed using the page lock in do_no_page, but that was initially considered too heavyweight. However, it is not a global or per-file lock, and the page cacheline is modified in do_no_page to increment _count and _mapcount anyway, so a further modification should not be a large performance hit. Scalability is not an issue. This patch implements this latter approach. ->nopage implementations return with the page locked if it is possible for their underlying file to be invalidated (in that case, they must set a special vm_flags bit to indicate so). do_no_page only unlocks the page after setting up the mapping completely. invalidation is excluded because it holds the page lock during invalidation of each page (and ensures that the page is not mapped while holding the lock). This also allows significant simplifications in do_no_page, because we have the page locked in the right place in the pagecache from the start. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-18	gfs2: stop giving out non-cluster-coherent leases	Marc Eshel
	Since gfs2 can't prevent conflicting opens or leases on other nodes, we probably shouldn't allow it to give out leases at all. Put the newly defined lease operation into use in gfs2 by turning off lease, unless we're using the "nolock' locking module (in which case all locking is local anyway). Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Steven Whitehouse <swhiteho@redhat.com>
2007-07-17	Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check	Satyam Sharma
	Introduce is_owner_or_cap() macro in fs.h, and convert over relevant users to it. This is done because we want to avoid bugs in the future where we check for only effective fsuid of the current task against a file's owning uid, without simultaneously checking for CAP_FOWNER as well, thus violating its semantics. [ XFS uses special macros and structures, and in general looked ... untouchable, so we leave it alone -- but it has been looked over. ] The (current->fsuid != inode->i_uid) check in generic_permission() and exec_permission_lite() is left alone, because those operations are covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone. Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17	knfsd: exportfs: add exportfs.h header	Christoph Hellwig
	currently the export_operation structure and helpers related to it are in fs.h. fs.h is already far too large and there are very few places needing the export bits, so split them off into a separate header. [akpm@linux-foundation.org: fix cifs build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Neil Brown <neilb@suse.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16	Remove capability.h from mm.h	Alexey Dobriyan
	I forgot to remove capability.h from mm.h while removing sched.h! This patch remedies that, because the only inline function which was using CAP_something was made out of line. Cross-compile tested without regressions on: all powerpc defconfigs all mips defconfigs all m68k defconfigs all arm defconfigs all ia64 defconfigs alpha alpha-allnoconfig alpha-defconfig alpha-up arm i386 i386-allnoconfig i386-defconfig i386-up ia64 ia64-allnoconfig ia64-defconfig ia64-up m68k mips parisc parisc-allnoconfig parisc-defconfig parisc-up powerpc powerpc-up s390 s390-allnoconfig s390-defconfig s390-up sparc sparc-allnoconfig sparc-defconfig sparc-up sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up um-x86_64 x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits) [GFS2] Accept old format NFS filehandles [GFS2] Small fixes to logging code [DLM] dump more lock values [GFS2] Remove i_mode passing from NFS File Handle [GFS2] Obtaining no_formal_ino from directory entry [GFS2] git-gfs2-nmw-build-fix [GFS2] System won't suspend with GFS2 file system mounted [GFS2] remounting w/o acl option leaves acls enabled [GFS2] inode size inconsistency [DLM] Telnet to port 21064 can stop all lockspaces [GFS2] Fix gfs2_block_truncate_page err return [GFS2] Addendum to the journaled file/unmount patch [GFS2] Simplify multiple glock aquisition [GFS2] assertion failure after writing to journaled file, umount [GFS2] Use zero_user_page() in stuffed_readpage() [GFS2] Remove bogus '\0' in rgrp.c [GFS2] Journaled file write/unstuff bug [DLM] don't require FS flag on all nodes [GFS2] Fix deallocation issues [GFS2] return conflicts for GETLK ...
2007-07-10	[GFS2] Accept old format NFS filehandles	Steven Whitehouse
	On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote: > > -#define GFS2_LARGE_FH_SIZE 10 > > - > > -struct gfs2_fh_obj { > > - struct gfs2_inum_host this; > > - u32 imode; > > -}; > > +#define GFS2_LARGE_FH_SIZE 8 > > Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE > or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older > gfs version anymore. Stale filehandles because of a new kernel version > are a big no-no, so please add back code to handle the old filehandles > on the decode side. > This should fix that problem I think since its only relating to end of the fh we can just ignore that field in order to accept the older format. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wendy Cheng <wcheng@redhat.com>
2007-07-10	sendfile: remove .sendfile from filesystems that use generic_file_sendfile()	Jens Axboe
	They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-09	[GFS2] Small fixes to logging code	Steven Whitehouse
	This reverts part of an earlier patch which tried to reclaim gfs2_bufdata structures too early and resulted in a "use after free" case (this bit from me). Also a change to not write out log headers unless we really need to (in the case of flushing nothing we don't need a header) from Bob. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2007-07-09	[GFS2] Remove i_mode passing from NFS File Handle	Wendy Cheng
	GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] Obtaining no_formal_ino from directory entry	Wendy Cheng
	GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] System won't suspend with GFS2 file system mounted	Abhijith Das
	The kernel threads in gfs2, namely gfs2_scand, gfs2_logd, gfs2_quotad, gfs2_glockd, gfs2_recoverd weren't doing anything when the suspend mechanism was trying to freeze them. I put in calls to refrigerator() in the loops for all the daemons and suspend works as expected. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] remounting w/o acl option leaves acls enabled	Bob Peterson
	This patch is for bugzilla bug #245663. This crosswrites a fix from gfs1 (bz #210369) so that the mount options are reset properly upon remount. This was tested on system trin-10. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] inode size inconsistency	Wendy Cheng
	This should have been part of the NFS patch #1 but somehow I missed it when packaging the patches. It is not a critical issue as the others (I hope). RHEL 5.1 31.el5 kernel runs fine without this change. Our truncate code is chopped into two parts, one for vfs inode changes (in vmtruncate()) and one of gfs inode (in gfs2_truncatei()). These two operatons are, unfortunately, not atomic. So it could happens that vmtruncate() succeeds (inode->i_size is changed) but gfs2_truncatei fails (say kernel temporarily out of memory). This would leave gfs inode i_di.di_size out of sync with vfs inode i_size. It will later confuse gfs2_commit_write() if a write is issued. Last time I checked, it will cause file corruption. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] Fix gfs2_block_truncate_page err return	S. Wendy Cheng
	Code segment inside gfs2_block_truncate_page() doesn't set the return code correctly. This causes NFSD erroneously returns EIO back to client with setattr procedure call (truncate error). Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] Addendum to the journaled file/unmount patch	Robert Peterson
	This patch is an addendum to the previous journaled file/unmount patch. It fixes a problem discovered during testing. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] Simplify multiple glock aquisition	Steven Whitehouse
	There is a bug in the code which acquires multiple glocks where if the initial out-of-order attempt fails part way though we can land up trying to acquire the wrong number of glocks. This is part of the fix for red hat bz #239737. The other part of the bz doesn't apply to upstream kernels since it was fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d3717bdf8f08a0e1039158c8bab2c24d20f492b6 Since the out-of-order code doesn't appear to add anything to the performance of GFS2, this patch just removed it rather than trying to fix it. It should be much easier to see whats going on here now. In addition, we don't allocate any memory unless we are using a lot of glocks (which is a relatively uncommon case). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09	[GFS2] assertion failure after writing to journaled file, umount	Robert Peterson
	This patch passes all my nasty tests that were causing the code to fail under one circumstance or another. Here is a complete summary of all changes from today's git tree, in order of appearance: 1. There are now separate variables for metadata buffer accounting. 2. Variable sd_log_num_hdrs is no longer needed, since the header accounting is taken care of by the reserve/refund sequence. 3. Fixed a tiny grammatical problem in a comment. 4. Added a new function "calc_reserved" to calculate the reserved log space. This isn't entirely necessary, but it has two benefits: First, it simplifies the gfs2_log_refund function greatly. Second, it allows for easier debugging because I could sprinkle the code with calls to this function to make sure the accounting is proper (by adding asserts and printks) at strategic point of the code. 5. In log_pull_tail there apparently was a kludge to fix up the accounting based on a "pull" parameter. The buffer accounting is now done properly, so the kludge was removed. 6. File sync operations were making a call to gfs2_log_flush that writes another journal header. Since that header was unplanned for (reserved) by the reserve/refund sequence, the free space had to be decremented so that when log_pull_tail gets called, the free space is be adjusted properly. (Did I hear you call that a kludge? well, maybe, but a lot more justifiable than the one I removed). 7. In the gfs2_log_shutdown code, it optionally syncs the log by specifying the PULL parameter to log_write_header. I'm not sure this is necessary anymore. It just seems to me there could be cases where shutdown is called while there are outstanding log buffers. 8. In the (data)buf_lo_before_commit functions, I changed some offset values from being calculated on the fly to being constants. That simplified some code and we might as well let the compiler do the calculation once rather than redoing those cycles at run time. 9. This version has my rewritten databuf_lo_add function. This version is much more like its predecessor, buf_lo_add, which makes it easier to understand. Again, this might not be necessary, but it seems as if this one works as well as the previous one, maybe even better, so I decided to leave it in. 10. In databuf_lo_before_commit, a previous data corruption problem was caused by going off the end of the buffer. The proper solution is to have the proper limit in place, rather than stopping earlier. (Thus my previous attempt to fix it is wrong). If you don't wrap the buffer, you're stopping too early and that causes more log buffer accounting problems. 11. In lops.h there are two new (previously mentioned) constants for figuring out the data offset for the journal buffers. 12. There are also two new functions, buf_limit and databuf_limit to calculate how many entries will fit in the buffer. 13. In function gfs2_meta_wipe, it needs to distinguish between pinned metadata buffers and journaled data buffers for proper journal buffer accounting. It can't use the JDATA gfs2_inode flag because it's sometimes passed the "real" inode and sometimes the "metadata inode" and the inode flags will be random bits in a metadata gfs2_inode. It needs to base its decision on which was passed in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>