asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2006-03-28	[PATCH] copy_process: cleanup bad_fork_cleanup_signal	Oleg Nesterov
	__exit_signal() does important cleanups atomically under ->siglock. It is also called from copy_process's error path. This is not good, for example we can't move __unhash_process() under ->siglock for that reason. We should not mix these 2 paths, just look at ugly 'if (p->sighand)' under 'bad_fork_cleanup_sighand:' label. For copy_process() case it is sufficient to just backout copy_signal(), nothing more. Again, nobody can see this task yet. For CLONE_THREAD case we just decrement signal->count, otherwise nobody can see this ->signal and we can free it lockless. This patch assumes it is safe to do exit_thread_group_keys() without tasklist_lock. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] copy_process: cleanup bad_fork_cleanup_sighand	Oleg Nesterov
	The only caller of exit_sighand(tsk) is copy_process's error path. We can call __exit_sighand() directly and kill exit_sighand(). This 'tsk' was not yet registered in pid_hash[] or init_task.tasks, it has no external references, nobody can see it, and IF (clone_flags & CLONE_SIGHAND) At least 'current' has a reference to ->sighand, this means atomic_dec_and_test(sighand->count) can't be true. ELSE Nobody can see this ->sighand, this means we can free it without any locking. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] introduce lock_task_sighand() helper	Oleg Nesterov
	Add lock_task_sighand() helper and converts group_send_sig_info() to use it. Hopefully we will have more users soon. This patch also removes '!sighand->count' and '!p->usage' checks, I think they both are bogus, racy and unneeded (but probably it makes sense to restore them as BUG_ON()s). ->sighand is cleared and it's ->count is decremented in release_task() with sighand->siglock held, so it is a bug to have '!p->usage \|\| !->count' after we already locked and verified it is the same. On the other hand, an already dead task without ->sighand can have a non-zero ->usage due to ptrace, for example. If we read the stale value of ->sighand we must see the change after spin_lock(), because that change was done while holding that same old ->sighand.siglock. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] convert sighand_cache to use SLAB_DESTROY_BY_RCU	Oleg Nesterov
	This patch borrows a clever Hugh's 'struct anon_vma' trick. Without tasklist_lock held we can't trust task->sighand until we locked it and re-checked that it is still the same. But this means we don't need to defer 'kmem_cache_free(sighand)'. We can return the memory to slab immediately, all we need is to be sure that sighand->siglock can't dissapear inside rcu protected section. To do so we need to initialize ->siglock inside ctor function, SLAB_DESTROY_BY_RCU does the rest. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] pidhash: don't use zero pids	Oleg Nesterov
	daemonize() calls set_special_pids(1,1), while init and kernel threads spawned from init/main.c:init() run with 0,0 special pids. This patch changes INIT_SIGNALS() so that that they run with ->pgrp == ->session == 1 also. This patch relies on fact that swapper's pid == 1. Now we have no hashed zero pids in pid_hash[]. User-space visibible change is that now /sbin/init runs with (1,1) special pids and becomes a session leader. Quoting Eric W. Biederman: > > daemonize consuming pids (1,1) then consumes pgrp 1. So that when > /sbin/init calls setsid() it thinks /sbin/init is a process group > leader and setsid() fails. So /sbin/init wants pgrp 1 session 1 > but doesn't get it. I am pretty certain daemonize did not exist so > /sbin/init got pgrp 1 session 1 in 2.4. > > That is the bug that is being fixed. > > This patch takes things one step farther and essentially calls > setsid() for pid == 1 before init is execed. That is new behavior > but it cleans up the kernel as we now do not need to support the > case of a process without a process group or a session. > > The only process that could have possibly cared was /sbin/init > and it already calls setsid() because it doesn't want that. > > If this was going to break anything noticeable the change in behavior > from 2.4 to 2.6 would have already done that. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] pidhash: don't count idle threads	Oleg Nesterov
	fork_idle() does unhash_process() just after copy_process(). Contrary, boot_cpu's idle thread explicitely registers itself for each pid_type with nr = 0. copy_process() already checks p->pid != 0 before process_counts++, I think we can just skip attach_pid() calls and job control inits for idle threads and kill unhash_process(). We don't need to cleanup ->proc_dentry in fork_idle() because with this patch idle threads are never hashed in kernel/pid.c:pid_hash[]. We don't need to hash pid == 0 in pidmap_init(). free_pidmap() is never called with pid == 0 arg, so it will never be reused. So it is still possible to use pid == 0 in any PIDTYPE_xxx namespace from kernel/pid.c's POV. However with this patch we don't hash pid == 0 for PIDTYPE_PID case. We still have have PIDTYPE_PGID/PIDTYPE_SID entries with pid == 0: /sbin/init and kernel threads which don't call daemonize(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] kill SET_LINKS/REMOVE_LINKS	Oleg Nesterov
	Both SET_LINKS() and SET_LINKS/REMOVE_LINKS() have exactly one caller, and these callers already check thread_group_leader(). This patch kills theese macros, they mix two different things: setting process's parent and registering it in init_task.tasks list. Callers are updated to do these actions by hand. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] remove add_parent()'s parent argument	Oleg Nesterov
	add_parent(p, parent) is always called with parent == p->parent, and it makes no sense to do it differently. This patch removes this argument. No changes in affected .o files. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] pidhash: kill switch_exec_pids	Eric W. Biederman
	switch_exec_pids is only called from de_thread by way of exec, and it is only called when we are exec'ing from a non thread group leader. Currently switch_exec_pids gives the leader the pid of the thread and unhashes and rehashes all of the process groups. The leader is already in the EXIT_DEAD state so no one cares about it's pids. The only concern for the leader is that __unhash_process called from release_task will function correctly. If we don't touch the leader at all we know that __unhash_process will work fine so there is no need to touch the leader. For the task becomming the thread group leader, we just need to give it the pid of the old thread group leader, add it to the task list, and attach it to the session and the process group of the thread group. Currently de_thread is also adding the task to the task list which is just silly. Currently the only leader of __detach_pid besides detach_pid is switch_exec_pids because of the ugly extra work that was being performed. So this patch removes switch_exec_pids because it is doing too much, it is creating an unnecessary special case in pid.c, duing work duplicated in de_thread, and generally obscuring what it is going on. The necessary work is added to de_thread, and it seems to be a little clearer there what is going on. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Remove dead kill_sl prototype from sched.h	Eric W. Biederman
	The kill_sl function doesn't exist in the kernel so a prototype is completely unnecessary. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	Merge master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] 3388/1: ixp23xx: add core ixp23xx support [ARM] 3417/1: add support for logicpd pxa270 card engine [ARM] 3387/1: ixp23xx: add defconfig [ARM] 3377/2: add support for intel xsc3 core [ARM] Move ice-dcc code into misc.c [ARM] Fix decompressor serial IO to give CRLF not LFCR [ARM] proc-v6: mark page table walks outer-cacheable, shared. Enable NX. [ARM] nommu: trivial patch for arch/arm/lib/Makefile [ARM] 3416/1: Update LART site URL [ARM] 3415/1: Akita: Add missing EXPORT_SYMBOL [ARM] 3414/1: ep93xx: reset ethernet controller before uncompressing
2006-03-28	Merge master.kernel.org:/home/rmk/linux-2.6-serial	Linus Torvalds
	* master.kernel.org:/home/rmk/linux-2.6-serial: [SERIAL] Provide Cirrus EP93xx AMBA PL010 serial support. [SERIAL] amba-pl010: allow platforms to specify modem control method [SERIAL] Remove obsoleted au1x00_uart driver [SERIAL] Small time UART configuration fix for AU1100 processor
2006-03-28	[ARM] 3388/1: ixp23xx: add core ixp23xx support	Lennert Buytenhek
	Patch from Lennert Buytenhek This patch adds support for the Intel ixp23xx series of CPUs. The ixp23xx is an XSC3 based CPU with 512K of L2 cache, a 64bit 66MHz PCI interface, two DDR RAM interfaces, QDR RAM interfaces, two gigabit MACs, two 10/100 MACs, expansion bus, four microengines, a Media and Switch Fabric unit almost identical to the one on the ixp2400, two xscale (8250ish) UARTs and a bunch of other stuff. This patch adds the core ixp23xx support code, and support for the ADI Engineering Roadrunner, Intel IXDP2351, and IP Fabrics Double Espresso platforms. Signed-off-by: Deepak Saxena <dsaxena@plexity.net> Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-28	[ARM] 3417/1: add support for logicpd pxa270 card engine	Lennert Buytenhek
	Patch from Lennert Buytenhek Add support for the LogicPD PXA270 Card Engine. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-28	[ARM] 3377/2: add support for intel xsc3 core	Lennert Buytenhek
	Patch from Lennert Buytenhek This patch adds support for the new XScale v3 core. This is an ARMv5 ISA core with the following additions: - L2 cache - I/O coherency support (on select chipsets) - Low-Locality Reference cache attributes (replaces mini-cache) - Supersections (v6 compatible) - 36-bit addressing (v6 compatible) - Single instruction cache line clean/invalidate - LRU cache replacement (vs round-robin) I attempted to merge the XSC3 support into proc-xscale.S, but XSC3 cores have separate errata and have to handle things like L2, so it is simpler to keep it separate. L2 cache support is currently a build option because the L2 enable bit must be set before we enable the MMU and there is no easy way to capture command line parameters at this point. There are still optimizations that can be done such as using LLR for copypage (in theory using the exisiting mini-cache code) but those can be addressed down the road. Signed-off-by: Deepak Saxena <dsaxena@plexity.net> Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-28	Merge branch 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block	Linus Torvalds
	* 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block: [BLOCK] cfq-iosched: seek and async performance fixes [PATCH] ll_rw_blk: fix 80-col offender in put_io_context() [PATCH] cfq-iosched: small cfq_choose_req() optimization [PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree
2006-03-28	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Implement futex_atomic_cmpxchg_inatomic().
2006-03-28	[PATCH] Typo fixes	Alexey Dobriyan
	Fix a lot of typos. Eyeballed by jmc@ in OpenBSD. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Replace 0xff.. with correct DMA_xBIT_MASK	Matthias Gehre
	Replace all occurences of 0xff.. in calls to function pci_set_dma_mask() and pci_set_consistant_dma_mask() with the corresponding DMA_xBIT_MASK from linux/dma-mapping.h. Signed-off-by: Matthias Gehre <M.Gehre@gmx.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] arch/i386/kernel/microcode.c: remove the obsolete microcode_ioctl	Adrian Bunk
	Nowadays, even Debian stable ships a microcode_ctl utility recent enough to no longer use this ioctl. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Tigran Aivazian <tigran_aivazian@symantec.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Make most file operations structs in fs/ const	Arjan van de Ven
	This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] mark f_ops const in the inode	Arjan van de Ven
	Mark the f_ops members of inodes as const, as well as fix the ripple-through this causes by places that copy this f_ops and then "do stuff" with it. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] for_each_possible_cpu: fixes for generic part	KAMEZAWA Hiroyuki
	replaces for_each_cpu with for_each_possible_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] for_each_possible_cpu: defines for_each_possible_cpu	KAMEZAWA Hiroyuki
	for_each_cpu() is a for-loop over cpu_possible_map. for_each_online_cpu is for-loop cpu over cpu_online_map. .....for_each_cpu() is not sufficiently explicit and can lead to mistakes. This patch adds for_each_possible_cpu() in preparation for the removal of for_each_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Optimize select/poll by putting small data sets on the stack	Andi Kleen
	Optimize select and poll by a using stack space for small fd sets This brings back an old optimization from Linux 2.0. Using the stack is faster than kmalloc. On a Intel P4 system it speeds up a select of a single pty fd by about 13% (~4000 cycles -> ~3500) It also saves memory because a daemon hanging in select or poll will usually save one or two less pages. This can add up - e.g. if you have 10 daemons blocking in poll/select you save 40KB of memory. I did a patch for this long ago, but it was never applied. This version is a reimplementation of the old patch that tries to be less intrusive. I only did the minimal changes needed for the stack allocation. The cut off point before external memory is allocated is currently at 832bytes. The system calls always allocate this much memory on the stack. These 832 bytes are divided into 256 bytes frontend data (for the select bitmaps of the pollfds) and the rest of the space for the wait queues used by the low level drivers. There are some extreme cases where this won't work out for select and it falls back to allocating memory too early - especially with very sparse large select bitmaps - but the majority of processes who only have a small number of file descriptors should be ok. [TBD: 832/256 might not be the best split for select or poll] I suspect more optimizations might be possible, but they would be more complicated. One way would be to cache the select/poll context over multiple system calls because typically the input values should be similar. Problem is when to flush the file descriptors out though. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Small fixes backported to old IDE SiS driver	Alan Cox
	Some quick backport bits from the libata PATA work to fix things found in the sis driver. The piix driver needs some fixes too but those are way to large and need someone working on old IDE with time to do them. This patch fixes the case where random bits get loaded into SIS timing registers according to the description of the correct behaviour from Vojtech Pavlik. It also adds the SiS5517 ATA16 chipset which is not currently supported by the driver. Thanks to Conrad Harriss for loaning me the machine with the 5517 chipset. Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] remove relayfs_fs.h	Andrew Morton
	This is obsolete. Cc: Tom Zanussi <zanussi@us.ibm.com> Cc: Jens Axboe <axboe@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] fs/fat/: proper prototypes for two functions	Adrian Bunk
	Add proper prototypes for fat_cache_init() and fat_cache_destroy() in msdos_fs.h. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] alpha: make poll flags the same as other architectures	Andrew Morton
	Renumber the recently-added POLLREMOVE and POLLRDHUP to line up with the other architectures. Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Add oprofile_add_ext_sample	Brian Rogan
	On ppc64 we look at a profiling register to work out the sample address and if it was in userspace or kernel. The backtrace interface oprofile_add_sample does not allow this. Create oprofile_add_ext_sample and make oprofile_add_sample use it too. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Philippe Elie <phil.el@wanadoo.fr> Cc: John Levon <levon@movementarian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] synclink_gt add gpio feature	Paul Fulghum
	Add driver support for general purpose I/O feature of the Synclink GT adapters. Signed-off-by: Paul Fulghum <paulkf@micrgate.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] Decrapify asm-generic/local.h	Kyle McMartin
	Now that Christoph Lameter's atomic_long_t support is merged in mainline, might as well convert asm-generic/local.h to use it, so the same code can be used for both sizes of 32 and 64-bit unsigned longs. akpm sayeth: Q: Is there any particular reason why these routines weren't simply implemented with local_save/restore_flags, if they are only meant to guarantee atomicity to the local cpu? I'm sure on most platforms this would be more efficient than using an atomic... A: The whole _point_ of local_t is to avoid local_irq_disable(). It's designed to exploit the fact that many CPUs can do incs and decs in a way which is atomic wrt local interrupts, but not atomic wrt SMP. But this patch makes sense, because asm-generic/local.h is just a fallback implementation for architectures which either cannot perform these local-irq-atomic operations, or its maintainers haven't yet got around to implementing them. We need more work done on local_t in the 2.6.17 timeframe - they're defined as unsigned long, but some architectures implement them as signed long. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] RTC: Fix up some RTC whitespace and style	Matt Mackall
	Fix up some RTC whitespace and style Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] RTC: Remove RTC UIP synchronization on MIPS MC146818	Matt Mackall
	Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[PATCH] RTC: Remove RTC UIP synchronization on x86	Matt Mackall
	Reading the CMOS clock on x86 and some other arches currently takes up to one second because it synchronizes with the CMOS second tick-over. This delay shows up at boot time as well a resume time. This is the currently the most substantial boot time delay for machines that are working towards instant-on capability. Also, a quick back of the envelope calculation (.5sec * 2M users * 1 boot a day * 10 years) suggests it has cost Linux users in the neighborhood of a million man-hours. An earlier thread on this topic is here: http://groups.google.com/group/linux.kernel/browse_frm/thread/8a24255215ff6151/2aa97e66a977653d?hl=en&lr=&ie=UTF-8&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D1To2R-2S7-11%40gated-at.bofh.it#2aa97e66a977653d ..from which the consensus seems to be that it's no longer desirable. In my view, there are basically four cases to consider: 1) networked, need precise walltime: use NTP 2) networked, don't need precise walltime: use NTP anyway 3) not networked, don't need sub-second precision walltime: don't care 4) not networked, need sub-second precision walltime: get a network or a radio time source because RTC isn't good enough anyway So this patch series simply removes the synchronization in favor of a simple seqlock-like approach using the seconds value. Note that for purposes of timer accuracy on wakeup, this patch will cause us to fire timers up to one second late. But as the current timer resume code will already sync once (or more!), it's no worse for short timers. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Andi Kleen <ak@muc.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28	[BLOCK] cfq-iosched: seek and async performance fixes	Jens Axboe
	Detect whether a given process is seeky and if so disable (mostly) the idle window if it is. We still allow just a little idle time, just enough to allow that process to submit a new request. That is needed to maintain fairness across priority groups. In some cases, we could setup several async queues. This is not optimal from a performance POV, since we want all async io in one queue to perform good sorting on it. It also impacted sync queues, as async io got too much slice time. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-28	[ARM] Fix decompressor serial IO to give CRLF not LFCR	Russell King
	As per the corresponding change to the serial drivers, arrange for ARM decompressors to give CRLF. Move the common putstr code into misc.c such that machines only need to supply "putc" and "flush" functions. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-03-28	[SPARC64]: Implement futex_atomic_cmpxchg_inatomic().	David S. Miller
	Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28	[PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree	Jens Axboe
	On setups with many disks, we spend a considerable amount of time looking up the process-disk mapping on each queue of io. Testing with a NULL based block driver, this costs 40-50% reduction in throughput for 1000 disks. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-27	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: drop duplicate assignment in request_sock [IPSEC]: Fix tunnel error handling in ipcomp6
2006-03-27	Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block	Linus Torvalds
	* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Don't make debugfs depend on DEBUG_KERNEL [PATCH] Fix blktrace compile with sysfs not defined [PATCH] unused label in drivers/block/cciss. [BLOCK] increase size of disk stat counters [PATCH] blk_execute_rq_nowait-speedup [PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure [BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion [PATCH] kzalloc() conversion in drivers/block [PATCH] update max_sectors documentation
2006-03-27	[PATCH] md: Convert reconfig_sem to reconfig_mutex	NeilBrown
	... being careful that mutex_trylock is inverted wrt down_trylock Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Support suspending of IO to regions of an md array	NeilBrown
	This allows user-space to access data safely. This is needed for raid5 reshape as user-space needs to take a backup of the first few stripes before allowing reshape to commence. It will also be useful in cluster-aware raid1 configurations so that all cluster members can leave a section of the array untouched while a resync/recovery happens. A 'start' and 'end' of the suspended range are written to 2 sysfs attributes. Note that only one range can be suspended at a time. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Split reshape handler in check_reshape and start_reshape	NeilBrown
	check_reshape checks validity and does things that can be done instantly - like adding devices to raid1. start_reshape initiates a restriping process to convert the whole array. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Only checkpoint expansion progress occasionally	NeilBrown
	Instead of checkpointing at each stripe, only checkpoint when a new write would overwrite uncheckpointed data. Block any write to the uncheckpointed area. Arbitrarily checkpoint at least every 3Meg. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Checkpoint and allow restart of raid5 reshape	NeilBrown
	We allow the superblock to record an 'old' and a 'new' geometry, and a position where any conversion is up to. The geometry allows for changing chunksize, layout and level as well as number of devices. When using verion-0.90 superblock, we convert the version to 0.91 while the conversion is happening so that an old kernel will refuse the assemble the array. For version-1, we use a feature bit for the same effect. When starting an array we check for an incomplete reshape and restart the reshape process if needed. If the reshape stopped at an awkward time (like when updating the first stripe) we refuse to assemble the array, and let user-space worry about it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Final stages of raid5 expand code	NeilBrown
	This patch adds raid5_reshape and end_reshape which will start and finish the reshape processes. raid5_reshape is only enabled in CONFIG_MD_RAID5_RESHAPE is set, to discourage accidental use. Read the 'help' for the CONFIG_MD_RAID5_RESHAPE entry. and Make sure that you have backups, just in case. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Core of raid5 resize process	NeilBrown
	This patch provides the core of the resize/expand process. sync_request notices if a 'reshape' is happening and acts accordingly. It allocated new stripe_heads for the next chunk-wide-stripe in the target geometry, marking them STRIPE_EXPANDING. Then it finds which stripe heads in the old geometry can provide data needed by these and marks them STRIPE_EXPAND_SOURCE. This causes stripe_handle to read all blocks on those stripes. Once all blocks on a STRIPE_EXPAND_SOURCE stripe_head are read, any that are needed are copied into the corresponding STRIPE_EXPANDING stripe_head. Once a STRIPE_EXPANDING stripe_head is full, it is marks STRIPE_EXPAND_READY and then is written out and released. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Infrastructure to allow normal IO to continue while array is ↵	NeilBrown
	expanding We need to allow that different stripes are of different effective sizes, and use the appropriate size. Also, when a stripe is being expanded, we must block any IO attempts until the stripe is stable again. Key elements in this change are: - each stripe_head gets a 'disk' field which is part of the key, thus there can sometimes be two stripe heads of the same area of the array, but covering different numbers of devices. One of these will be marked STRIPE_EXPANDING and so won't accept new requests. - conf->expand_progress tracks how the expansion is progressing and is used to determine whether the target part of the array has been expanded yet or not. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27	[PATCH] md: Allow stripes to be expanded in preparation for expanding an array	NeilBrown
	Before a RAID-5 can be expanded, we need to be able to expand the stripe-cache data structure. This requires allocating new stripes in a new kmem_cache. If this succeeds, we copy cache pages over and release the old stripes and kmem_cache. We then allocate new pages. If that fails, we leave the stripe cache at it's new size. It isn't worth the effort to shrink it back again. Unfortuanately this means we need two kmem_cache names as we, for a short period of time, we have two kmem_caches. So they are raid5/%s and raid5/%s-alt Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>