asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2009-04-16	sysfs: don't use global workqueue in sysfs_schedule_callback()	Alex Chiang
	A sysfs attribute using sysfs_schedule_callback() to commit suicide may end up calling device_unregister(), which will eventually call a driver's ->remove function. Drivers may call flush_scheduled_work() in their shutdown routines, in which case lockdep will complain with something like the following: ============================================= [ INFO: possible recursive locking detected ] 2.6.29-rc8-kk #1 --------------------------------------------- events/4/56 is trying to acquire lock: (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0 but task is already holding lock: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 other info that might help us debug this: 3 locks held by events/4/56: #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230 #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40 stack backtrace: Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1 Call Trace: [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260 [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40 [<ffffffff8026f148>] lock_acquire+0x58/0x80 [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0 [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0 [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0 [<ffffffff80258070>] flush_scheduled_work+0x10/0x20 [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e] [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50 [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70 [<ffffffff80441da9>] __device_release_driver+0x59/0x90 [<ffffffff80441edb>] device_release_driver+0x2b/0x40 [<ffffffff804419d6>] bus_remove_device+0xa6/0x120 [<ffffffff8043e46b>] device_del+0x12b/0x190 [<ffffffff8043e4f6>] device_unregister+0x26/0x70 [<ffffffff803ba969>] pci_stop_dev+0x49/0x60 [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0 [<ffffffff803c10d9>] remove_callback+0x29/0x40 [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50 [<ffffffff8025769a>] run_workqueue+0x15a/0x230 [<ffffffff80257648>] ? run_workqueue+0x108/0x230 [<ffffffff8025846f>] worker_thread+0x9f/0x100 [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff802583d0>] ? worker_thread+0x0/0x100 [<ffffffff8025b89d>] kthread+0x4d/0x80 [<ffffffff8020d4ba>] child_rip+0xa/0x20 [<ffffffff8020cebc>] ? restore_args+0x0/0x30 [<ffffffff8025b850>] ? kthread+0x0/0x80 [<ffffffff8020d4b0>] ? child_rip+0x0/0x20 Although we know that the device_unregister path will never acquire a lock that a driver might try to acquire in its ->remove, in general we should never attempt to flush a workqueue from within the same workqueue, and lockdep rightly complains. So as long as sysfs attributes cannot commit suicide directly and we are stuck with this callback mechanism, put the sysfs callbacks on their own workqueue instead of the global one. This has the side benefit that if a suicidal sysfs attribute kicks off a long chain of ->remove callbacks, we no longer induce a long delay on the global queue. This also fixes a missing module_put in the error path introduced by sysfs-only-allow-one-scheduled-removal-callback-per-kobj.patch. We never destroy the workqueue, but I'm not sure that's a problem. Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-16	Merge branch 'upstream-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: ata: Report 16/32bit PIO as best we can libata: use ATA_ID_CFA_* pata_legacy: fix no device fail path pata_hpt37x: fix HPT370 DMA timeouts libata: handle SEMB signature better
2009-04-16	mm: pass correct mm when growing stack	Hugh Dickins
	Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in security_vm_enough_memory(), when do_execve() is touching the target mm's stack, to set up its args and environment. Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns an mm-less kernel thread to do the exec. And in any case, that vm_enough_memory check when growing stack ought to be done on the target mm, not on the execer's mm (though apart from the warning, it only makes a slight tweak to OVERCOMMIT_NEVER behaviour). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-16	Revert "kobject: don't block for each kobject_uevent".	Hugh Dickins
	This reverts commit f520360d93cdc37de5d972dac4bf3bdef6a7f6a7. Tetsuo Handa, running a kernel with CONFIG_DEBUG_PAGEALLOC=y and CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug, has been hitting RCU detected CPU stalls: it's been spinning in the loop where do_execve() counts up the args (but why wasn't fixup_exception working? dunno). The recent change, switching kobject_uevent_env() from UMH_WAIT_EXEC to UMH_NO_WAIT, is broken: the exec uses args on the local stack here, and an env which is kfreed as soon as call_usermodehelper() returns. It very much needs to wait for the exec to be done. An alternative would be to keep the UMH_NO_WAIT, and complicate the code to allocate and free these resources correctly? but no, as GregKH pointed out when making the commit, CONFIG_UEVENT_HELPER_PATH="" is a much better optimization - though some distros are still saying /sbin/hotplug in their .config, yet with no such binary in their initrd or their root. Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Will Newton <will.newton@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-16	ata: Report 16/32bit PIO as best we can	Alan Cox
	The legacy old IDE ioctl API for this is a bit primitive so we try and map stuff sensibly onto it. - Set PIO over DMA devices to report 32bit - Add ability to change the PIO32 settings if the controller permits it - Add that functionality into the sff drivers - Add that functionality into the VLB legacy driver - Turn on the 32bit PIO on the ninja32 and add support there Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-04-16	libata: use ATA_ID_CFA_*	Sergei Shtylyov
	Use ATA_ID_CFA_* constants for CFA specific identify data words 162 and 163. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-04-16	pata_legacy: fix no device fail path	Tejun Heo
	When pata_legacy can't detect any device, it unregisters the platform_device and fails detection. However, it forgets to detach ata host triggering weird failures as the host later gets freed by devres while still attached. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-04-16	pata_hpt37x: fix HPT370 DMA timeouts	Sergei Shtylyov
	The libata driver has copied the code from the IDE driver which caused a post 2.4.18 regression on many HPT370[A] chips -- DMA stopped to work completely, only causing timeouts. Now remove hpt370_bmdma_start() for good... Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-04-16	libata: handle SEMB signature better	Tejun Heo
	WDC WD1600JS-62MHB5 successfully hits the window between ATA/ATAPI-7 and Serial ATA II standards and reports 3c/c3 signature which now is assigned to SEMB. Make ata_dev_classify() report ATA_DEV_SEMB on the sig and let ata_dev_read_id() work around it by trying IDENTIFY once. This fixes bko#11579. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Haun <drhaun88@gmail.com> Reported-by: Lars Wirzenius <liw@liw.fi> Reported-by: Juan Manuel <jmcarranza@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-04-16	Add block_write_full_page_endio for passing endio handler	Chris Mason
	block_write_full_page doesn't allow the caller to control what happens when the IO is over. This adds a new call named block_write_full_page_endio so the buffer head end_io handler can be provided by the caller. This will be used by the ext3 data=guarded mode to do i_size updates in a workqueue based end_io handler. end_buffer_async_write is also exported so it can be called to do the dirty work of managing page writeback for the higher level end_io handler. Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Theodore Tso <tytso@mit.edu> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-16	Export filemap_write_and_wait_range	Chris Mason
	This wasn't exported before and is useful (used by the experimental ext3 data=guarded code) Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Theodore Tso <tytso@mit.edu> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (64 commits) phylib: Fix delay argument of schedule_delayed_work NET/ixgbe: Fix powering off during shutdown NET/e1000e: Fix powering off during shutdown NET/e1000: Fix powering off during shutdown packet: avoid warnings when high-order page allocation fails gianfar: stop send queue before resetting gianfar myr10ge: again fix lro_gen_skb() alignment declance: convert to net_device_ops bfin_mac: convert to net_device_ops au1000: convert to net_device_ops atarilance: convert to net_device_ops a2065: convert to net_device_ops ixgbe: update real_num_tx_queues on changing num_rx_queues ixgbe: fix tx queue index Revert "rose: zero length frame filtering in af_rose.c" sfc: Use correct macro to set event bitfield sfc: Match calls to netif_napi_add() and netif_napi_del() bonding: Remove debug printk e1000/e1000: fix compile warning ehea: Fix incomplete conversion to net_device_ops ...
2009-04-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc: remove some pointless conditionals before kfree() sbus: changed ioctls to unlocked sparc: asm/atomic.h on 32bit should include asm/system.h for xchg sparc64: Fix smp_callin() locking.
2009-04-16	phylib: Fix delay argument of schedule_delayed_work	Atsushi Nemoto
	The commit a390d1f3 ("phylib: convert state_queue work to delayed_work") missed converting 'expires' value to 'delay' value. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Acked-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-16	NET/ixgbe: Fix powering off during shutdown	Rafael J. Wysocki
	Prevent ixgbe from putting the adapter into D3 during shutdown except when we're going to power off the system, since doing that may generally cause problems with kexec to happen (such problems were observed for igb and forcedeth). For this purpose seperate ixgbe_shutdown() from ixgbe_suspend() and use the appropriate PCI PM callbacks in both of them. Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-16	NET/e1000e: Fix powering off during shutdown	Rafael J. Wysocki
	Prevent e1000e from putting the adapter into D3 during shutdown except when we're going to power off the system, since doing that may generally cause problems with kexec to happen (such problems were observed for igb and forcedeth). For this purpose seperate e1000e_shutdown() from e1000e_suspend() and use the appropriate PCI PM callbacks in both of them. Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-16	NET/e1000: Fix powering off during shutdown	Rafael J. Wysocki
	Prevent e1000 from putting the adapter into D3 during shutdown except when we're going to power off the system, since doing that may generally cause problems with kexec to happen (such problems were observed for igb and forcedeth). For this purpose seperate e1000_shutdown() from e1000_suspend() and use the appropriate PCI PM callbacks in both of them. Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-15	RCU: Don't try and predeclare inline funcs as it upsets some versions of gcc	David Howells
	Don't try and predeclare inline funcs like this: static inline void wait_migrated_callbacks(void) ... static void _rcu_barrier(enum rcu_barrier type) { ... wait_migrated_callbacks(); } ... static inline void wait_migrated_callbacks(void) { wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); } as it upsets some versions of gcc under some circumstances: kernel/rcupdate.c: In function `_rcu_barrier': kernel/rcupdate.c:125: sorry, unimplemented: inlining failed in call to 'wait_migrated_callbacks': function body not available kernel/rcupdate.c:152: sorry, unimplemented: called from here This can be dealt with by simply putting the static variables (rcu_migrate_*) at the top, and moving the implementation of the function up so that it replaces its forward declaration. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15	The default CONFIG_BUG=n version of BUG() should have an empty do...while	David Howells
	The default CONFIG_BUG=n version of BUG() should incorporate an empty a do...while statement to avoid compilation weirdness. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15	MN10300: Stop gcc from generating uninitialised variable warnings after BUG()	David Howells
	Stop gcc from generating uninitialised variable warnings after BUG(). The problem is that MN10300's implementation of BUG() invokes system call 15 which doesn't return - but there's no way to tell the compiler that and also emit the bug table element with the correct file and line data. So instead, we make the do...while wrapper in _debug_bug_trap() an endless loop from which there's no escape. Also, while we're at it, (1) get rid of _debug_bug_trap() and just implement directly as BUG(), and (2) make the implementation of BUG() contingent on CONFIG_BUG=y. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15	MN10300: Wire up missing system calls	David Howells
	Wire up missing system calls preadv() and pwritev(). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15	MN10300: Discard duplicate PFN_xxx() macros	David Howells
	Discard duplicate PFN_xxx() macros from arch code as they're now in the general headers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6	Linus Torvalds
	* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] boot cputime accounting [S390] add read_persistent_clock [S390] cpu hotplug and accounting values [S390] fix idle time accounting [S390] smp: fix cpu_possible_map initialization [S390] dasd: fix idaw boundary checking for track based ccw [S390] dasd: Use the new async framework for autoonlining. [S390] qdio: remove dead timeout handler [S390] appldata: Use new mod_virt_timer_periodic() function. [S390] extend virtual timer interface by mod_virt_timer_periodic [S390] stp synchronization retry timer [S390] call nmi_enter/nmi_exit on machine checks [S390] wire up preadv/pwritev system calls [S390] s390: move machine flags to lowcore
2009-04-15	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Fix the cmd cache keys for amp verbs ALSA: add missing definitions(letters) to HD-Audio.txt ALSA: hda - Add quirk mask for Fujitsu Amilo laptops with ALC883 [ALSA] intel8x0: add one retry to the ac97_clock measurement routine [ALSA] intel8x0: fix wrong conditions in ac97_clock measure routine ALSA: hda - Avoid call of snd_jack_report at release ALSA: add private_data to struct snd_jack ALSA: snd-usb-caiaq: rename files to remove redundant information in file pathes ALSA: snd-usb-caiaq: clean up header includes ALSA: sound/pci: use memdup_user() ALSA: sound/usb: use memdup_user() ALSA: sound/isa: use memdup_user() ALSA: sound/core: use memdup_user() [ALSA] intel8x0: do not use zero value from PICB register [ALSA] intel8x0: an attempt to make ac97_clock measurement more reliable [ALSA] pcm-midlevel: Add more strict buffer position checks based on jiffies [ALSA] hda_intel: fix unexpected ring buffer positions ASoC: Disable S3C64xx support in Kconfig ASoC: magician: remove un-necessary #include of pxa-regs.h and hardware.h
2009-04-15	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Use DEFINE_SPINLOCK GFS2: cleanup file_operations mess GFS2: Move umount flush rwsem GFS2: Fix symlink creation race GFS2: Make quotad's waiting interruptible
2009-04-15	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds
	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (28 commits) cfq-iosched: add close cooperator code cfq-iosched: log responsible 'cfqq' in idle timer arm cfq-iosched: tweak kick logic a bit more cfq-iosched: no need to save interrupts in cfq_kick_queue() brd: fix cacheflushing brd: support barriers swap: Remove code handling bio_alloc failure with __GFP_WAIT gfs2: Remove code handling bio_alloc failure with __GFP_WAIT ext4: Remove code handling bio_alloc failure with __GFP_WAIT dio: Remove code handling bio_alloc failure with __GFP_WAIT block: Remove code handling bio_alloc failure with __GFP_WAIT bio: add documentation to bio_alloc() splice: add helpers for locking pipe inode splice: remove generic_file_splice_write_nolock() ocfs2: fix i_mutex locking in ocfs2_splice_to_file() splice: fix i_mutex locking in generic_splice_write() splice: remove i_mutex locking in splice_from_pipe() splice: split up __splice_from_pipe() block: fix SG_IO to return a proper error value cfq-iosched: don't delay queue kick for a merged request ...
2009-04-15	Merge branch 'topic/hda' into for-linus	Takashi Iwai
	* topic/hda: ALSA: hda - Fix the cmd cache keys for amp verbs ALSA: add missing definitions(letters) to HD-Audio.txt
2009-04-15	ALSA: hda - Fix the cmd cache keys for amp verbs	Takashi Iwai
	Fix the key value generation for get/set amp verbs. The upper bits of the parameter have to be combined with the verb value to be unique for each direction/index of amp access. This fixes the resume problem on some hardwares like Macbook after the channel mode is changed. Tested-by: Johannes Berg <johannes@sipsolutions.net> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-04-15	Merge branch 'merge' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc: pseries/dtl.c should include asm/firmware.h powerpc: Fix data-corrupting bug in __futex_atomic_op powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset() powerpc: Allow 256kB pages with SHMEM powerpc: Document new FSL I2C bindings and cleanup powerpc/mm: Fix compile warning powerpc/85xx: TQM8548: update defconfig powerpc/85xx: TQM8548: use proper phy-handles for enet2 and enet3 powerpc/85xx: TQM85xx: correct address of LM75 I2C device nodes powerpc: Add support for early tlbilx opcode powerpc: Fix tlbilx opcode
2009-04-15	acpi-cpufreq: fix 'smp_call_function_many()' confusion	Linus Torvalds
	It turns out that 'smp_call_function_many()' doesn't work at all like 'smp_call_function_single()', and my change to Andrew's patch to use it rather than a loop over all CPU's acpi-cpufreq doesn't work. My bad. 'smp_call_function_many()' has two "features" (aka "documented bugs"): (a) it needs to be called with preemption disabled, because it uses smp_processor_id() without guarding the CPU lookup with 'get_cpu()' and 'put_cpu()' like the 'single' variant does. (b) even if the current CPU is part of the CPU mask, it won't do the call on that CPU. Still, we're better off trying to use 'smp_call_function_many()' than looping over CPU's, since it at least in theory allows us to use a broadcast IPI and do it all in parallel. So let's just work around the silly semantic bugs in that function. Reported-and-tested-by: Ali Gholami Rudi <ali@rudi.ir> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org>, Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15	packet: avoid warnings when high-order page allocation fails	Eric Dumazet
	Latest tcpdump/libpcap triggers annoying messages because of high order page allocation failures (when lowmem exhausted or fragmented) These allocation errors are correctly handled so could be silent. [22660.208901] tcpdump: page allocation failure. order:5, mode:0xc0d0 [22660.208921] Pid: 13866, comm: tcpdump Not tainted 2.6.30-rc2 #170 [22660.208936] Call Trace: [22660.208950] [<c04e2b46>] ? printk+0x18/0x1a [22660.208965] [<c02760f7>] __alloc_pages_internal+0x357/0x460 [22660.208980] [<c0276251>] __get_free_pages+0x21/0x40 [22660.208995] [<c04cc835>] packet_set_ring+0x105/0x3d0 [22660.209009] [<c04ccd1d>] packet_setsockopt+0x21d/0x4d0 [22660.209025] [<c0270400>] ? filemap_fault+0x0/0x450 [22660.209040] [<c0449e34>] sys_setsockopt+0x54/0xa0 [22660.209053] [<c044b97f>] sys_socketcall+0xef/0x270 [22660.209067] [<c0202e34>] sysenter_do_call+0x12/0x26 Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-15	cfq-iosched: add close cooperator code	Jens Axboe
	If we have processes that are working in close proximity to each other on disk, we don't want to idle wait. Instead allow the close process to issue a request, getting better aggregate bandwidth. The anticipatory scheduler has similar checks, noop and deadline do not need it since they don't care about process <-> io mappings. The code for CFQ is a little more involved though, since we split request queues into per-process contexts. This fixes a performance problem with eg dump(8), since it uses several processes in some silly attempt to speed IO up. Even if dump(8) isn't really a valid case (it should be fixed by using CLONE_IO), there are other cases where we see close processes and where idling ends up hurting performance. Credit goes to Jeff Moyer <jmoyer@redhat.com> for writing the initial implementation. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	cfq-iosched: log responsible 'cfqq' in idle timer arm	Jens Axboe
	Makes it easier to read the traces. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	cfq-iosched: tweak kick logic a bit more	Jens Axboe
	We only kick the dispatch for an idling queue, if we think it's a (somewhat) fully merged request. Also allow a kick if we have other busy queues in the system, since we don't want to risk waiting for a potential merge in that case. It's better to get some work done and proceed. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	cfq-iosched: no need to save interrupts in cfq_kick_queue()	Jens Axboe
	It's called from the workqueue handlers from process context, so we always have irqs enabled when entered. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	brd: fix cacheflushing	Nick Piggin
	brd is missing a flush_dcache_page. On 2nd thoughts, perhaps it is the pagecache's responsibility to flush user virtual aliases (the driver of course should flush kernel virtual mappings)... but anyway, there already exists cache flushing for one direction of transfer, so we should add the other. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	brd: support barriers	Nick Piggin
	brd is always ordered (not that it matters, as it is defined not to survive when the system goes down). So tell the block layer it is ordered, which might be of help with testing filesystems. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	swap: Remove code handling bio_alloc failure with __GFP_WAIT	Nikanth Karthikesan
	Remove code handling bio_alloc failure with __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	gfs2: Remove code handling bio_alloc failure with __GFP_WAIT	Nikanth Karthikesan
	Remove code handling bio_alloc failure with __GFP_WAIT. GFP_NOFS implies __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	ext4: Remove code handling bio_alloc failure with __GFP_WAIT	Nikanth Karthikesan
	Remove code handling bio_alloc failure with __GFP_WAIT. GFP_NOIO implies __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	dio: Remove code handling bio_alloc failure with __GFP_WAIT	Nikanth Karthikesan
	Remove code handling bio_alloc failure with __GFP_WAIT. GFP_KERNEL implies __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	block: Remove code handling bio_alloc failure with __GFP_WAIT	Nikanth Karthikesan
	Remove code handling bio_alloc failure with __GFP_WAIT. GFP_KERNEL implies __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	bio: add documentation to bio_alloc()	Jens Axboe
	Explain that with __GFP_WAIT set it will not fail, and that the caller must never allocate more than 1 bio at the time. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	splice: add helpers for locking pipe inode	Miklos Szeredi
	There are lots of sequences like this, especially in splice code: if (pipe->inode) mutex_lock(&pipe->inode->i_mutex); /* do something */ if (pipe->inode) mutex_unlock(&pipe->inode->i_mutex); so introduce helpers which do the conditional locking and unlocking. Also replace the inode_double_lock() call with a pipe_double_lock() helper to avoid spreading the use of this functionality beyond the pipe code. This patch is just a cleanup, and should cause no behavioral changes. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	splice: remove generic_file_splice_write_nolock()	Miklos Szeredi
	Remove the now unused generic_file_splice_write_nolock() function. It's conceptually broken anyway, because splice may need to wait for pipe events so holding locks across the whole operation is wrong. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	ocfs2: fix i_mutex locking in ocfs2_splice_to_file()	Miklos Szeredi
	Rearrange locking of i_mutex on destination and call to ocfs2_rw_lock() so locks are only held while buffers are copied with the pipe_to_file() actor, and not while waiting for more data on the pipe. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	splice: fix i_mutex locking in generic_splice_write()	Miklos Szeredi
	Rearrange locking of i_mutex on destination so it's only held while buffers are copied with the pipe_to_file() actor, and not while waiting for more data on the pipe. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	splice: remove i_mutex locking in splice_from_pipe()	Miklos Szeredi
	splice_from_pipe() is only called from two places: - generic_splice_sendpage() - splice_write_null() Neither of these require i_mutex to be taken on the destination inode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	splice: split up __splice_from_pipe()	Miklos Szeredi
	Split up __splice_from_pipe() into four helper functions: splice_from_pipe_begin() splice_from_pipe_next() splice_from_pipe_feed() splice_from_pipe_end() splice_from_pipe_next() will wait (if necessary) for more buffers to be added to the pipe. splice_from_pipe_feed() will feed the buffers to the supplied actor and return when there's no more data available (or if all of the requested data has been copied). This is necessary so that implementations can do locking around the non-waiting splice_from_pipe_feed(). This patch should not cause any change in behavior. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-15	block: fix SG_IO to return a proper error value	FUJITA Tomonori
	blk_rq_unmap_user() returns -EFAULT if a program passes an invalid address to kernel. SG_IO path needs to pass the returned value to user space instead of ignoring it. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>