asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2010-12-24	init: don't call flush_scheduled_work() from do_initcalls()	Tejun Heo
	The call to flush_scheduled_work() in do_initcalls() is there to make sure all works queued to system_wq by initcalls finish before the init sections are dropped. However, the call doesn't make much sense at this point - there already are multiple different workqueues and different subsystems are free to create and use their own. Ordering requirements are and should be expressed explicitly. Drop the call to prepare for the deprecation and removal of flush_scheduled_work(). Andrew suggested adding sanity check where the workqueue code checks whether any pending or running work has the work function in the init text section. However, checking this for running works requires the worker to keep track of the current function being executed, and checking only the pending works will miss most cases. As a violation will almost always be caught by the usual page fault mechanism, I don't think it would be worthwhile to make the workqueue code track extra state just for this. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>
2010-12-24	s390: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. * tape_3590: Create and use tape_3590_wq instead of the system_wq. * tape_block: Directly flush requeue_task on cleanup instead of using flush_scheduled_work(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org
2010-12-24	rtc: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. On removal, directly cancel the work, and flush the uie_task in rtc-dev.c::clear_uie(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: rtc-linux@googlegroups.com
2010-12-24	mmc: update workqueue usages	Tejun Heo
	Workqueue creation API has been updated and flush_scheduled_work() is deprecated and scheduled to be removed. * core/core.c: Use alloc_ordered_workqueue() instead of create_singlethread_workqueue(). This removes an unnecessary rescuer. * host/omap.c: Create, use and flush mmc_omap_wq instead of the system_wq. * Flush host->mmc_carddetect_work directly on removal instead of using flush_scheduled_work(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Chris Ball <cjb@laptop.org> Cc: linux-mmc@vger.kernel.org
2010-12-24	mfd: update workqueue usages	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. * In menelaus, flush menelaus->work directly on probe failure. Also, make sure the work isn't running on removal. * In tps65010, cancel_delayed_work() + flush_scheduled_work() -> cancel_delayed_work_sync(). While at it, remove unnecessary (void) casts on return value, and use schedule_delayed_work() and to_delayed_work() instead of using delayed_work's internal work field. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Samuel Ortiz <sameo@linux.intel.com>
2010-12-24	dvb: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. * Flush the used works directly. * Replace the deprecated cancel_rearming_delayed_work() + flush_scheduled_work() -> cancel_delayed_work_sync(). * Make sure mantis->uart_work isn't running on exit. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: linux-media@vger.kernel.org
2010-12-24	leds-wm8350: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush led->work on removal instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Richard Purdie <rpurdie@rpsys.net>
2010-12-24	mISDN: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush ch->workq when freeing channel and cancel it on release. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: netdev@vger.kernel.org
2010-12-24	macintosh/ams: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush ams_info.worker on detach instead. Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-24	vmwgfx: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush info->deferred_work on removal instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Thomas Hellstrom <thellstrom@vmware.com>
2010-12-24	tpm: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush chip->work instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Debora Velarde <debora@linux.vnet.ibm.com> Cc: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
2010-12-24	sonypi: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush sonypi_device.input_work on removal instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mattia Dongili <malattia@linux.it>
2010-12-24	hvsi: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly cancel hp->writer and flush hp->handshaker instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org>
2010-12-24	xen: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush info->work instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-12-24	gdrom: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush work on removal instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
2010-12-24	floppy: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush floppy_work instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
2010-12-24	sh: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush psw->work on removal instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: linux-sh@vger.kernel.org
2010-12-24	arm/sharpsl: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush toggle_charger and sharpsl_bat works on suspend instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <linux@arm.linux.org.uk>
2010-12-24	ncpfs: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush the used works on stop instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Petr Vandrovec <petr@vandrovec.name>
2010-12-24	drm/ttm: use cancel_delayed_work_sync() in ttm_bo	Tejun Heo
	Make ttm_bo::ttm_bo_device_release call cancel_delayed_work_sync() instead of calling cancel_delayed_work() followed by flush_scheduled_work(). This is to prepare for the deprecation and removal of flush_scheduled_work(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc:: Thomas Hellstrom <thellstrom@vmware.com> Cc:: Dave Airlie <airlied@redhat.com>
2010-12-24	pcmcia/ipwireless: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush the used works instead. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: David Sterba <dsterba@suse.cz>
2010-12-24	ocfs2: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. * cancel_delayed_work() + flush_schedule_work() -> cancel_delayed_work_sync(). * flush qs->qs_work directly on exit instead. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Joel Becker <joel.becker@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com>
2010-12-24	net/dsa: don't use flush_scheduled_work()	Tejun Heo
	flush_scheduled_work() is deprecated and scheduled to be removed. Directly flush dst->link_poll_work on remove instead. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
2010-12-24	isdn/capi: make kcapi use a separate workqueue	Tejun Heo
	flush_scheduled_work() is deprecated and will be removed. Because kcapi uses fire-and-forget type works, it's impossible to flush each work explicitly. Create and use a dedicated workqueue instead. Please note that with recent workqueue changes, each workqueue doesn't reserve a lot of resources and using it as a flush domain is fine. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jan Kiszka <jan.kiszka@web.de>
2010-12-24	isdn/capi: unregister capictr notifier after init failure	Tejun Heo
	capidrv_init() could leave capictr notifier dangling after init failure. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jan Kiszka <jan.kiszka@web.de>
2010-12-20	workqueue: allow chained queueing during destruction	Tejun Heo
	Currently, destroy_workqueue() makes the workqueue deny all new queueing by setting WQ_DYING and flushes the workqueue once before proceeding with destruction; however, there are cases where work items queue more related work items. Currently, such users need to explicitly flush the workqueue multiple times depending on the possible depth of such chained queueing. This patch updates the queueing path such that a work item can queue further work items on the same workqueue even when WQ_DYING is set. The flush on destruction is automatically retried until the workqueue is empty. This guarantees that the workqueue is empty on destruction while allowing chained queueing. The flush retry logic whines if it takes too many retries to drain the workqueue. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
2010-12-15	workqueue: deprecate cancel_rearming_delayed_work[queue]()	Tejun Heo
	There's no in-kernel user left for these two obsolete functions. Mark them deprecated and schedule for removal during 2.6.39 cycle. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net>
2010-12-15	workqueue: convert cancel_rearming_delayed_work[queue]() users to ↵	Tejun Heo
	cancel_delayed_work_sync() cancel_rearming_delayed_work[queue]() has been superceded by cancel_delayed_work_sync() quite some time ago. Convert all the in-kernel users. The conversions are completely equivalent and trivial. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: netdev@vger.kernel.org Cc: Anton Vorontsov <cbou@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Alex Elder <aelder@sgi.com> Cc: xfs-masters@oss.sgi.com Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: netfilter-devel@vger.kernel.org Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: linux-nfs@vger.kernel.org
2010-12-14	workqueue: It is likely that WORKER_NOT_RUNNING is true	Steven Rostedt
	Running the annotate branch profiler on three boxes, including my main box that runs firefox, evolution, xchat, and is part of the distcc farm, showed this with the likelys in the workqueue code: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 96 996253 99 wq_worker_sleeping workqueue.c 703 96 996247 99 wq_worker_waking_up workqueue.c 677 The likely()s in this case were assuming that WORKER_NOT_RUNNING will most likely be false. But this is not the case. The reason is (and shown by adding trace_printks and testing it) that most of the time WORKER_PREP is set. In worker_thread() we have: worker_clr_flags(worker, WORKER_PREP); [ do work stuff ] worker_set_flags(worker, WORKER_PREP, false); (that 'false' means not to wake up an idle worker) The wq_worker_sleeping() is called from schedule when a worker thread is putting itself to sleep. Which happens most of the time outside of that [ do work stuff ]. The wq_worker_waking_up is called by the wakeup worker code, which is also callod outside that [ do work stuff ]. Thus, the likely and unlikely used by those two functions are actually backwards. Remove the annotation and let gcc figure it out. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-14	MAINTAINERS: Add workqueue entry	Tejun Heo
	Signed-off-by: Tejun Heo <tj@kernel.org>
2010-11-26	workqueue: check the allocation of system_unbound_wq	Hitoshi Mitake
	I found a trivial bug on initialization of workqueue. Current init_workqueues doesn't check the result of allocation of system_unbound_wq, this should be checked like other queues. Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-11-25	Merge branch 'hwmon-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging: hwmon: (lis3lv02d_i2c) Fix compile warnings hwmon: (i5k_amb) Fix compile warning
2010-11-25	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: remove duplicated #include xen: x86/32: perform initial startup on initial_page_table
2010-11-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: fix memchr() not to dereference memory for zero length arch/tile: make glibc's sysconf(_SC_NPROCESSORS_CONF) work correctly arch/tile: fix rwlock so would-be write lockers don't block new readers
2010-11-25	Merge branch 'drivers' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile * 'drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: pci root complex: support for tile architecture drivers/net/tile/: on-chip network drivers for the tile architecture MAINTAINERS: add drivers/char/hvc_tile.c as maintained by tile
2010-11-25	Merge master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6482/2: Fix find_next_zero_bit and related assembly ARM: 6490/1: MM: bugfix: initialize spinlock for init_mm.context ARM: avoid annoying <4>'s in printk output SCSI: arm fas216: fix missing ';' ARM: avoid marking decompressor .stack section as having contents ARM: 6489/1: thumb2: fix incorrect optimisation in usracc ARM: 6488/1: nomadik: prevent sched_clock() wraparound ARM: 6484/1: fix compile warning in mm/init.c ARM: 6473/1: Small update to ux500 specific L2 cache code ARM: improve compiler's ability to optimize page tables mx25: fix spi device registration typo ARM i.MX27 eukrea: Fix compilation ARM i.MX spi: fix compilation for i.MX21 ARM i.MX pcm037 eet: compile fixes ARM i.MX: sdma is merged, so remove #ifdef SDMA_IS_MERGED ARM mx3fb: check for DMA engine type mach-pcm037_eet: Fix section mismatch for eet_init_devices()
2010-11-25	Merge branch 'fbdev-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6 * 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6: sisfb: delete osdef.h sisfb: move the CONFIG warning to sis_main.c sisfb: replace SiS_SetMemory with memset_io sisfb: remove InPort/OutPort wrappers sisfb: use CONFIG_FB_SIS_301/315 instead of SIS301/315H sisfb: delete redudant #define SIS_LINUX_KERNEL sisfb: delete dead SIS_XORG_XF86 code sisfb: delete fallback code for pci_map_rom() sisfb: delete obsolete PCI ROM bug workaround fbdev: Update documentation index file. lxfb: Program panel v/h sync output polarity correctly fbcmap: integer overflow bug fbcmap: cleanup white space in fb_alloc_cmap() MAINTAINERS: Add fbdev patchwork entry, tidy up file patterns. fbdev: da8xx: punt duplicated FBIO_WAITFORVSYNC define fbdev: sh_mobile_lcdcfb: fix bug in reconfig()
2010-11-25	Merge branch 'sh-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: clkfwk: Build fix for non-legacy CPG changes. sh: Use GCC __builtin_prefetch() to implement prefetch(). sh: fix vsyscall compilation due to .eh_frame issue sh: avoid to flush all cache in sys_cacheflush sh: clkfwk: Disable init clk op for non-legacy clocks. sh: clkfwk: Kill off now unused algo_id in set_rate op. sh: clkfwk: Kill off unused clk_set_rate_ex().
2010-11-25	Merge branch 'for-linus' of git://neil.brown.name/md	Linus Torvalds
	* 'for-linus' of git://neil.brown.name/md: md: Call blk_queue_flush() to establish flush/fua support md/raid1: really fix recovery looping when single good device fails. md: fix return value of rdev_size_change()
2010-11-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio: fix format of sysfs driver/vendor files Char: virtio_console, fix memory leak virtio: return correct capacity to users module: Update prototype for ref_module (formerly use_module)
2010-11-25	arch/x86/include/asm/fixmap.h: mark __set_fixmap_offset as __always_inline	Andrew Morton
	When compiling arch/x86/kernel/early_printk_mrst.c with i386 allmodconfig, gcc-4.1.0 generates an out-of-line copy of __set_fixmap_offset() which contains a reference to __this_fixmap_does_not_exist which the compiler cannot elide. Marking __set_fixmap_offset() as __always_inline prevents this. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Feng Tang <feng.tang@intel.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	scripts: fix gfp-translate for recent changes to gfp.h	Mel Gorman
	The recent changes to gfp.h to satisfy sparse broke scripts/gfp-translate. This patch fixes it up to work with old and new versions of gfp.h . [akpm@linux-foundation.org: use `grep -q', per WANG Cong] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Namhyung Kim <namhyung@gmail.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	reiserfs: fix inode mutex - reiserfs lock misordering	Frederic Weisbecker
	reiserfs_unpack() locks the inode mutex with reiserfs_mutex_lock_safe() to protect against reiserfs lock dependency. However this protection requires to have the reiserfs lock to be locked. This is the case if reiserfs_unpack() is called by reiserfs_ioctl but not from reiserfs_quota_on() when it tries to unpack tails of quota files. Fix the ordering of the two locks in reiserfs_unpack() to fix this issue. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reported-by: Markus Gapp <markus.gapp@gmx.net> Reported-by: Jan Kara <jack@suse.cz> Cc: Jeff Mahoney <jeffm@suse.com> Cc: <stable@kernel.org> [2.6.36.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	backlight: grab ops_lock before testing bd->ops	Uwe Kleine-König
	According to the comment describing ops_lock in the definition of struct backlight_device and when comparing with other functions in backlight.c the mutex must be hold when checking ops to be non-NULL. Fixes a problem added by c835ee7f4154992e6 ("backlight: Add suspend/resume support to the backlight core") in Jan 2009. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Richard Purdie <rpurdie@linux.intel.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	drivers/misc/isl29020.c: remove incorrect kfree in isl29020_remove()	Axel Lin
	struct als_data *data is not used in this driver at all. Also add a missing ">" character for MODULE_AUTHOR. Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	pagemap: set pagemap walk limit to PMD boundary	Naoya Horiguchi
	Currently one pagemap_read() call walks in PAGEMAP_WALK_SIZE bytes (== 512 pages.) But there is a corner case where walk_pmd_range() accidentally runs over a VMA associated with a hugetlbfs file. For example, when a process has mappings to VMAs as shown below: # cat /proc/<pid>/maps ... 3a58f6d000-3a58f72000 rw-p 00000000 00:00 0 7fbd51853000-7fbd51855000 rw-p 00000000 00:00 0 7fbd5186c000-7fbd5186e000 rw-p 00000000 00:00 0 7fbd51a00000-7fbd51c00000 rw-s 00000000 00:12 8614 /hugepages/test then pagemap_read() goes into walk_pmd_range() path and walks in the range 0x7fbd51853000-0x7fbd51a53000, but the hugetlbfs VMA should be handled by walk_hugetlb_range(). Otherwise PMD for the hugepage is considered bad and cleared, which causes undesirable results. This patch fixes it by separating pagemap walk range into one PMD. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	mm: remove call to find_vma in pagewalk for non-hugetlbfs	David Sterba
	Commit d33b9f45 ("mm: hugetlb: fix hugepage memory leak in walk_page_range()") introduces a check if a vma is a hugetlbfs one and later in 5dc37642 ("mm hugetlb: add hugepage support to pagemap") it is moved under #ifdef CONFIG_HUGETLB_PAGE but a needless find_vma call is left behind and its result is not used anywhere else in the function. The side-effect of caching vma for @addr inside walk->mm is neither utilized in walk_page_range() nor in called functions. Signed-off-by: David Sterba <dsterba@suse.cz> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Matt Mackall <mpm@selenic.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	mm/page_alloc.c: fix build_all_zonelist() where percpu_alloc() is wrongly ↵	KAMEZAWA Hiroyuki
	called under stop_machine_run() During memory hotplug, build_allzonelists() may be called under stop_machine_run(). In this function, setup_zone_pageset() is called. But it's bug because it will do page allocation under stop_machine_run(). Here is a report from Alok Kataria. BUG: sleeping function called from invalid context at kernel/mutex.c:94 in_atomic(): 0, irqs_disabled(): 1, pid: 4, name: migration/0 Pid: 4, comm: migration/0 Not tainted 2.6.35.6-45.fc14.x86_64 #1 Call Trace: [<ffffffff8103d12b>] __might_sleep+0xeb/0xf0 [<ffffffff81468245>] mutex_lock+0x24/0x50 [<ffffffff8110eaa6>] pcpu_alloc+0x6d/0x7ee [<ffffffff81048888>] ? load_balance+0xbe/0x60e [<ffffffff8103a1b3>] ? rt_se_boosted+0x21/0x2f [<ffffffff8103e1cf>] ? dequeue_rt_stack+0x18b/0x1ed [<ffffffff8110f237>] __alloc_percpu+0x10/0x12 [<ffffffff81465e22>] setup_zone_pageset+0x38/0xbe [<ffffffff810d6d81>] ? build_zonelists_node.clone.58+0x79/0x8c [<ffffffff81452539>] __build_all_zonelists+0x419/0x46c [<ffffffff8108ef01>] ? cpu_stopper_thread+0xb2/0x198 [<ffffffff8108f075>] stop_machine_cpu_stop+0x8e/0xc5 [<ffffffff8108efe7>] ? stop_machine_cpu_stop+0x0/0xc5 [<ffffffff8108ef57>] cpu_stopper_thread+0x108/0x198 [<ffffffff81467a37>] ? schedule+0x5b2/0x5cc [<ffffffff8108ee4f>] ? cpu_stopper_thread+0x0/0x198 [<ffffffff81065f29>] kthread+0x7f/0x87 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 [<ffffffff81065eaa>] ? kthread+0x0/0x87 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10 Built 5 zonelists in Node order, mobility grouping on. Total pages: 289456 Policy zone: Normal This patch tries to fix the issue by moving setup_zone_pageset() out from stop_machine_run(). It's obviously not necessary to be called under stop_machine_run(). [akpm@linux-foundation.org: remove unneeded local] Reported-by: Alok Kataria <akataria@vmware.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Petr Vandrovec <petr@vmware.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	cgroups: make swap accounting default behavior configurable	Michal Hocko
	Swap accounting can be configured by CONFIG_CGROUP_MEM_RES_CTLR_SWAP configuration option and then it is turned on by default. There is a boot option (noswapaccount) which can disable this feature. This makes it hard for distributors to enable the configuration option as this feature leads to a bigger memory consumption and this is a no-go for general purpose distribution kernel. On the other hand swap accounting may be very usuful for some workloads. This patch adds a new configuration option which controls the default behavior (CGROUP_MEM_RES_CTLR_SWAP_ENABLED). If the option is selected then the feature is turned on by default. It also adds a new boot parameter swapaccount[=1\|0] which enhances the original noswapaccount parameter semantic by means of enable/disable logic (defaults to 1 if no value is provided to be still consistent with noswapaccount). The default behavior is unchanged (if CONFIG_CGROUP_MEM_RES_CTLR_SWAP is enabled then CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is enabled as well) Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-25	memcg: avoid deadlock between move charge and try_charge()	Daisuke Nishimura
	__mem_cgroup_try_charge() can be called under down_write(&mmap_sem)(e.g. mlock does it). This means it can cause deadlock if it races with move charge: Ex.1) move charge \| try charge --------------------------------------+------------------------------ mem_cgroup_can_attach() \| down_write(&mmap_sem) mc.moving_task = current \| .. mem_cgroup_precharge_mc() \| __mem_cgroup_try_charge() mem_cgroup_count_precharge() \| prepare_to_wait() down_read(&mmap_sem) \| if (mc.moving_task) -> cannot aquire the lock \| -> true \| schedule() Ex.2) move charge \| try charge --------------------------------------+------------------------------ mem_cgroup_can_attach() \| mc.moving_task = current \| mem_cgroup_precharge_mc() \| mem_cgroup_count_precharge() \| down_read(&mmap_sem) \| .. \| up_read(&mmap_sem) \| \| down_write(&mmap_sem) mem_cgroup_move_task() \| .. mem_cgroup_move_charge() \| __mem_cgroup_try_charge() down_read(&mmap_sem) \| prepare_to_wait() -> cannot aquire the lock \| if (mc.moving_task) \| -> true \| schedule() To avoid this deadlock, we do all the move charge works (both can_attach() and attach()) under one mmap_sem section. And after this patch, we set/clear mc.moving_task outside mc.lock, because we use the lock only to check mc.from/to. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>