Age | Commit message (Collapse) | Author |
|
The call to flush_scheduled_work() in do_initcalls() is there to make
sure all works queued to system_wq by initcalls finish before the init
sections are dropped.
However, the call doesn't make much sense at this point - there
already are multiple different workqueues and different subsystems are
free to create and use their own. Ordering requirements are and
should be expressed explicitly.
Drop the call to prepare for the deprecation and removal of
flush_scheduled_work().
Andrew suggested adding sanity check where the workqueue code checks
whether any pending or running work has the work function in the init
text section. However, checking this for running works requires the
worker to keep track of the current function being executed, and
checking only the pending works will miss most cases. As a violation
will almost always be caught by the usual page fault mechanism, I
don't think it would be worthwhile to make the workqueue code track
extra state just for this.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
* tape_3590: Create and use tape_3590_wq instead of the system_wq.
* tape_block: Directly flush requeue_task on cleanup instead of using
flush_scheduled_work().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
|
|
flush_scheduled_work() is deprecated and scheduled to be removed. On
removal, directly cancel the work, and flush the uie_task in
rtc-dev.c::clear_uie().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: rtc-linux@googlegroups.com
|
|
Workqueue creation API has been updated and flush_scheduled_work() is
deprecated and scheduled to be removed.
* core/core.c: Use alloc_ordered_workqueue() instead of
create_singlethread_workqueue(). This removes an unnecessary
rescuer.
* host/omap.c: Create, use and flush mmc_omap_wq instead of the
system_wq.
* Flush host->mmc_carddetect_work directly on removal instead of using
flush_scheduled_work().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Chris Ball <cjb@laptop.org>
Cc: linux-mmc@vger.kernel.org
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
* In menelaus, flush menelaus->work directly on probe failure. Also,
make sure the work isn't running on removal.
* In tps65010, cancel_delayed_work() + flush_scheduled_work() ->
cancel_delayed_work_sync(). While at it, remove unnecessary (void)
casts on return value, and use schedule_delayed_work() and
to_delayed_work() instead of using delayed_work's internal work
field.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
* Flush the used works directly.
* Replace the deprecated cancel_rearming_delayed_work() +
flush_scheduled_work() -> cancel_delayed_work_sync().
* Make sure mantis->uart_work isn't running on exit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: linux-media@vger.kernel.org
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush led->work on removal instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush ch->workq when freeing channel and cancel it on
release.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush ams_info.worker on detach instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush info->deferred_work on removal instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush chip->work instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Debora Velarde <debora@linux.vnet.ibm.com>
Cc: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush sonypi_device.input_work on removal instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mattia Dongili <malattia@linux.it>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly cancel hp->writer and flush hp->handshaker instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush info->work instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush work on removal instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush floppy_work instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush psw->work on removal instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush toggle_charger and sharpsl_bat works on suspend
instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush the used works on stop instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Petr Vandrovec <petr@vandrovec.name>
|
|
Make ttm_bo::ttm_bo_device_release call cancel_delayed_work_sync()
instead of calling cancel_delayed_work() followed by
flush_scheduled_work().
This is to prepare for the deprecation and removal of
flush_scheduled_work().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc:: Thomas Hellstrom <thellstrom@vmware.com>
Cc:: Dave Airlie <airlied@redhat.com>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush the used works instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: David Sterba <dsterba@suse.cz>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
* cancel_delayed_work() + flush_schedule_work() ->
cancel_delayed_work_sync().
* flush qs->qs_work directly on exit instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Joel Becker <joel.becker@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
|
|
flush_scheduled_work() is deprecated and scheduled to be removed.
Directly flush dst->link_poll_work on remove instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
|
|
flush_scheduled_work() is deprecated and will be removed. Because
kcapi uses fire-and-forget type works, it's impossible to flush each
work explicitly. Create and use a dedicated workqueue instead.
Please note that with recent workqueue changes, each workqueue doesn't
reserve a lot of resources and using it as a flush domain is fine.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jan Kiszka <jan.kiszka@web.de>
|
|
capidrv_init() could leave capictr notifier dangling after init
failure. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jan Kiszka <jan.kiszka@web.de>
|
|
Currently, destroy_workqueue() makes the workqueue deny all new
queueing by setting WQ_DYING and flushes the workqueue once before
proceeding with destruction; however, there are cases where work items
queue more related work items. Currently, such users need to
explicitly flush the workqueue multiple times depending on the
possible depth of such chained queueing.
This patch updates the queueing path such that a work item can queue
further work items on the same workqueue even when WQ_DYING is set.
The flush on destruction is automatically retried until the workqueue
is empty. This guarantees that the workqueue is empty on destruction
while allowing chained queueing.
The flush retry logic whines if it takes too many retries to drain the
workqueue.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
There's no in-kernel user left for these two obsolete functions. Mark
them deprecated and schedule for removal during 2.6.39 cycle.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
|
|
cancel_delayed_work_sync()
cancel_rearming_delayed_work[queue]() has been superceded by
cancel_delayed_work_sync() quite some time ago. Convert all the
in-kernel users. The conversions are completely equivalent and
trivial.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: "David S. Miller" <davem@davemloft.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: netdev@vger.kernel.org
Cc: Anton Vorontsov <cbou@mail.ru>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alex Elder <aelder@sgi.com>
Cc: xfs-masters@oss.sgi.com
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: netfilter-devel@vger.kernel.org
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org
|
|
Running the annotate branch profiler on three boxes, including my
main box that runs firefox, evolution, xchat, and is part of the distcc farm,
showed this with the likelys in the workqueue code:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
96 996253 99 wq_worker_sleeping workqueue.c 703
96 996247 99 wq_worker_waking_up workqueue.c 677
The likely()s in this case were assuming that WORKER_NOT_RUNNING will
most likely be false. But this is not the case. The reason is
(and shown by adding trace_printks and testing it) that most of the time
WORKER_PREP is set.
In worker_thread() we have:
worker_clr_flags(worker, WORKER_PREP);
[ do work stuff ]
worker_set_flags(worker, WORKER_PREP, false);
(that 'false' means not to wake up an idle worker)
The wq_worker_sleeping() is called from schedule when a worker thread
is putting itself to sleep. Which happens most of the time outside
of that [ do work stuff ].
The wq_worker_waking_up is called by the wakeup worker code, which
is also callod outside that [ do work stuff ].
Thus, the likely and unlikely used by those two functions are actually
backwards.
Remove the annotation and let gcc figure it out.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
I found a trivial bug on initialization of workqueue.
Current init_workqueues doesn't check the result of
allocation of system_unbound_wq, this should be checked
like other queues.
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon: (lis3lv02d_i2c) Fix compile warnings
hwmon: (i5k_amb) Fix compile warning
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: remove duplicated #include
xen: x86/32: perform initial startup on initial_page_table
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: fix memchr() not to dereference memory for zero length
arch/tile: make glibc's sysconf(_SC_NPROCESSORS_CONF) work correctly
arch/tile: fix rwlock so would-be write lockers don't block new readers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
* 'drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
pci root complex: support for tile architecture
drivers/net/tile/: on-chip network drivers for the tile architecture
MAINTAINERS: add drivers/char/hvc_tile.c as maintained by tile
|
|
* master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6482/2: Fix find_next_zero_bit and related assembly
ARM: 6490/1: MM: bugfix: initialize spinlock for init_mm.context
ARM: avoid annoying <4>'s in printk output
SCSI: arm fas216: fix missing ';'
ARM: avoid marking decompressor .stack section as having contents
ARM: 6489/1: thumb2: fix incorrect optimisation in usracc
ARM: 6488/1: nomadik: prevent sched_clock() wraparound
ARM: 6484/1: fix compile warning in mm/init.c
ARM: 6473/1: Small update to ux500 specific L2 cache code
ARM: improve compiler's ability to optimize page tables
mx25: fix spi device registration typo
ARM i.MX27 eukrea: Fix compilation
ARM i.MX spi: fix compilation for i.MX21
ARM i.MX pcm037 eet: compile fixes
ARM i.MX: sdma is merged, so remove #ifdef SDMA_IS_MERGED
ARM mx3fb: check for DMA engine type
mach-pcm037_eet: Fix section mismatch for eet_init_devices()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
sisfb: delete osdef.h
sisfb: move the CONFIG warning to sis_main.c
sisfb: replace SiS_SetMemory with memset_io
sisfb: remove InPort/OutPort wrappers
sisfb: use CONFIG_FB_SIS_301/315 instead of SIS301/315H
sisfb: delete redudant #define SIS_LINUX_KERNEL
sisfb: delete dead SIS_XORG_XF86 code
sisfb: delete fallback code for pci_map_rom()
sisfb: delete obsolete PCI ROM bug workaround
fbdev: Update documentation index file.
lxfb: Program panel v/h sync output polarity correctly
fbcmap: integer overflow bug
fbcmap: cleanup white space in fb_alloc_cmap()
MAINTAINERS: Add fbdev patchwork entry, tidy up file patterns.
fbdev: da8xx: punt duplicated FBIO_WAITFORVSYNC define
fbdev: sh_mobile_lcdcfb: fix bug in reconfig()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: clkfwk: Build fix for non-legacy CPG changes.
sh: Use GCC __builtin_prefetch() to implement prefetch().
sh: fix vsyscall compilation due to .eh_frame issue
sh: avoid to flush all cache in sys_cacheflush
sh: clkfwk: Disable init clk op for non-legacy clocks.
sh: clkfwk: Kill off now unused algo_id in set_rate op.
sh: clkfwk: Kill off unused clk_set_rate_ex().
|
|
* 'for-linus' of git://neil.brown.name/md:
md: Call blk_queue_flush() to establish flush/fua support
md/raid1: really fix recovery looping when single good device fails.
md: fix return value of rdev_size_change()
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
virtio: fix format of sysfs driver/vendor files
Char: virtio_console, fix memory leak
virtio: return correct capacity to users
module: Update prototype for ref_module (formerly use_module)
|
|
When compiling arch/x86/kernel/early_printk_mrst.c with i386
allmodconfig, gcc-4.1.0 generates an out-of-line copy of
__set_fixmap_offset() which contains a reference to
__this_fixmap_does_not_exist which the compiler cannot elide.
Marking __set_fixmap_offset() as __always_inline prevents this.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Feng Tang <feng.tang@intel.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The recent changes to gfp.h to satisfy sparse broke scripts/gfp-translate.
This patch fixes it up to work with old and new versions of gfp.h .
[akpm@linux-foundation.org: use `grep -q', per WANG Cong]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Namhyung Kim <namhyung@gmail.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
reiserfs_unpack() locks the inode mutex with reiserfs_mutex_lock_safe()
to protect against reiserfs lock dependency. However this protection
requires to have the reiserfs lock to be locked.
This is the case if reiserfs_unpack() is called by reiserfs_ioctl but
not from reiserfs_quota_on() when it tries to unpack tails of quota
files.
Fix the ordering of the two locks in reiserfs_unpack() to fix this
issue.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: Markus Gapp <markus.gapp@gmx.net>
Reported-by: Jan Kara <jack@suse.cz>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: <stable@kernel.org> [2.6.36.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
According to the comment describing ops_lock in the definition of struct
backlight_device and when comparing with other functions in backlight.c
the mutex must be hold when checking ops to be non-NULL.
Fixes a problem added by c835ee7f4154992e6 ("backlight: Add suspend/resume
support to the backlight core") in Jan 2009.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
struct als_data *data is not used in this driver at all.
Also add a missing ">" character for MODULE_AUTHOR.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently one pagemap_read() call walks in PAGEMAP_WALK_SIZE bytes (== 512
pages.) But there is a corner case where walk_pmd_range() accidentally
runs over a VMA associated with a hugetlbfs file.
For example, when a process has mappings to VMAs as shown below:
# cat /proc/<pid>/maps
...
3a58f6d000-3a58f72000 rw-p 00000000 00:00 0
7fbd51853000-7fbd51855000 rw-p 00000000 00:00 0
7fbd5186c000-7fbd5186e000 rw-p 00000000 00:00 0
7fbd51a00000-7fbd51c00000 rw-s 00000000 00:12 8614 /hugepages/test
then pagemap_read() goes into walk_pmd_range() path and walks in the range
0x7fbd51853000-0x7fbd51a53000, but the hugetlbfs VMA should be handled by
walk_hugetlb_range(). Otherwise PMD for the hugepage is considered bad
and cleared, which causes undesirable results.
This patch fixes it by separating pagemap walk range into one PMD.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit d33b9f45 ("mm: hugetlb: fix hugepage memory leak in
walk_page_range()") introduces a check if a vma is a hugetlbfs one and
later in 5dc37642 ("mm hugetlb: add hugepage support to pagemap") it is
moved under #ifdef CONFIG_HUGETLB_PAGE but a needless find_vma call is
left behind and its result is not used anywhere else in the function.
The side-effect of caching vma for @addr inside walk->mm is neither
utilized in walk_page_range() nor in called functions.
Signed-off-by: David Sterba <dsterba@suse.cz>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Matt Mackall <mpm@selenic.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
called under stop_machine_run()
During memory hotplug, build_allzonelists() may be called under
stop_machine_run(). In this function, setup_zone_pageset() is called.
But it's bug because it will do page allocation under stop_machine_run().
Here is a report from Alok Kataria.
BUG: sleeping function called from invalid context at kernel/mutex.c:94
in_atomic(): 0, irqs_disabled(): 1, pid: 4, name: migration/0
Pid: 4, comm: migration/0 Not tainted 2.6.35.6-45.fc14.x86_64 #1
Call Trace:
[<ffffffff8103d12b>] __might_sleep+0xeb/0xf0
[<ffffffff81468245>] mutex_lock+0x24/0x50
[<ffffffff8110eaa6>] pcpu_alloc+0x6d/0x7ee
[<ffffffff81048888>] ? load_balance+0xbe/0x60e
[<ffffffff8103a1b3>] ? rt_se_boosted+0x21/0x2f
[<ffffffff8103e1cf>] ? dequeue_rt_stack+0x18b/0x1ed
[<ffffffff8110f237>] __alloc_percpu+0x10/0x12
[<ffffffff81465e22>] setup_zone_pageset+0x38/0xbe
[<ffffffff810d6d81>] ? build_zonelists_node.clone.58+0x79/0x8c
[<ffffffff81452539>] __build_all_zonelists+0x419/0x46c
[<ffffffff8108ef01>] ? cpu_stopper_thread+0xb2/0x198
[<ffffffff8108f075>] stop_machine_cpu_stop+0x8e/0xc5
[<ffffffff8108efe7>] ? stop_machine_cpu_stop+0x0/0xc5
[<ffffffff8108ef57>] cpu_stopper_thread+0x108/0x198
[<ffffffff81467a37>] ? schedule+0x5b2/0x5cc
[<ffffffff8108ee4f>] ? cpu_stopper_thread+0x0/0x198
[<ffffffff81065f29>] kthread+0x7f/0x87
[<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
[<ffffffff81065eaa>] ? kthread+0x0/0x87
[<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
Built 5 zonelists in Node order, mobility grouping on. Total pages: 289456
Policy zone: Normal
This patch tries to fix the issue by moving setup_zone_pageset() out from
stop_machine_run(). It's obviously not necessary to be called under
stop_machine_run().
[akpm@linux-foundation.org: remove unneeded local]
Reported-by: Alok Kataria <akataria@vmware.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Petr Vandrovec <petr@vmware.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Swap accounting can be configured by CONFIG_CGROUP_MEM_RES_CTLR_SWAP
configuration option and then it is turned on by default. There is a boot
option (noswapaccount) which can disable this feature.
This makes it hard for distributors to enable the configuration option as
this feature leads to a bigger memory consumption and this is a no-go for
general purpose distribution kernel. On the other hand swap accounting
may be very usuful for some workloads.
This patch adds a new configuration option which controls the default
behavior (CGROUP_MEM_RES_CTLR_SWAP_ENABLED). If the option is selected
then the feature is turned on by default.
It also adds a new boot parameter swapaccount[=1|0] which enhances the
original noswapaccount parameter semantic by means of enable/disable logic
(defaults to 1 if no value is provided to be still consistent with
noswapaccount).
The default behavior is unchanged (if CONFIG_CGROUP_MEM_RES_CTLR_SWAP is
enabled then CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is enabled as well)
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
__mem_cgroup_try_charge() can be called under down_write(&mmap_sem)(e.g.
mlock does it). This means it can cause deadlock if it races with move charge:
Ex.1)
move charge | try charge
--------------------------------------+------------------------------
mem_cgroup_can_attach() | down_write(&mmap_sem)
mc.moving_task = current | ..
mem_cgroup_precharge_mc() | __mem_cgroup_try_charge()
mem_cgroup_count_precharge() | prepare_to_wait()
down_read(&mmap_sem) | if (mc.moving_task)
-> cannot aquire the lock | -> true
| schedule()
Ex.2)
move charge | try charge
--------------------------------------+------------------------------
mem_cgroup_can_attach() |
mc.moving_task = current |
mem_cgroup_precharge_mc() |
mem_cgroup_count_precharge() |
down_read(&mmap_sem) |
.. |
up_read(&mmap_sem) |
| down_write(&mmap_sem)
mem_cgroup_move_task() | ..
mem_cgroup_move_charge() | __mem_cgroup_try_charge()
down_read(&mmap_sem) | prepare_to_wait()
-> cannot aquire the lock | if (mc.moving_task)
| -> true
| schedule()
To avoid this deadlock, we do all the move charge works (both can_attach() and
attach()) under one mmap_sem section.
And after this patch, we set/clear mc.moving_task outside mc.lock, because we
use the lock only to check mc.from/to.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|