summaryrefslogtreecommitdiffstats
path: root/include/linux/freezer.h
AgeCommit message (Collapse)Author
2013-05-12freezer: add new freezable helpers using freezer_do_not_count()Colin Cross
Freezing tasks will wake up almost every userspace task from where it is blocking and force it to run until it hits a call to try_to_sleep(), generally on the exit path from the syscall it is blocking in. On resume each task will run again, usually restarting the syscall and running until it hits the same blocking call as it was originally blocked in. To allow tasks to avoid running on every suspend/resume cycle, this patch adds additional freezable wrappers around blocking calls that call freezer_do_not_count(). Combined with the previous patch, these tasks will not run during suspend or resume unless they wake up for another reason, in which case they will run until they hit the try_to_freeze() in freezer_count(), and then continue processing the wakeup after tasks are thawed. Additional patches will convert the most common locations that userspace blocks in to use freezable helpers. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12freezer: convert freezable helpers to static inline where possibleColin Cross
Some of the freezable helpers have to be macros because their condition argument needs to get evaluated every time through the wait loop. Convert the others to static inline to make future changes easier. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12freezer: convert freezable helpers to freezer_do_not_count()Colin Cross
Freezing tasks will wake up almost every userspace task from where it is blocking and force it to run until it hits a call to try_to_sleep(), generally on the exit path from the syscall it is blocking in. On resume each task will run again, usually restarting the syscall and running until it hits the same blocking call as it was originally blocked in. Convert the existing wait_event_freezable* wrappers to use freezer_do_not_count(). Combined with a previous patch, these tasks will not run during suspend or resume unless they wake up for another reason, in which case they will run until they hit the try_to_freeze() in freezer_count(), and then continue processing the wakeup after tasks are thawed. This results in a small change in behavior, previously a race between freezing and a normal wakeup would be won by the wakeup, now the task will freeze and then handle the wakeup after thawing. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12lockdep: check that no locks held at freeze timeMandeep Singh Baines
We shouldn't try_to_freeze if locks are held. Holding a lock can cause a deadlock if the lock is later acquired in the suspend or hibernate path (e.g. by dpm). Holding a lock can also cause a deadlock in the case of cgroup_freezer if a lock is held inside a frozen cgroup that is later acquired by a process outside that group. History: This patch was originally applied as 6aa9707099c and reverted in dbf520a9d7d4 because NFS was freezing with locks held. It was deemed better to keep the bad freeze point in NFS to allow laptops to suspend consistently. The previous patch in this series converts NFS to call _unsafe versions of the freezable helpers so that lockdep doesn't complain about them until a more correct fix can be applied. [akpm@linux-foundation.org: export debug_check_no_locks_held] Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12freezer: add unsafe versions of freezable helpers for CIFSColin Cross
CIFS calls wait_event_freezekillable_unsafe with a VFS lock held, which is unsafe and will cause lockdep warnings when 6aa9707 "lockdep: check that no locks held at freeze time" is reapplied (it was reverted in dbf520a). CIFS shouldn't be doing this, but it has long-running syscalls that must hold a lock but also shouldn't block suspend. Until CIFS freeze handling is rewritten to use a signal to exit out of the critical section, add a new wait_event_freezekillable_unsafe helper that will not run the lockdep test when 6aa9707 is reapplied, and call it from CIFS. In practice the likley result of holding the lock while freezing is that a second task blocked on the lock will never freeze, aborting suspend, but it is possible to manufacture a case using the cgroup freezer, the lock, and the suspend freezer to create a deadlock. Silencing the lockdep warning here will allow problems to be found in other drivers that may have a more serious deadlock risk, and prevent new problems from being added. Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12freezer: add unsafe versions of freezable helpers for NFSColin Cross
NFS calls the freezable helpers with locks held, which is unsafe and will cause lockdep warnings when 6aa9707 "lockdep: check that no locks held at freeze time" is reapplied (it was reverted in dbf520a). NFS shouldn't be doing this, but it has long-running syscalls that must hold a lock but also shouldn't block suspend. Until NFS freeze handling is rewritten to use a signal to exit out of the critical section, add new *_unsafe versions of the helpers that will not run the lockdep test when 6aa9707 is reapplied, and call them from NFS. In practice the likley result of holding the lock while freezing is that a second task blocked on the lock will never freeze, aborting suspend, but it is possible to manufacture a case using the cgroup freezer, the lock, and the suspend freezer to create a deadlock. Silencing the lockdep warning here will allow problems to be found in other drivers that may have a more serious deadlock risk, and prevent new problems from being added. Signed-off-by: Colin Cross <ccross@android.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-03-31Revert "lockdep: check that no locks held at freeze time"Paul Walmsley
This reverts commit 6aa9707099c4b25700940eb3d016f16c4434360d. Commit 6aa9707099c4 ("lockdep: check that no locks held at freeze time") causes problems with NFS root filesystems. The failures were noticed on OMAP2 and 3 boards during kernel init: [ BUG: swapper/0/1 still has locks held! ] 3.9.0-rc3-00344-ga937536 #1 Not tainted ------------------------------------- 1 lock held by swapper/0/1: #0: (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574 stack backtrace: rpc_wait_bit_killable __wait_on_bit out_of_line_wait_on_bit __rpc_execute rpc_run_task rpc_call_sync nfs_proc_get_root nfs_get_root nfs_fs_mount_common nfs_try_mount nfs_fs_mount mount_fs vfs_kern_mount do_mount sys_mount do_mount_root mount_root prepare_namespace kernel_init_freeable kernel_init Although the rootfs mounts, the system is unstable. Here's a transcript from a PM test: http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt Here's what the test log should look like: http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt Mailing list discussion is here: http://lkml.org/lkml/2013/3/4/221 Deal with this for v3.9 by reverting the problem commit, until folks can figure out the right long-term course of action. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: Jeff Layton <jlayton@redhat.com> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: <maciej.rutecki@gmail.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27lockdep: check that no locks held at freeze timeMandeep Singh Baines
We shouldn't try_to_freeze if locks are held. Holding a lock can cause a deadlock if the lock is later acquired in the suspend or hibernate path (e.g. by dpm). Holding a lock can also cause a deadlock in the case of cgroup_freezer if a lock is held inside a frozen cgroup that is later acquired by a process outside that group. [akpm@linux-foundation.org: export debug_check_no_locks_held] Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-09suspend: enable freeze timeout configuration through sysLi Fei
At present, the value of timeout for freezing is 20s, which is meaningless in case that one thread is frozen with mutex locked and another thread is trying to lock the mutex, as this time of freezing will fail unavoidably. And if there is no new wakeup event registered, the system will waste at most 20s for such meaningless trying of freezing. With this patch, the value of timeout can be configured to smaller value, so such meaningless trying of freezing will be aborted in earlier time, and later freezing can be also triggered in earlier time. And more power will be saved. In normal case on mobile phone, it costs real little time to freeze processes. On some platform, it only costs about 20ms to freeze user space processes and 10ms to freeze kernel freezable threads. Signed-off-by: Liu Chuansheng <chuansheng.liu@intel.com> Signed-off-by: Li Fei <fei.li@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-12-12Merge branch 'for-3.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "A lot of activities on cgroup side. The big changes are focused on making cgroup hierarchy handling saner. - cgroup_rmdir() had peculiar semantics - it allowed cgroup destruction to be vetoed by individual controllers and tried to drain refcnt synchronously. The vetoing never worked properly and caused good deal of contortions in cgroup. memcg was the last reamining user. Michal Hocko removed the usage and cgroup_rmdir() path has been simplified significantly. This was done in a separate branch so that the memcg people can base further memcg changes on top. - The above allowed cleaning up cgroup lifecycle management and implementation of generic cgroup iterators which are used to improve hierarchy support. - cgroup_freezer updated to allow migration in and out of a frozen cgroup and handle hierarchy. If a cgroup is frozen, all descendant cgroups are frozen. - netcls_cgroup and netprio_cgroup updated to handle hierarchy properly. - Various fixes and cleanups. - Two merge commits. One to pull in memcg and rmdir cleanups (needed to build iterators). The other pulled in cgroup/for-3.7-fixes for device_cgroup fixes so that further device_cgroup patches can be stacked on top." Fixed up a trivial conflict in mm/memcontrol.c as per Tejun (due to commit bea8c150a7 ("memcg: fix hotplugged memory zone oops") in master touching code close to commit 2ef37d3fe4 ("memcg: Simplify mem_cgroup_force_empty_list error handling") in for-3.8) * 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (65 commits) cgroup: update Documentation/cgroups/00-INDEX cgroup_rm_file: don't delete the uncreated files cgroup: remove subsystem files when remounting cgroup cgroup: use cgroup_addrm_files() in cgroup_clear_directory() cgroup: warn about broken hierarchies only after css_online cgroup: list_del_init() on removed events cgroup: fix lockdep warning for event_control cgroup: move list add after list head initilization netprio_cgroup: allow nesting and inherit config on cgroup creation netprio_cgroup: implement netprio[_set]_prio() helpers netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx netprio_cgroup: reimplement priomap expansion netprio_cgroup: shorten variable names in extend_netdev_table() netprio_cgroup: simplify write_priomap() netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking cgroup: remove obsolete guarantee from cgroup_task_migrate. cgroup: add cgroup->id cgroup, cpuset: remove cgroup_subsys->post_clone() cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/ cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() ...
2012-11-23PM / Freezer: Fixup compile error of try_to_freeze_nowarn()Li Haifeng
If FREEZER is not defined, the error as following will be throw when compiled. arch/arm/kernel/signal.c:645: error: implicit declaration of function 'try_to_freeze_nowarn' Signed-off-by: Haifeng Li <omycle@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-10-26freezer: change ptrace_stop/do_signal_stop to use freezable_schedule()Oleg Nesterov
try_to_freeze_tasks() and cgroup_freezer rely on scheduler locks to ensure that a task doing STOPPED/TRACED -> RUNNING transition can't escape freezing. This mostly works, but ptrace_stop() does not necessarily call schedule(), it can change task->state back to RUNNING and check freezing() without any lock/barrier in between. We could add the necessary barrier, but this patch changes ptrace_stop() and do_signal_stop() to use freezable_schedule(). This fixes the race, freezer_count() and freezer_should_skip() carefully avoid the race. And this simplifies the code, try_to_freeze_tasks/update_if_frozen no longer need to use task_is_stopped_or_traced() checks with the non trivial assumptions. We can rely on the mechanism which was specially designed to mark the sleeping task as "frozen enough". v2: As Tejun pointed out, we can also change get_signal_to_deliver() and move try_to_freeze() up before 'relock' label. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2012-10-16freezer: add missing mb's to freezer_count() and freezer_should_skip()Tejun Heo
A task is considered frozen enough between freezer_do_not_count() and freezer_count() and freezers use freezer_should_skip() to test this condition. This supposedly works because freezer_count() always calls try_to_freezer() after clearing %PF_FREEZER_SKIP. However, there currently is nothing which guarantees that freezer_count() sees %true freezing() after clearing %PF_FREEZER_SKIP when freezing is in progress, and vice-versa. A task can escape the freezing condition in effect by freezer_count() seeing !freezing() and freezer_should_skip() seeing %PF_FREEZER_SKIP. This patch adds smp_mb()'s to freezer_count() and freezer_should_skip() such that either %true freezing() is visible to freezer_count() or !PF_FREEZER_SKIP is visible to freezer_should_skip(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: stable@vger.kernel.org
2012-01-29PM / Hibernate: Fix s2disk regression related to freezing workqueuesRafael J. Wysocki
Commit 2aede851ddf08666f68ffc17be446420e9d2a056 PM / Hibernate: Freeze kernel threads after preallocating memory introduced a mechanism by which kernel threads were frozen after the preallocation of hibernate image memory to avoid problems with frozen kernel threads not responding to memory freeing requests. However, it overlooked the s2disk code path in which the SNAPSHOT_CREATE_IMAGE ioctl was run directly after SNAPSHOT_FREE, which caused freeze_workqueues_begin() to BUG(), because it saw that worqueues had been already frozen. Although in principle this issue might be addressed by removing the relevant BUG_ON() from freeze_workqueues_begin(), that would reintroduce the very problem that commit 2aede851ddf08666f68ffc17be4 attempted to avoid into that particular code path. For this reason, to fix the issue at hand, introduce thaw_kernel_threads() and make the SNAPSHOT_FREE ioctl execute it. Special thanks to Srivatsa S. Bhat for detailed analysis of the problem. Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: stable@kernel.org
2011-12-27PM / Freezer: fix return value of freezable_schedule_timeout_killable()Jeff Layton
...it should return the return code from schedule_timeout_killable(), not the one from freezer_count(). All of the current callers ignore the return code so the bug is harmless but it's worth fixing. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-12-08PM / Freezer: Remove the "userspace only" constraint from ↵Srivatsa S. Bhat
freezer[_do_not]_count() At present, the functions freezer_count() and freezer_do_not_count() impose the restriction that they are effective only for userspace processes. However, now, these functions have found more utility than originally intended by the commit which introduced it: ba96a0c8 (freezer: fix vfork problem). And moreover, even the vfork issue actually does not need the above restriction in these functions. So, modify these functions to make them work even for kernel threads, so that they can be used at other places in the kernel, where the userspace restriction doesn't apply. Suggested-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-12-06Freezer / sunrpc / NFS: don't allow TASK_KILLABLE sleeps to block the freezerJeff Layton
Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc layer. This should allow suspend and hibernate events to proceed, even when there are RPC's pending on the wire. Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count and freezer_count calls. This allows the freezer to skip tasks that are sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-11-25Freezer: fix more fallout from the thaw_process renameStephen Rothwell
Commit 944e192db53c "freezer: rename thaw_process() to __thaw_task() and simplify the implementation" did not create a !CONFIG_FREEZER version of __thaw_task(). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-11-23freezer: fix wait_event_freezable/__thaw_task racesOleg Nesterov
wait_event_freezable() and friends stop the waiting if try_to_freeze() fails. This is not right, we can race with __thaw_task() and in this case - wait_event_freezable() returns the wrong ERESTARTSYS - wait_event_freezable_timeout() can return the positive value while condition == F Change the code to always check __retval/condition before return. Note: with or without this patch the timeout logic looks strange, probably we should recalc timeout if try_to_freeze() returns T. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Tejun Heo <tj@kernel.org>
2011-11-23freezer: kill unused set_freezable_with_signal()Tejun Heo
There's no in-kernel user of set_freezable_with_signal() left. Mixing TIF_SIGPENDING with kernel threads can lead to nasty corner cases as kernel threads never travel signal delivery path on their own. e.g. the current implementation is buggy in the cancelation path of __thaw_task(). It calls recalc_sigpending_and_wake() in an attempt to clear TIF_SIGPENDING but the function never clears it regardless of sigpending state. This means that signallable freezable kthreads may continue executing with !freezing() && stuck TIF_SIGPENDING, which can be troublesome. This patch removes set_freezable_with_signal() along with PF_FREEZER_NOSIG and recalc_sigpending*() calls in freezer. User tasks get TIF_SIGPENDING, kernel tasks get woken up and the spurious sigpending is dealt with in the usual signal delivery path. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com>
2011-11-21freezer: remove unused @sig_only from freeze_task()Tejun Heo
After "freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZE", freezing() returns authoritative answer on whether the current task should freeze or not and freeze_task() doesn't need or use @sig_only. Remove it. While at it, rewrite function comment for freeze_task() and rename @sig_only to @user_only in try_to_freeze_tasks(). This patch doesn't cause any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com>
2011-11-21freezer: fix set_freezable[_with_signal]() raceTejun Heo
A kthread doing set_freezable*() may race with on-going PM freeze and the freezer might think all tasks are frozen while the new freezable kthread is merrily proceeding to execute code paths which aren't supposed to be executing during PM freeze. Reimplement set_freezable[_with_signal]() using __set_freezable() such that freezable PF flags are modified under freezer_lock and try_to_freeze() is called afterwards. This eliminates race condition against freezing. Note: Separated out from larger patch to resolve fix order dependency Oleg pointed out. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com>
2011-11-21freezer: remove should_send_signal() and update frozen()Tejun Heo
should_send_signal() is only used in freezer.c. Exporting them only increases chance of abuse. Open code the two users and remove it. Update frozen() to return bool. Signed-off-by: Tejun Heo <tj@kernel.org>
2011-11-21freezer: make freezing() test freeze conditions in effect instead of TIF_FREEZETejun Heo
Using TIF_FREEZE for freezing worked when there was only single freezing condition (the PM one); however, now there is also the cgroup_freezer and single bit flag is getting clumsy. thaw_processes() is already testing whether cgroup freezing in in effect to avoid thawing tasks which were frozen by both PM and cgroup freezers. This is racy (nothing prevents race against cgroup freezing) and fragile. A much simpler way is to test actual freeze conditions from freezing() - ie. directly test whether PM or cgroup freezing is in effect. This patch adds variables to indicate whether and what type of freezing conditions are in effect and reimplements freezing() such that it directly tests whether any of the two freezing conditions is active and the task should freeze. On fast path, freezing() is still very cheap - it only tests system_freezing_cnt. This makes the clumsy dancing aroung TIF_FREEZE unnecessary and freeze/thaw operations more usual - updating state variables for the new state and nudging target tasks so that they notice the new state and comply. As long as the nudging happens after state update, it's race-free. * This allows use of freezing() in freeze_task(). Replace the open coded tests with freezing(). * p != current test is added to warning printing conditions in try_to_freeze_tasks() failure path. This is necessary as freezing() is now true for the task which initiated freezing too. -v2: Oleg pointed out that re-freezing FROZEN cgroup could increment system_freezing_cnt. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Paul Menage <paul@paulmenage.org> (for the cgroup portions)
2011-11-21cgroup_freezer: prepare for removal of TIF_FREEZETejun Heo
TIF_FREEZE will be removed soon and freezing() will directly test whether any freezing condition is in effect. Make the following changes in preparation. * Rename cgroup_freezing_or_frozen() to cgroup_freezing() and make it return bool. * Make cgroup_freezing() access task_freezer() under rcu read lock instead of task_lock(). This makes the state dereferencing racy against task moving to another cgroup; however, it was already racy without this change as ->state dereference wasn't synchronized. This will be later dealt with using attach hooks. * freezer->state is now set before trying to push tasks into the target state. -v2: Oleg pointed out that freeze_change_state() was setting freeze->state incorrectly to CGROUP_FROZEN instead of CGROUP_FREEZING. Fixed. -v3: Matt pointed out that setting CGROUP_FROZEN used to always invoke try_to_freeze_cgroup() regardless of the current state. Patch updated such that the actual freeze/thaw operations are always performed on invocation. This shouldn't make any difference unless something is broken. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Paul Menage <paul@paulmenage.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com>
2011-11-21freezer: clean up freeze_processes() failure pathTejun Heo
freeze_processes() failure path is rather messy. Freezing is canceled for workqueues and tasks which aren't frozen yet but frozen tasks are left alone and should be thawed by the caller and of course some callers (xen and kexec) didn't do it. This patch updates __thaw_task() to handle cancelation correctly and makes freeze_processes() and freeze_kernel_threads() call thaw_processes() on failure instead so that the system is fully thawed on failure. Unnecessary [suspend_]thaw_processes() calls are removed from kernel/power/hibernate.c, suspend.c and user.c. While at it, restructure error checking if clause in suspend_prepare() to be less weird. -v2: Srivatsa spotted missing removal of suspend_thaw_processes() in suspend_prepare() and error in commit message. Updated. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
2011-11-21freezer: rename thaw_process() to __thaw_task() and simplify the implementationTejun Heo
thaw_process() now has only internal users - system and cgroup freezers. Remove the unnecessary return value, rename, unexport and collapse __thaw_process() into it. This will help further updates to the freezer code. -v3: oom_kill grew a use of thaw_process() while this patch was pending. Convert it to use __thaw_task() for now. In the longer term, this should be handled by allowing tasks to die if killed even if it's frozen. -v2: minor style update as suggested by Matt. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Paul Menage <menage@google.com> Cc: Matt Helsley <matthltc@us.ibm.com>
2011-11-21freezer: implement and use kthread_freezable_should_stop()Tejun Heo
Writeback and thinkpad_acpi have been using thaw_process() to prevent deadlock between the freezer and kthread_stop(); unfortunately, this is inherently racy - nothing prevents freezing from happening between thaw_process() and kthread_stop(). This patch implements kthread_freezable_should_stop() which enters refrigerator if necessary but is guaranteed to return if kthread_stop() is invoked. Both thaw_process() users are converted to use the new function. Note that this deadlock condition exists for many of freezable kthreads. They need to be converted to use the new should_stop or freezable workqueue. Tested with synthetic test case. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Henrique de Moraes Holschuh <ibm-acpi@hmh.eng.br> Cc: Jens Axboe <axboe@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com>
2011-11-21freezer: unexport refrigerator() and update try_to_freeze() slightlyTejun Heo
There is no reason to export two functions for entering the refrigerator. Calling refrigerator() instead of try_to_freeze() doesn't save anything noticeable or removes any race condition. * Rename refrigerator() to __refrigerator() and make it return bool indicating whether it scheduled out for freezing. * Update try_to_freeze() to return bool and relay the return value of __refrigerator() if freezing(). * Convert all refrigerator() users to try_to_freeze(). * Update documentation accordingly. * While at it, add might_sleep() to try_to_freeze(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>
2011-11-04PM / Freezer: Reimplement wait_event_freezekillable using ↵Oleg Nesterov
freezer_do_not_count/freezer_count Commit 27920651fe "PM / Freezer: Make fake_signal_wake_up() wake TASK_KILLABLE tasks too" updated fake_signal_wake_up() used by freezer to wake up KILLABLE tasks. Sending unsolicited wakeups to tasks in killable sleep is dangerous as there are code paths which depend on tasks not waking up spuriously from KILLABLE sleep. For example. sys_read() or page can sleep in TASK_KILLABLE assuming that wait/down/whatever _killable can only fail if we can not return to the usermode. TASK_TRACED is another obvious example. The offending commit was to resolve freezer hang during system PM operations caused by KILLABLE sleeps in network filesystems. wait_event_freezekillable(), which depends on the spurious KILLABLE wakeup, was added by f06ac72e92 "cifs, freezer: add wait_event_freezekillable and have cifs use it" to be used to implement killable & freezable sleeps in network filesystems. To prepare for reverting of 27920651fe, this patch reimplements wait_event_freezekillable() using freezer_do_not_count/freezer_count() so that it doesn't depend on the spurious KILLABLE wakeup. This isn't very nice but should do for now. [tj: Refreshed patch to apply to linus/master and updated commit description on Rafael's request.] Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-28Merge branch '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
* '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits) Fix build break when freezer not configured Add definition for share encryption CIFS: Make cifs_push_locks send as many locks at once as possible CIFS: Send as many mandatory unlock ranges at once as possible CIFS: Implement caching mechanism for posix brlocks CIFS: Implement caching mechanism for mandatory brlocks CIFS: Fix DFS handling in cifs_get_file_info CIFS: Fix error handling in cifs_readv_complete [CIFS] Fixup trivial checkpatch warning [CIFS] Show nostrictsync and noperm mount options in /proc/mounts cifs, freezer: add wait_event_freezekillable and have cifs use it cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters cifs: tune bdi.ra_pages in accordance with the rsize cifs: allow for larger rsize= options and change defaults cifs: convert cifs_readpages to use async reads cifs: add cifs_async_readv cifs: fix protocol definition for READ_RSP cifs: add a callback function to receive the rest of the frame cifs: break out 3rd receive phase into separate function cifs: find mid earlier in receive codepath ...
2011-10-27Fix build break when freezer not configuredSteve French
fs/cifs/transport.c: In function 'wait_for_response': fs/cifs/transport.c:328: error: implicit declaration of function 'wait_event_freezekillable' Caused by commit f06ac72e9291 ("cifs, freezer: add wait_event_freezekillable and have cifs use it"). In this config, CONFIG_FREEZER is not set. Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com> CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2011-10-19[CIFS] Fixup trivial checkpatch warningSteve French
Signed-off-by: Steve French <smfrench@gmail.com>
2011-10-19cifs, freezer: add wait_event_freezekillable and have cifs use itJeff Layton
CIFS currently uses wait_event_killable to put tasks to sleep while they await replies from the server. That function though does not allow the freezer to run. In many cases, the network interface may be going down anyway, in which case the reply will never come. The client then ends up blocking the computer from suspending. Fix this by adding a new wait_event_freezable variant -- wait_event_freezekillable. The idea is to combine the behavior of wait_event_killable and wait_event_freezable -- put the task to sleep and only allow it to be awoken by fatal signals, but also allow the freezer to do its job. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2011-10-16PM / Hibernate: Freeze kernel threads after preallocating memoryRafael J. Wysocki
There is a problem with the current ordering of hibernate code which leads to deadlocks in some filesystems' memory shrinkers. Namely, some filesystems use freezable kernel threads that are inactive when the hibernate memory preallocation is carried out. Those same filesystems use memory shrinkers that may be triggered by the hibernate memory preallocation. If those memory shrinkers wait for the frozen kernel threads, the hibernate process deadlocks (this happens with XFS, for one example). Apparently, it is not technically viable to redesign the filesystems in question to avoid the situation described above, so the only possible solution of this issue is to defer the freezing of kernel threads until the hibernate memory preallocation is done, which is implemented by this change. Unfortunately, this requires the memory preallocation to be done before the "prepare" stage of device freeze, so after this change the only way drivers can allocate additional memory for their freeze routines in a clean way is to use PM notifiers. Reported-by: Christoph <cr2005@u-club.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-02-16workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable'Tejun Heo
There are two spellings in use for 'freeze' + 'able' - 'freezable' and 'freezeable'. The former is the more prominent one. The latter is mostly used by workqueue and in a few other odd places. Unify the spelling to 'freezable'. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Alex Dubov <oakad@yahoo.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Steven Whitehouse <swhiteho@redhat.com>
2010-03-26Freezer: Fix buggy resume test for tasks frozen with cgroup freezerMatt Helsley
When the cgroup freezer is used to freeze tasks we do not want to thaw those tasks during resume. Currently we test the cgroup freezer state of the resuming tasks to see if the cgroup is FROZEN. If so then we don't thaw the task. However, the FREEZING state also indicates that the task should remain frozen. This also avoids a problem pointed out by Oren Ladaan: the freezer state transition from FREEZING to FROZEN is updated lazily when userspace reads or writes the freezer.state file in the cgroup filesystem. This means that resume will thaw tasks in cgroups which should be in the FROZEN state if there is no read/write of the freezer.state file to trigger this transition before suspend. NOTE: Another "simple" solution would be to always update the cgroup freezer state during resume. However it's a bad choice for several reasons: Updating the cgroup freezer state is somewhat expensive because it requires walking all the tasks in the cgroup and checking if they are each frozen. Worse, this could easily make resume run in N^2 time where N is the number of tasks in the cgroup. Finally, updating the freezer state from this code path requires trickier locking because of the way locks must be ordered. Instead of updating the freezer state we rely on the fact that lazy updates only manage the transition from FREEZING to FROZEN. We know that a cgroup with the FREEZING state may actually be FROZEN so test for that state too. This makes sense in the resume path even for partially-frozen cgroups -- those that really are FREEZING but not FROZEN. Reported-by: Oren Ladaan <orenl@cs.columbia.edu> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: stable@kernel.org Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2008-10-30freezer_cg: use thaw_process() in unfreeze_cgroup()Li Zefan
Don't duplicate the implementation of thaw_process(). [akpm@linux-foundation.org: make __thaw_process() static] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20container freezer: implement freezer cgroup subsystemMatt Helsley
This patch implements a new freezer subsystem in the control groups framework. It provides a way to stop and resume execution of all tasks in a cgroup by writing in the cgroup filesystem. The freezer subsystem in the container filesystem defines a file named freezer.state. Writing "FROZEN" to the state file will freeze all tasks in the cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in the cgroup. Reading will return the current state. * Examples of usage : # mkdir /containers/freezer # mount -t cgroup -ofreezer freezer /containers # mkdir /containers/0 # echo $some_pid > /containers/0/tasks to get status of the freezer subsystem : # cat /containers/0/freezer.state RUNNING to freeze all tasks in the container : # echo FROZEN > /containers/0/freezer.state # cat /containers/0/freezer.state FREEZING # cat /containers/0/freezer.state FROZEN to unfreeze all tasks in the container : # echo RUNNING > /containers/0/freezer.state # cat /containers/0/freezer.state RUNNING This is the basic mechanism which should do the right thing for user space task in a simple scenario. It's important to note that freezing can be incomplete. In that case we return EBUSY. This means that some tasks in the cgroup are busy doing something that prevents us from completely freezing the cgroup at this time. After EBUSY, the cgroup will remain partially frozen -- reflected by freezer.state reporting "FREEZING" when read. The state will remain "FREEZING" until one of these things happens: 1) Userspace cancels the freezing operation by writing "RUNNING" to the freezer.state file 2) Userspace retries the freezing operation by writing "FROZEN" to the freezer.state file (writing "FREEZING" is not legal and returns EIO) 3) The tasks that blocked the cgroup from entering the "FROZEN" state disappear from the cgroup's set of tasks. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: export thaw_process] Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20container freezer: make refrigerator always availableMatt Helsley
Now that the TIF_FREEZE flag is available in all architectures, extract the refrigerator() and freeze_task() from kernel/power/process.c and make it available to all. The refrigerator() can now be used in a control group subsystem implementing a control group freezer. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Matt Helsley <matthltc@us.ibm.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-16Freezer: Introduce PF_FREEZER_NOSIGRafael J. Wysocki
The freezer currently attempts to distinguish kernel threads from user space tasks by checking if their mm pointer is unset and it does not send fake signals to kernel threads. However, there are kernel threads, mostly related to networking, that behave like user space tasks and may want to be sent a fake signal to be frozen. Introduce the new process flag PF_FREEZER_NOSIG that will be set by default for all kernel threads and make the freezer only send fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide the set_freezable_with_signal() function to be called by the kernel threads that want to be sent a fake signal for freezing. This patch should not change the freezer's observable behavior. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>
2007-10-18freezer: introduce freezer-friendly waiting macrosRafael J. Wysocki
Introduce freezer-friendly wrappers around wait_event_interruptible() and wait_event_interruptible_timeout(), originally defined in <linux/wait.h>, to be used in freezable kernel threads. Make some of the freezable kernel threads use them. This is necessary for the freezer to stop sending signals to kernel threads, which is implemented in the next patch. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-29Introduce CONFIG_SUSPEND for suspend-to-Ram and standbyRafael J. Wysocki
Introduce CONFIG_SUSPEND representing the ability to enter system sleep states, such as the ACPI S3 state, and allow the user to choose SUSPEND and HIBERNATION independently of each other. Make HOTPLUG_CPU be selected automatically if SUSPEND or HIBERNATION has been chosen and the kernel is intended for SMP systems. Also, introduce CONFIG_PM_SLEEP which is automatically selected if CONFIG_SUSPEND or CONFIG_HIBERNATION is set and use it to select the code needed for both suspend and hibernation. The top-level power management headers and the ACPI code related to suspend and hibernation are modified to use the new definitions (the changes in drivers/acpi/sleep/main.c are, mostly, moving code to reduce the number of ifdefs). There are many other files in which CONFIG_PM can be replaced with CONFIG_PM_SLEEP or even with CONFIG_SUSPEND, but they can be updated in the future. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19Freezer: avoid freezing kernel threads prematurelyRafael J. Wysocki
Kernel threads should not have TIF_FREEZE set when user space processes are being frozen, since otherwise some of them might be frozen prematurely. To prevent this from happening we can (1) make exit_mm() unset TIF_FREEZE unconditionally just after clearing tsk->mm and (2) make try_to_freeze_tasks() check if p->mm is different from zero and PF_BORROWED_MM is unset in p->flags when user space processes are to be frozen. Namely, when user space processes are being frozen, we only should set TIF_FREEZE for tasks that have p->mm different from NULL and don't have PF_BORROWED_MM set in p->flags. For this reason task_lock() must be used to prevent try_to_freeze_tasks() from racing with use_mm()/unuse_mm(), in which p->mm and p->flags.PF_BORROWED_MM are changed under task_lock(p). Also, we need to prevent the following scenario from happening: * daemonize() is called by a task spawned from a user space code path * freezer checks if the task has p->mm set and the result is positive * task enters exit_mm() and clears its TIF_FREEZE * freezer sets TIF_FREEZE for the task * task calls try_to_freeze() and goes to the refrigerator, which is wrong at that point This requires us to acquire task_lock(p) before p->flags.PF_BORROWED_MM and p->mm are examined and release it after TIF_FREEZE is set for p (or it turns out that TIF_FREEZE should not be set). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Freezer: make kernel threads nonfreezable by defaultRafael J. Wysocki
Currently, the freezer treats all tasks as freezable, except for the kernel threads that explicitly set the PF_NOFREEZE flag for themselves. This approach is problematic, since it requires every kernel thread to either set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't care for the freezing of tasks at all. It seems better to only require the kernel threads that want to or need to be frozen to use some freezer-related code and to remove any freezer-related code from the other (nonfreezable) kernel threads, which is done in this patch. The patch causes all kernel threads to be nonfreezable by default (ie. to have PF_NOFREEZE set by default) and introduces the set_freezable() function that should be called by the freezable kernel threads in order to unset PF_NOFREEZE. It also makes all of the currently freezable kernel threads call set_freezable(), so it shouldn't cause any (intentional) change of behaviour to appear. Additionally, it updates documentation to describe the freezing of tasks more accurately. [akpm@linux-foundation.org: build fixes] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23freezer: move frozen_process() to kernel/power/process.cGautham R Shenoy
Other than refrigerator, no one else calls frozen_process(). So move it from include/linux/freezer.h to kernel/power/process.c. Also, since a task can be marked as frozen by itself, we don't need to pass the (struct task_struct *p) parameter to frozen_process(). Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23freezer: fix PF_NOFREEZE vs freezeable raceGautham R Shenoy
This patch fixes the race pointed out by Oleg Nesterov. * Freezer marks a thread as freezeable. * The thread now marks itself PF_NOFREEZE, but it will be frozen on on calling try_to_freeze(). Thus the task is frozen, even though it doesn't want to. * Subsequent thaw_processes() will also fail to thaw the task since it is marked PF_NOFREEZE. Avoid this problem by checking the task's PF_NOFREEZE status in frozen_processes() before marking the task as frozen. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23freezer: fix vfork problemRafael J. Wysocki
Currently try_to_freeze_tasks() has to wait until all of the vforked processes exit and for this reason every user can make it fail. To fix this problem we can introduce the additional process flag PF_FREEZER_SKIP to be used by tasks that do not want to be counted as freezable by the freezer and want to have TIF_FREEZE set nevertheless. Then, this flag can be set by tasks using sys_vfork() before they call wait_for_completion(&vfork) and cleared after they have woken up. After clearing it, the tasks should call try_to_freeze() as soon as possible. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-23freezer: close potential race between refrigerator and thaw_tasksRafael J. Wysocki
If the freezing of tasks fails and a task is preempted in refrigerator() before calling frozen_process(), then thaw_tasks() may run before this task is frozen. In that case the task will freeze and no one will thaw it. To fix this race we can call freezing(current) in refrigerator() along with frozen_process(current) under the task_lock() which also should be taken in the error path of try_to_freeze_tasks() as well as in thaw_process(). Moreover, if thaw_process() additionally clears TIF_FREEZE for tasks that are not frozen, we can be sure that all tasks are thawed and there are no pending "freeze" requests after thaw_tasks() has run. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-12-13[PATCH] PM: Fix SMP races in the freezerRafael J. Wysocki
Currently, to tell a task that it should go to the refrigerator, we set the PF_FREEZE flag for it and send a fake signal to it. Unfortunately there are two SMP-related problems with this approach. First, a task running on another CPU may be updating its flags while the freezer attempts to set PF_FREEZE for it and this may leave the task's flags in an inconsistent state. Second, there is a potential race between freeze_process() and refrigerator() in which freeze_process() running on one CPU is reading a task's PF_FREEZE flag while refrigerator() running on another CPU has just set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it. If the refrigerator wins the race, freeze_process() will state that PF_FREEZE hasn't been set for the task and will set it unnecessarily, so the task will go to the refrigerator once again after it's been thawed. To solve first of these problems we need to stop using PF_FREEZE to tell tasks that they should go to the refrigerator. Instead, we can introduce a special TIF_*** flag and use it for this purpose, since it is allowed to change the other tasks' TIF_*** flags and there are special calls for it. To avoid the freeze_process()-refrigerator() race we can make freeze_process() to always check the task's PF_FROZEN flag after it's read its "freeze" flag. We should also make sure that refrigerator() will always reset the task's "freeze" flag after it's set PF_FROZEN for it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@muc.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>