summaryrefslogtreecommitdiffstats
path: root/kernel
AgeCommit message (Collapse)Author
2008-03-07sched: rt-group: fixup schedulability constraints calculationPeter Zijlstra
it was only possible to configure the rt-group scheduling parameters beyond the default value in a very small range. that's because div64_64() has a different calling convention than do_div() :/ fix a few untidies while we are here; sysctl_sched_rt_period may overflow due to that multiplication, so cast to u64 first. Also that RUNTIME_INF juggling makes little sense although its an effective NOP. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07sched: fix the wrong time slice value for SCHED_FIFO tasksMiao Xie
Function sys_sched_rr_get_interval returns wrong time slice value for SCHED_FIFO tasks. The time slice for SCHED_FIFO tasks should be 0. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07sched: export task_nicePavel Roskin
The API is trivial, and so is the implementation. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07sched: balance RT task resched only on runqueueSteven Rostedt
Sripathi Kodi reported a crash in the -rt kernel: https://bugzilla.redhat.com/show_bug.cgi?id=435674 this is due to a place that can reschedule a task without holding the tasks runqueue lock. This was caused by the RT balancing code that pulls RT tasks to the current run queue and will reschedule the current task. There's a slight chance that the pulling of the RT tasks will release the current runqueue's lock and retake it (in the double_lock_balance). During this time that the runqueue is released, the current task can migrate to another runqueue. In the prio_changed_rt code, after the pull, if the current task is of lesser priority than one of the RT tasks pulled, resched_task is called on the current task. If the current task had migrated in that small window, resched_task will be called without holding the runqueue lock for the runqueue that the task is on. This race condition also exists in the mainline kernel and this patch adds a check to make sure the task hasn't migrated before calling resched_task. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Tested-by: Sripathi Kodi <sripathik@in.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-07sched: retain vruntimePeter Zijlstra
Kei Tokunaga reported an interactivity problem when moving tasks between control groups. Tasks would retain their old vruntime when moved between groups, this can cause funny lags. Re-set the vruntime on group move to fit within the new tree. Reported-by: Kei Tokunaga <tokunaga.keiich@jp.fujitsu.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-05cpusets: fix obsolete commentDavid Rientjes
mm migration is no longer done in cpuset_update_task_memory_state() so it can no longer take current->mm->mmap_sem, so fix the obsolete comment. [ This changed in commit 04c19fa6f16047abff2288ddbc1f0798ede5a849 ("cpuset: migrate all tasks in cpuset at once") when the mm migration was moved from cpuset_update_task_memory_state() to update_nodemask() ] Signed-off-by: David Rientjes <rientjes@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04module: allow ndiswrapper to use GPL-only symbolsPavel Roskin
A change after 2.6.24 broke ndiswrapper by accidentally removing its access to GPL-only symbols. Revert that change and add comments about the reasons why ndiswrapper and driverloader are treated in a special way. Signed-off-by: Pavel Roskin <proski@gnu.org> Acked-by: Greg KH <gregkh@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jon Masters <jonathan@jonmasters.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04kprobes: fix a null pointer bug in register_kretprobe()Masami Hiramatsu
Fix a bug in regiseter_kretprobe() which does not check rp->kp.symbol_name == NULL before calling kprobe_lookup_name. For maintainability, this introduces kprobe_addr helper function which resolves addr field. It is used by register_kprobe and register_kretprobe. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04markers: don't risk NULL deref in markerJesper Juhl
get_marker() may return NULL, so test for it. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04Kprobes: indicate kretprobe support in KconfigAnanth N Mavinakayanahalli
Add CONFIG_HAVE_KRETPROBES to the arch/<arch>/Kconfig file for relevant architectures with kprobes support. This facilitates easy handling of in-kernel modules (like samples/kprobes/kretprobe_example.c) that depend on kretprobes being present in the kernel. Thanks to Sam Ravnborg for helping make the patch more lean. Per Mathieu's suggestion, added CONFIG_KRETPROBES and fixed up dependencies. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04Memory Resource Controller use strstrip while parsing argumentsBalbir Singh
The memory controller has a requirement that while writing values, we need to use echo -n. This patch fixes the problem and makes the UI more consistent. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04cgroup: fix default notify_on_release settingLi Zefan
The documentation says the default value of notify_on_release of a child cgroup is inherited from its parent, which is reasonable, but the implementation just sets the flag disabled. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel: sched: revert load_balance_monitor() changes
2008-03-04sched: revert load_balance_monitor() changesPeter Zijlstra
The following commits cause a number of regressions: commit 58e2d4ca581167c2a079f4ee02be2f0bc52e8729 Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Date: Fri Jan 25 21:08:00 2008 +0100 sched: group scheduling, change how cpu load is calculated commit 6b2d7700266b9402e12824e11e0099ae6a4a6a79 Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Date: Fri Jan 25 21:08:00 2008 +0100 sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups Namely: - very frequent wakeups on SMP, reported by PowerTop users. - cacheline trashing on (large) SMP - some latencies larger than 500ms While there is a mergeable patch to fix the latter, the former issues are not fixable in a manner suitable for .25 (we're at -rc3 now). Hence we revert them and try again in v2.6.26. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-04freezer vs stopped or tracedRoland McGrath
This changes the "freezer" code used by suspend/hibernate in its treatment of tasks in TASK_STOPPED (job control stop) and TASK_TRACED (ptrace) states. As I understand it, the intent of the "freezer" is to hold all tasks from doing anything significant. For this purpose, TASK_STOPPED and TASK_TRACED are "frozen enough". It's possible the tasks might resume from ptrace calls (if the tracer were unfrozen) or from signals (including ones that could come via timer interrupts, etc). But this doesn't matter as long as they quickly block again while "freezing" is in effect. Some minor adjustments to the signal.c code make sure that try_to_freeze() very shortly follows all wakeups from both kinds of stop. This lets the freezer code safely leave stopped tasks unmolested. Changing this fixes the longstanding bug of seeing after resuming from suspend/hibernate your shell report "[1] Stopped" and the like for all your jobs stopped by ^Z et al, as if you had freshly fg'd and ^Z'd them. It also removes from the freezer the arcane special case treatment for ptrace'd tasks, which relied on intimate knowledge of ptrace internals. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-03exit_notify: fix kill_orphaned_pgrp() usage with mt exitOleg Nesterov
1. exit_notify() always calls kill_orphaned_pgrp(). This is wrong, we should do this only when the whole process exits. 2. exit_notify() uses "current" as "ignored_task", obviously wrong. Use ->group_leader instead. Test case: void hup(int sig) { printf("HUP received\n"); } void *tfunc(void *arg) { sleep(2); printf("sub-thread exited\n"); return NULL; } int main(int argc, char *argv[]) { if (!fork()) { signal(SIGHUP, hup); kill(getpid(), SIGSTOP); exit(0); } pthread_t thr; pthread_create(&thr, NULL, tfunc, NULL); sleep(1); printf("main thread exited\n"); syscall(__NR_exit, 0); return 0; } output: main thread exited HUP received Hangup With this patch the output is: main thread exited sub-thread exited HUP received Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-03will_become_orphaned_pgrp: partially fix insufficient ->exit_state checkOleg Nesterov
p->exit_state != 0 doesn't mean this process is dead, it may have sub-threads. Change the code to use "p->exit_state && thread_group_empty(p)" instead. Without this patch, ^Z doesn't deliver SIGTSTP to the foreground process if the main thread has exited. However, the new check is not perfect either. There is a window when exit_notify() drops tasklist and before release_task(). Suppose that the last (non-leader) thread exits. This means that entire group exits, but thread_group_empty() is not true yet. As Eric pointed out, is_global_init() is wrong as well, but I did not dare to do other changes. Just for the record, has_stopped_jobs() is absolutely wrong too. But we can't fix it now, we should first fix SIGNAL_STOP_STOPPED issues. Even with this patch ^Z doesn't play well with the dead main thread. The task is stopped correctly but do_wait(WSTOPPED) won't see it. This is another unrelated issue, will be (hopefully) fixed separately. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-03introduce kill_orphaned_pgrp() helperOleg Nesterov
Factor out the common code in reparent_thread() and exit_notify(). No functional changes. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-01[PATCH] drop EOE records from printkSteve Grubb
Hi, While we are looking at the printk issue, I see that its printk'ing the EOE (end of event) records which is really not something that we need in syslog. Its really intended for the realtime audit event stream handled by the audit daemon. So, lets avoid printk'ing that record type. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-01[RFC] AUDIT: do not panic when printk loses messagesEric Paris
On the latest kernels if one was to load about 15 rules, set the failure state to panic, and then run service auditd stop the kernel will panic. This is because auditd stops, then the script deletes all of the rules. These deletions are sent as audit messages out of the printk kernel interface which is already known to be lossy. These will overun the default kernel rate limiting (10 really fast messages) and will call audit_panic(). The same effect can happen if a slew of avc's come through while auditd is stopped. This can be fixed a number of ways but this patch fixes the problem by just not panicing if auditd is not running. We know printk is lossy and if the user chooses to set the failure mode to panic and tries to use printk we can't make any promises no matter how hard we try, so why try? At least in this way we continue to get lost message accounting and will eventually know that things went bad. The other change is to add a new call to audit_log_lost() if auditd disappears. We already pulled the skb off the queue and couldn't send it so that message is lost. At least this way we will account for the last message and panic if the machine is configured to panic. This code path should only be run if auditd dies for unforeseen reasons. If auditd closes correctly audit_pid will get set to 0 and we won't walk this code path. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-01[PATCH] Audit: Fix the format type for size_t variablesPaul Moore
Fix the following compiler warning by using "%zu" as defined in C99. CC kernel/auditsc.o kernel/auditsc.c: In function 'audit_log_single_execve_arg': kernel/auditsc.c:1074: warning: format '%ld' expects type 'long int', but argument 4 has type 'size_t' Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-02-29rcupreempt: remove never-migrates assumption from rcu_process_callbacks()Paul E. McKenney
This patch fixes a potentially invalid access to a per-CPU variable in rcu_process_callbacks(). This per-CPU access needs to be done in such a way as to guarantee that the code using it cannot move to some other CPU before all uses of the value accessed have completed. Even though this code is currently only invoked from softirq context, which currrently cannot migrate to some other CPU, life would be better if this code did not silently make such an assumption. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29rcupreempt: fix hibernate/resume in presence of PREEMPT_RCU and hotplugPaul E. McKenney
This fixes a oops encountered when doing hibernate/resume in presence of PREEMPT_RCU. The problem was that the code failed to disable preemption when accessing a per-CPU variable. This is OK when called from code that already has preemption disabled, but such is not the case from the suspend/resume code path. Reported-by: Dave Young <hidave.darkstar@gmail.com> Tested-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29softlockup: fix task state settingDmitry Adamushko
kthread_stop() can be called when a 'watchdog' thread is executing after kthread_should_stop() but before set_task_state(TASK_INTERRUPTIBLE). Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-29rcu: add support for dynamic ticks and preempt rcuSteven Rostedt
The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The idle CPU will not progress the RCU through its grace period and a synchronize_rcu my get stuck. Without this patch I have a box that will not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine with this patch. This patch comes from the -rt kernel where it has been tested for several months. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26Merge branch 'v2.6.25-rc3-lockdep' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-lockdep * 'v2.6.25-rc3-lockdep' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-lockdep: Subject: lockdep: include all lock classes in all_lock_classes lockdep: increase MAX_LOCK_DEPTH
2008-02-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: latencytop: change /proc task_struct access method latencytop: fix memory leak on latency proc file latencytop: fix kernel panic while reading latency proc file sched: add declaration of sched_tail to sched.h sched: fix signedness warnings in sched.c sched: clean up __pick_last_entity() a bit sched: remove duplicate code from sched_fair.c sched: make early bootup sched_clock() use safer
2008-02-26printk: fix possible printk overrunTejun Heo
printk recursion detection prepends message to printk_buf and offsets printk_buf when actual message is printed but it forgets to trim buffer length accordingly. This can result in overrun in extreme cases. Fix it. [ mingo@elte.hu: bug was introduced by me via: commit 32a76006683f7b28ae3cc491da37716e002f198e Author: Ingo Molnar <mingo@elte.hu> Date: Fri Jan 25 21:07:58 2008 +0100 printk: make printk more robust by not allowing recursion ] Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-25Subject: lockdep: include all lock classes in all_lock_classesDale Farnsworth
Add each lock class to the all_lock_classes list when it is first registered. Previously, lock classes were added to all_lock_classes when the lock class was first used. Since one of the uses of the list is to find unused locks, this didn't work well. Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25sched: fix signedness warnings in sched.cHarvey Harrison
Unsigned long values are always assigned to switch_count, make it unsigned long. kernel/sched.c:3897:15: warning: incorrect type in assignment (different signedness) kernel/sched.c:3897:15: expected long *switch_count kernel/sched.c:3897:15: got unsigned long *<noident> kernel/sched.c:3921:16: warning: incorrect type in assignment (different signedness) kernel/sched.c:3921:16: expected long *switch_count kernel/sched.c:3921:16: got unsigned long *<noident> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25sched: clean up __pick_last_entity() a bitIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25sched: remove duplicate code from sched_fair.cBalbir Singh
pick_task_entity() duplicates existing code. This functionality can be easily obtained using rb_last(). Avoid code duplication by using rb_last(). Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-25sched: make early bootup sched_clock() use saferIngo Molnar
do not call sched_clock() too early. Not only might rq->idle not be set up - but pure per-cpu data might not be accessible either. this solves an ia64 early bootup hang with CONFIG_PRINTK_TIME=y. Tested-by: Tony Luck <tony.luck@gmail.com> Acked-by: Tony Luck <tony.luck@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-23Add memory barrier semantics to wake_up() & coLinus Torvalds
Oleg Nesterov and others have pointed out that on some architectures, the traditional sequence of set_current_state(TASK_INTERRUPTIBLE); if (CONDITION) return; schedule(); is racy wrt another CPU doing CONDITION = 1; wake_up_process(p); because while set_current_state() has a memory barrier separating setting of the TASK_INTERRUPTIBLE state from reading of the CONDITION variable, there is no such memory barrier on the wakeup side. Now, wake_up_process() does actually take a spinlock before it reads and sets the task state on the waking side, and on x86 (and many other architectures) that spinlock is in fact equivalent to a memory barrier, but that is not generally guaranteed. The write that sets CONDITION could move into the critical region protected by the runqueue spinlock. However, adding a smp_wmb() to before the spinlock should now order the writing of CONDITION wrt the lock itself, which in turn is ordered wrt the accesses within the spinlock (which includes the reading of the old state). This should thus close the race (which probably has never been seen in practice, but since smp_wmb() is a no-op on x86, it's not like this will make anything worse either on the most common architecture where the spinlock already gave the required protection). Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23cgroup: remove dead code in cgroup_get_rootdir()Li Zefan
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23cgroup: remove duplicate code in find_css_set()Li Zefan
The list head res->tasks gets initialized twice in find_css_set(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23cgroup: fix subsys bitopsLi Zefan
Cgroup uses unsigned long for subsys bitops, not unsigned long long. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23cgroup: fix memory leak in cgroup_get_sb()Li Zefan
opts.release_agent is not kfree()ed in all necessary places. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23cgroup: fix commentsLi Zefan
fix: - comments about need_forkexit_callback - comments about release agent - typo and comment style, etc. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23kprobes: refuse kprobe insertion on add/sub_preempt_counter()Srinivasa Ds
Kprobes makes use of preempt_disable(),preempt_enable_noresched() and these functions inturn call add/sub_preempt_count(). So we need to refuse user from inserting probe in to these functions. This patch disallows user from probing add/sub_preempt_count(). Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23futex: runtime enable pi and robust functionalityThomas Gleixner
Not all architectures implement futex_atomic_cmpxchg_inatomic(). The default implementation returns -ENOSYS, which is currently not handled inside of the futex guts. Futex PI calls and robust list exits with a held futex result in an endless loop in the futex code on architectures which have no support. Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would add a fair amount of extra if/else constructs to the already complex code. It is also not possible to disable the robust feature before user space tries to register robust lists. Compile time disabling is not a good idea either, as there are already architectures with runtime detection of futex_atomic_cmpxchg_inatomic support. Detect the functionality at runtime instead by calling cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization code. This is guaranteed to fail, but the call of futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled. On architectures, which use the asm-generic implementation or have a runtime CPU feature detection, a -ENOSYS return value disables the PI/robust features. On architectures with a working implementation the call returns -EFAULT and the PI/robust features are enabled. The relevant syscalls return -ENOSYS and the robust list exit code is blocked, when the detection fails. Fixes http://lkml.org/lkml/2008/2/11/149 Originally reported by: Lennart Buytenhek Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Lennert Buytenhek <buytenh@wantstofly.org> Cc: Riku Voipio <riku.voipio@movial.fi> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23futex: fix init orderThomas Gleixner
When the futex init code fails to initialize the futex pseudo file system it returns early without initializing the hash queues. Should the boot succeed then a futex syscall which tries to enqueue a waiter on the hashqueue will crash due to the unitilialized plist heads. Initialize the hash queues before the filesystem. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Lennert Buytenhek <buytenh@wantstofly.org> Cc: Riku Voipio <riku.voipio@movial.fi> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23markers: fix sparse warnings in markers.cHarvey Harrison
char can be unsigned kernel/marker.c:64:20: error: dubious one-bit signed bitfield kernel/marker.c:65:14: error: dubious one-bit signed bitfield Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23PM: Introduce PM_EVENT_HIBERNATE callback stateRafael J. Wysocki
During the last step of hibernation in the "platform" mode (with the help of ACPI) we use the suspend code, including the devices' ->suspend() methods, to prepare the system for entering the ACPI S4 system sleep state. But at least for some devices the operations performed by the ->suspend() callback in that case must be different from its operations during regular suspend. For this reason, introduce the new PM event type PM_EVENT_HIBERNATE and pass it to the device drivers' ->suspend() methods during the last phase of hibernation, so that they can distinguish this case and handle it as appropriate. Modify the drivers that handle PM_EVENT_SUSPEND in a special way and need to handle PM_EVENT_HIBERNATE in the same way. These changes are necessary to fix a hibernation regression related to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-21Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (26 commits) PM: Make suspend_device() static PCI ACPI: Fix comment describing acpi_pci_choose_state Hibernation: Handle DEBUG_PAGEALLOC on x86 ACPI: fix build warning ACPI: TSC breaks atkbd suspend ACPI: remove is_processor_present prototype acer-wmi: Add DMI match for mail LED on Acer TravelMate 4200 series ACPI: sparse fix, replace macro with static function ACPI: thinkpad-acpi: add tablet-mode reporting ACPI: thinkpad-acpi: minor hotkey_radio_sw fixes ACPI: thinkpad-acpi: improve thinkpad-acpi input device documentation ACPI: thinkpad-acpi: issue input events for tablet swivel events ACPI: thinkpad-acpi: make the video output feature optional ACPI: thinkpad-acpi: synchronize input device switches ACPI: thinkpad-acpi: always track input device open/close ACPI: thinkpad-acpi: trivial fix to documentation ACPI: thinkpad-acpi: trivial fix to module_desc typo intel_menlo: extract return values using PTR_ERR ACPI video: check for error from thermal_cooling_device_register ACPI thermal: extract return values using PTR_ERR ...
2008-02-21modules: do not try to add sysfs attributes if !CONFIG_SYSFSKay Sievers
Thanks to Alexey for the testing and the fix of the fix. Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-21Hibernation: Handle DEBUG_PAGEALLOC on x86Rafael J. Wysocki
Make hibernation work with CONFIG_DEBUG_PAGEALLOC set on x86, by checking if the pages to be copied are marked as present in the kernel mapping and temporarily marking them as present if that's not the case. No functional modifications are introduced if CONFIG_DEBUG_PAGEALLOC is unset. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-19genirq: do not leave interupts enabled on free_irqThomas Gleixner
The default_disable() function was changed in commit: 76d2160147f43f982dfe881404cfde9fd0a9da21 genirq: do not mask interrupts by default It removed the mask function in favour of the default delayed interrupt disabling. Unfortunately this also broke the shutdown in free_irq() when the last handler is removed from the interrupt for those architectures which rely on the default implementations. Now we can end up with a enabled interrupt line after the last handler was removed, which can result in spurious interrupts. Fix this by adding a default_shutdown function, which is only installed, when the irqchip implementation does provide neither a shutdown nor a disable function. [@stable: affected versions: .21 - .24 ] Pointed-out-by: Michael Hennerich <Michael.Hennerich@analog.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org Tested-by: Michael Hennerich <Michael.Hennerich@analog.com>
2008-02-19genirq: spurious.c: use time_* macrosS.Caglar Onur
The functions time_before, time_before_eq, time_after, and time_after_eq are more robust for comparing jiffies against other values. So following patch implements usage of the time_after() macro, defined at linux/jiffies.h, which deals with wrapping correctly Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-18Audit: use == not = in if statementsEric Paris
Clearly this was supposed to be an == not an = in the if statement. This patch also causes us to stop processing execve args once we have failed rather than continuing to loop on failure over and over and over. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>