summaryrefslogtreecommitdiffstats
path: root/kernel
AgeCommit message (Collapse)Author
2011-03-09tracing: Add an 'overwrite' trace_option.David Sharp
Add an "overwrite" trace_option for ftrace to control whether the buffer should be overwritten on overflow or not. The default remains to overwrite old events when the buffer is full. This patch adds the option to instead discard newest events when the buffer is full. This is useful to get a snapshot of traces just after enabling traces. Dropping the current event is also a simpler code path. Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-08genirq: Add comments to Kconfig switchesThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sam Ravnborg <sam@ravnborg.org>
2011-03-08Merge commit 'v2.6.38-rc8' into perf/coreIngo Molnar
Merge reason: Merge latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-08debugobjects: Add hint for better object identificationStanislaw Gruszka
In complex subsystems like mac80211 structures can contain several timers and work structs, so identifying a specific instance from the call trace and object type output of debugobjects can be hard. Allow the subsystems which support debugobjects to provide a hint function. This function returns a pointer to a kernel address (preferrably the objects callback function) which is printed along with the debugobjects type. Add hint methods for timer_list, work_struct and hrtimer. [ tglx: Massaged changelog, made it compile ] Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> LKML-Reference: <20110307085809.GA9334@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-08unfuck proc_sysctl ->d_compare()Al Viro
a) struct inode is not going to be freed under ->d_compare(); however, the thing PROC_I(inode)->sysctl points to just might. Fortunately, it's enough to make freeing that sucker delayed, provided that we don't step on its ->unregistering, clear the pointer to it in PROC_I(inode) before dropping the reference and check if it's NULL in ->d_compare(). b) I'm not sure that we *can* walk into NULL inode here (we recheck dentry->seq between verifying that it's still hashed / fetching dentry->d_inode and passing it to ->d_compare() and there's no negative hashed dentries in /proc/sys/*), but if we can walk into that, we really should not have ->d_compare() return 0 on it! Said that, I really suspect that this check can be simply killed. Nick? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-08Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into nextJames Morris
2011-03-05BKL: That's all, folksArnd Bergmann
This removes the implementation of the big kernel lock, at last. A lot of people have worked on this in the past, I so the credit for this patch should be with everyone who participated in the hunt. The names on the Cc list are the people that were the most active in this, according to the recorded git history, in alphabetical order. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alan Cox <alan@linux.intel.com> Cc: Alessio Igor Bogani <abogani@texware.it> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hans Verkuil <hverkuil@xs4all.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jan Blunck <jblunck@infradead.org> Cc: John Kacur <jkacur@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Oliver Neukum <oliver@neukum.org> Cc: Paul Menage <menage@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-04cpuset: add a missing unlock in cpuset_write_resmask()Li Zefan
Don't forget to release cgroup_mutex if alloc_trial_cpuset() fails. [akpm@linux-foundation.org: avoid multiple return points] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04Mark ptrace_{traceme,attach,detach} staticLinus Torvalds
They are only used inside kernel/ptrace.c, and have been for a long time. We don't want to go back to the bad-old-days when architectures did things on their own, so make them static and private. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04rcu: add comment saying why DEBUG_OBJECTS_RCU_HEAD depends on PREEMPT.Paul E. McKenney
The build will break if you change the Kconfig to allow DEBUG_OBJECTS_RCU_HEAD and !PREEMPT, so document the reasoning near where the breakage would occur. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-03-04rcupdate: remove dead codeAmerigo Wang
DEBUG_OBJECTS_RCU_HEAD depends on PREEMPT, so #ifndef CONFIG_PREEMPT is totally useless in kernel/rcupdate.c. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-03-04rcutorture: Get rid of duplicate sched.h includeJesper Juhl
linux/sched.h is included twice in kernel/rcutorture.c - once is enough. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-03-04rcu: call __rcu_read_unlock() in exit_rcu for tiny RCULai Jiangshan
Using __rcu_read_lock() in place of rcu_read_lock() leaves any debug state as it really should be, namely with the lock still held. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-03-04perf: Fix cgroup vs jump_label problemPeter Zijlstra
Li Zefan reported that the jump label code sleeps and we're calling it under a spinlock, *fail* ;-) Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04perf cgroup: Clean up perf_cgroup_create()Li Zefan
- Use kzalloc() to replace kmalloc() + memset(). - Remove redundant initialization, since alloc_percpu() returns zero-filled percpu memory. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4D6F347E.2010806@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04perf cgroup: Fix unmatched call to perf_detach_cgroup()Li Zefan
In the failure path, we call perf_detach_cgroup(), but we didn't call perf_get_cgroup() prio to it. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4D6F346E.9070606@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04perf cgroup: Fix leak of file reference countLi Zefan
In perf_cgroup_connect(), fput_light() is missing in a failure path. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4D6F3461.6060406@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04perf: Fix the missing event initialization when pmu is found in idrLin Ming
Currently, the event is not initialized if pmu is found in idr. This never causes bug just because now no pmu is associated with the idr id. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1298812411.2699.9.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04sched: Resched proper CPU on yield_to()Venkatesh Pallipadi
yield_to_task_fair() has code to resched the CPU of yielding task when the intention is to resched the CPU of the task that is being yielded to. Change here fixes the problem and also makes the resched conditional on rq != p_rq. Signed-off-by: Venkatesh Pallipadi <venki@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299025701-22168-1-git-send-email-venki@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04sched: Allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policyDarren Hart
The current scheduler implementation returns -EPERM when trying to change from SCHED_IDLE to SCHED_OTHER or SCHED_BATCH. Since SCHED_IDLE is considered to be a nice 20 on steroids, changing to another policy should be allowed provided the RLIMIT_NICE is accounted for. This patch allows the following test-case to pass with RLIMIT_NICE=40, but still fail with RLIMIT_NICE=10 when the calling process is run from a typical shell (nice 0, or 20 in rlimit terms). int main() { int ret; struct sched_param sp; sp.sched_priority = 0; /* switch to SCHED_IDLE */ ret = sched_setscheduler(0, SCHED_IDLE, &sp); printf("setscheduler IDLE: %d\n", ret); if (ret) return ret; /* switch back to SCHED_OTHER */ ret = sched_setscheduler(0, SCHED_OTHER, &sp); printf("setscheduler OTHER: %d\n", ret); return ret; } $ ulimit -e 40 $ ./test setscheduler IDLE: 0 setscheduler OTHER: 0 $ ulimit -e 10 $ ulimit -e 10 $ ./test setscheduler IDLE: 0 setscheduler OTHER: -1 Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Richard Purdie <richard.purdie@linuxfoundation.org> LKML-Reference: <4D657BEE.4040608@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04sched: Allow SCHED_BATCH to preempt SCHED_IDLE tasksDarren Hart
Perform the test for SCHED_IDLE before testing for SCHED_BATCH (and ensure idle tasks don't preempt idle tasks) so the non-interactive, but still important, SCHED_BATCH tasks will run in favor of the very low priority SCHED_IDLE tasks. Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Cc: Richard Purdie <richard.purdie@linuxfoundation.org> LKML-Reference: <1298408674-3130-2-git-send-email-dvhart@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04Merge branch 'sched/urgent' into sched/coreIngo Molnar
Merge reason: Add fixes before applying dependent patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04sched: Fix sched rt group scheduling when hierachy is enabledBalbir Singh
The current sched rt code is broken when it comes to hierarchical scheduling, this patch fixes two problems 1. It adds redundant enqueuing (harmless) when it finds a queue has tasks enqueued, but it has no run time and it is not throttled. 2. The most important change is in sched_rt_rq_enqueue/dequeue. The code just picks the rt_rq belonging to the current cpu on which the period timer runs, the patch fixes it, so that the correct rt_se is enqueued/dequeued. Tested with a simple hierarchy /c/d, c and d assigned similar runtimes of 50,000 and a while 1 loop runs within "d". Both c and d get throttled, without the patch, the task just stops running and never runs (depends on where the sched_rt b/w timer runs). With the patch, the task is throttled and runs as expected. [ bharata, suggestions on how to pick the rt_se belong to the rt_rq and correct cpu ] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org LKML-Reference: <20110303113435.GA2868@balbir.in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: Pick up updates before queueing up dependent patches. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-03Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h
2011-03-03netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parmsPatrick McHardy
Netlink message processing in the kernel is synchronous these days, the session information can be collected when needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03blktrace: Remove blk_fill_rwbs_rq.Tao Ma
If we enable trace events to trace block actions, We use blk_fill_rwbs_rq to analyze the corresponding actions in request's cmd_flags, but we only choose the minor 2 bits from it, so most of other flags(e.g, REQ_SYNC) are missing. For example, with a sync write we get: write_test-2409 [001] 160.013869: block_rq_insert: 3,64 W 0 () 258135 + = 8 [write_test] Since now we have integrated the flags of both bio and request, it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and blk_fill_rwbs_rq isn't needed any more. With this patch, after a sync write we get: write_test-2417 [000] 226.603878: block_rq_insert: 3,64 WS 0 () 258135 += 8 [write_test] Signed-off-by: Tao Ma <boyu.mt@taobao.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-03Merge branch '/tip/perf/filter' of ↵Frederic Weisbecker
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git into perf/core
2011-03-02hrtimer: Update base[CLOCK_BOOTTIME].offset correctlyThomas Gleixner
We calculate the current time of each clock base by adding an offset to clock_monotonic. The offset for the clock bases is set in retrigger_next_event() which is called when we switch a cpu to highres mode or when the clock was set. Add the missing update for clock boottime. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <johnstul@us.ibm.com>
2011-03-02genirq: Fixup fasteoi handler for oneshot modeThomas Gleixner
The fasteoi handler must mask the interrupt line in oneshot mode otherwise we end up with an irq storm. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-26genirq: Provide forced interrupt threadingThomas Gleixner
Add a commandline parameter "threadirqs" which forces all interrupts except those marked IRQF_NO_THREAD to run threaded. That's mostly a debug option to allow retrieving better debug data from crashing interrupt handlers. If "threadirqs" is not enabled on the kernel command line, then there is no impact in the interrupt hotpath. Architecture code needs to select CONFIG_IRQ_FORCED_THREADING after marking the interrupts which cant be threaded IRQF_NO_THREAD. All interrupts which have IRQF_TIMER set are implict marked IRQF_NO_THREAD. Also all PER_CPU interrupts are excluded. Forced threading hard interrupts also forces all soft interrupt handling into thread context. When enabled it might slow down things a bit, but for debugging problems in interrupt code it's a reasonable penalty as it does not immediately crash and burn the machine when an interrupt handler is buggy. Some test results on a Core2Duo machine: Cache cold run of: # time git grep irq_desc non-threaded threaded real 1m18.741s 1m19.061s user 0m1.874s 0m1.757s sys 0m5.843s 0m5.427s # iperf -c server non-threaded [ 3] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec [ 3] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec [ 3] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec threaded [ 3] 0.0-10.0 sec 1.09 GBytes 939 Mbits/sec [ 3] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec [ 3] 0.0-10.0 sec 1.09 GBytes 937 Mbits/sec Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110223234956.772668648@linutronix.de>
2011-02-26clockevents: Prevent oneshot mode when broadcast device is periodicThomas Gleixner
When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only can switch into oneshot mode, when the backup broadcast device supports oneshot mode as well. Otherwise we would try to switch the broadcast device into an unsupported mode unconditionally. This went unnoticed so far as the current available broadcast devices support oneshot mode. Seth unearthed this problem while debugging and working around an hpet related BIOS wreckage. Add the necessary check to tick_is_oneshot_available(). Reported-and-tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <alpine.LFD.2.00.1102252231200.2701@localhost6.localdomain6> Cc: stable@kernel.org # .21 ->
2011-02-26sched: Clean up the IRQ_TIME_ACCOUNTING codeVenkatesh Pallipadi
Fix this warning: lkml.org/lkml/2011/1/30/124 kernel/sched.c:3719: warning: 'irqtime_account_idle_ticks' defined but not used kernel/sched.c:3720: warning: 'irqtime_account_process_tick' defined but not used In a cleaner way than: 7e9498705e81: sched: Add #ifdef around irq time accounting functions This patch will not have any functional impact. Signed-off-by: Venkatesh Pallipadi <venki@google.com> Cc: heiko.carstens@de.ibm.com Cc: a.p.zijlstra@chello.nl LKML-Reference: <1298675596-10992-1-git-send-email-venki@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-25sched: Switch wait_task_inactive to schedule_hrtimeout()Thomas Gleixner
When we force thread hard and soft interrupts the startup of ksoftirqd would hang in kthread_bind() when wait_task_inactive() calls schedule_timeout_uninterruptible() because there is no softirq yet which will wake us up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110223234956.677109139@linutronix.de>
2011-02-25genirq: Allow shared oneshot interruptsThomas Gleixner
Support ONESHOT on shared interrupts, if all drivers agree on it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110223234956.483640430@linutronix.de>
2011-02-25genirq: Prepare the handling of shared oneshot interruptsThomas Gleixner
For level type interrupts we need to track how many threads are on flight to avoid useless interrupt storms when not all thread handlers have finished yet. Keep track of the woken threads and only unmask when there are no more threads in flight. Yes, I'm lazy and using a bitfield. But not only because I'm lazy, the main reason is that it's way simpler than using a refcount. A refcount based solution would need to keep track of various things like crashing the irq thread, spurious interrupts coming in, disables/enables, free_irq() and some more. The bitfield keeps the tracking simple and makes things just work. It's also nicely confined to the thread code pathes and does not require additional checks all over the place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110223234956.388095876@linutronix.de>
2011-02-25genirq: Make warning in handle_percpu_event usefulThomas Gleixner
The WARN_ON_ONCE in handle_percpu_event() which emits a warning when an action handler returns with interrupts enabled is not really useful. It does not reveal the interrupt number and handler function which caused it. Make it WARN_ONCE() and add the information. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-25sched: Add #ifdef around irq time accounting functionsHeiko Carstens
Get rid of this: kernel/sched.c:3731:13: warning: 'irqtime_account_idle_ticks' defined but not used kernel/sched.c:3732:13: warning: 'irqtime_account_process_tick' defined but not used Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110225133228.GD7469@osiris.boeblingen.de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23perf: Simplify task_clock_event_read()Peter Zijlstra
There is no point in us having different code paths for nmi and !nmi here, so remove the !nmi one. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23perf_events: Fix rcu and locking issues with cgroup supportStephane Eranian
This patches ensures that we do not end up calling perf_cgroup_from_task() when there is no cgroup event. This avoids potential RCU and locking issues. The change in perf_cgroup_set_timestamp() ensures we check against ctx->nr_cgroups. It also avoids calling perf_clock() tiwce in a row. It also ensures we do need to grab ctx->lock before calling the function. We drop update_cgrp_time() from task_clock_event_read() because it is not needed. This also avoids having to deal with perf_cgroup_from_task(). Thanks to Peter Zijlstra for his help on this. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d5e76b8.815bdf0a.7ac3.774f@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched, autogroup: Stop claiming ownership of the root task groupMike Galbraith
Disown it, and only display autogroup association if one exists. Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1298383320.8036.5.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched, autogroup: Stop going ahead if autogroup is disabledYong Zhang
when autogroup is disable from the beginning, sched_autogroup_create_attach() autogroup_move_group() <== 1 sched_move_task() <== 2 task_move_group_fair() set_task_rq() task_group() autogroup_task_group() We go the whole path without doing anything useful. Then stop going further if autogroup is disabled. But there will be a race window between 1 and 2, in which sysctl_sched_autogroup_enabled is enabled. This issue will be toke by following patch. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1298185696-4403-4-git-send-email-yong.zhang0@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched, autogroup, sysctl: Use proc_dointvec_minmax() insteadYong Zhang
sched_autogroup_enabled has min/max value, proc_dointvec_minmax() is be used for this case. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1298185696-4403-2-git-send-email-yong.zhang0@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched: Fix the group_imb logicPeter Zijlstra
On a 2*6*2 machine something like: taskset -c 3-11 bash -c 'for ((i=0;i<9;i++)) do while :; do :; done & done' _should_ result in 9 busy CPUs, each running 1 task. However it didn't quite work reliably, most of the time one cpu of the second socket (6-11) would be idle and one cpu of the first socket (0-5) would have two tasks on it. The group_imb logic is supposed to deal with this and detect when a particular group is imbalanced (like in our case, 0-2 are idle but 3-5 will have 4 tasks on it). The detection phase needed a bit of a tweak as it was too weak and required more than 2 avg weight tasks difference between idle and busy cpus in the group which won't trigger for our test-case. So cure that to be one or more avg task weight difference between cpus. Once the detection phase worked, it was then defeated by the f_b_g() tests trying to avoid ping-pongs. In particular, this_load >= max_load triggered because the pulling cpu (the (first) idle cpu in on the second socket, say 6) would find this_load to be 5 and max_load to be 4 (there'd be 5 tasks running on our socket and only 4 on the other socket). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nikhil Rao <ncrao@google.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched: Clean up some f_b_g() commentsPeter Zijlstra
The existing comment tends to grow state (as it already has), split it up and place it near the actual tests. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nikhil Rao <ncrao@google.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched: Clean up remnants of sd_idlePeter Zijlstra
With the wholesale removal of the sd_idle SMT logic we can clean up some more. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nikhil Rao <ncrao@google.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23Merge commit 'v2.6.38-rc6' into sched/coreIngo Molnar
Merge reason: Pick up the latest fixes before queueing up new changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-22genirq: Streamline kernel/irq/KconfigJan Beulich
"def_bool n" without prompt is pointless, these should be just "bool". [ tglx: Adapted to latest changes ] Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4D5D3309020000780003264A@vpn.id2.novell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-22rtmutex: tester: Remove the remaining BKL leftoversThomas Gleixner
We just leave the numbers assinged as commemoration and in case that someone was crazy enough to reimplement the test stuff out of tree. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-22Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for now genirq: Prevent access beyond allocated_irqs bitmap