summaryrefslogtreecommitdiffstats
path: root/kernel
AgeCommit message (Collapse)Author
2012-05-10namespaces, pid_ns: fix leakage on fork() failureMike Galbraith
Fork() failure post namespace creation for a child cloned with CLONE_NEWPID leaks pid_namespace/mnt_cache due to proc being mounted during creation, but not unmounted during cleanup. Call pid_ns_release_proc() during cleanup. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Louis Rilling <louis.rilling@kerlabs.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-10compat: Fix RT signal mask corruption via sigprocmaskJan Kiszka
compat_sys_sigprocmask reads a smaller signal mask from userspace than sigprogmask accepts for setting. So the high word of blocked.sig[0] will be cleared, releasing any potentially blocked RT signal. This was discovered via userspace code that relies on get/setcontext. glibc's i386 versions of those functions use sigprogmask instead of rt_sigprogmask to save/restore signal mask and caused RT signal unblocking this way. As suggested by Linus, this replaces the sys_sigprocmask based compat version with one that open-codes the required logic, including the merge of the existing blocked set with the new one provided on SIG_SETMASK. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-29Merge tag 'pm-for-3.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael J. Wysocki: "Fix for an issue causing hibernation to hang on systems with highmem (that practically means i386) due to broken memory management (bug introduced in 3.2, so -stable material) and PM documentation update making the freezer documentation follow the code again after some recent updates." * tag 'pm-for-3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / Freezer / Docs: Update documentation about freezing of tasks PM / Hibernate: fix the number of pages used for hibernate/thaw buffering
2012-04-27Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fix from Ingo Molnar. * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Permit call_rcu() from CPU_DYING notifiers
2012-04-27Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix OOPS when build_sched_domains() percpu allocation fails sched: Fix more load-balancing fallout
2012-04-27Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix perf_event_for_each() to use sibling perf symbols: Read plt symbols from proper symtab_type binary tracing: Fix stacktrace of latency tracers (irqsoff and friends) perf tools: Add 'G' and 'H' modifiers to event parsing tracing: Fix regression with tracing_on perf tools: Drop CROSS_COMPILE from flex and bison calls perf report: Fix crash showing warning related to kernel maps tracing: Fix build breakage without CONFIG_PERF_EVENTS (again)
2012-04-27Merge branch 'for-v3.4-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull build fixes for less mainstream architectures from Paul Gortmaker: "These are fixes for frv(1), blackfin(2), powerpc(1) and xtensa(4). Fortunately the touches are nearly all specific to files just used by the arch in question. The two touches to shared/common files [kernel/irq/debug.h and drivers/pci/Makefile] are trivial to assess as no risk to anyone. Half of them relate to xtensa directly. It was only when I fixed the last xtensa issue that I realized that the arch has been broken for a significant time, and isn't a specific v3.4 regression. So if you wanted, we could leave xtensa lying bleeding in the street for a couple more weeks and queue those for 3.5. But given they are no risk to anyone outside of xtensa, I figured to just leave them in. If you are OK with taking the xtensa fixes, then please pull to get: - one last implicit include uncovered by system.h that is in a file specific to just one powerpc defconfig. (I'd sync'd with BenH). - fix an oversight in the PCI makefile where shared code wasn't being compiled for ARCH=frv - fix a missing include for GPIO in blackfin framebuffer. - audit and tag endif in blackfin ezkit board file, in order to find and fix the misplaced endif masking a block of code. - fix irq/debug.h choice of temporary macro names to be more internal so they don't conflict with names used by xtensa. - fix a reference to an undeclared local var in xtensa's signal.c - fix an implicit bug.h usage in xtensa's asm/io.h uncovered by my removing bug.h from kernel.h - fix xtensa to properly indicate it is using asm-generic/hardirq.h in order to resolve the link error - undefined ack_bad_irq The xtensa still fails final link as my latest binutils does something evil when ld forward-relocates unlikely() blocks, but in theory people who have older/valid toolchains could now use the thing." * 'for-v3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: xtensa: fix build fail on undefined ack_bad_irq blackfin: fix ifdef fustercluck in mach-bf538/boards/ezkit.c blackfin: fix compile error in bfin-lq035q1-fb.c pci: frv architecture needs generic setup-bus infrastructure irq: hide debug macros so they don't collide with others. xtensa: fix build error in xtensa/include/asm/io.h xtensa: fix build failure in xtensa/kernel/signal.c powerpc: fix system.h fallout in sysdev/scom.c [chroma_defconfig]
2012-04-26perf: Fix perf_event_for_each() to use siblingMichael Ellerman
In perf_event_for_each() we call a function on an event, and then iterate over the siblings of the event. However we don't call the function on the siblings, we call it repeatedly on the original event - it seems "obvious" that we should be calling it with sibling as the argument. It looks like this broke in commit 75f937f24bd9 ("Fix ctx->mutex vs counter->mutex inversion"). The only effect of the bug is that the PERF_IOC_FLAG_GROUP parameter to the ioctls doesn't work. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1334109253-31329-1-git-send-email-michael@ellerman.id.au Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-26sched: Fix OOPS when build_sched_domains() percpu allocation failshe, bo
Under extreme memory used up situations, percpu allocation might fail. We hit it when system goes to suspend-to-ram, causing a kworker panic: EIP: [<c124411a>] build_sched_domains+0x23a/0xad0 Kernel panic - not syncing: Fatal exception Pid: 3026, comm: kworker/u:3 3.0.8-137473-gf42fbef #1 Call Trace: [<c18cc4f2>] panic+0x66/0x16c [...] [<c1244c37>] partition_sched_domains+0x287/0x4b0 [<c12a77be>] cpuset_update_active_cpus+0x1fe/0x210 [<c123712d>] cpuset_cpu_inactive+0x1d/0x30 [...] With this fix applied build_sched_domains() will return -ENOMEM and the suspend attempt fails. Signed-off-by: he, bo <bo.he@intel.com> Reviewed-by: Zhang, Yanmin <yanmin.zhang@intel.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1335355161.5892.17.camel@hebo [ So, we fail to deallocate a CPU because we cannot allocate RAM :-/ I don't like that kind of sad behavior but nevertheless it should not crash under high memory load. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-26sched: Fix more load-balancing falloutPeter Zijlstra
Commits 367456c756a6 ("sched: Ditch per cgroup task lists for load-balancing") and 5d6523ebd ("sched: Fix load-balance wreckage") left some more wreckage. By setting loop_max unconditionally to ->nr_running load-balancing could take a lot of time on very long runqueues (hackbench!). So keep the sysctl as max limit of the amount of tasks we'll iterate. Furthermore, the min load filter for migration completely fails with cgroups since inequality in per-cpu state can easily lead to such small loads :/ Furthermore the change to add new tasks to the tail of the queue instead of the head seems to have some effect.. not quite sure I understand why. Combined these fixes solve the huge hackbench regression reported by Tim when hackbench is ran in a cgroup. Reported-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1335365763.28150.267.camel@twins [ got rid of the CONFIG_PREEMPT tuning and made small readability edits ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-25Merge branch 'tip/perf/urgent-2' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
2012-04-25Merge branch 'rcu/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/urgent
2012-04-24PM / Hibernate: fix the number of pages used for hibernate/thaw bufferingBojan Smojver
Hibernation regression fix, since 3.2. Calculate the number of required free pages based on non-high memory pages only, because that is where the buffers will come from. Commit 081a9d043c983f161b78fdc4671324d1342b86bc introduced a new buffer page allocation logic during hibernation, in order to improve the performance. The amount of pages allocated was calculated based on total amount of pages available, although only non-high memory pages are usable for this purpose. This caused hibernation code to attempt to over allocate pages on platforms that have high memory, which led to hangs. Signed-off-by: Bojan Smojver <bojan@rexursive.com> Signed-off-by: Rafael J. Wysocki <rjw@suse.de>
2012-04-23irq: hide debug macros so they don't collide with others.Paul Gortmaker
The file kernel/irq/debug.h temporarily defines P, PS, PD and then undefines them. However these names aren't really "internal" enough, and collide with other more legit users such as the ones in the xtensa arch, causing: In file included from kernel/irq/internals.h:58:0, from kernel/irq/irqdesc.c:18: kernel/irq/debug.h:8:0: warning: "PS" redefined [enabled by default] arch/xtensa/include/asm/regs.h:59:0: note: this is the location of the previous definition Add a handful of underscores to do a better job of hiding these temporary macros. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-19tracing: Fix stacktrace of latency tracers (irqsoff and friends)Steven Rostedt
While debugging a latency with someone on IRC (mirage335) on #linux-rt (OFTC), we discovered that the stacktrace output of the latency tracers (preemptirqsoff) was empty. This bug was caused by the creation of the dynamic length stack trace again (like commit 12b5da3 "tracing: Fix ent_size in trace output" was). This bug is caused by the latency tracers requiring the next event to determine the time between the current event and the next. But by grabbing the next event, the iter->ent_size is set to the next event instead of the current one. As the stacktrace event is the last event, this makes the ent_size zero and causes nothing to be printed for the stack trace. The dynamic stacktrace uses the ent_size to determine how much of the stack can be printed. The ent_size of zero means no stack. The simple fix is to save the iter->ent_size before finding the next event. Note, mirage335 asked to remain anonymous from LKML and git, so I will not add the Reported-by and Tested-by tags, even though he did report the issue and tested the fix. Cc: stable@vger.kernel.org # 3.1+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-19tick: Fix the spurious broadcast timer ticks after resumeSuresh Siddha
During resume, tick_resume_broadcast() programs the broadcast timer in oneshot mode unconditionally. On the platforms where broadcast timer is not really required, this will generate spurious broadcast timer ticks upon resume. For example, on the always running apic timer platforms with HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit hpet counter wraparound time). Similar to boot time, during resume make the oneshot mode setting of the broadcast clock event device conditional on the state of active broadcast users. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: svenjoac@gmx.de Cc: torvalds@linux-foundation.org Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/1334802459.28674.209.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-19tick: Ensure that the broadcast device is initializedThomas Gleixner
Santosh found another trap when we avoid to initialize the broadcast device in the switch_to_oneshot code. The broadcast device might be still in SHUTDOWN state when we actually need to use it. That obviously breaks, as set_next_event() is called on a shutdown device. This did not break on x86, but Suresh analyzed it: From the review, most likely on Sven's system we are force enabling the hpet using the pci quirk's method very late. And in this case, hpet_clockevent (which will be global_clock_event) handler can be null, specifically as this platform might not be using deeper c-states and using the reliable APIC timer. Prior to commit 'fa4da365bc7772c', that handler will be set to 'tick_handle_oneshot_broadcast' when we switch the broadcast timer to oneshot mode, even though we don't use it. Post commit 'fa4da365bc7772c', we stopped switching the broadcast mode to oneshot as this is not really needed and his platform's global_clock_event's handler will remain null. While on my SNB laptop, same is set to 'clockevents_handle_noop' because hpet gets enabled very early. (noop handler on my platform set when the early enabled hpet timer gets replaced by the lapic timer). But the commit 'fa4da365bc7772c' tracked the broadcast timer mode in the SW as oneshot, even though it didn't touch the HW timer. During resume however, tick_resume_broadcast() saw the SW broadcast mode as oneshot and actually programmed the broadcast device also into oneshot mode. So this triggered the null pointer de-reference after the hpet wraps around and depending on what the hpet counter is set to. On the normal platforms where hpet gets enabled early we should be seeing a spurious interrupt (in my SNB laptop I see one spurious interrupt after around 5 minutes ;) which is 32-bit hpet counter wraparound time), but that's a separate issue. Enforce the mode setting when trying to set an event. Reported-and-tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: torvalds@linux-foundation.org Cc: svenjoac@gmx.de Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181723350.2542@ionos
2012-04-18tick: Fix oneshot broadcast setup reallyThomas Gleixner
Sven Joachim reported, that suspend/resume on rc3 trips over a NULL pointer dereference. Linus spotted the clockevent handler being NULL. commit fa4da365b(clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot()) tried to fix a problem with the broadcast device setup, which was introduced in commit 77b0d60c5( clockevents: Leave the broadcast device in shutdown mode when not needed). The initial commit avoided to set up the broadcast device when no broadcast request bits were set, but that left the broadcast device disfunctional. In consequence deep idle states which need the broadcast device were not woken up. commit fa4da365b tried to fix that by initializing the state of the broadcast facility, but that missed the fact, that nothing initializes the event handler and some other state of the underlying clock event device. The fix is to revert both commits and make only the mode setting of the clock event device conditional on the state of active broadcast users. That initializes everything except the low level device mode, but this happens when the broadcast functionality is invoked by deep idle. Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos
2012-04-17rcu: Permit call_rcu() from CPU_DYING notifiersPaul E. McKenney
As of: 29494be71afe ("rcu,cleanup: simplify the code when cpu is dying") RCU adopts callbacks from the dying CPU in its CPU_DYING notifier, which means that any callbacks posted by later CPU_DYING notifiers are ignored until the CPU comes back online. A WARN_ON_ONCE() was added to __call_rcu() by: e56014000816 ("rcu: Simplify offline processing") to check for this condition. Although this condition did not trigger (at least as far as I know) during -next testing, it did recently trigger in mainline: https://lkml.org/lkml/2012/4/2/34 What is needed longer term is for RCU's CPU_DEAD notifier to adopt any callbacks that were posted by CPU_DYING notifiers, however, the Linux kernel has been running with this sort of thing happening for quite some time. So the only thing that qualifies as a regression is the WARN_ON_ONCE(), which this commit removes. Making RCU's CPU_DEAD notifier adopt callbacks posted by CPU_DYING notifiers is a topic for the 3.5 release of the Linux kernel. Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-04-16tracing: Fix regression with tracing_onSteven Rostedt
The change to make tracing_on affect only the ftrace ring buffer, caused a bug where it wont affect any ring buffer. The problem was that the buffer of the trace_array was passed to the write function and not the trace array itself. The trace_array can change the buffer when running a latency tracer. If this happens, then the buffer being disabled may not be the buffer currently used by ftrace. This will cause the tracing_on file to become useless. The simple fix is to pass the trace_array to the write function instead of the buffer. Then the actual buffer may be changed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-13Merge branch 'systemh-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull system.h fixups for less common arch's from Paul Gortmaker: "Here is what is hopefully the last of the system.h related fixups. The fixes for Alpha and ia64 are code relocations consistent with what was done for the more mainstream architectures. Note that the diffstat lines removed vs lines added are not the same since I've fixed some of the whitespace issues in the relocated code blocks. However they are functionally the same. Compile tested locally, plus these two have been in linux-next for a while. There is also a trivial one line system.h related fix for the Tilera arch from Chris Metcalf to fix an implict include.." * 'systemh-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: irq_work: fix compile failure on tile from missing include ia64: populate the cmpxchg header with appropriate code alpha: fix build failures from system.h dismemberment
2012-04-13tracing: Fix build breakage without CONFIG_PERF_EVENTS (again)Mark Brown
Today's -next fails to link for me: kernel/built-in.o:(.data+0x178e50): undefined reference to `perf_ftrace_event_register' It looks like multiple fixes have been merged for the issue fixed by commit fa73dc9 (tracing: Fix build breakage without CONFIG_PERF_EVENTS) though I can't identify the other changes that have gone in at the minute, it's possible that the changes which caused the breakage fixed by the previous commit got dropped but the fix made it in. Link: http://lkml.kernel.org/r/1334307179-21255-1-git-send-email-broonie@opensource.wolfsonmicro.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-13irq_work: fix compile failure on tile from missing includeChris Metcalf
Building with IRQ_WORK configured results in kernel/irq_work.c: In function ‘irq_work_run’: kernel/irq_work.c:110: error: implicit declaration of function ‘irqs_disabled’ The appropriate header just needs to be included. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-04-12Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
Pull a fix for the recent irqdomain bug fixes from Grant Likely: "I flubbed one patch in the last pull request which broke a format string on 64 bit platforms. Here's the fix." * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6: irq_domain: fix type mismatch in debugfs output format
2012-04-12irq_domain: fix type mismatch in debugfs output formatGrant Likely
sizeof(void*) returns an unsigned long, but it was being used as a width parameter to a "%-*s" format string which requires an int. On 64 bit platforms this causes a type mismatch: linux/kernel/irq/irqdomain.c:575: warning: field width should have type 'int', but argument 6 has type 'long unsigned int' This change casts the size to an int so printf gets the right data type. Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: David Daney <david.daney@cavium.com>
2012-04-12Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "The itimer removal one is not strictly a fix, but I really wanted to avoid a rebase of the urgent ones." * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "clocksource: Load the ACPI PM clocksource asynchronously" clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot() itimer: Use printk_once instead of WARN_ONCE nohz: Fix stale jiffies update in tick_nohz_restart() tick: Document TICK_ONESHOT config option proc: stats: Use arch_idle_time for idle and iowait times if available itimer: Schedule silent NULL pointer fixup in setitimer() for removal
2012-04-12Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge fixes from Andrew Morton. * emailed from Andrew Morton <akpm@linux-foundation.org>: (14 patches) panic: fix stack dump print on direct call to panic() drivers/rtc/rtc-pl031.c: enable clock on all ST variants Revert "mm: vmscan: fix misused nr_reclaimed in shrink_mem_cgroup_zone()" hugetlb: fix race condition in hugetlb_fault() drivers/rtc/rtc-twl.c: use static register while reading time drivers/rtc/rtc-s3c.c: add placeholder for driver private data drivers/rtc/rtc-s3c.c: fix compilation error MAINTAINERS: add PCDP console maintainer memcg: do not open code accesses to res_counter members drivers/rtc/rtc-efi.c: fix section mismatch warning drivers/rtc/rtc-r9701.c: reset registers if invalid values are detected drivers/char/random.c: fix boot id uniqueness race memcg: fix broken boolen expression memcg: fix up documentation on global LRU
2012-04-12panic: fix stack dump print on direct call to panic()Jason Wessel
Commit 6e6f0a1f0fa6 ("panic: don't print redundant backtraces on oops") causes a regression where no stack trace will be printed at all for the case where kernel code calls panic() directly while not processing an oops, and of course there are 100's of instances of this type of call. The original commit executed the check (!oops_in_progress), but this will always be false because just before the dump_stack() there is a call to bust_spinlocks(1), which does the following: void __attribute__((weak)) bust_spinlocks(int yes) { if (yes) { ++oops_in_progress; The proper way to resolve the problem that original commit tried to solve is to avoid printing a stack dump from panic() when the either of the following conditions is true: 1) TAINT_DIE has been set (this is done by oops_end()) This indicates and oops has already been printed. 2) oops_in_progress > 1 This guards against the rare case where panic() is invoked a second time, or in between oops_begin() and oops_end() Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: <stable@vger.kernel.org> [3.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-12Merge tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
Pull irqdomain bug fixes from Grant Likely: "This branch fixes a bug in irq_create_mapping() where an error return from irq_alloc_desc_from() gets ignored. It also removes irq_virq_count to fix a bug on powerpc where the irqdomain code does not find irqs allocated above the CONFIG_NR_IRQS boundary. The remaining patches get rid of an completely pointless export and fix some minor bugs in the irqdomain debug output." * tag 'irqdomain-for-linus' of git://git.secretlab.ca/git/linux-2.6: irq_domain: Move irq_virq_count into NOMAP revmap irqdomain: Fix debugfs formatting irq_domain: correct the debugfs file name irq: Kill pointless irqd_to_hw export irq/irq_domain: Quit ignoring error returns from irq_alloc_desc_from().
2012-04-12irq_domain: Move irq_virq_count into NOMAP revmapGrant Likely
This patch replaces the old global setting of irq_virq_count that is only used by the NOMAP mapping and instead uses a revmap_data property so that the maximum NOMAP allocation can be set per NOMAP irq_domain. There is exactly one user of irq_virq_count in-tree right now: PS3. Also, irq_virq_count is only useful for the NOMAP mapping. So, instead of having a single global irq_virq_count values, this change drops it entirely and added a max_irq argument to irq_domain_add_nomap(). That makes it a property of an individual nomap irq domain instead of a global system settting. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com>
2012-04-11cred: copy_process() should clear child->replacement_session_keyringOleg Nesterov
keyctl_session_to_parent(task) sets ->replacement_session_keyring, it should be processed and cleared by key_replace_session_keyring(). However, this task can fork before it notices TIF_NOTIFY_RESUME and the new child gets the bogus ->replacement_session_keyring copied by dup_task_struct(). This is obviously wrong and, if nothing else, this leads to put_cred(already_freed_cred). change copy_creds() to clear this member. If copy_process() fails before this point the wrong ->replacement_session_keyring doesn't matter, exit_creds() won't be called. Cc: <stable@vger.kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-11irqdomain: Fix debugfs formattingGrant Likely
This patch fixes the irq_domain_mapping debugfs output to pad pointer values with leading zeros so that pointer values are displayed correctly. Otherwise you get output similar to "0x 5e0000000000000". Also, when the irq_domain is set to 'null' Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: David Daney <david.daney@cavium.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
2012-04-10irq_domain: correct the debugfs file nameMika Westerberg
The actual name of the irq_domain mapping debugfs file is "irq_domain_mapping" not "virq_mapping". Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2012-04-10irq/irq_domain: Quit ignoring error returns from irq_alloc_desc_from().David Daney
In commit 4bbdd45a (irq_domain/powerpc: eliminate irq_map; use irq_alloc_desc() instead) code was added that ignores error returns from irq_alloc_desc_from() by (silently) casting the return value to unsigned. The negitive value error return now suddenly looks like a valid irq number. Commits cc79ca69 (irq_domain: Move irq_domain code from powerpc to kernel/irq) and 1bc04f2c (irq_domain: Add support for base irq and hwirq in legacy mappings) move this code to its current location in irqdomain.c The result of all of this is a null pointer dereference OOPS if one of the error cases is hit. The fix: Don't cast away the negativeness of the return value and then check for errors. Signed-off-by: David Daney <david.daney@cavium.com> Acked-by: Rob Herring <rob.herring@calxeda.com> [grant.likely: dropped addition of new 'irq' variable] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2012-04-10clockevents: tTack broadcast device mode change in ↵Suresh Siddha
tick_broadcast_switch_to_oneshot() In the commit 77b0d60c5adf39c74039e2142a1d3cd1e4d53799, "clockevents: Leave the broadcast device in shutdown mode when not needed", we were bailing out too quickly in tick_broadcast_switch_to_oneshot(), with out tracking the broadcast device mode change to 'TICKDEV_MODE_ONESHOT'. This breaks the platforms which need broadcast device oneshot services during deep idle states. tick_broadcast_oneshot_control() thinks that it is in periodic mode and fails to take proper decisions based on the CLOCK_EVT_NOTIFY_BROADCAST_[ENTER, EXIT] notifications during deep idle entry/exit. Fix this by tracking the broadcast device mode as 'TICKDEV_MODE_ONESHOT', before leaving the broadcast HW device in shutdown mode if there are no active requests for the moment. Reported-and-tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: johnstul@us.ibm.com Link: http://lkml.kernel.org/r/1334011304.12400.81.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-10itimer: Use printk_once instead of WARN_ONCEThomas Gleixner
David pointed out, that WARN_ONCE() to report usage of an deprecated misfeature make folks unhappy. Use printk_once() instead. Andrew told me to stop grumbling and to remove the silly typecast while touching the file. Reported-by: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer fixlet from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: sysctl: fix write access to dmesg_restrict/kptr_restrict
2012-04-06nohz: Fix stale jiffies update in tick_nohz_restart()Neal Cardwell
Fix tick_nohz_restart() to not use a stale ktime_t "now" value when calling tick_do_update_jiffies64(now). If we reach this point in the loop it means that we crossed a tick boundary since we grabbed the "now" timestamp, so at this point "now" refers to a time in the old jiffy, so using the old value for "now" is incorrect, and is likely to give us a stale jiffies value. In particular, the first time through the loop the tick_do_update_jiffies64(now) call is always a no-op, since the caller, tick_nohz_restart_sched_tick(), will have already called tick_do_update_jiffies64(now) with that "now" value. Note that tick_nohz_stop_sched_tick() already uses the correct approach: when we notice we cross a jiffy boundary, grab a new timestamp with ktime_get(), and *then* update jiffies. Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Ben Segall <bsegall@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1332875377-23014-1-git-send-email-ncardwell@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-04-05Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge batch of fixes from Andrew Morton: "The simple_open() cleanup was held back while I wanted for laggards to merge things. I still need to send a few checkpoint/restore patches. I've been wobbly about merging them because I'm wobbly about the overall prospects for success of the project. But after speaking with Pavel at the LSF conference, it sounds like they're further toward completion than I feared - apparently davem is at the "has stopped complaining" stage regarding the net changes. So I need to go back and re-review those patchs and their (lengthy) discussion." * emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches) memcg swap: use mem_cgroup_uncharge_swap fix backlight: add driver for DA9052/53 PMIC v1 C6X: use set_current_blocked() and block_sigmask() MAINTAINERS: add entry for sparse checker MAINTAINERS: fix REMOTEPROC F: typo alpha: use set_current_blocked() and block_sigmask() simple_open: automatically convert to simple_open() scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open() libfs: add simple_open() hugetlbfs: remove unregister_filesystem() when initializing module drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback fs/xattr.c:setxattr(): improve handling of allocation failures fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed fs/xattr.c: suppress page allocation failure warnings from sys_listxattr() sysrq: use SEND_SIG_FORCED instead of force_sig() proc: fix mount -t proc -o AAA
2012-04-05simple_open: automatically convert to simple_open()Stephen Boyd
Many users of debugfs copy the implementation of default_open() when they want to support a custom read/write function op. This leads to a proliferation of the default_open() implementation across the entire tree. Now that the common implementation has been consolidated into libfs we can replace all the users of this function with simple_open(). This replacement was done with the following semantic patch: <smpl> @ open @ identifier open_f != simple_open; identifier i, f; @@ -int open_f(struct inode *i, struct file *f) -{ ( -if (i->i_private) -f->private_data = i->i_private; | -f->private_data = i->i_private; ) -return 0; -} @ has_open depends on open @ identifier fops; identifier open.open_f; @@ struct file_operations fops = { ... -.open = open_f, +.open = simple_open, ... }; </smpl> [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05sysctl: fix write access to dmesg_restrict/kptr_restrictKees Cook
Commit bfdc0b4 adds code to restrict access to dmesg_restrict, however, it incorrectly alters kptr_restrict rather than dmesg_restrict. The original patch from Richard Weinberger (https://lkml.org/lkml/2011/3/14/362) alters dmesg_restrict as expected, and so the patch seems to have been misapplied. This adds the CAP_SYS_ADMIN check to both dmesg_restrict and kptr_restrict, since both are sensitive. Reported-by: Phillip Lougher <plougher@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Richard Weinberger <richard@nod.at> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-04-04Merge tag 'for_linus-3.4-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb Pull KGDB/KDB regression fixes from Jason Wessel: - Fix a Smatch warning that appeared in the 3.4 merge window - Fix kgdb test suite with SMP for all archs without HW single stepping - Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86 - Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA - Fix kgdb test suite with SMP for all archs with HW single stepping * tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb: x86,kgdb: Fix DEBUG_RODATA limitation using text_poke() kgdb,debug_core: pass the breakpoint struct instead of address and memory kgdbts: (2 of 2) fix single step awareness to work correctly with SMP kgdbts: (1 of 2) fix single step awareness to work correctly with SMP kgdbts: Fix kernel oops with CONFIG_DEBUG_RODATA kdb: Fix smatch warning on dbg_io_ops->is_console
2012-04-04Merge tag 'pm-for-3.4-part-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: - Patch series that hopefully fixes races between the freezer and request_firmware() and request_firmware_nowait() for good, with two cleanups from Stephen Boyd on top. - Runtime PM fix from Alan Stern preventing tasks from getting stuck indefinitely in the runtime PM wait queue. - Device PM QoS update from MyungJoo Ham introducing a new variant of pm_qos_update_request() allowing the callers to specify a timeout. * tag 'pm-for-3.4-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / QoS: add pm_qos_update_request_timeout() API firmware_class: Move request_firmware_nowait() to workqueues firmware_class: Reorganize fw_create_instance() PM / Sleep: Mitigate race between the freezer and request_firmware() PM / Sleep: Move disabling of usermode helpers to the freezer PM / Hibernate: Disable usermode helpers right before freezing tasks firmware_class: Do not warn that system is not ready from async loads firmware_class: Split _request_firmware() into three functions, v2 firmware_class: Rework usermodehelper check PM / Runtime: don't forget to wake up waitqueue on failure
2012-04-02Merge branch 'paul' (Fixups from Paul Gortmaker)Linus Torvalds
This merges some of the fixes from Paul Gortmaker for the header file cleanup fallout. Some of the patches are going through arch maintainer trees, and David Howells suggested another be done differently, but this at least fixes a few cases. * emailed from Paul Gortmaker <paul.gortmaker@windriver.com>: asm-generic: add linux/types.h to cmpxchg.h firewire: restore the device.h include in linux/firewire.h frv: fix warnings in mb93090-mb00/pci-dma.c about implicit EXPORT_SYMBOL parisc: fix missing cmpxchg file error from system.h split blackfin: fix cmpxchg build fails from system.h fallout avr32: fix build failures from mis-naming of atmel_nand.h ARM: mach-msm: fix compile fail from system.h fallout irq_work: fix compile failure on MIPS from system.h split
2012-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fixes from Herbert Xu: - Fix for CPU hotplug hang in padata. - Avoid using cpu_active inappropriately in pcrypt and padata. - Fix for user-space algorithm lookup hang with IV generators. - Fix for netlink dump of algorithms where stuff went missing due to incorrect calculation of message size. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: user - Fix size of netlink dump message crypto: user - Fix lookup of algorithms with IV generator crypto: pcrypt - Use the online cpumask as the default padata: Fix cpu hotplug padata: Use the online cpumask as the default padata: Add a reference to the api documentation
2012-04-02Merge tag 'for-linus' of git://github.com/rustyrussell/linuxLinus Torvalds
Pull cpumask cleanups from Rusty Russell: "(Somehow forgot to send this out; it's been sitting in linux-next, and if you don't want it, it can sit there another cycle)" I'm a sucker for things that actually delete lines of code. Fix up trivial conflict in arch/arm/kernel/kprobes.c, where Rusty fixed a user of &cpu_online_map to be cpu_online_mask, but that code got deleted by commit b21d55e98ac2 ("ARM: 7332/1: extract out code patch function from kprobes"). * tag 'for-linus' of git://github.com/rustyrussell/linux: cpumask: remove old cpu_*_map. documentation: remove references to cpu_*_map. drivers/cpufreq/db8500-cpufreq: remove references to cpu_*_map. remove references to cpu_*_map in arch/
2012-04-02irq_work: fix compile failure on MIPS from system.h splitPaul Gortmaker
Builds of the MIPS platform ip32_defconfig fails as of commit 0195c00244dc ("Merge tag 'split-asm_system_h ...") because MIPS xchg() macro uses BUILD_BUG_ON and it was moved in commit b81947c646bf ("Disintegrate asm/system.h for MIPS"). The root cause is that the system.h split wasn't tested on a baseline with commit 6c03438edeb5 ("kernel.h: doesn't explicitly use bug.h, so don't include it.") Since this file uses BUG code in several other places besides the xchg call, simply make the inclusion explicit. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-31Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix incorrect usage of for_each_cpu_mask() in select_fallback_rq() sched: Fix __schedule_bug() output when called from an interrupt sched/arch: Introduce the finish_arch_post_lock_switch() scheduler callback
2012-03-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates and fixes from Ingo Molnar: "It's mostly fixes, but there's also two late items: - preliminary GTK GUI support for perf report - PMU raw event format descriptors in sysfs, to be parsed by tooling The raw event format in sysfs is a new ABI. For example for the 'CPU' PMU we have: aldebaran:~> ll /sys/bus/event_source/devices/cpu/format/* -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/any -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/cmask -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/edge -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/event -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/inv -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/offcore_rsp -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/pc -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/umask those lists of fields contain a specific format: aldebaran:~> cat /sys/bus/event_source/devices/cpu/format/offcore_rsp config1:0-63 So, those who wish to specify raw events can now use the following event format: -e cpu/cmask=1,event=2,umask=3 Most people will not want to specify any events (let alone raw events), they'll just use whatever default event the tools use. But for more obscure PMU events that have no cross-architecture generic events the above syntax is more usable and a bit more structured than specifying hex numbers." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) perf tools: Remove auto-generated bison/flex files perf annotate: Fix off by one symbol hist size allocation and hit accounting perf tools: Add missing ref-cycles event back to event parser perf annotate: addr2line wants addresses in same format as objdump perf probe: Finder fails to resolve function name to address tracing: Fix ent_size in trace output perf symbols: Handle NULL dso in dso__name_len perf symbols: Do not include libgen.h perf tools: Fix bug in raw sample parsing perf tools: Fix display of first level of callchains perf tools: Switch module.h into export.h perf: Move mmap page data_head offset assertion out of header perf: Fix mmap_page capabilities and docs perf diff: Fix to work with new hists design perf tools: Fix modifier to be applied on correct events perf tools: Fix various casting issues for 32 bits perf tools: Simplify event_read_id exit path tracing: Fix ftrace stack trace entries tracing: Move the tracing_on/off() declarations into CONFIG_TRACING perf report: Add a simple GTK2-based 'perf report' browser ...
2012-03-31tick: Document TICK_ONESHOT config optionThomas Gleixner
This option has been selected from arch code as it was assumed that it's necessary to support oneshot mode clockevent devices. But it's just a core internal helper to compile tick-oneshot.c if NOHZ or HIG_RES_TIMERS are selected. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>