summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2013-02-19Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "Main changes: - scheduler side full-dynticks (user-space execution is undisturbed and receives no timer IRQs) preparation changes that convert the cputime accounting code to be full-dynticks ready, from Frederic Weisbecker. - Initial sched.h split-up changes, by Clark Williams - select_idle_sibling() performance improvement by Mike Galbraith: " 1 tbench pair (worst case) in a 10 core + SMT package: pre 15.22 MB/sec 1 procs post 252.01 MB/sec 1 procs " - sched_rr_get_interval() ABI fix/change. We think this detail is not used by apps (so it's not an ABI in practice), but lets keep it under observation. - misc RT scheduling cleanups, optimizations" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h> cputime: Remove irqsave from seqlock readers sched, powerpc: Fix sched.h split-up build failure cputime: Restore CPU_ACCOUNTING config defaults for PPC64 sched/rt: Move rt specific bits into new header file sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice sched: Move sched.h sysctl bits into separate header sched: Fix signedness bug in yield_to() sched: Fix select_idle_sibling() bouncing cow syndrome sched/rt: Further simplify pick_rt_task() sched/rt: Do not account zero delta_exec in update_curr_rt() cputime: Safely read cputime of full dynticks CPUs kvm: Prepare to add generic guest entry/exit callbacks cputime: Use accessors to read task cputime stats cputime: Allow dynamic switch between tick/virtual based cputime accounting cputime: Generic on-demand virtual cputime accounting cputime: Move default nsecs_to_cputime() to jiffies based cputime file cputime: Librarize per nsecs resolution cputime definitions cputime: Avoid multiplication overflow on utime scaling context_tracking: Export context state for generic vtime ... Fix up conflict in kernel/context_tracking.c due to comment additions.
2013-02-19Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "There are lots of improvements, the biggest changes are: Main kernel side changes: - Improve uprobes performance by adding 'pre-filtering' support, by Oleg Nesterov. - Make some POWER7 events available in sysfs, equivalent to what was done on x86, from Sukadev Bhattiprolu. - tracing updates by Steve Rostedt - mostly misc fixes and smaller improvements. - Use perf/event tracing to report PCI Express advanced errors, by Tony Luck. - Enable northbridge performance counters on AMD family 15h, by Jacob Shin. - This tracing commit: tracing: Remove the extra 4 bytes of padding in events changes the ABI. All involved parties (PowerTop in particular) seem to agree that it's safe to do now with the introduction of libtraceevent, but the devil is in the details ... Main tooling side changes: - Add 'event group view', from Namyung Kim: To use it, 'perf record' should group events when recording. And then perf report parses the saved group relation from file header and prints them together if --group option is provided. You can use the 'perf evlist' command to see event group information: $ perf record -e '{ref-cycles,cycles}' noploop 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ] $ perf evlist --group {ref-cycles,cycles} With this example, default perf report will show you each event separately. You can use --group option to enable event group view: $ perf report --group ... # group: {ref-cycles,cycles} # ======== # Samples: 7K of event 'anon group { ref-cycles, cycles }' # Event count (approx.): 6876107743 # # Overhead Command Shared Object Symbol # ................ ....... ................. .......................... 99.84% 99.76% noploop noploop [.] main 0.07% 0.00% noploop ld-2.15.so [.] strcmp 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del 0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu 0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time 0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask 0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe 0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock 0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page 0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks 0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time As you can see the Overhead column now contains both of ref-cycles and cycles and header line shows group information also - 'anon group { ref-cycles, cycles }'. The output is sorted by period of group leader first. - Initial GTK+ annotate browser, from Namhyung Kim. - Add option for runtime switching perf data file in perf report, just press 's' and a menu with the valid files found in the current directory will be presented, from Feng Tang. - Add support to display whole group data for raw columns, from Jiri Olsa. - Add per processor socket count aggregation in perf stat, from Stephane Eranian. - Add interval printing in 'perf stat', from Stephane Eranian. - 'perf test' improvements - Add support for wildcards in tracepoint system name, from Jiri Olsa. - Add anonymous huge page recognition, from Joshua Zhu. - perf build-id cache now can show DSOs present in a perf.data file that are not in the cache, to integrate with build-id servers being put in place by organizations such as Fedora. - perf top now shares more of the evsel config/creation routines with 'record', paving the way for further integration like 'top' snapshots, etc. - perf top now supports DWARF callchains. - Fix mmap limitations on 32-bit, fix from David Miller. - 'perf bench numa mem' NUMA performance measurement suite - ... and lots of fixes, performance improvements, cleanups and other improvements I failed to list - see the shortlog and git log for details." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits) perf/x86/amd: Enable northbridge performance counters on AMD family 15h perf/hwbp: Fix cleanup in case of kzalloc failure perf tools: Fix build with bison 2.3 and older. perf tools: Limit unwind support to x86 archs perf annotate: Make it to be able to skip unannotatable symbols perf gtk/annotate: Fail early if it can't annotate perf gtk/annotate: Show source lines with gray color perf gtk/annotate: Support multiple event annotation perf ui/gtk: Implement basic GTK2 annotation browser perf annotate: Fix warning message on a missing vmlinux perf buildid-cache: Add --update option uprobes/perf: Avoid uprobe_apply() whenever possible uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE uprobes/perf: Teach trace_uprobe/perf code to pre-filter uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's uprobes: Introduce uprobe_apply() perf: Introduce hw_perf_event->tp_target and ->tp_list uprobes/perf: Always increment trace_uprobe->nhit uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe uprobes/tracing: Introduce is_trace_uprobe_enabled() ...
2013-02-19Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq core changes from Ingo Molnar: "The biggest changes are the IRQ-work and printk changes from Frederic Weisbecker, which prepare the code for 'full dynticks' (the ability to stop or slow down the periodic tick arbitrarily, not just in idle time as today): - Don't stop tick with irq works pending. This fix is generally useful and concerns archs that can't raise self IPIs. - Flush irq works before CPU offlining. - Introduce "lazy" irq works that can wait for the next tick to be executed, unless it's stopped. - Implement klogd wake up using irq work. This removes the ad-hoc printk_tick()/printk_needs_cpu() hooks and make it working even in dynticks mode. - Cleanups and fixes." * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Export enable/disable_percpu_irq() arch Kconfig: Remove references to IRQ_PER_CPU irq_work: Remove return value from the irq_work_queue() function genirq: Avoid deadlock in spurious handling printk: Wake up klogd using irq_work irq_work: Make self-IPIs optable irq_work: Warn if there's still work on cpu_down irq_work: Flush work on CPU_DYING irq_work: Don't stop the tick with pending works nohz: Add API to check tick state irq_work: Remove CONFIG_HAVE_IRQ_WORK irq_work: Fix racy check on work pending flag irq_work: Fix racy IRQ_WORK_BUSY flag setting
2013-02-11sched, powerpc: Fix sched.h split-up build failureIngo Molnar
Fix PowerPC/Cell build fallout from: 8bd75c77b7c6 sched/rt: Move rt specific bits into new header file Reported-by: Michael Ellerman <michael@ellerman.id.au> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-06perf/powerpc: Fix build errorSukadev Bhattiprolu
Fix compile errors like those below: CC arch/powerpc/perf/power7-pmu.o /home/git/linux/arch/powerpc/perf/power7-pmu.c:397:2: error: initialization from incompatible pointer type [-Werror] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130205231938.GA24125@us.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-04arch Kconfig: Remove references to IRQ_PER_CPUJames Hogan
The IRQ_PER_CPU Kconfig symbol was removed in the following commit: Commit 6a58fb3bad099076f36f0f30f44507bc3275cdb6 ("genirq: Remove CONFIG_IRQ_PER_CPU") merged in v2.6.39-rc1. But IRQ_PER_CPU wasn't removed from any of the architecture Kconfig files where it was defined or selected. It's completely unused so remove the remaining references. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: <uclinux-dist-devel@blackfin.uclinux.org> Cc: <linux-mips@linux-mips.org> Cc: <linuxppc-dev@lists.ozlabs.org> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Tony Luck <tony.luck@intel.com> Acked-by: Richard Kuo <rkuo@codeaurora.org> Link: http://lkml.kernel.org/r/1359972583-17134-1-git-send-email-james.hogan@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-04powerpc/mm: Fix hash computation functionAneesh Kumar K.V
The ASM version of hash computation function was truncating the upper bit. Make the ASM version similar to hpt_hash function. Remove masking vsid bits. Without this patch, we observed hang during bootup due to not satisfying page fault request correctly. The fault handler used wrong hash values to update the HPTE. Hence we kept looping with page fault. hash_page(ea=000001003e260008, access=203, trap=300 ip=3fff91787134 dsisr 42000000 The computed value of hash 000000000f22f390 update: avpnv=4003e46054003e00, hash=000000000722f390, f=80000006, psize: 2 ... BenH: The over-masking has been there for ever but only hurts with the new 64T support introduced in 3.7 Reported-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Tested-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org> [v3.7]
2013-01-31perf/POWER7: Make some POWER7 events available in sysfsSukadev Bhattiprolu
Make some POWER7-specific perf events available in sysfs. $ /bin/ls -1 /sys/bus/event_source/devices/cpu/events/ branch-instructions branch-misses cache-misses cache-references cpu-cycles instructions PM_BRU_FIN PM_BRU_MPRED PM_CMPLU_STALL PM_CYC PM_GCT_NOSLOT_CYC PM_INST_CMPL PM_LD_MISS_L1 PM_LD_REF_L1 stalled-cycles-backend stalled-cycles-frontend where the 'PM_*' events are POWER specific and the others are the generic events. This will enable users to specify these events with their symbolic names rather than with their raw code. perf stat -e 'cpu/PM_CYC' ... Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062528.GE13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-31perf/POWER7: Make generic event translations available in sysfsSukadev Bhattiprolu
Make the generic perf events in POWER7 available via sysfs. $ ls /sys/bus/event_source/devices/cpu/events branch-instructions branch-misses cache-misses cache-references cpu-cycles instructions stalled-cycles-backend stalled-cycles-frontend $ cat /sys/bus/event_source/devices/cpu/events/cache-misses event=0x400f0 This patch is based on commits that implement this functionality on x86. Eg: commit a47473939db20e3961b200eb00acf5fcf084d755 Author: Jiri Olsa <jolsa@redhat.com> Date: Wed Oct 10 14:53:11 2012 +0200 perf/x86: Make hardware event translations available in sysfs Changelog:[v2] [Jiri Osla] Drop EVENT_ID() macro since it is only used once. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062454.GD13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-31perf/Power7: Use macros to identify perf eventsSukadev Bhattiprolu
Define and use macros to identify perf events codes This would make it easier and more readable when these event codes need to be used in more than one place. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: linuxppc-dev@ozlabs.org Link: http://lkml.kernel.org/r/20130123062353.GB13720@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-01-29powerpc: Max next_tb to prevent from replaying timer interruptTiejun Chen
With lazy interrupt, we always call __check_irq_replaysome with decrementers_next_tb to check if we need to replay timer interrupt. So in hotplug case we also need to set decrementers_next_tb as MAX to make sure __check_irq_replay don't replay timer interrupt when return as we expect, otherwise we'll trap here infinitely. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc: kernel/kgdb.c: Fix memory leakageCong Ding
the variable backup_current_thread_info isn't freed before existing the function. Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc/book3e: Disable interrupt after preempt_schedule_irqTiejun Chen
In preempt case current arch_local_irq_restore() from preempt_schedule_irq() may enable hard interrupt but we really should disable interrupts when we return from the interrupt, and so that we don't get interrupted after loading SRR0/1. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc/oprofile: Fix error in oprofile power7_marked_instr_event() functionCarl E. Love
The calculation for the left shift of the mask OPROFILE_PM_PMCSEL_MSK has an error. The calculation is should be to shift left by (max_cntrs - cntr) times the width of the pmsel field width. However, the #define OPROFILE_MAX_PMC_NUM was used instead of OPROFILE_PMSEL_FIELD_WIDTH. This patch fixes the calculation. Signed-off-by: Carl Love <cel@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc/pasemi: Fix crash on rebootSteven Rostedt
commit f96972f2dc "kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()" added a call to disable_nonboot_cpus() on kernel_restart(), which tries to shutdown all the CPUs except the first one. The issue with the PA Semi, is that it does not support CPU hotplug. When the call is made to __cpu_down(), it calls the notifiers CPU_DOWN_PREPARE, and then tries to take the CPU down. One of the notifiers to the CPU hotplug code, is the cpufreq. The DOWN_PREPARE will call __cpufreq_remove_dev() which calls cpufreq_driver->exit. The PA Semi exit handler unmaps regions of I/O that is used by an interrupt that goes off constantly (system_reset_common, but it goes off during normal system operations too). I'm not sure exactly what this interrupt does. Running a simple function trace, you can see it goes off quite a bit: # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <idle>-0 [001] 1558.859363: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [000] 1558.860112: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [000] 1558.861109: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [001] 1558.861361: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [000] 1558.861437: .pasemi_system_reset_exception <-.system_reset_exception When the region is unmapped, the system crashes with: Disabling non-boot CPUs ... Error taking CPU1 down: -38 Unable to handle kernel paging request for data at address 0xd0000800903a0100 Faulting instruction address: 0xc000000000055fcc Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT SMP NR_CPUS=64 NUMA PA Semi PWRficient Modules linked in: shpchp NIP: c000000000055fcc LR: c000000000055fb4 CTR: c0000000000df1fc REGS: c0000000012175d0 TRAP: 0300 Not tainted (3.8.0-rc4-test-dirty) MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 24000088 XER: 00000000 SOFTE: 0 DAR: d0000800903a0100, DSISR: 42000000 TASK = c0000000010e9008[0] 'swapper/0' THREAD: c000000001214000 CPU: 0 GPR00: d0000800903a0000 c000000001217850 c0000000012167e0 0000000000000000 GPR04: 0000000000000000 0000000000000724 0000000000000724 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000001 0000000000a70000 GPR12: 0000000024000080 c00000000fff0000 ffffffffffffffff 000000003ffffae0 GPR16: ffffffffffffffff 0000000000a21198 0000000000000060 0000000000000000 GPR20: 00000000008fdd35 0000000000a21258 000000003ffffaf0 0000000000000417 GPR24: 0000000000a226d0 c000000000000000 0000000000000000 0000000000000000 GPR28: c00000000138b358 0000000000000000 c000000001144818 d0000800903a0100 NIP [c000000000055fcc] .set_astate+0x5c/0xa4 LR [c000000000055fb4] .set_astate+0x44/0xa4 Call Trace: [c000000001217850] [c000000000055fb4] .set_astate+0x44/0xa4 (unreliable) [c0000000012178f0] [c00000000005647c] .restore_astate+0x2c/0x34 [c000000001217980] [c000000000054668] .pasemi_system_reset_exception+0x6c/0x88 [c000000001217a00] [c000000000019ef0] .system_reset_exception+0x48/0x84 [c000000001217a80] [c000000000001e40] system_reset_common+0x140/0x180 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning for ppc32Li Zhong
This patch fixes MAX_STACK_TRACE_ENTRIES too low warning for ppc32, which is similar to commit 12660b17. Reported-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-27kvm: Prepare to add generic guest entry/exit callbacksFrederic Weisbecker
Do some ground preparatory work before adding guest_enter() and guest_exit() context tracking callbacks. Those will be later used to read the guest cputime safely when we run in full dynticks mode. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2013-01-27cputime: Generic on-demand virtual cputime accountingFrederic Weisbecker
If we want to stop the tick further idle, we need to be able to account the cputime without using the tick. Virtual based cputime accounting solves that problem by hooking into kernel/user boundaries. However implementing CONFIG_VIRT_CPU_ACCOUNTING require low level hooks and involves more overhead. But we already have a generic context tracking subsystem that is required for RCU needs by archs which plan to shut down the tick outside idle. This patch implements a generic virtual based cputime accounting that relies on these generic kernel/user hooks. There are some upsides of doing this: - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING if context tracking is already built (already necessary for RCU in full tickless mode). - We can rely on the generic context tracking subsystem to dynamically (de)activate the hooks, so that we can switch anytime between virtual and tick based accounting. This way we don't have the overhead of the virtual accounting when the tick is running periodically. And one downside: - There is probably more overhead than a native virtual based cputime accounting. But this relies on hooks that are already set anyway. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2013-01-24Merge branch 'core/irq_work' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into irq/core irq_work fixes and cleanups, in preparation for full dyntics support. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-18KVM: PPC: Emulate dcbfAlexander Graf
Guests can trigger MMIO exits using dcbf. Since we don't emulate cache incoherent MMIO, just do nothing and move on. Reported-by: Ben Collins <ben.c@servergy.com> Signed-off-by: Alexander Graf <agraf@suse.de> Tested-by: Ben Collins <ben.c@servergy.com> CC: stable@vger.kernel.org
2013-01-10Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: use dynamic percpu allocations for shared msrs area KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV powerpc: Corrected include header path in kvm_para.h Add rcu user eqs exception hooks for async page fault
2013-01-06KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNVAndreas Schwab
Fixes this build breakage: arch/powerpc/kvm/book3s_hv_ras.c: In function ‘kvmppc_realmode_mc_power7’: arch/powerpc/kvm/book3s_hv_ras.c:126:23: error: ‘struct paca_struct’ has no member named ‘opal_mc_evt’ Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-01-03Merge tag 'driver-core-3.8-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core __dev* removal patches - take 3 - from Greg Kroah-Hartman: "Here are the remaining __dev* removal patches against the 3.8-rc2 tree. All of these patches were previously sent to the subsystem maintainers, most of them were picked up and pushed to you, but there were a number that fell through the cracks, and new drivers were added during the merge window, so this series cleans up the rest of the instances of these markings. Third time's the charm... Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed up trivial conflict with the pinctrl pull in pinctrl-sirf.c. * tag 'driver-core-3.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (54 commits) misc: remove __dev* attributes. include: remove __dev* attributes. Documentation: remove __dev* attributes. Drivers: misc: remove __dev* attributes. Drivers: block: remove __dev* attributes. Drivers: bcma: remove __dev* attributes. Drivers: char: remove __dev* attributes. Drivers: clocksource: remove __dev* attributes. Drivers: ssb: remove __dev* attributes. Drivers: dma: remove __dev* attributes. Drivers: gpu: remove __dev* attributes. Drivers: infinband: remove __dev* attributes. Drivers: memory: remove __dev* attributes. Drivers: mmc: remove __dev* attributes. Drivers: iommu: remove __dev* attributes. Drivers: power: remove __dev* attributes. Drivers: message: remove __dev* attributes. Drivers: macintosh: remove __dev* attributes. Drivers: mfd: remove __dev* attributes. pstore: remove __dev* attributes. ...
2013-01-03POWERPC: drivers: remove __dev* attributes.Greg Kroah-Hartman
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, __devinitconst, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03powerpc: Add missing NULL terminator to avoid boot panic on PPC40xGabor Juhos
The missing NULL terminator can cause a panic on PPC405 boards during boot: Linux/PowerPC load: console=ttyS0,115200 root=/dev/mtdblock1 rootfstype=squashfs,jffs2 noinitrd init=/etc/preinit Finalizing device tree... flat tree at 0x6a5160 bootconsole [udbg0] enabled Page fault in user mode with in_atomic() = 1 mm = (null) NIP = c0275f50 MSR = fffffffe Oops: Weird page fault, sig: 11 [#1] PowerPC 40x Platform Modules linked in: NIP: c0275f50 LR: c0275f60 CTR: c0280000 REGS: c0275eb0 TRAP: 636f7265 Not tainted (3.7.1) MSR: fffffffe <VEC,VSX,EE,PR,FP,ME,SE,BE,IR,DR,PMM,RI> CR: c06a6190 XER: 00000001 TASK = c02662a8[0] 'swapper' THREAD: c0274000 GPR00: c0275ec0 c000c658 c027c4bf 00000000 c0275ee0 c000a0ec c020a1a8 c020a1f0 GPR08: c020f631 c020f404 c025f078 c025f080 c0275f10 Call Trace: ---[ end trace 31fd0ba7d8756001 ]--- Kernel panic - not syncing: Attempted to kill the idle task! The panic happens since commit 9597abe00c1bab2aedce6b49866bf6d1e81c9eed (sections: fix section conflicts in arch/powerpc), however the root cause of this is that the NULL terminator were not added in commit a4f740cf33f7f6c164bbde3c0cdbcc77b0c4997c (of/flattree: Add of_flat_dt_match() helper function). Cc: Grant Likely <grant.likely@secretlab.ca> Cc: <stable@vger.kernel.org> Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-03powerpc/vdso: Remove redundant locking in update_vsyscall_tz()Shan Hai
The locking in update_vsyscall_tz() is not only unnecessary because the vdso code copies the data unproteced in __kernel_gettimeofday() but also introduces a hard to reproduce race condition between update_vsyscall() and update_vsyscall_tz(), which causes user space process to loop forever in vdso code. The following patch removes the locking from update_vsyscall_tz(). Locking is not only unnecessary because the vdso code copies the data unprotected in __kernel_gettimeofday() but also erroneous because updating the tb_update_count is not atomic and introduces a hard to reproduce race condition between update_vsyscall() and update_vsyscall_tz(), which further causes user space process to loop forever in vdso code. The below scenario describes the race condition, x==0 Boot CPU other CPU proc_P: x==0 timer interrupt update_vsyscall x==1 x++;sync settimeofday update_vsyscall_tz x==2 x++;sync x==3 sync;x++ sync;x++ proc_P: x==3 (loops until x becomes even) Because the ++ operator would be implemented as three instructions and not atomic on powerpc. A similar change was made for x86 in commit 6c260d58634 ("x86: vdso: Remove bogus locking in update_vsyscall_tz") Signed-off-by: Shan Hai <shan.hai@windriver.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-12-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS update from Al Viro: "fscache fixes, ESTALE patchset, vmtruncate removal series, assorted misc stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (79 commits) vfs: make lremovexattr retry once on ESTALE error vfs: make removexattr retry once on ESTALE vfs: make llistxattr retry once on ESTALE error vfs: make listxattr retry once on ESTALE error vfs: make lgetxattr retry once on ESTALE vfs: make getxattr retry once on an ESTALE error vfs: allow lsetxattr() to retry once on ESTALE errors vfs: allow setxattr to retry once on ESTALE errors vfs: allow utimensat() calls to retry once on an ESTALE error vfs: fix user_statfs to retry once on ESTALE errors vfs: make fchownat retry once on ESTALE errors vfs: make fchmodat retry once on ESTALE errors vfs: have chroot retry once on ESTALE error vfs: have chdir retry lookup and call once on ESTALE error vfs: have faccessat retry once on an ESTALE error vfs: have do_sys_truncate retry once on an ESTALE error vfs: fix renameat to retry on ESTALE errors vfs: make do_unlinkat retry once on ESTALE errors vfs: make do_rmdir retry once on ESTALE errors vfs: add a flags argument to user_path_parent ...
2012-12-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure
2012-12-20vfs: turn is_dir argument to kern_path_create into a lookup_flags argJeff Layton
Where we can pass in LOOKUP_DIRECTORY or LOOKUP_REVAL. Any other flags passed in here are currently ignored. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-20Merge tag 'iommu-updates-v3.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "A few new features this merge-window. The most important one is probably, that dma-debug now warns if a dma-handle is not checked with dma_mapping_error by the device driver. This requires minor changes to some architectures which make use of dma-debug. Most of these changes have the respective Acks by the Arch-Maintainers. Besides that there are updates to the AMD IOMMU driver for refactor the IOMMU-Groups support and to make sure it does not trigger a hardware erratum. The OMAP changes (for which I pulled in a branch from Tony Lindgren's tree) have a conflict in linux-next with the arm-soc tree. The conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in the arm-soc tree. It is safe to delete the file too so solve the conflict. Similar changes are done in the arm-soc tree in the common clock framework migration. A missing hunk from the patch in the IOMMU tree will be submitted as a seperate patch when the merge-window is closed." * tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits) ARM: dma-mapping: support debug_dma_mapping_error ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks iommu/omap: Adapt to runtime pm iommu/omap: Migrate to hwmod framework iommu/omap: Keep mmu enabled when requested iommu/omap: Remove redundant clock handling on ISR iommu/amd: Remove obsolete comment iommu/amd: Don't use 512GB pages iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch iommu/tegra: gart: Move bus_set_iommu after probe for multi arch iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all tile: dma_debug: add debug_dma_mapping_error support sh: dma_debug: add debug_dma_mapping_error support powerpc: dma_debug: add debug_dma_mapping_error support mips: dma_debug: add debug_dma_mapping_error support microblaze: dma-mapping: support debug_dma_mapping_error ia64: dma_debug: add debug_dma_mapping_error support c6x: dma_debug: add debug_dma_mapping_error support ARM64: dma_debug: add debug_dma_mapping_error support intel-iommu: Prevent devices with RMRRs from being placed into SI Domain ...
2012-12-20powerpc: Corrected include header path in kvm_para.hBharat Bhushan
The include/uapi/asm/kvm_para.h includes <include/uapi/asm/epapr_hcalls.h> but the correct reference should be <include/asm/epapr_hcalls.h> as this is the place where make install_header installs the header files for userspace. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> CC: stable@vger.kernel.org Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-19unify SS_ONSTACK/SS_DISABLE definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19Bury the conditionals from kernel_thread/kernel_execve seriesAl Viro
All architectures have CONFIG_GENERIC_KERNEL_THREAD CONFIG_GENERIC_KERNEL_EXECVE __ARCH_WANT_SYS_EXECVE None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers of kernel_execve() (which is a trivial wrapper for do_execve() now) left. Kill the conditionals and make both callers use do_execve(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-12-19Merge tag 'modules-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd
2012-12-18Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlight is probably some base POWER8 support. There's more to come such as transactional memory support but that will wait for the next one. Overall it's pretty quiet, or rather I've been pretty poor at picking things up from patchwork and reviewing them this time around and Kumar no better on the FSL side it seems..." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits) powerpc+of: Rename and fix OF reconfig notifier error inject module powerpc: mpc5200: Add a3m071 board support powerpc/512x: don't compile any platform DIU code if the DIU is not enabled powerpc/mpc52xx: use module_platform_driver macro powerpc+of: Export of_reconfig_notifier_[register,unregister] powerpc/dma/raidengine: add raidengine device powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct powerpc/mpc85xx: Change spin table to cached memory powerpc/fsl-pci: Add PCI controller ATMU PM support powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS] powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers powerpc: Disable relocation on exceptions when kexecing powerpc: Enable relocation on during exceptions at boot powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function powerpc: Add wrappers to enable/disable relocation on exceptions powerpc: Add set_mode hcall powerpc: Setup relocation on exceptions for bare metal systems powerpc: Move initial mfspr LPCR out of __init_LPCR powerpc: Add relocation on exception vector handlers ...
2012-12-17Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc patches from Andrew Morton: "Incoming: - lots of misc stuff - backlight tree updates - lib/ updates - Oleg's percpu-rwsem changes - checkpatch - rtc - aoe - more checkpoint/restart support I still have a pile of MM stuff pending - Pekka should be merging later today after which that is good to go. A number of other things are twiddling thumbs awaiting maintainer merges." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits) scatterlist: don't BUG when we can trivially return a proper error. docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output fs, fanotify: add @mflags field to fanotify output docs: add documentation about /proc/<pid>/fdinfo/<fd> output fs, notify: add procfs fdinfo helper fs, exportfs: add exportfs_encode_inode_fh() helper fs, exportfs: escape nil dereference if no s_export_op present fs, epoll: add procfs fdinfo helper fs, eventfd: add procfs fdinfo helper procfs: add ability to plug in auxiliary fdinfo providers tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test breakpoint selftests: print failure status instead of cause make error kcmp selftests: print fail status instead of cause make error kcmp selftests: make run_tests fix mem-hotplug selftests: print failure status instead of cause make error cpu-hotplug selftests: print failure status instead of cause make error mqueue selftests: print failure status instead of cause make error vm selftests: print failure status instead of cause make error ubifs: use prandom_bytes mtd: nandsim: use prandom_bytes ...
2012-12-17compat: generic compat_sys_sched_rr_get_interval() implementationCatalin Marinas
This function is used by sparc, powerpc tile and arm64 for compat support. The patch adds a generic implementation with a wrapper for PowerPC to do the u32->int sign extension. The reason for a single patch covering powerpc, tile, sparc and arm64 is to keep it bisectable, otherwise kernel building may fail with mismatched function declarations. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile] Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17include/linux/init.h: use the stringify operator for the __define_initcall macroMatthew Leach
Currently the __define_initcall() macro takes three arguments, fn, id and level. The level argument is exactly the same as the id argument but wrapped in quotes. To overcome this need to specify three arguments to the __define_initcall macro, where one argument is the stringification of another, we can just use the stringification macro instead. Signed-off-by: Matthew Leach <matthew@mattleach.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...
2012-12-18Merge remote-tracking branch 'agust/next' into nextBenjamin Herrenschmidt
Brings some 52xx updates. Also manually merged tools/perf/perf.h. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-12-16Merge branches 'iommu/fixes', 'dma-debug', 'x86/amd', 'x86/vt-d', ↵Joerg Roedel
'arm/tegra' and 'arm/omap' into next
2012-12-14powerpc: add finit_module syscall.Rusty Russell
(This is just for Acks: this won't work without the actual syscall patches, sitting in my tree for -next at the moment). Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-13Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Marcelo Tosatti: "Considerable KVM/PPC work, x86 kvmclock vsyscall support, IA32_TSC_ADJUST MSR emulation, amongst others." Fix up trivial conflict in kernel/sched/core.c due to cross-cpu migration notifier added next to rq migration call-back. * tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (156 commits) KVM: emulator: fix real mode segment checks in address linearization VMX: remove unneeded enable_unrestricted_guest check KVM: VMX: fix DPL during entry to protected mode x86/kexec: crash_vmclear_local_vmcss needs __rcu kvm: Fix irqfd resampler list walk KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump x86/kexec: VMCLEAR VMCSs loaded on all cpus if necessary KVM: MMU: optimize for set_spte KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation KVM: PPC: bookehv: Add guest computation mode for irq delivery KVM: PPC: Make EPCR a valid field for booke64 and bookehv KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation KVM: PPC: e500: Add emulation helper for getting instruction ea KVM: PPC: bookehv64: Add support for interrupt handling KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler KVM: PPC: booke: Fix get_tb() compile error on 64-bit KVM: PPC: e500: Silence bogus GCC warning in tlb code ...
2012-12-13Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc VM changes from Andrew Morton: "The rest of most-of-MM. The other MM bits await a slab merge. This patch includes the addition of a huge zero_page. Not a performance boost but it an save large amounts of physical memory in some situations. Also a bunch of Fujitsu engineers are working on memory hotplug. Which, as it turns out, was badly broken. About half of their patches are included here; the remainder are 3.8 material." However, this merge disables CONFIG_MOVABLE_NODE, which was totally broken. We don't add new features with "default y", nor do we add Kconfig questions that are incomprehensible to most people without any help text. Does the feature even make sense without compaction or memory hotplug? * akpm: (54 commits) mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic() mm/memory.c: remove unused code from do_wp_page() asm-generic, mm: pgtable: consolidate zero page helpers mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage hwpoison, hugetlbfs: fix RSS-counter warning hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage mm: protect against concurrent vma expansion memcg: do not check for mm in __mem_cgroup_count_vm_event tmpfs: support SEEK_DATA and SEEK_HOLE (reprise) mm: provide more accurate estimation of pages occupied by memmap fs/buffer.c: remove redundant initialization in alloc_page_buffers() fs/buffer.c: do not inline exported function writeback: fix a typo in comment mm: introduce new field "managed_pages" to struct zone mm, oom: remove statically defined arch functions of same name mm, oom: remove redundant sleep in pagefault oom handler mm, oom: cleanup pagefault oom handler memory_hotplug: allow online/offline memory to result movable node numa: add CONFIG_MOVABLE_NODE for movable-dedicated node mm, memcg: avoid unnecessary function call when memcg is disabled ...
2012-12-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...
2012-12-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking changes from David Miller: 1) Allow to dump, monitor, and change the bridge multicast database using netlink. From Cong Wang. 2) RFC 5961 TCP blind data injection attack mitigation, from Eric Dumazet. 3) Networking user namespace support from Eric W. Biederman. 4) tuntap/virtio-net multiqueue support by Jason Wang. 5) Support for checksum offload of encapsulated packets (basically, tunneled traffic can still be checksummed by HW). From Joseph Gasparakis. 6) Allow BPF filter access to VLAN tags, from Eric Dumazet and Daniel Borkmann. 7) Bridge port parameters over netlink and BPDU blocking support from Stephen Hemminger. 8) Improve data access patterns during inet socket demux by rearranging socket layout, from Eric Dumazet. 9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and Jon Maloy. 10) Update TCP socket hash sizing to be more in line with current day realities. The existing heurstics were choosen a decade ago. From Eric Dumazet. 11) Fix races, queue bloat, and excessive wakeups in ATM and associated drivers, from Krzysztof Mazur and David Woodhouse. 12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions in VXLAN driver, from David Stevens. 13) Add "oops_only" mode to netconsole, from Amerigo Wang. 14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also allow DCB netlink to work on namespaces other than the initial namespace. From John Fastabend. 15) Support PTP in the Tigon3 driver, from Matt Carlson. 16) tun/vhost zero copy fixes and improvements, plus turn it on by default, from Michael S. Tsirkin. 17) Support per-association statistics in SCTP, from Michele Baldessari. And many, many, driver updates, cleanups, and improvements. Too numerous to mention individually. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits) net/mlx4_en: Add support for destination MAC in steering rules net/mlx4_en: Use generic etherdevice.h functions. net: ethtool: Add destination MAC address to flow steering API bridge: add support of adding and deleting mdb entries bridge: notify mdb changes via netlink ndisc: Unexport ndisc_{build,send}_skb(). uapi: add missing netconf.h to export list pkt_sched: avoid requeues if possible solos-pci: fix double-free of TX skb in DMA mode bnx2: Fix accidental reversions. bna: Driver Version Updated to 3.1.2.1 bna: Firmware update bna: Add RX State bna: Rx Page Based Allocation bna: TX Intr Coalescing Fix bna: Tx and Rx Optimizations bna: Code Cleanup and Enhancements ath9k: check pdata variable before dereferencing it ath5k: RX timestamp is reported at end of frame ath9k_htc: RX timestamp is reported at end of frame ...
2012-12-12mm, oom: remove statically defined arch functions of same nameDavid Rientjes
out_of_memory() is a globally defined function to call the oom killer. x86, sh, and powerpc all use a function of the same name within file scope in their respective fault.c unnecessarily. Inline the functions into the pagefault handlers to clean the code up. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull big execve/kernel_thread/fork unification series from Al Viro: "All architectures are converted to new model. Quite a bit of that stuff is actually shared with architecture trees; in such cases it's literally shared branch pulled by both, not a cherry-pick. A lot of ugliness and black magic is gone (-3KLoC total in this one): - kernel_thread()/kernel_execve()/sys_execve() redesign. We don't do syscalls from kernel anymore for either kernel_thread() or kernel_execve(): kernel_thread() is essentially clone(2) with callback run before we return to userland, the callbacks either never return or do successful do_execve() before returning. kernel_execve() is a wrapper for do_execve() - it doesn't need to do transition to user mode anymore. As a result kernel_thread() and kernel_execve() are arch-independent now - they live in kernel/fork.c and fs/exec.c resp. sys_execve() is also in fs/exec.c and it's completely architecture-independent. - daemonize() is gone, along with its parts in fs/*.c - struct pt_regs * is no longer passed to do_fork/copy_process/ copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump. - sys_fork()/sys_vfork()/sys_clone() unified; some architectures still need wrappers (ones with callee-saved registers not saved in pt_regs on syscall entry), but the main part of those suckers is in kernel/fork.c now." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits) do_coredump(): get rid of pt_regs argument print_fatal_signal(): get rid of pt_regs argument ptrace_signal(): get rid of unused arguments get rid of ptrace_signal_deliver() arguments new helper: signal_pt_regs() unify default ptrace_signal_deliver flagday: kill pt_regs argument of do_fork() death to idle_regs() don't pass regs to copy_process() flagday: don't pass regs to copy_thread() bfin: switch to generic vfork, get rid of pointless wrappers xtensa: switch to generic clone() openrisc: switch to use of generic fork and clone unicore32: switch to generic clone(2) score: switch to generic fork/vfork/clone c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone() take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h mn10300: switch to generic fork/vfork/clone h8300: switch to generic fork/vfork/clone tile: switch to generic clone() ... Conflicts: arch/microblaze/include/asm/Kbuild
2012-12-11Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest change affects group scheduling: we now track the runnable average on a per-task entity basis, allowing a smoother, exponential decay average based load/weight estimation instead of the previous binary on-the-runqueue/off-the-runqueue load weight method. This will inevitably disturb workloads that were in some sort of borderline balancing state or unstable equilibrium, so an eye has to be kept on regressions. For that reason the new load average is only limited to group scheduling (shares distribution) at the moment (which was also hurting the most from the prior, crude weight calculation and whose scheduling quality wins most from this change) - but we plan to extend this to regular SMP balancing as well in the future, which will simplify and speed up things a bit. Other changes involve ongoing preparatory work to extend NOHZ to the scheduler as well, eventually allowing completely irq-free user-space execution." * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled" cputime: Comment cputime's adjusting code cputime: Consolidate cputime adjustment code cputime: Rename thread_group_times to thread_group_cputime_adjusted cputime: Move thread_group_cputime() to sched code vtime: Warn if irqs aren't disabled on system time accounting APIs vtime: No need to disable irqs on vtime_account() vtime: Consolidate a bit the ctx switch code vtime: Explicitly account pending user time on process tick vtime: Remove the underscore prefix invasion sched/autogroup: Fix crash on reboot when autogroup is disabled cputime: Separate irqtime accounting from generic vtime cputime: Specialize irq vtime hooks kvm: Directly account vtime to system on guest switch vtime: Make vtime_account_system() irqsafe vtime: Gather vtime declarations to their own header file sched: Describe CFS load-balancer sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking sched: Make __update_entity_runnable_avg() fast sched: Update_cfs_shares at period edge ...
2012-12-11Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Lots of activity: 211 files changed, 8328 insertions(+), 4116 deletions(-) most of it on the tooling side. Main changes: * ftrace enhancements and fixes from Steve Rostedt. * uprobes fixes, cleanups and preparation for the ARM port from Oleg Nesterov. * UAPI fixes, from David Howels - prepares the arch/x86 UAPI transition * Separate perf tests into multiple objects, one per test, from Jiri Olsa. * Make hardware event translations available in sysfs, from Jiri Olsa. * Fixes to /proc/pid/maps parsing, preparatory to supporting data maps, from Namhyung Kim * Implement ui_progress for GTK, from Namhyung Kim * Add framework for automated perf_event_attr tests, where tools with different command line options will be run from a 'perf test', via python glue, and the perf syscall will be intercepted to verify that the perf_event_attr fields set by the tool are those expected, from Jiri Olsa * Add a 'link' method for hists, so that we can have the leader with buckets for all the entries in all the hists. This new method is now used in the default 'diff' output, making the sum of the 'baseline' column be 100%, eliminating blind spots. * libtraceevent fixes for compiler warnings trying to make perf it build on some distros, like fedora 14, 32-bit, some of the warnings really pointed to real bugs. * Add a browser for 'perf script' and make it available from the report and annotate browsers. It does filtering to find the scripts that handle events found in the perf.data file used. From Feng Tang * perf inject changes to allow showing where a task sleeps, from Andrew Vagin. * Makefile improvements from Namhyung Kim. * Add --pre and --post command hooks in 'stat', from Peter Zijlstra. * Don't stop synthesizing threads when one vanishes, this is for the existing threads when we start a tool like trace. * Use sched:sched_stat_runtime to provide a thread summary, this produces the same output as the 'trace summary' subcommand of tglx's original "trace" tool. * Support interrupted syscalls in 'trace' * Add an event duration column and filter in 'trace'. * There are references to the man pages in some tools, so try to build Documentation when installing, warning the user if that is not possible, from Borislav Petkov. * Give user better message if precise is not supported, from David Ahern. * Try to find cross-built objdump path by using the session environment information in the perf.data file header, from Irina Tirdea, original patch and idea by Namhyung Kim. * Diplays more output on features check for make V=1, so that one can figure out what is happening by looking at gcc output, etc. From Jiri Olsa. * Add on_exit implementation for systems without one, e.g. Android, from Bernhard Rosenkraenzer. * Only process events for vcpus of interest, helps handling large number of events, from David Ahern. * Cross compilation fixes for Android, from Irina Tirdea. * Add documentation on compiling for Android, from Irina Tirdea. * perf diff improvements from Jiri Olsa. * Target (task/user/cpu/syswide) handling improvements, from Namhyung Kim. * Add support in 'trace' for tracing workload given by command line, from Namhyung Kim. * ... and much more." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (194 commits) uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race perf evsel: Introduce is_group_member method perf powerpc: Use uapi/unistd.h to fix build error tools: Pass the target in descend tools: Honour the O= flag when tool build called from a higher Makefile tools: Define a Makefile function to do subdir processing perf ui: Always compile browser setup code perf ui: Add ui_progress__finish() perf ui gtk: Implement ui_progress functions perf ui: Introduce generic ui_progress helper perf ui tui: Move progress.c under ui/tui directory perf tools: Add basic event modifier sanity check perf tools: Omit group members from perf_evlist__disable/enable perf tools: Ensure single disable call per event in record comand perf tools: Fix 'disabled' attribute config for record command perf tools: Fix attributes for '{}' defined event groups perf tools: Use sscanf for parsing /proc/pid/maps perf tools: Add gtk.<command> config option for launching GTK browser perf tools: Fix compile error on NO_NEWT=1 build perf hists: Initialize all of he->stat with zeroes ...