summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
2012-11-29x86, 386 removal: Remove CONFIG_BSWAPH. Peter Anvin
All 486+ CPUs support BSWAP, so remove the fallback 386 support code. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1354132230-21854-5-git-send-email-hpa@linux.intel.com
2012-11-13Merge tag 'please-pull-tangchen' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/urgent Pull MCE fix from Tony Luck: "Fix problem in CMCI rediscovery code that was illegally migrating worker threads to other cpus." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-31x86, amd: Disable way access filter on Piledriver CPUsAndre Przywara
The Way Access Filter in recent AMD CPUs may hurt the performance of some workloads, caused by aliasing issues in the L1 cache. This patch disables it on the affected CPUs. The issue is similar to that one of last year: http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html This new patch does not replace the old one, we just need another quirk for newer CPUs. The performance penalty without the patch depends on the circumstances, but is a bit less than the last year's 3%. The workloads affected would be those that access code from the same physical page under different virtual addresses, so different processes using the same libraries with ASLR or multiple instances of PIE-binaries. The code needs to be accessed simultaneously from both cores of the same compute unit. More details can be found here: http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf CPUs affected are anything with the core known as Piledriver. That includes the new parts of the AMD A-Series (aka Trinity) and the just released new CPUs of the FX-Series (aka Vishera). The model numbering is a bit odd here: FX CPUs have model 2, A-Series has model 10h, with possible extensions to 1Fh. Hence the range of model ids. Signed-off-by: Andre Przywara <osp@andrep.de> Link: http://lkml.kernel.org/r/1351700450-9277-1-git-send-email-osp@andrep.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-10-30x86/mce: Do not change worker's running cpu in cmci_rediscover().Tang Chen
cmci_rediscover() used set_cpus_allowed_ptr() to change the current process's running cpu, and migrate itself to the dest cpu. But worker processes are not allowed to be migrated. If current is a worker, the worker will be migrated to another cpu, but the corresponding worker_pool is still on the original cpu. In this case, the following BUG_ON in try_to_wake_up_local() will be triggered: BUG_ON(rq != this_rq()); This will cause the kernel panic. The call trace is like the following: [ 6155.451107] ------------[ cut here ]------------ [ 6155.452019] kernel BUG at kernel/sched/core.c:1654! ...... [ 6155.452019] RIP: 0010:[<ffffffff810add15>] [<ffffffff810add15>] try_to_wake_up_local+0x115/0x130 ...... [ 6155.452019] Call Trace: [ 6155.452019] [<ffffffff8166fc14>] __schedule+0x764/0x880 [ 6155.452019] [<ffffffff81670059>] schedule+0x29/0x70 [ 6155.452019] [<ffffffff8166de65>] schedule_timeout+0x235/0x2d0 [ 6155.452019] [<ffffffff810db57d>] ? mark_held_locks+0x8d/0x140 [ 6155.452019] [<ffffffff810dd463>] ? __lock_release+0x133/0x1a0 [ 6155.452019] [<ffffffff81671c50>] ? _raw_spin_unlock_irq+0x30/0x50 [ 6155.452019] [<ffffffff810db8f5>] ? trace_hardirqs_on_caller+0x105/0x190 [ 6155.452019] [<ffffffff8166fefb>] wait_for_common+0x12b/0x180 [ 6155.452019] [<ffffffff810b0b30>] ? try_to_wake_up+0x2f0/0x2f0 [ 6155.452019] [<ffffffff8167002d>] wait_for_completion+0x1d/0x20 [ 6155.452019] [<ffffffff8110008a>] stop_one_cpu+0x8a/0xc0 [ 6155.452019] [<ffffffff810abd40>] ? __migrate_task+0x1a0/0x1a0 [ 6155.452019] [<ffffffff810a6ab8>] ? complete+0x28/0x60 [ 6155.452019] [<ffffffff810b0fd8>] set_cpus_allowed_ptr+0x128/0x130 [ 6155.452019] [<ffffffff81036785>] cmci_rediscover+0xf5/0x140 [ 6155.452019] [<ffffffff816643c0>] mce_cpu_callback+0x18d/0x19d [ 6155.452019] [<ffffffff81676187>] notifier_call_chain+0x67/0x150 [ 6155.452019] [<ffffffff810a03de>] __raw_notifier_call_chain+0xe/0x10 [ 6155.452019] [<ffffffff81070470>] __cpu_notify+0x20/0x40 [ 6155.452019] [<ffffffff810704a5>] cpu_notify_nofail+0x15/0x30 [ 6155.452019] [<ffffffff81655182>] _cpu_down+0x262/0x2e0 [ 6155.452019] [<ffffffff81655236>] cpu_down+0x36/0x50 [ 6155.452019] [<ffffffff813d3eaa>] acpi_processor_remove+0x50/0x11e [ 6155.452019] [<ffffffff813a6978>] acpi_device_remove+0x90/0xb2 [ 6155.452019] [<ffffffff8143cbec>] __device_release_driver+0x7c/0xf0 [ 6155.452019] [<ffffffff8143cd6f>] device_release_driver+0x2f/0x50 [ 6155.452019] [<ffffffff813a7870>] acpi_bus_remove+0x32/0x6d [ 6155.452019] [<ffffffff813a7932>] acpi_bus_trim+0x87/0xee [ 6155.452019] [<ffffffff813a7a21>] acpi_bus_hot_remove_device+0x88/0x16b [ 6155.452019] [<ffffffff813a33ee>] acpi_os_execute_deferred+0x27/0x34 [ 6155.452019] [<ffffffff81090589>] process_one_work+0x219/0x680 [ 6155.452019] [<ffffffff81090528>] ? process_one_work+0x1b8/0x680 [ 6155.452019] [<ffffffff813a33c7>] ? acpi_os_wait_events_complete+0x23/0x23 [ 6155.452019] [<ffffffff810923be>] worker_thread+0x12e/0x320 [ 6155.452019] [<ffffffff81092290>] ? manage_workers+0x110/0x110 [ 6155.452019] [<ffffffff81098396>] kthread+0xc6/0xd0 [ 6155.452019] [<ffffffff8167c4c4>] kernel_thread_helper+0x4/0x10 [ 6155.452019] [<ffffffff81671f30>] ? retint_restore_args+0x13/0x13 [ 6155.452019] [<ffffffff810982d0>] ? __init_kthread_worker+0x70/0x70 [ 6155.452019] [<ffffffff8167c4c0>] ? gs_change+0x13/0x13 This patch removes the set_cpus_allowed_ptr() call, and put the cmci rediscover jobs onto all the other cpus using system_wq. This could bring some delay for the jobs. Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-10-30x86, AMD: Change Boris' email addressBorislav Petkov
Move to private email and put in maintained status. Signed-off-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1351532410-4887-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Remove unused variable in nhmex_rbox_alter_er()Wei Yongjun
The variable port is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: Yan, Zheng <zheng.z.yan@intel.com> Cc: a.p.zijlstra@chello.nl Cc: paulus@samba.org Cc: acme@ghostprotocols.net Link: http://lkml.kernel.org/r/CAPgLHd8NZkYSkZm22FpZxiEh6HcA0q-V%3D29vdnheiDhgrJZ%2Byw@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Enable overflow on Intel KNC with a custom knc_pmu_handle_irq()Vince Weaver
Although based on the Intel P6 design, the interrupt mechnanism for KNC more closely resembles the Intel architectural perfmon one. We can't just re-use that code though, because KNC has different MSR numbers for the status and ack registers. In this case we just cut-and paste from perf_event_intel.c with some minor changes, as it looks like it would not be worth the trouble to change that code to be MSR-configurable. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Meadows Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171304410.23243@vincent-weaver-1.um.maine.edu [ Small stylistic edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Remove cpuc->enable check on Intl KNC event enable/disableVince Weaver
x86_pmu.enable() is called from x86_pmu_enable() with cpuc->enabled set to 0. This means we weren't re-enabling the counters after a context switch. This patch just removes the check, as it should't be necessary (and the equivelent x86_ generic code does not have the checks). The origin of this problem is the KNC driver being based on the P6 one. The P6 driver also has this issue, but works anyway due to various lucky accidents. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Meadows Cc: Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171303290.23243@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Make Intel KNC use full 40-bit width of countersVince Weaver
Early versions of Intel KNC chips have a bug where bits above 32 were not properly set. We worked around this by only using the bottom 32 bits (out of 40 that should be available). It turns out this workaround breaks overflow handling. The buggy silicon will in theory never be used in production systems, so remove this workaround so we get proper overflow support. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Meadows Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171302140.23243@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86/uncore: Handle pci_read_config_dword() errorsYan, Zheng
This, beyond handling corner cases, also fixes some build warnings: arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_disable_box’: arch/x86/kernel/cpu/perf_event_intel_uncore.c:124:9: warning: ‘config’ is used uninitialized in this function [-Wuninitialized] arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_enable_box’: arch/x86/kernel/cpu/perf_event_intel_uncore.c:135:9: warning: ‘config’ is used uninitialized in this function [-Wuninitialized] arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_read_counter’: arch/x86/kernel/cpu/perf_event_intel_uncore.c:164:2: warning: ‘count’ is used uninitialized in this function [-Wuninitialized] Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/1351068140-13456-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Remove P6 cpuc->enabled checkVince Weaver
Between 2.6.33 and 2.6.34 the PMU code was made modular. The x86_pmu_enable() call was extended to disable cpuc->enabled and iterate the counters, enabling one at a time, before calling enable_all() at the end, followed by re-enabling cpuc->enabled. Since cpuc->enabled was set to 0, that change effectively caused the "val |= ARCH_PERFMON_EVENTSEL_ENABLE;" code in p6_pmu_enable_event() and p6_pmu_disable_event() to be dead code that was never called. This change removes this code (which was confusing) and adds some extra commentary to make it more clear what is going on. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191732000.14552@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Update/fix generic events on P6 PMUVince Weaver
This patch updates the generic events on p6, including some new extended cache events. Values for these events were taken from the equivelant PAPI predefined events. Tested on a Pentium II. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191730080.14552@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24perf/x86: Fix P6 FP_ASSIST event constraintVince Weaver
According to Intel SDM Volume 3B, FP_ASSIST is limited to Counter 1 only, not Counter 0. Tested on a Pentium II. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191728570.14552@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24x86/perf: Fix virtualization sanity checkAndre Przywara
In check_hw_exists() we try to detect non-emulated MSR accesses by writing an arbitrary value into one of the PMU registers and check if it's value after a readout is still the same. This algorithm silently assumes that the register does not contain the magic value already, which is wrong in at least one situation. Fix the algorithm to really do a read-modify-write cycle. This fixes a warning under Xen under some circumstances on AMD family 10h CPUs. The reasons in more details actually sound like a story from Believe It or Not!: First you need an AMD family 10h/12h CPU. These do not reset the PERF_CTR registers on a reboot. Now you boot bare metal Linux, which goes successfully through this check, but leaves the magic value of 0xabcd in the register. You don't use the performance counters, but do a reboot (warm reset). Then you choose to boot Xen. The check will be triggered with a recent Linux kernel as Dom0 again, trying to write 0xabcd into the MSR. Xen silently drops the write (expected), but the subsequent read will return the value in the register, which just happens to be the expected magic value. Thus the test misleadingly succeeds, leaving the kernel in the belief that the PMU is available. This will trigger the following message: [ 0.020294] ------------[ cut here ]------------ [ 0.020311] WARNING: at arch/x86/xen/enlighten.c:730 xen_apic_write+0x15/0x17() [ 0.020318] Hardware name: empty [ 0.020323] Modules linked in: [ 0.020334] Pid: 1, comm: swapper/0 Not tainted 3.3.8 #7 [ 0.020340] Call Trace: [ 0.020354] [<ffffffff81050379>] warn_slowpath_common+0x80/0x98 [ 0.020369] [<ffffffff810503a6>] warn_slowpath_null+0x15/0x17 [ 0.020378] [<ffffffff810034df>] xen_apic_write+0x15/0x17 [ 0.020392] [<ffffffff8101cb2b>] perf_events_lapic_init+0x2e/0x30 [ 0.020410] [<ffffffff81ee4dd0>] init_hw_perf_events+0x250/0x407 [ 0.020419] [<ffffffff81ee4b80>] ? check_bugs+0x2d/0x2d [ 0.020430] [<ffffffff81002181>] do_one_initcall+0x7a/0x131 [ 0.020444] [<ffffffff81edbbf9>] kernel_init+0x91/0x15d [ 0.020456] [<ffffffff817caaa4>] kernel_thread_helper+0x4/0x10 [ 0.020471] [<ffffffff817c347c>] ? retint_restore_args+0x5/0x6 [ 0.020481] [<ffffffff817caaa0>] ? gs_change+0x13/0x13 [ 0.020500] ---[ end trace a7919e7f17c0a725 ]--- The new code will change every of the 16 low bits read from the register and tries to write and read-back that modified number from the MSR. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Avi Kivity <avi@redhat.com> Link: http://lkml.kernel.org/r/1349797115-28346-2-git-send-email-andre.przywara@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-20perf/x86: Disable uncore on virtualized CPUsYan, Zheng
Initializing uncore PMU on virtualized CPU may hang the kernel. This is because kvm does not emulate the entire hardware. Thers are lots of uncore related MSRs, making kvm enumerate them all is a non-trival task. So just disable uncore on virtualized CPU. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Tested-by: Pekka Enberg <penberg@kernel.org> Cc: a.p.zijlstra@chello.nl Cc: eranian@google.com Cc: andi@firstfloor.org Cc: avi@redhat.com Link: http://lkml.kernel.org/r/1345540117-14164-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-19Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Assorted small fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf python: Properly link with libtraceevent perf hists browser: Add back callchain folding symbol perf tools: Fix build on sparc. perf python: Link with libtraceevent perf python: Initialize 'page_size' variable tools lib traceevent: Fix missed freeing of subargs in free_arg() in filter lib tools traceevent: Add back pevent assignment in __pevent_parse_format() perf hists browser: Fix off-by-two bug on the first column perf tools: Remove warnings on JIT samples for srcline sort key perf tools: Fix segfault when using srcline sort key perf: Require exclude_guest to use PEBS - kernel side enforcement perf tool: Precise mode requires exclude_guest
2012-10-20Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * The python binding needs to link with libtraceevent and to initialize the 'page_size' variable so that mmaping works again. * The callchain folding character that appears on the TUI just before the overhead had disappeared due to recent changes, add it back. * Intel PEBS in VT-x context uses the DS address as a guest linear address, even though its programmed by the host as a host linear address. This either results in guest memory corruption and or the hardware faulting and 'crashing' the virtual machine. Therefore we have to disable PEBS on VT-x enter and re-enable on VT-x exit, enforcing a strict exclude_guest. Kernel side enforcement fix by Peter Zijlstra, tooling side fix by David Ahern. * Fix build on sparc due to UAPI, fix from David Miller. * Fixes for the srclike sort key for unresolved symbols and when processing samples in JITted code, where we don't have an ELF file, just an special symbol table, fixes from Namhyung Kim. * Fix some leaks in libtraceevent, from Steven Rostedt. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-19Merge commit '5bc66170dc486556a1e36fd384463536573f4b82' into x86/urgentH. Peter Anvin
From Borislav Petkov <bp@amd64.org>: Below is a RAS fix which reverts the addition of a sysfs attribute which we agreed is not needed, post-factum. And this should go in now because that sysfs attribute is going to end up in 3.7 otherwise and thus exposed to userspace; removing it then would be a lot harder. This is done as a merge rather than a simple patch/cherry-pick since the baseline for this patch was not in the previous x86/urgent. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-10-19x86, MCE: Remove bios_cmci_threshold sysfs attributeBorislav Petkov
450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI threshold") added the bios_cmci_threshold sysfs attribute which was supposed to communicate to userspace tools that BIOS CMCI threshold has been honoured. However, this info is not of any importance to userspace - it should rather get the actual error count it has been thresholded already from MCi_STATUS[38:52]. So drop this before it becomes a used interface (good thing we caught this early in 3.7-rc1, right after the merge window closed). Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/20121017105940.GA14590@x1.osrc.amd.com Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-10-17x86, amd, mce: Avoid NULL pointer reference on CPU northbridge lookupDaniel J Blueman
When booting on a federated multi-server system (NumaScale), the processor Northbridge lookup returns NULL; add guards to prevent this causing an oops. On those systems, the northbridge is accessed through MMIO and the "normal" northbridge enumeration in amd_nb.c doesn't work since we're generating the northbridge ID from the initial APIC ID and the last is not unique on those systems. Long story short, we end up without northbridge descriptors. Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Cc: stable@vger.kernel.org # 3.6 Link: http://lkml.kernel.org/r/1349073725-14093-1-git-send-email-daniel@numascale-asia.com [ Boris: beef up commit message ] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-10-16perf: Require exclude_guest to use PEBS - kernel side enforcementPeter Zijlstra
Intel PEBS in VT-x context uses the DS address as a guest linear address, even though its programmed by the host as a host linear address. This either results in guest memory corruption and or the hardware faulting and 'crashing' the virtual machine. Therefore we have to disable PEBS on VT-x enter and re-enable on VT-x exit, enforcing a strict exclude_guest. This patch enforces exclude_guest kernel side. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Avi Kivity <avi@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Robert Richter <robert.richter@amd.com> Link: http://lkml.kernel.org/r/1347569955-54626-3-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-10-13Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "This tree includes some late late perf items that missed the first round: tools: - Bash auto completion improvements, now we can auto complete the tools long options, tracepoint event names, etc, from Namhyung Kim. - Look up thread using tid instead of pid in 'perf sched'. - Move global variables into a perf_kvm struct, from David Ahern. - Hists refactorings, preparatory for improved 'diff' command, from Jiri Olsa. - Hists refactorings, preparatory for event group viewieng work, from Namhyung Kim. - Remove double negation on optional feature macro definitions, from Namhyung Kim. - Remove several cases of needless global variables, on most builtins. - misc fixes kernel: - sysfs support for IBS on AMD CPUs, from Robert Richter. - Support for an upcoming Intel CPU, the Xeon-Phi / Knights Corner HPC blade PMU, from Vince Weaver. - misc fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) perf: Fix perf_cgroup_switch for sw-events perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu perf/AMD/IBS: Add sysfs support perf hists: Add more helpers for hist entry stat perf hists: Move he->stat.nr_events initialization to a template perf hists: Introduce struct he_stat perf diff: Removing the total_period argument from output code perf tool: Add hpp interface to enable/disable hpp column perf tools: Removing hists pair argument from output path perf hists: Separate overhead and baseline columns perf diff: Refactor diff displacement possition info perf hists: Add struct hists pointer to struct hist_entry perf tools: Complete tracepoint event names perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU perf evlist: Remove some unused methods perf evlist: Introduce add_newtp method perf kvm: Move global variables into a perf_kvm struct perf tools: Convert to BACKTRACE_SUPPORT perf tools: Long option completion support for each subcommands perf tools: Complete long option names of perf command ...
2012-10-05perf/AMD/IBS: Add sysfs supportRobert Richter
Add sysfs format entries for AMD IBS PMUs: # find /sys/bus/event_source/devices/ibs_*/format /sys/bus/event_source/devices/ibs_fetch/format /sys/bus/event_source/devices/ibs_fetch/format/rand_en /sys/bus/event_source/devices/ibs_op/format /sys/bus/event_source/devices/ibs_op/format/cnt_ctl This allows to specify following IBS options: $ perf record -e ibs_fetch/rand_en=1/GH ... $ perf record -e ibs_op/cnt_ctl=1/GH ... Option cnt_ctl is only enabled if the IBS_CAPS_OPCNT bit is set in IBS cpuid feature flags (AMD family 10h RevC and above). Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1347447584-28405-1-git-send-email-robert.richter@amd.com [ Added small readability improvements. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-04perf/x86: Add support for Intel Xeon-Phi Knights Corner PMUVince Weaver
The following patch adds perf_event support for the Xeon-Phi PMU, as documented in the "Intel Xeon Phi Coprocessor (codename: Knights Corner) Performance Monitoring Units" manual. Even though it is a co-processor, a Phi runs a full Linux environment and can support performance counters. This is just barebones support, it does not add support for interesting new features such as the SPFLT intruction that allows starting/stopping events without entering the kernel. The PMU internally is just like that of an original Pentium, but a "P6-like" MSR interface is provided. The interface is different enough from a real P6 that it's not easy (or practical) to re-use the code in perf_event_p6.c Acked-by: Lawrence F Meadows <lawrence.f.meadows@intel.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: eranian@gmail.com Cc: Lawrence F <lawrence.f.meadows@intel.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1209261405320.8398@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-04x86/cache_info: Use ARRAY_SIZE() in amd_l3_attrs()Dan Carpenter
Using ARRAY_SIZE() is more readable. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Shai Fultheim <shai@scalemp.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/20121002083409.GM12398@elgon.mountain Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-02UAPI: Partition the header include path sets and add uapi/ header directoriesDavid Howells
Partition the header include path flags into two sets, one for kernelspace builds and one for userspace builds. Add the following directories to build after the ordinary include directories so that #include will pick up the UAPI header directly if the kernel header has been moved there. The userspace set (represented by the USERINCLUDE make variable) contains: -I $(srctree)/arch/$(hdr-arch)/include/uapi -I arch/$(hdr-arch)/include/generated/uapi -I $(srctree)/include/uapi -I include/generated/uapi -include $(srctree)/include/linux/kconfig.h and the kernelspace set (represented by the LINUXINCLUDE make variable) contains: -I $(srctree)/arch/$(hdr-arch)/include -I arch/$(hdr-arch)/include/generated -I $(srctree)/include -I include --- if not building in the source tree plus everything in the USERINCLUDE set. Then use USERINCLUDE in building the x86 boot code. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>
2012-10-01Merge branch 'x86-smap-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/smap support from Ingo Molnar: "This adds support for the SMAP (Supervisor Mode Access Prevention) CPU feature on Intel CPUs: a hardware feature that prevents unintended user-space data access from kernel privileged code. It's turned on automatically when possible. This, in combination with SMEP, makes it even harder to exploit kernel bugs such as NULL pointer dereferences." Fix up trivial conflict in arch/x86/kernel/entry_64.S due to newly added includes right next to each other. * 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, smep, smap: Make the switching functions one-way x86, suspend: On wakeup always initialize cr4 and EFER x86-32: Start out eflags and cr4 clean x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry x86, smap: Reduce the SMAP overhead for signal handling x86, smap: A page fault due to SMAP is an oops x86, smap: Turn on Supervisor Mode Access Prevention x86, smap: Add STAC and CLAC instructions to control user space access x86, uaccess: Merge prototypes for clear_user/__clear_user x86, smap: Add a header file with macros for STAC/CLAC x86, alternative: Add header guards to <asm/alternative-asm.h> x86, alternative: Use .pushsection/.popsection x86, smap: Add CR4 bit for SMAP x86-32, mm: The WP test should be done on a kernel page
2012-10-01Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/mm changes from Ingo Molnar: "The biggest change is new TLB partial flushing code for AMD CPUs. (The v3.6 kernel had the Intel CPU side code, see commits e0ba94f14f74..effee4b9b3b.) There's also various other refinements around the TLB flush code" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Distinguish TLB shootdown interrupts from other functions call interrupts x86/mm: Fix range check in tlbflush debugfs interface x86, cpu: Preset default tlb_flushall_shift on AMD x86, cpu: Add AMD TLB size detection x86, cpu: Push TLB detection CPUID check down x86, cpu: Fixup tlb_flushall_shift formatting
2012-10-01Merge branch 'x86-mce-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/MCE update from Ingo Molnar: "Various MCE robustness enhancements. One of the changes adds CMCI (Corrected Machine Check Interrupt) poll mode on Intel Nehalem+ CPUs, which mode is automatically entered when the rate of messages is too high - and exited once the storm is over. An MCE events storm will roughly look like this: [ 5342.740616] mce: [Hardware Error]: Machine check events logged [ 5342.746501] mce: [Hardware Error]: Machine check events logged [ 5342.757971] CMCI storm detected: switching to poll mode [ 5372.674957] CMCI storm subsided: switching to interrupt mode This should make such events more survivable" * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Provide boot argument to honour bios-set CMCI threshold x86, MCE: Remove unused defines x86, mce: Enable MCA support by default x86/mce: Add CMCI poll mode x86/mce: Make cmci_discover() quiet x86: mce: Remove the frozen cases in the hotplug code x86: mce: Split timer init x86: mce: Serialize mce injection x86: mce: Disable preemption when calling raise_local()
2012-10-01Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/fpu update from Ingo Molnar: "The biggest change is the addition of the non-lazy (eager) FPU saving support model and enabling it on CPUs with optimized xsaveopt/xrstor FPU state saving instructions. There are also various Sparse fixes" Fix up trivial add-add conflict in arch/x86/kernel/traps.c * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: fix kvm's usage of kernel_fpu_begin/end() x86, fpu: remove cpu_has_xmm check in the fx_finit() x86, fpu: make eagerfpu= boot param tri-state x86, fpu: enable eagerfpu by default for xsaveopt x86, fpu: decouple non-lazy/eager fpu restore from xsave x86, fpu: use non-lazy fpu restore for processors supporting xsave lguest, x86: handle guest TS bit for lazy/non-lazy fpu host models x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage x86, kvm: use kernel_fpu_begin/end() in kvm_load/put_guest_fpu() x86, fpu: remove unnecessary user_fpu_end() in save_xstate_sig() x86, fpu: drop_fpu() before restoring new state from sigframe x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels x86, fpu: Consolidate inline asm routines for saving/restoring fpu state x86, signal: Cleanup ifdefs and is_ia32, is_x32
2012-10-01Merge branch 'x86-debug-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 debug update from Ingo Molnar: "Various small enhancements" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/debug: Dump family, model, stepping of the boot CPU x86/iommu: Use NULL instead of plain 0 for __IOMMU_INIT x86/iommu: Drop duplicate const in __IOMMU_INIT x86/fpu/xsave: Keep __user annotation in casts x86/pci/probe_roms: Add missing __iomem annotation to pci_map_biosrom() x86/signals: ia32_signal.c: add __user casts to fix sparse warnings x86/vdso: Add __user annotation to VDSO32_SYMBOL x86: Fix __user annotations in asm/sys_ia32.h
2012-10-01Merge branches 'x86-cpu-for-linus' and 'x86-cpufeature-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/cpu and x86/cpufeature from Ingo Molnar: "One tiny cleanup, and prepare for SMAP (Supervisor Mode Access Prevention) support on x86" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove the useless branch in c_start() * 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, cpufeature: Add feature bit for SMAP
2012-10-01Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/asm changes from Ingo Molnar: "The one change that stands out is the alternatives patching change that prevents us from ever patching back instructions from SMP to UP: this simplifies things and speeds up CPU hotplug. Other than that it's smaller fixes, cleanups and improvements." * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Unspaghettize do_trap() x86_64: Work around old GAS bug x86: Use REP BSF unconditionally x86: Prefer TZCNT over BFS x86/64: Adjust types of temporaries used by ffs()/fls()/fls64() x86: Drop unnecessary kernel_eflags variable on 64-bit x86/smp: Don't ever patch back to UP if we unplug cpus
2012-10-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "Leftover perf/urgent fix from the v3.6 cycle" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix typo in uncore_pmu_to_box
2012-10-01Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf update from Ingo Molnar: "Lots of changes in this cycle as well, with hundreds of commits from over 30 contributors. Most of the activity was on the tooling side. Higher level changes: - New 'perf kvm' analysis tool, from Xiao Guangrong. - New 'perf trace' system-wide tracing tool - uprobes fixes + cleanups from Oleg Nesterov. - Lots of patches to make perf build on Android out of box, from Irina Tirdea - Extend ftrace function tracing utility to be more dynamic for its users. It allows for data passing to the callback functions, as well as reading regs as if a breakpoint were to trigger at function entry. The main goal of this patch series was to allow kprobes to use ftrace as an optimized probe point when a probe is placed on an ftrace nop. With lots of help from Masami Hiramatsu, and going through lots of iterations, we finally came up with a good solution. - Add cpumask for uncore pmu, use it in 'stat', from Yan, Zheng. - Various tracing updates from Steve Rostedt - Clean up and improve 'perf sched' performance by elliminating lots of needless calls to libtraceevent. - Event group parsing support, from Jiri Olsa - UI/gtk refactorings and improvements from Namhyung Kim - Add support for non-tracepoint events in perf script python, from Feng Tang - Add --symbols to 'script', similar to the one in 'report', from Feng Tang. Infrastructure enhancements and fixes: - Convert the trace builtins to use the growing evsel/evlist tracepoint infrastructure, removing several open coded constructs like switch like series of strcmp to dispatch events, etc. Basically what had already been showcased in 'perf sched'. - Add evsel constructor for tracepoints, that uses libtraceevent just to parse the /format events file, use it in a new 'perf test' to make sure the libtraceevent format parsing regressions can be more readily caught. - Some strange errors were happening in some builds, but not on the next, reported by several people, problem was some parser related files, generated during the build, didn't had proper make deps, fix from Eric Sandeen. - Introduce struct and cache information about the environment where a perf.data file was captured, from Namhyung Kim. - Fix handling of unresolved samples when --symbols is used in 'report', from Feng Tang. - Add union member access support to 'probe', from Hyeoncheol Lee. - Fixups to die() removal, from Namhyung Kim. - Render fixes for the TUI, from Namhyung Kim. - Don't enable annotation in non symbolic view, from Namhyung Kim. - Fix pipe mode in 'report', from Namhyung Kim. - Move related stats code from stat to util/, will be used by the 'stat' kvm tool, from Xiao Guangrong. - Remove die()/exit() calls from several tools. - Resolve vdso callchains, from Jiri Olsa - Don't pass const char pointers to basename, so that we can unconditionally use libgen.h and thus avoid ifdef BIONIC lines, from David Ahern - Refactor hist formatting so that it can be reused with the GTK browser, From Namhyung Kim - Fix build for another rbtree.c change, from Adrian Hunter. - Make 'perf diff' command work with evsel hists, from Jiri Olsa. - Use the only field_sep var that is set up: symbol_conf.field_sep, fix from Jiri Olsa. - .gitignore compiled python binaries, from Namhyung Kim. - Get rid of die() in more libtraceevent places, from Namhyung Kim. - Rename libtraceevent 'private' struct member to 'priv' so that it works in C++, from Steven Rostedt - Remove lots of exit()/die() calls from tools so that the main perf exit routine can take place, from David Ahern - Fix x86 build on x86-64, from David Ahern. - {int,str,rb}list fixes from Suzuki K Poulose - perf.data header fixes from Namhyung Kim - Allow user to indicate objdump path, needed in cross environments, from Maciek Borzecki - Fix hardware cache event name generation, fix from Jiri Olsa - Add round trip test for sw, hw and cache event names, catching the problem Jiri fixed, after Jiri's patch, the test passes successfully. - Clean target should do clean for lib/traceevent too, fix from David Ahern - Check the right variable for allocation failure, fix from Namhyung Kim - Set up evsel->tp_format regardless of evsel->name being set already, fix from Namhyung Kim - Oprofile fixes from Robert Richter. - Remove perf_event_attr needless version inflation, from Jiri Olsa - Introduce libtraceevent strerror like error reporting facility, from Namhyung Kim - Add pmu mappings to perf.data header and use event names from cmd line, from Robert Richter - Fix include order for bison/flex-generated C files, from Ben Hutchings - Build fixes and documentation corrections from David Ahern - Assorted cleanups from Robert Richter - Let O= makes handle relative paths, from Steven Rostedt - perf script python fixes, from Feng Tang. - Initial bash completion support, from Frederic Weisbecker - Allow building without libelf, from Namhyung Kim. - Support DWARF CFI based unwind to have callchains when %bp based unwinding is not possible, from Jiri Olsa. - Symbol resolution fixes, while fixing support PPC64 files with an .opt ELF section was the end goal, several fixes for code that handles all architectures and cleanups are included, from Cody Schafer. - Assorted fixes for Documentation and build in 32 bit, from Robert Richter - Cache the libtraceevent event_format associated to each evsel early, so that we avoid relookups, i.e. calling pevent_find_event repeatedly when processing tracepoint events. [ This is to reduce the surface contact with libtraceevents and make clear what is that the perf tools needs from that lib: so far parsing the common and per event fields. ] - Don't stop the build if the audit libraries are not installed, fix from Namhyung Kim. - Fix bfd.h/libbfd detection with recent binutils, from Markus Trippelsdorf. - Improve warning message when libunwind devel packages not present, from Jiri Olsa" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (282 commits) perf trace: Add aliases for some syscalls perf probe: Print an enum type variable in "enum variable-name" format when showing accessible variables perf tools: Check libaudit availability for perf-trace builtin perf hists: Add missing period_* fields when collapsing a hist entry perf trace: New tool perf evsel: Export the event_format constructor perf evsel: Introduce rawptr() method perf tools: Use perf_evsel__newtp in the event parser perf evsel: The tracepoint constructor should store sys:name perf evlist: Introduce set_filter() method perf evlist: Renane set_filters method to apply_filters perf test: Add test to check we correctly parse and match syscall open parms perf evsel: Handle endianity in intval method perf evsel: Know if byte swap is needed perf tools: Allow handling a NULL cpu_map as meaning "all cpus" perf evsel: Improve tracepoint constructor setup tools lib traceevent: Fix error path on pevent_parse_event perf test: Fix build failure trace: Move trace event enable from fs_initcall to core_initcall tracing: Add an option for disabling markers ...
2012-09-27x86/mce: Provide boot argument to honour bios-set CMCI thresholdNaveen N. Rao
The ACPI spec doesn't provide for a way for the bios to pass down recommended thresholds to the OS on a _per-bank_ basis. This patch adds a new boot option, which if passed, tells Linux to use CMCI thresholds set by the bios. As fail-safe, we initialize threshold to 1 if some banks have not been initialized by the bios and warn the user. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-09-27x86, smep, smap: Make the switching functions one-wayH. Peter Anvin
There is no fundamental reason why we should switch SMEP and SMAP on during early cpu initialization just to switch them off again. Now with %eflags and %cr4 forced to be initialized to a clean state, we only need the one-way enable. Also, make the functions inline to make them (somewhat) harder to abuse. This does mean that SMEP and SMAP do not get initialized anywhere near as early. Even using early_param() instead of __setup() doesn't give us control early enough to do this during the early cpu initialization phase. This seems reasonable to me, because SMEP and SMAP should not matter until we have userspace to protect ourselves from, but it does potentially make it possible for a bug involving a "leak of permissions to userspace" to get uncaught. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-27perf/x86: Fix typo in uncore_pmu_to_boxYan, Zheng
The variable box should not be declared as static. This could probably cause crashes with sufficiently parallel uncore PMU use. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Cc: eranian@google.com Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/1348709606-2759-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-26x86: Remove the useless branch in c_start()Michael Wang
Since 'cpu == -1' in cpumask_next() is legal, no need to handle '*pos == 0' specially. About the comments: /* just in case, cpu 0 is not the first */ A test with a cpumask in which cpu 0 is not the first has been done, and it works well. This patch will remove that useless branch to clean the code. Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Cc: kjwinchester@gmail.com Cc: borislav.petkov@amd.com Cc: ak@linux.intel.com Link: http://lkml.kernel.org/r/1348033343-23658-1-git-send-email-wangyun@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-21Merge branch 'x86/fpu' into x86/smapH. Peter Anvin
Reason for merge: x86/fpu changed the structure of some of the code that x86/smap changes; mostly fpu-internal.h but also minor changes to the signal code. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Resolved Conflicts: arch/x86/ia32/ia32_signal.c arch/x86/include/asm/fpu-internal.h arch/x86/kernel/signal.c
2012-09-21x86, smap: Turn on Supervisor Mode Access PreventionH. Peter Anvin
If Supervisor Mode Access Prevention is available and not disabled by the user, turn it on. Also fix the expansion of SMEP (Supervisor Mode Execution Prevention.) Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-10-git-send-email-hpa@linux.intel.com
2012-09-21x86, smap: Add STAC and CLAC instructions to control user space accessH. Peter Anvin
When Supervisor Mode Access Prevention (SMAP) is enabled, access to userspace from the kernel is controlled by the AC flag. To make the performance of manipulating that flag acceptable, there are two new instructions, STAC and CLAC, to set and clear it. This patch adds those instructions, via alternative(), when the SMAP feature is enabled. It also adds X86_EFLAGS_AC unconditionally to the SYSCALL entry mask; there is simply no reason to make that one conditional. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
2012-09-19perf/x86: Fix Intel Ivy Bridge supportStephane Eranian
This patch updates the existing Intel IvyBridge (model 58) support with proper PEBS event constraints. It cannot reuse the same as SandyBridge because some events (0xd3) are specific to IvyBridge. Also there is no UOPS_DISPATCHED.THREAD on IVB, so do not populate the PERF_COUNT_HW_STALLED_CYCLES_BACKEND mapping. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ak@linux.intel.com Link: http://lkml.kernel.org/r/20120910230701.GA5898@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19x86/debug: Dump family, model, stepping of the boot CPUBorislav Petkov
When acting on a user bug report, we find ourselves constantly asking for /proc/cpuinfo in order to know the exact family, model, stepping of the CPU in question. Instead of having to ask this, add this to dmesg so that it is visible and no ambiguities can ensue from looking at the official name string of the CPU coming from CPUID and trying to map it to f/m/s. Output then looks like this: [ 0.146041] smpboot: CPU0: AMD FX(tm)-8100 Eight-Core Processor (fam: 15, model: 01, stepping: 02) Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Link: http://lkml.kernel.org/r/1347640666-13638-1-git-send-email-bp@amd64.org [ tweaked it minimally to add commas. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19Merge tag 'v3.6-rc6' into x86/mceIngo Molnar
Merge Linux v3.6-rc6, to refresh this tree. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-18x86, fpu: decouple non-lazy/eager fpu restore from xsaveSuresh Siddha
Decouple non-lazy/eager fpu restore policy from the existence of the xsave feature. Introduce a synthetic CPUID flag to represent the eagerfpu policy. "eagerfpu=on" boot paramter will enable the policy. Requested-by: H. Peter Anvin <hpa@zytor.com> Requested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-18x86, fpu: use non-lazy fpu restore for processors supporting xsaveSuresh Siddha
Fundamental model of the current Linux kernel is to lazily init and restore FPU instead of restoring the task state during context switch. This changes that fundamental lazy model to the non-lazy model for the processors supporting xsave feature. Reasons driving this model change are: i. Newer processors support optimized state save/restore using xsaveopt and xrstor by tracking the INIT state and MODIFIED state during context-switch. This is faster than modifying the cr0.TS bit which has serializing semantics. ii. Newer glibc versions use SSE for some of the optimized copy/clear routines. With certain workloads (like boot, kernel-compilation etc), application completes its work with in the first 5 task switches, thus taking upto 5 #DNA traps with the kernel not getting a chance to apply the above mentioned pre-load heuristic. iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit and thus will not work correctly in the presence of lazy restore. Non-lazy state restore is needed for enabling such features. Some data on a two socket SNB system: * Saved 20K DNA exceptions during boot on a two socket SNB system. * Saved 50K DNA exceptions during kernel-compilation workload. * Improved throughput of the AVX based checksumming function inside the kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts pair. Also now kernel_fpu_begin/end() relies on the patched alternative instructions. So move check_fpu() which uses the kernel_fpu_begin/end() after alternative_instructions(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com Merge 32-bit boot fix from, Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-17perf/x86: Add cpumask for uncore pmuYan, Zheng
This patch adds a cpumask file to the uncore pmu sysfs directory. The cpumask file contains one active cpu for every socket. Signed-off-by: "Yan, Zheng" <zheng.z.yan@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: "Yan, Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1347263631-23175-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-13x86: Drop unnecessary kernel_eflags variable on 64-bitIan Campbell
On 64 bit x86 we save the current eflags in cpu_init for use in ret_from_fork. Strictly speaking reserved bits in EFLAGS should be read as written but in practise it is unlikely that EFLAGS could ever be extended in this way and the kernel alread clears any undefined flags early on. The equivalent 32 bit code simply hard codes 0x0202 as the new EFLAGS. This change makes 64 bit use the same mechanism to setup the initial EFLAGS on fork. Note that 64 bit resets EFLAGS before calling schedule_tail() as opposed to 32 bit which calls schedule_tail() first. Therefore the correct value for EFLAGS has opposite IF bit. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20120824195847.GA31628@moon Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-13perf/x86/ibs: Check syscall attribute flagsRobert Richter
Current implementation simply ignores attribute flags. Thus, there is no notification to userland of unsupported features. Check syscall's attribute flags to let userland know if a feature is supported by the kernel. This is also needed to distinguish between future kernels what might support a feature. Cc: <stable@vger.kernel.org> v3.5.. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>