summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)Author
2010-05-07perf, x86: Move x86_setup_perfctr()Robert Richter
Move x86_setup_perfctr(), no other changes made. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1271190201-25705-3-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07perf, x86: Move perfctr init code to x86_setup_perfctr()Robert Richter
Split __hw_perf_event_init() to configure pmu events other than perfctrs. Perfctr code is moved to a separate function x86_setup_perfctr(). This and the following patches refactor the code. Split in multiple patches for better review. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1271190201-25705-2-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: Resolve patch dependency Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-06x86: Detect running on a Microsoft HyperV systemKy Srinivasan
This patch integrates HyperV detection within the framework currently used by VmWare. With this patch, we can avoid having to replicate the HyperV detection code in each of the Microsoft HyperV drivers. Reworked and tweaked by Greg K-H to build properly. Signed-off-by: K. Y. Srinivasan <ksrinivasan@novell.com> LKML-Reference: <20100506190841.GA1605@kroah.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Vadim Rozenfeld <vrozenfe@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "K.Prasad" <prasad@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Alan Cox <alan@linux.intel.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Hank Janssen <hjanssen@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-03x86, cpu: Make APERF/MPERF a normal table-driven flagH. Peter Anvin
APERF/MPERF can be handled via the table like all the other scattered CPU flags. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-4-git-send-email-bp@amd64.org>
2010-05-03powernow-k8: Fix frequency reportingMark Langsdorf
With F10, model 10, all valid frequencies are in the ACPI _PST table. Cc: <stable@kernel.org> # 33.x 32.x Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-30Merge commit 'v2.6.34-rc6' into perf/coreIngo Molnar
Merge reason: update to the latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-28Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: Disable large pages on CPUs with Atom erratum AAE44 x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero x86, mrst: Conditionally register cpu hotplug notifier for apbt
2010-04-28x86, asm: Introduce and use percpu_inc()Jan Beulich
... generating slightly smaller code. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4BCF261F020000780003B33C@vpn.id2.novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-24VMware Balloon driverDmitry Torokhov
This is a standalone version of VMware Balloon driver. Ballooning is a technique that allows hypervisor dynamically limit the amount of memory available to the guest (with guest cooperation). In the overcommit scenario, when hypervisor set detects that it needs to shuffle some memory, it instructs the driver to allocate certain number of pages, and the underlying memory gets returned to the hypervisor. Later hypervisor may return memory to the guest by reattaching memory to the pageframes and instructing the driver to "deflate" balloon. We are submitting a standalone driver because KVM maintainer (Avi Kivity) expressed opinion (rightly) that our transport does not fit well into virtqueue paradigm and thus it does not make much sense to integrate with virtio. There were also some concerns whether current ballooning technique is the right thing. If there appears a better framework to achieve this we are prepared to evaluate and switch to using it, but in the meantime we'd like to get this driver upstream. We want to get the driver accepted in distributions so that users do not have to deal with an out-of-tree module and many distributions have "upstream first" requirement. The driver has been shipping for a number of years and users running on VMware platform will have it installed as part of VMware Tools even if it will not come from a distribution, thus there should not be additional risk in pulling the driver into mainline. The driver will only activate if host is VMware so everyone else should not be affected at all. Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Cc: Avi Kivity <avi@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-23x86: Disable large pages on CPUs with Atom erratum AAE44H. Peter Anvin
Atom erratum AAE44/AAF40/AAG38/AAH41: "If software clears the PS (page size) bit in a present PDE (page directory entry), that will cause linear addresses mapped through this PDE to use 4-KByte pages instead of using a large page after old TLB entries are invalidated. Due to this erratum, if a code fetch uses this PDE before the TLB entry for the large page is invalidated then it may fetch from a different physical address than specified by either the old large page translation or the new 4-KByte page translation. This erratum may also cause speculative code fetches from incorrect addresses." [http://download.intel.com/design/processor/specupdt/319536.pdf] Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to workaround errata AAH41 (mixed 4K TLBs) it reduces the window of opportunity for the bug to occur and does not totally remove it. This patch disables mixed 4K/4MB page tables totally avoiding the page splitting and not tripping this processor issue. This is based on an original patch by Colin King. Originally-by: Colin Ian King <colin.king@canonical.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com> Cc: <stable@kernel.org>
2010-04-22x86, cacheinfo: Disable index in all four subcachesBorislav Petkov
When disabling an L3 cache index, make sure we disable that index in all four subcaches of the L3. Clarify nomenclature while at it, wrt to disable slots versus disable index and rename accordingly. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1271945222-5283-6-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-22x86, cacheinfo: Make L3 cache info per nodeBorislav Petkov
Currently, we're allocating L3 cache info and calculating indices for each online cpu which is clearly superfluous. Instead, we need to do this per-node as is each L3 cache. No functional change, only per-cpu memory savings. -v2: Allocate L3 cache descriptors array dynamically. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1271945222-5283-5-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-22x86, cacheinfo: Reorganize AMD L3 cache structureBorislav Petkov
Add a struct representing L3 cache attributes (subcache sizes and indices count) and move the respective members out of _cpuid4_info. Also, stash the struct pci_dev ptr into the struct simplifying the code even more. There should be no functionality change resulting from this patch except slightly slimming the _cpuid4_info per-cpu vars. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1271945222-5283-4-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-22x86, cacheinfo: Turn off L3 cache index disable feature in virtualized ↵Frank Arnold
environments When running a quest kernel on xen we get: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df PGD 0 Oops: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.34-rc3 #1 /HVM domU RIP: 0010:[<ffffffff8142f2fb>] [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x 2ca/0x3df RSP: 0018:ffff880002203e08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060 RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000 RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38 R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0 R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68 FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020) Stack: ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30 <0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70 <0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b Call Trace: <IRQ> [<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c [<ffffffff81059140>] ? mod_timer+0x23/0x25 [<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36 [<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3 [<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d [<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232 [<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82 [<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a [<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0 [<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27 [<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20 <EOI> [<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63 [<ffffffff810295c6>] ? native_safe_halt+0xc/0xd [<ffffffff810114eb>] ? default_idle+0x36/0x53 [<ffffffff81008c22>] cpu_idle+0xaa/0xe4 [<ffffffff81423a9a>] rest_init+0x7e/0x80 [<ffffffff81b10dd2>] start_kernel+0x40e/0x419 [<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7 [<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107 Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b 00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb RIP [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df RSP <ffff880002203e08> CR2: 0000000000000038 ---[ end trace a7919e7f17c0a726 ]--- The L3 cache index disable feature of AMD CPUs has to be disabled if the kernel is running as guest on top of a hypervisor because northbridge devices are not available to the guest. Currently, this fixes a boot crash on top of Xen. In the future this will become an issue on KVM as well. Check if northbridge devices are present and do not enable the feature if there are none. Signed-off-by: Frank Arnold <frank.arnold@amd.com> LKML-Reference: <1271945222-5283-3-git-send-email-bp@amd64.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-22x86, cacheinfo: Unify AMD L3 cache index disable checkingBorislav Petkov
All F10h CPUs starting with model 8 resp. 9, stepping 1, support L3 cache index disable. Concentrate the family, model, stepping checking at one place and enable the feature implicitly on upcoming Fam10h models. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1271945222-5283-2-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-20perf & kvm: Clean up some of the guest profiling callback API detailsZhang, Yanmin
Fix some build bug and programming style issues: - use valid C - fix up various style details Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Cc: Avi Kivity <avi@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sheng Yang <sheng@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: oerg Roedel <joro@8bytes.org> Cc: Jes Sorensen <Jes.Sorensen@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Zachary Amsden <zamsden@redhat.com> Cc: zhiteng.huang@intel.com Cc: tim.c.chen@intel.com Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <1271729638.2078.624.camel@ymzhang.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-19perf: Enhance perf to allow for guest statistic collection from hostZhang, Yanmin
Below patch introduces perf_guest_info_callbacks and related register/unregister functions. Add more PERF_RECORD_MISC_XXX bits meaning guest kernel and guest user space. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-09powernow-k8: Fix frequency reportingMark Langsdorf
With F10, model 10, all valid frequencies are in the ACPI _PST table. Cc: <stable@kernel.org> # 33.x 32.x Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-09x86, cpufreq: Add APERF/MPERF support for AMD processorsMark Langsdorf
Starting with model 10 of Family 0x10, AMD processors may have support for APERF/MPERF. Add support for identifying it and using it within cpufreq. Move the APERF/MPERF functions out of the acpi-cpufreq code and into their own file so they can easily be shared. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> LKML-Reference: <20100401141956.GA1930@aftab> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-09x86: Unify APERF/MPERF supportBorislav Petkov
Initialize this CPUID flag feature in common code. It could be made a standalone function later, maybe, if more functionality is duplicated. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-4-git-send-email-bp@amd64.org> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-09powernow-k8: Add core performance boost supportBorislav Petkov
Starting with F10h, revE, AMD processors add support for a dynamic core boosting feature called Core Performance Boost. When a specific condition is present, a subset of the cores on a system are boosted beyond their P0 operating frequency to speed up the performance of single-threaded applications. In the normal case, the system comes out of reset with core boosting enabled. This patch adds a sysfs knob with which core boosting can be switched on or off for benchmarking purposes. While at it, make the CPB code hotplug-aware so that taking cores offline wouldn't interfere with boosting the remaining online cores. Furthermore, add cpu_online_mask hotplug protection as suggested by Andrew. Finally, cleanup the driver init codepath and update copyrights. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-3-git-send-email-bp@amd64.org> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-09x86, cpu: Add AMD core boosting feature flag to /proc/cpuinfoBorislav Petkov
By semi-popular demand, this adds the Core Performance Boost feature flag to /proc/cpuinfo. Possible use case for this is userspace tools like cpufreq-aperf, for example, so that they don't have to jump through hoops of accessing "/dev/cpu/%d/cpuid" in order to check for CPB hw support, or call cpuid from userspace. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-2-git-send-email-bp@amd64.org> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-08Merge branch 'linus' into perf/coreIngo Molnar
Semantic conflict: arch/x86/kernel/cpu/perf_event_intel_ds.c Merge reason: pick up latest fixes, fix the conflict Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-06perf, x86: Enable Nehalem-EX supportVince Weaver
According to Intel Software Devel Manual Volume 3B, the Nehalem-EX PMU is just like regular Nehalem (except for the uncore support, which is completely different). Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <alpine.DEB.2.00.1004060956580.1417@cl320.eecs.utk.edu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-05Merge branch 'master' into export-slabhTejun Heo
2010-04-04perf: Drop the frame reliablity checkFrederic Weisbecker
It is useless now that we have a pure stack frame walker, as given addr are always reliable. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Add Nehalem programming quirk to WestmerePeter Zijlstra
According to the Xeon-5600 errata the Westmere suffers the same PMU programming bug as the original Nehalem did. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Fix __initconst vs constPeter Zijlstra
All variables that have __initconst should also be const. Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Fix up the ANY flag stuffPeter Zijlstra
Stephane noticed that the ANY flag was in generic arch code, and Cyrill reported that it broke the P4 code. Solve this by merging x86_pmu::raw_event into x86_pmu::hw_config and provide intel_pmu and amd_pmu specific versions of this callback. The intel_pmu one deals with the ANY flag, the amd_pmu adds the few extra event bits AMD64 has. Reported-by: Stephane Eranian <eranian@google.com> Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Robert Richter <robert.richter@amd.com> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269968113.5258.442.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: implement ARCH_PERFMON_EVENTSEL bit masksRobert Richter
ARCH_PERFMON_EVENTSEL bit masks are often used in the kernel. This patch adds macros for the bit masks and removes local defines. The function intel_pmu_raw_event() becomes x86_pmu_raw_event() which is generic for x86 models and same also for p6. Duplicate code is removed. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100330092821.GH11907@erda.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Undo some some *_counter* -> *_event* renamesRobert Richter
The big rename: cdd6c48 perf: Do the big rename: Performance Counters -> Performance Events accidentally renamed some members of stucts that were named after registers in the spec. To avoid confusion this patch reverts some changes. The related specs are MSR descriptions in AMD's BKDGs and the ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32 Architectures Software Developer's Manuals. This patch does: $ sed -i -e 's:num_events:num_counters:g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c \ arch/x86/oprofile/op_model_ppro.c $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1269880612-25800-2-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02Merge branch 'perf/urgent' into perf/coreIngo Molnar
Conflicts: arch/x86/kernel/cpu/perf_event.c Merge reason: Resolve the conflict, pick up fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernelsTorok Edwin
When profiling a 32-bit process on a 64-bit kernel, callgraph tracing stopped after the first function, because it has seen a garbage memory address (tried to interpret the frame pointer, and return address as a 64-bit pointer). Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set. Note that TIF_IA32 flag must be used, and not is_compat_task(), because the latter is only set when the 32-bit process is executing a syscall, which may not always be the case (when tracing page fault events for example). Signed-off-by: Török Edwin <edwintorok@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mackerras <paulus@samba.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org LKML-Reference: <1268820436-13145-1-git-send-email-edwintorok@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02perf, x86: Fix AMD hotplug & constraint initializationPeter Zijlstra
Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP however amd_nb_id() doesn't work yet on prepare so it would simply bail basically reverting to a state where we do not properly track node wide constraints - causing weird perf results. Fix up the AMD NorthBridge initialization code by allocating from CPU_UP_PREPARE and installing it from CPU_STARTING once we have the proper nb_id. It also properly deals with the allocation failing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [ robustify using amd_has_nb() ] Signed-off-by: Stephane Eranian <eranian@google.com> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-01perf: Use hot regs with software sched switch/migrate eventsFrederic Weisbecker
Scheduler's task migration events don't work because they always pass NULL regs perf_sw_event(). The event hence gets filtered in perf_swevent_add(). Scheduler's context switches events use task_pt_regs() to get the context when the event occured which is a wrong thing to do as this won't give us the place in the kernel where we went to sleep but the place where we left userspace. The result is even more wrong if we switch from a kernel thread. Use the hot regs snapshot for both events as they belong to the non-interrupt/exception based events family. Unlike page faults or so that provide the regs matching the exact origin of the event, we need to save the current context. This makes the task migration event working and fix the context switch callchains and origin ip. Example: perf record -a -e cs Before: 10.91% ksoftirqd/0 0 [k] 0000000000000000 | --- (nil) perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_swevent_ctx_event do_perf_sw_event __perf_sw_event perf_event_task_sched_out schedule run_ksoftirqd kthread kernel_thread_helper After: 23.77% hald-addon-stor [kernel.kallsyms] [k] schedule | --- schedule | |--60.00%-- schedule_timeout | wait_for_common | wait_for_completion | blk_execute_rq | scsi_execute | scsi_execute_req | sr_test_unit_ready | | | |--66.67%-- sr_media_change | | media_changed | | cdrom_media_changed | | sr_block_media_changed | | check_disk_change | | cdrom_open v2: Always build perf_arch_fetch_caller_regs() now that software events need that too. They don't need it from modules, unlike trace events, so we keep the EXPORT_SYMBOL in trace_event_perf.c Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-26perf, x86: Add Nehelem PMU programming errata workaroundPeter Zijlstra
Implement the workaround for Intel Errata AAK100 and AAP53. Also, remove the Core-i7 name for Nehalem events since there are also Westmere based i7 chips. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1269608924.12097.147.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26x86, perf, bts, mm: Delete the never used BTS-ptrace codePeter Zijlstra
Support for the PMU's BTS features has been upstreamed in v2.6.32, but we still have the old and disabled ptrace-BTS, as Linus noticed it not so long ago. It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without regard for other uses (perf) and doesn't provide the flexibility needed for perf either. Its users are ptrace-block-step and ptrace-bts, since ptrace-bts was never used and ptrace-block-step can be implemented using a much simpler approach. So axe all 3000 lines of it. That includes the *locked_memory*() APIs in mm/mlock.c as well. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Markus Metzger <markus.t.metzger@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <20100325135413.938004390@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26perf, x86: Clean up debugctlmsr bit definitionsPeter Zijlstra
Move all debugctlmsr thingies into msr-index.h Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100325135413.861425293@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26x86, perf: Add raw events support for the P4 PMUCyrill Gorcunov
The adding of raw event support lead to complete code refactoring. I hope is became more readable then it was. The list of changes: 1) The 64bit config field is enough to hold all information we need to track event details. To achieve it we used *own* enum for events selection in ESCR register and map this key into proper value at moment of event enabling. For the same reason we use 12LSB bits in CCCR register -- to track which exactly cache trace event was requested. And we cear this bits at real 'write' moment. 2) There is no per-cpu area reserved for P4 PMU anymore. We don't need it. All is held by config. 3) Now we may use any available counter, ie we try to grab any possible counter. v2: - Lin Ming reported the lack of ESCR selector in CCCR for cache events v3: - Don't loose cache event codes at config unpacking procedure, we may need it one day so no obscure hack behind our back, better to clear reserved bits explicitly when needed (thanks Ming for pointing out) - Lin Ming fixed misplaced opcodes in cache events Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Tested-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1269403766.3409.6.camel@minggr.sh.intel.com> [ v4: did a few whitespace fixlets ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-22Merge commit 'v2.6.34-rc2' into perf/coreIngo Molnar
Merge reason: Pick up latest perf fixes from upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-22x86 / perf: Fix suspend to RAM on HP nx6325Rafael J. Wysocki
Commit 3f6da3905398826d85731247e7fbcf53400c18bd (perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to RAM on my HP nx6325 (and most likely on other AMD-based boxes too) by allowing amd_pmu_cpu_offline() to be executed for CPUs that are going offline as part of the suspend process. The problem is that cpuhw->amd_nb may be NULL already, so the function should make sure it's not NULL before accessing the object pointed to by it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-18Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits) perf: Fix unexported generic perf_arch_fetch_caller_regs perf record: Don't try to find buildids in a zero sized file perf: export perf_trace_regs and perf_arch_fetch_caller_regs perf, x86: Fix hw_perf_enable() event assignment perf, ppc: Fix compile error due to new cpu notifiers perf: Make the install relative to DESTDIR if specified kprobes: Calculate the index correctly when freeing the out-of-line execution slot perf tools: Fix sparse CPU numbering related bugs perf_event: Fix oops triggered by cpu offline/online perf: Drop the obsolete profile naming for trace events perf: Take a hot regs snapshot for trace events perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot perf/x86-64: Use frame pointer to walk on irq and process stacks lockdep: Move lock events under lockdep recursion protection perf report: Print the map table just after samples for which no map was found perf report: Add multiple event support perf session: Change perf_session post processing functions to take histogram tree perf session: Add storage for seperating event types in report perf session: Change add_hist_entry to take the tree root instead of session perf record: Add ID and to recorded event data when recording multiple events ...
2010-03-18x86, perf: Fix few cosmetic dabs for P4 pmu (comments and constantify)Cyrill Gorcunov
- A few ESCR have escaped fixing at previous attempt. - p4_escr_map is read only, make it const. Nothing serious. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> LKML-Reference: <20100318211256.GH5062@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18perf_events: Fix resource leak in x86 __hw_perf_event_init()Stephane Eranian
If reserve_pmc_hardware() succeeds but reserve_ds_buffers() fails, then we need to release_pmc_hardware. It won't be done by the destroy() callback because we return before setting it in case of error. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: <stable@kernel.org> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: robert.richter@amd.com Cc: perfmon2-devel@lists.sf.net LKML-Reference: <4ba1568b.15185e0a.182a.7802@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> -- arch/x86/kernel/cpu/perf_event.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
2010-03-18perf, x86: Add cache events for the Pentium-4 PMULin Ming
Move the HT bit setting code from p4_pmu_event_map to p4_hw_config. So the cache events can get HT bit set correctly. Tested on my P4 desktop, below 6 cache events work: L1-dcache-load-misses LLC-load-misses dTLB-load-misses dTLB-store-misses iTLB-loads iTLB-load-misses Signed-off-by: Lin Ming <ming.m.lin@intel.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1268908392.13901.128.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18perf, x86: Add a key to simplify template lookup in Pentium-4 PMULin Ming
Currently, we use opcode(Event and Event-Selector) + emask to look up template in p4_templates. But cache events (L1-dcache-load-misses, LLC-load-misses, etc) use the same event(P4_REPLAY_EVENT) to do the counting, ie, they have the same opcode and emask. So we can not use current lookup mechanism to find the template for cache events. This patch introduces a "key", which is the index into p4_templates. The low 12 bits of CCCR are reserved, so we can hide the "key" in the low 12 bits of hwc->config. We extract the key from hwc->config and then quickly find the template. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <1268908387.13901.127.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18x86, perf: Use apic_write unconditionallyCyrill Gorcunov
Since apic_write() maps to a plain noop in the !CONFIG_X86_LOCAL_APIC case we're safe to remove this conditional compilation and clean up the code a bit. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: fweisbec@gmail.com Cc: acme@redhat.com Cc: eranian@google.com Cc: peterz@infradead.org LKML-Reference: <20100317104356.232371479@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-17perf/core, x86: Remove duplicate perf_event_mask variableRobert Richter
The same information is stored also in x86_pmu.intel_ctrl. This patch removes perf_event_mask and instead uses x86_pmu.intel_ctrl directly. Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1268826553-19518-5-git-send-email-robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>