asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2010-03-09	MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer	Ingo Molnar
	Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-04	Merge branch 'perf/urgent' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
2010-03-04	perf trace: Don't use pager if scripting	Tom Zanussi
	It's useful for paging through raw traces, but just gets in the way when scripting. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org LKML-Reference: <1267599873-8193-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-04	perf trace/scripting: Remove extraneous header read	Tom Zanussi
	perf_header__read() is already done in perf_session__open(), so remove it from the script gen case. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org LKML-Reference: <1267599873-8193-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-04	perf, ARM: Modify kuser rmb() call to compile for Thumb-2	Will Deacon
	The Thumb-2 instruction set does not provide an encoding for sub pc, r0, #95 as present in the rmb() definition used by perf. This results in compilation failure when using a compiler targetting an instruction set other than ARM. This patch redefines rmb() for ARM by casting the address of the kuser helper to a function pointer, therefore getting the compiler to take care of making the call. Patch taken against tip/master. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1267616878-2154-1-git-send-email-will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-04	Merge branch 'perf/core' into perf/urgent	Ingo Molnar
	Merge reason: Switch from pre-merge topical split to the post-merge urgent track Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-03	x86/stacktrace: Don't dereference bad frame pointers	Frederic Weisbecker
	Callers of a stacktrace might pass bad frame pointers. Those are usually checked for safety in stack walking helpers before any dereferencing, but this is not the case when we need to go through one more frame pointer that backlinks the irq stack to the previous one, as we don't have any reliable address boudaries to compare this frame pointer against. This raises crashes when we record callchains for ftrace events with perf because we don't use the right helpers to capture registers there. We get wrong frame pointers as we call task_pt_regs() even on kernel threads, which is a wrong thing as it gives us the initial state of any kernel threads freshly created. This is even not what we want for user tasks. What we want is a hot snapshot of registers when the ftrace event triggers, not the state before a task entered the kernel. This requires more thoughts to do it correctly though. So first put a guardian to ensure the given frame pointer can be dereferenced to avoid crashes. We'll think about how to fix the callers in a subsequent patch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: 2.6.33.x <stable@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-03-02	perf archive: Don't try to collect files without a build-id	Arnaldo Carvalho de Melo
	To avoid these error: [root@doppio ~]# perf archive tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: .build-id/00/00000000000000000000000000000000000000: Cannot stat: No such file or directory tar: Exiting with failure status due to previous errors [root@doppio ~]# More work is needed to support archiving symtabs for binaries without a build-id, perhaps creating a perf.data UUID + adding build-ids for the binaries copied into the cache and then have this perf.data session UUID be a directory with symlinks to the by now calculated build-id of the files inside it. Or just do an extra pass and insert the calculated build-ids in the perf.data header. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-02	perf_events, x86: Fixup fixed counter constraints	Peter Zijlstra
	Patch 1da53e0230 ("perf_events, x86: Improve x86 event scheduling") lost us one of the fixed purpose counters and then ed8777fc13 ("perf_events, x86: Fix event constraint masks") broke it even further. Widen the fixed event mask to event+umask and specify the full config for each of the 3 fixed purpose counters. Then let the init code fill out the placement for the GP regs based on the cpuid info. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-02	perf, x86: Restrict the ANY flag	Peter Zijlstra
	The ANY flag can show SMT data of another task (like 'top'), so we want to disable it when system-wide profiling is disabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-01	perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE	Robert Richter
	For consistency reasons this patch renames ARCH_PERFMON_EVENTSEL0_ENABLE to ARCH_PERFMON_EVENTSEL_ENABLE. The following is performed: $ sed -i -e s/ARCH_PERFMON_EVENTSEL0_ENABLE/ARCH_PERFMON_EVENTSEL_ENABLE/g \ arch/x86/include/asm/perf_event.h arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perfctr-watchdog.c \ arch/x86/oprofile/op_model_amd.c arch/x86/oprofile/op_model_ppro.c Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-03-01	perf, x86: add some IBS macros to perf_event.h	Robert Richter
	Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-03-01	perf, x86: make IBS macros available in perf_event.h	Robert Richter
	This patch moves code from oprofile to perf_event.h to make it also available for usage by perf. Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-03-01	Merge remote branch 'tip/oprofile' into tip/perf/core	Robert Richter
	Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-28	hw-breakpoints: Remove stub unthrottle callback	Frederic Weisbecker
	We support event unthrottling in breakpoint events. It means that if we have more than sysctl_perf_event_sample_rate/HZ, perf will throttle, ignoring subsequent events until the next tick. So if ptrace exceeds this max rate, it will omit events, which breaks the ptrace determinism that is supposed to report every triggered breakpoints. This is likely to happen if we set sysctl_perf_event_sample_rate to 1. This patch removes support for unthrottling in breakpoint events to break throttling and restore ptrace determinism. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: 2.6.33.x <stable@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org>
2010-02-27	x86/hw-breakpoints: Remove the name field	Frederic Weisbecker
	Remove the name field from the arch_hw_breakpoint. We never deal with target symbols in the arch level, neither do we need to ever store it. It's a legacy for the previous version of the x86 breakpoint backend. Let's remove it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-27	perf: Remove pointless breakpoint union	Frederic Weisbecker
	Remove pointless union in the breakpoint field of hw_perf_event. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org>
2010-02-27	perf lock: Drop the buffers multiplexing dependency	Frederic Weisbecker
	We need to deal with time ordered events to build a correct state machine of lock events. This is why we multiplex the lock events buffers. But the ordering is done from the kernel, on the tracing fast path, leading to high contention between cpus. Without multiplexing, the events appears in a weak order. If we have four events, each split per cpu, perf record will read the events buffers in the following order: [ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....] To handle a post processing reordering, we could just read and sort the whole in memory, but it just doesn't scale with high amounts of events: lock events can fill huge amounts in few times. Basically we need to sort in memory and find a "grace period" point when we know that a given slice of previously sorted events can be committed for post-processing, so that we can unload the memory usage step by step and keep a scalable sorting list. There is no strong rules about how to define such "grace period". What does this patch is: We define a FLUSH_PERIOD value that defines a grace period in seconds. We want to have a slice of events covering 2 * FLUSH_PERIOD in our sorted list. If FLUSH_PERIOD is big enough, it ensures every events that occured in the first half of the timeslice have all been buffered and there are none remaining and there won't be further to put inside this first timeslice. Then once we reach the 2 * FLUSH_PERIOD timeslice, we flush the first half to be gentle with the memory (the second half can still get new events in the middle, so wait another period to flush it) FLUSH_PERIOD is defined to 5 seconds. Say the first event started on time t0. We can safely assume that at the time we are processing events of t0 + 10 seconds, ther won't be anymore events to read from perf.data that occured between t0 and t0 + 5 seconds. Hence we can safely flush the first half. To point out funky bugs, we have a guardian that checks a new event timestamp is not below the last event's timestamp flushed and that displays a warning in this case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com>
2010-02-27	perf lock: Fix and add misc documentally things	Hitoshi Mitake
	I've forgot to add 'perf lock' line to command-list.txt, so users of perf could not find perf lock when they type 'perf'. Fixing command-list.txt requires document (tools/perf/Documentation/perf-lock.txt). But perf lock is too much "under construction" to write a stable document, so this is something like pseudo document for now. And I wrote description of perf lock at help section of CONFIG_LOCK_STAT, this will navigate users of lock trace events. Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> LKML-Reference: <1265267295-8388-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-02-27	percpu: Add __percpu sparse annotations to hw_breakpoint	Tejun Heo
	Add __percpu sparse annotations to hw_breakpoint. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. In kernel/hw_breakpoint.c, per_cpu(nr_task_bp_pinned, cpu)'s will trigger spurious noderef related warnings from sparse. Changing it to &per_cpu(nr_task_bp_pinned[0], cpu) will work around the problem but deemed to ugly by the maintainer. Leave it alone until better solution can be found. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: K.Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <4B7B4B7A.9050902@kernel.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-02-27	Merge commit 'v2.6.33' into perf/core	Frederic Weisbecker
	Merge reason: __percpu annotations need the corresponding sparse address space definition upstream. Conflicts: tools/perf/util/probe-event.c (trivial)
2010-02-26	perf_event, amd: Fix spinlock initialization	Peter Zijlstra
	Avoid kernels from exploding on AMD machines when they have any lock debugging bits enabled. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf_event: Fix preempt warning in perf_clock()	Peter Zijlstra
	A recent commit introduced a preemption warning for perf_clock(), use raw_smp_processor_id() to avoid this, it really doesn't matter which cpu we use here. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1267198583.22519.684.camel@laptop> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf tools: Flush maps on COMM events	David S. Miller
	Even though we don't register the counters until the child is right about to exec(), we're still going to get at least a few events while the fork()'d child is still executing 'perf' and in particular we're going to get the MMAP events. We can't distinguish the ones in the newly executed process because the PID will be the same. One way to solve this would be to have a PERF_RECORD_EXEC event, and when this is seen 'perf' can flush it's map cache. We can't use PERF_RECORD_COMM since that's generated by other things, not just exec(). Actually, thinking about it some more, using PERF_RECORD_COMM might be a good enough approximation. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1267196914-16238-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf_events, x86: Split PMU definitions into separate files	Peter Zijlstra
	Split amd,p6,intel into separate files so that we can easily deal with CONFIG_CPU_SUP_* things, needed to make things build now that perf_event.c relies on symbols from amd.c Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf annotate: Handle samples not at objdump output addr boundaries	Arnaldo Carvalho de Melo
	Without this patch we get this for need_resched: [root@mica ~]# perf annotate need_resched ------------------------------------------------ Percent \| Source code & Disassembly of vmlinux ------------------------------------------------ : : : Disassembly of section .text: : : ffffffff810095ed <need_resched>: : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095ed: 55 push %rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 0.00 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi : : static inline struct thread_info current_thread_info(void) : { : struct thread_info ti; : ti = (void )(percpu_read_stable(kernel_stack) + 0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi 0.00 : ffffffff810095fa: 00 00 : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 0.00 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi 0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8 <test_ti_thread_flag> : } 0.00 : ffffffff8100960b: c9 leaveq 0.00 : ffffffff8100960c: 85 c0 test %eax,%eax 0.00 : ffffffff8100960e: 0f 95 c0 setne %al 0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax : Disassembly of section .vsyscall_0: : Disassembly of section .vsyscall_fn: : Disassembly of section .vsyscall_1: : Disassembly of section .vsyscall_2: : Disassembly of section .init.text: : Disassembly of section .altinstr_replacement: : Disassembly of section .exit.text: [root@mica ~]# But from the 'perf report' result we know that there are hits for need_resched on a 4 way machine mostly doing nothing, so after adding code to show what is in each hist offset and collapsing IP hits for what happens between objdump lines we get, for the same perf.data file: [root@mica ~]# perf annotate -v need_resched ------------------------------------------------ Percent \| Source code & Disassembly of vmlinux ------------------------------------------------ : : : Disassembly of section .text: : : ffffffff810095ed <need_resched>: : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095ed: 55 push %rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 52.78 : ffffffff810095ee: be 03 00 00 00 mov $0x3,%esi : : static inline struct thread_info current_thread_info(void) : { : struct thread_info ti; : ti = (void )(percpu_read_stable(kernel_stack) + 0.00 : ffffffff810095f3: 65 48 8b 3c 25 48 b5 mov %gs:0xb548,%rdi 0.00 : ffffffff810095fa: 00 00 : return (state & TASK_INTERRUPTIBLE) \|\| __fatal_signal_pending(p); : } : : static inline int need_resched(void) : { 0.00 : ffffffff810095fc: 48 89 e5 mov %rsp,%rbp : return unlikely(test_thread_flag(TIF_NEED_RESCHED)); 9.72 : ffffffff810095ff: 48 81 ef d8 1f 00 00 sub $0x1fd8,%rdi 0.00 : ffffffff81009606: e8 9d ff ff ff callq ffffffff810095a8 <test_ti_thread_flag> : } 0.00 : ffffffff8100960b: c9 leaveq 0.00 : ffffffff8100960c: 85 c0 test %eax,%eax 37.50 : ffffffff8100960e: 0f 95 c0 setne %al 0.00 : ffffffff81009611: 0f b6 c0 movzbl %al,%eax : Disassembly of section .vsyscall_0: : Disassembly of section .vsyscall_fn: : Disassembly of section .vsyscall_1: : Disassembly of section .vsyscall_2: : Disassembly of section .init.text: : Disassembly of section .altinstr_replacement: : Disassembly of section .exit.text: [root@mica ~]# And now 'perf annotate -v', verbose mode, will show the hits per precise IP, so that one can make sense of the attribution to each objdumop line: [root@mica ~]# perf annotate -v need_resched Looking at the vmlinux_path (5 entries long) Using /lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux for symbols annotate_sym: filename=/lib/modules/2.6.33-rc8-tip-00784-g3471df5-dirty/build/vmlinux, sym=need_resched, start=0xffffffff810095ed, end=0xffffffff81009614 ------------------------------------------------ Percent \| Source code & Disassembly of vmlinux ------------------------------------------------ ffffffff810095f1: 152 ffffffff81009603: 28 ffffffff8100960f: 55 ffffffff81009610: 53 h->sum: 288 <SNIP same annotation> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1267194194-15670-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	oprofile/x86: fix msr access to reserved counters	Robert Richter
	During switching virtual counters there is access to perfctr msrs. If the counter is not available this fails due to an invalid address. This patch fixes this. Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: use kzalloc() instead of kmalloc()	Robert Richter
	Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: fix perfctr nmi reservation for mulitplexing	Robert Richter
	Multiple virtual counters share one physical counter. The reservation of virtual counters fails due to duplicate allocation of the same counter. The counters are already reserved. Thus, virtual counter reservation may removed at all. This also makes the code easier. Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: add comment to counter-in-use warning	Naga Chumbalkar
	Currently, oprofile fails silently on platforms where a non-OS entity such as the system firmware "enables" and uses a performance counter. There is a warning in the code for this case. The warning indicates an already running counter. If oprofile doesn't collect data, then try using a different performance counter on your platform to monitor the desired event. Delete the counter from the desired event by editing the /usr/share/oprofile/<cpu_type>/<cpu>/events file. If the event cannot be monitored by any other counter, contact your hardware or BIOS vendor. Cc: Shashi Belur <shashi-kiran.belur@hp.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: warn user if a counter is already active	Robert Richter
	This patch generates a warning if a counter is already active. Implemented for AMD and P6 models. P4 is not supported. Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Shashi Belur <shashi-kiran.belur@hp.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: implement randomization for IBS periodic op counter	Robert Richter
	IBS selects an op (execution operation) for sampling by counting either cycles or dispatched ops. Better statistical samples can be produced by adding a software generated random offset to the periodic op counter value with each sample. This patch adds software randomization to the IBS periodic op counter. The lower 12 bits of the 20 bit counter are randomized. IbsOpCurCnt is initialized with a 12 bit random value. There is a work around if the hw can not write to IbsOpCurCnt. Then the lower 8 bits of the 16 bit IbsOpMaxCnt [15:0] value are randomized in the range of -128 to +127 by adding/subtracting an offset to the maximum count (IbsOpMaxCnt). The linear feedback shift register (LFSR) algorithm is used for pseudo-random number generation to have low impact to the memory system. Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: implement lsfr pseudo-random number generator for IBS	Suravee Suthikulpanit
	This patch implements a linear feedback shift register (LFSR) for pseudo-random number generation for IBS. For IBS measurements it would be good to minimize memory traffic in the interrupt handler since every access pollutes the data caches. Computing a maximal period LFSR just needs shifts and ORs. The LFSR method is good enough to randomize the ops at low overhead. 16 pseudo-random bits are enough for the implementation and it doesn't matter that the pattern repeats with a fairly short cycle. It only needs to break up (hard) periodic sampling behavior. The logic was designed by Paul Drongowski. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: implement IBS cpuid feature detection	Robert Richter
	This patch adds IBS feature detection using cpuid flags. An IBS capability mask is introduced to test for certain IBS features. The bit mask is the same as for IBS cpuid feature flags (Fn8000_001B_EAX), but bit 0 is used to indicate the existence of IBS. The patch also changes the handling of the IbsOpCntCtl bit (periodic op counter count control). The oprofilefs file for this feature (ibs_op/dispatched_ops) will be only exposed if the feature is available, also the default for the bit is set to count clock cycles. In general, the userland can detect the availability of a feature by checking for the corresponding file in oprofilefs. If it exists, the feature also exists. This may lead to a dynamic file layout depending on the cpu type with that the userland has to deal with. Current opcontrol is compatible. Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: remove node check in AMD IBS initialization	Robert Richter
	Standard AMD systems have the same number of nodes as there are northbridge devices. However, there may kernel configurations (especially for 32 bit) or system setups exist, where the node number is different or it can not be detected properly. Thus the check is not reliable and may fail though IBS setup was fine. For this reason it is better to remove the check. Cc: stable <stable@kernel.org> Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile/x86: remove OPROFILE_IBS config option	Robert Richter
	OProfile support for IBS is now for several versions in the kernel. The feature is stable now and the code can be activated permanently. As a side effect IBS now works also on nosmp configs. Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile: remove EXPERIMENTAL from the config option description	Robert Richter
	OProfile is already used for a long time and no longer experimental. Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	oprofile: remove tracing build dependency	Robert Richter
	The commit 1155de4 ring-buffer: Make it generally available already made ring-buffer available without the TRACING option enabled. This patch removes the TRACING dependency from oprofile. Fixes also oprofile configuration on ia64. The patch also applies to the 2.6.32-stable kernel. Reported-by: Tony Jones <tonyj@suse.de> Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-02-26	perf_events, x86: Remove superflous MSR writes	Peter Zijlstra
	We re-program the event control register every time we reset the count, this appears to be superflous, hence remove it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in()	Peter Zijlstra
	Since the cpu argument to hw_perf_group_sched_in() is always smp_processor_id(), simplify the code a little by removing this argument and using the current cpu where needed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1265890918.5396.3.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf_events, x86: AMD event scheduling	Stephane Eranian
	This patch adds correct AMD NorthBridge event scheduling. NB events are events measuring L3 cache, Hypertransport traffic. They are identified by an event code >= 0xe0. They measure events on the Northbride which is shared by all cores on a package. NB events are counted on a shared set of counters. When a NB event is programmed in a counter, the data actually comes from a shared counter. Thus, access to those counters needs to be synchronized. We implement the synchronization such that no two cores can be measuring NB events using the same counters. Thus, we maintain a per-NB allocation table. The available slot is propagated using the event_constraint structure. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b703957.0702d00a.6bf2.7b7d@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf_events: Add new start/stop PMU callbacks	Stephane Eranian
	In certain situations, the kernel may need to stop and start the same event rapidly. The current PMU callbacks do not distinguish between stop and release (i.e., stop + free the resource). Thus, a counter may be released, then it will be immediately re-acquired. Event scheduling will again take place with no guarantee to assign the same counter. On some processors, this may event yield to failure to assign the event back due to competion between cores. This patch is adding a new pair of callback to stop and restart a counter without actually release the underlying counter resource. On stop, the counter is stopped, its values saved and that's it. On start, the value is reloaded and counter is restarted (on x86, actual restart is delayed until perf_enable()). Signed-off-by: Stephane Eranian <eranian@google.com> [ added fallback to ->enable/->disable for all other PMUs fixed x86_pmu_start() to call x86_pmu.enable() merged __x86_pmu_disable into x86_pmu_stop() ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b703875.0a04d00a.7896.ffffb824@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26	perf_events: Report the MMAP pgoff value in bytes	Peter Zijlstra
	DaveM reported that currently perf interprets the pgoff value reported by the MMAP events as a byte range, but the kernel reports it as a page offset. Since its broken (and unusable) anyway, change the kernel behaviour (ABI) to report bytes indeed, avoiding the need for userspace to deal with PAGE_SIZE things. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25	perf annotate: Defer allocating sym_priv->hist array	Arnaldo Carvalho de Melo
	Because symbol->end is not fixed up at symbol_filter time, only after all symbols for a DSO are loaded, and that, for asm symbols, may be bogus, causing segfaults when hits happen in these symbols. Reported-by: David Miller <davem@davemloft.net> Reported-by: Anton Blanchard <anton@samba.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: <stable@kernel.org> # for .33.x. Does not apply cleanly, needs backport. LKML-Reference: <20100225155740.GB8553@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25	perf symbols: Improve debugging information about symtab origins	Arnaldo Carvalho de Melo
	Be more clear about DSO long names and tell from which file kernel symbols were obtained, all in --verbose mode: [root@mica ~]# perf report -v > /dev/null Looking at the vmlinux_path (5 entries long) Using /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux for symbols [root@mica ~]# mv /lib/modules/2.6.33-rc8-tip-00777-g0918527-dirty/build/vmlinux /tmp/dd [root@mica ~]# perf report -v > /dev/null Looking at the vmlinux_path (5 entries long) Using /proc/kallsyms for symbols [root@mica ~]# Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1266866139-6361-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25	perf top: Use a macro instead of a constant variable	Arnaldo Carvalho de Melo
	To overcome a silly gcc warning: cc1: warnings being treated as errors builtin-top.c: In function ‘lookup_sym_source’: builtin-top.c:291: warning: not protecting local variables: variable length buffer make: * [builtin-top.o] Error 1 make: * Waiting for unfinished jobs.... That is emitted for this: const size_t pattern_len = BITS_PER_LONG / 4 + 2; char pattern[pattern_len + 1]; Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1266866062-6287-1-git-send-email-acme@infradead.org> [ -v2: macroify the naming style ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25	perf symbols: Check the right return variable	Zhang, Yanmin
	In function dso__split_kallsyms(), curr_map saves the return value of map__new2. So check it instead of var map after the call returns. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: <stable@kernel.org> # for .33.x Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1267066851.1726.9.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-25	perf/scripts: Tag syscall_name helper as not yet available	Frederic Weisbecker
	syscall_name() helper, which resolves a syscall arch number to its name, is not yet available as we first need to implement event injection for it to work. Remove it from the documentation or tag its references as unavailable yet. Once it's implemented, we can just revert the current patch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-02-25	perf/scripts: Add perf-trace-python Documentation	Tom Zanussi
	Also small update to perf-trace-perl and perf-trace docs. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1264580883-15324-13-git-send-email-tzanussi@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-02-25	perf/scripts: Remove unnecessary PyTuple resizes	Tom Zanussi
	If we know the size of a tuple in advance, there's no need to resize it - start out with the known size in the first place. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Keiichi KII <k-keiichi@bx.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1266822779.6426.4.camel@tropicana> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>