summaryrefslogtreecommitdiffstats
path: root/tools/perf/builtin-report.c
AgeCommit message (Collapse)Author
2012-03-19perf report: Add a simple GTK2-based 'perf report' browserPekka Enberg
This patch adds a simple GTK2-based browser to 'perf report' that's based on the TTY-based browser in builtin-report.c. To launch "perf report" using the new GTK interface just type: $ perf report --gtk The interface is somewhat limited in features at the moment: - No callgraph support - No KVM guest profiling support - No color coding for percentages - No sorting from the UI - ..and many, many more! That said, I think this patch a reasonable start to build future features on. Signed-off-by: Pekka Enberg <penberg@kernel.org> Cc: Colin Walters <walters@verbum.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202231952410.6689@tux.localdomain [ committer note: Added #pragma to make gtk no strict prototype problem go away as suggested by Colin Walters modulo avoiding push/pop ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16perf report: Treat an argument as a symbol filterNamhyung Kim
As Ingo requested, it'd be better off treating first (and the only) argument as a symbol filter, so that user doesn't need to input the symbol on the dialog window on TUI. Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1331887855-874-5-git-send-email-namhyung.kim@lge.com Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-16perf report: Add --symbol-filter optionNamhyung Kim
Add new --symbol-filter command line option to set appropriate filter string. Its short version is missing as I couldn't find an ideal one and --filter option of perf record also has no short version. Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1331887855-874-4-git-send-email-namhyung.kim@lge.com Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-03-09perf report: Enable TUI in branch view modeStephane Eranian
This patch updates perf report to support TUI mode when the perf.data file contains samples with branch stacks. For each row in the report, it is possible to annotate either the source or target of each branch. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-5-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09perf report: Auto-detect branch stack sampling modeStephane Eranian
This patch enhances perf report to auto-detect when the perf.data file contains samples with branch stacks. That way it is not necessary to use the -b option. To force branch view mode to off, simply use --no-branch-stack. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09perf report: Add support for taken branch samplingRoberto Agostino Vitillo
This patch adds support for taken branch sampling, i.e, the PERF_SAMPLE_BRANCH_STACK feature to perf report. In other words, to display histograms based on taken branches rather than executed instructions addresses. The new option is called -b and it takes no argument. To generate meaningful output, the perf.data must have been obtained using perf record -b xxx ... where xxx is a branch filter option. The output shows symbols, modules, sorted by 'who branches where' the most often. The percentages reported in the first column refer to the total number of branches captured and not the usual number of samples. Here is a quick example. Here branchy is simple test program which looks as follows: void f2(void) {} void f3(void) {} void f1(unsigned long n) { if (n & 1UL) f2(); else f3(); } int main(void) { unsigned long i; for (i=0; i < N; i++) f1(i); return 0; } Here is the output captured on Nehalem, if we are only interested in user level function calls. $ perf record -b any_call,u -e cycles:u branchy $ perf report -b --sort=symbol 52.34% [.] main [.] f1 24.04% [.] f1 [.] f3 23.60% [.] f1 [.] f2 0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow 0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn 0.01% [k] _IO_vfprintf_internal [k] strchrnul 0.01% [k] __printf [k] _IO_vfprintf_internal 0.01% [k] main [k] __printf About half (52%) of the call branches captured are from main() -> f1(). The second half (24%+23%) is split in two equal shares between f1() -> f2(), f1() ->f3(). The output is as expected given the code. It should be noted, that using -b in perf record does not eliminate information in the perf.data file. Consequently, a typical profile can also be obtained by perf report by simply not using its -b option. It is possible to sort on branch related columns: - dso_from, symbol_from - dso_to, symbol_to - mispredict Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: robert.richter@amd.com Cc: ming.m.lin@intel.com Cc: andi@firstfloor.org Cc: asharma@fb.com Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-23perf report: Accept fifos as input fileRobert Richter
The default input file for perf report is not handled the same way as perf record does it for its output file. This leads to unexpected behavior of perf report, etc. E.g.: # perf record -a -e cpu-cycles sleep 2 | perf report | cat failed to open perf.data: No such file or directory (try 'perf record' first) While perf record writes to a fifo, perf report expects perf.data to be read. This patch changes this to accept fifos as input file. Applies to the following commands: perf annotate perf buildid-list perf evlist perf kmem perf lock perf report perf sched perf script perf timechart Also fixes char const* -> const char* type declaration for filename strings. v2: * Prevent potential null pointer access to input_name in builtin-report.c. Needed due to removal of patch "perf report: Setup browser if stdout is a pipe" Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-22perf report: Fix usage stringNamhyung Kim
perf report does not take a command from command line. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323703017-6060-8-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-12-20perf report: Document '--call-graph' for optional print_limit argumentNamhyung Kim
The '--call-graph' command line option can receive undocumented optional print_limit argument. Besides, use strtoul() to parse the option since its type is u32. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323703017-6060-2-git-send-email-namhyung@gmail.com Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: make -C consistent across commands (for cpu list arg)David Ahern
Currently the meaning of -C varies by perf command: for perf-top, perf-stat, perf-record it means cpu list. For perf-report it means comm list. Then perf-annotate, perf-report and perf-script use -c for cpu list. Fix annotate, report and script to use -C for cpu list to be consistent with top, stat and record. This means report needs to use -c for comm list which does introduce a backward compatibility change. v1 -> v2 - update perf-script.txt too Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1321209008-7004-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: Rename perf_event_ops to perf_toolArnaldo Carvalho de Melo
To better reflect that it became the base class for all tools, that must be in each tool struct and where common stuff will be put. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: Resolve machine earlier and pass it to perf_event_opsArnaldo Carvalho de Melo
Reducing the exposure of perf_session further, so that we can use the classes in cases where no perf.data file is created. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf tools: Pass tool context in the the perf_event_ops functionsArnaldo Carvalho de Melo
So that we don't need to have that many globals. Next steps will remove the 'session' pointer, that in most cases is not needed. Then we can rename perf_event_ops to 'perf_tool' that better describes this class hierarchy. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf report: Group options in a structArnaldo Carvalho de Melo
Paving the way to remove these globals when we change the perf_event_ops to receive as a first parameter a pointer to a perf_event_ops that will then provide access to perf_report via container_of. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2eh2vi2nb5z3tg1lvoxv09xu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf session: Remove superfluous callchain_cursor memberArnaldo Carvalho de Melo
Since we have it in evsel->hists.callchain_cursor, remove it from perf_session. One more step in disentangling several places from requiring a perf_session pointer. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-rxr5dj3di7ckyfmnz0naku1z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-11-28perf symbols: Add nr_events to symbol_confArnaldo Carvalho de Melo
Since symbol__alloc_hists need it, to avoid passing it around in many functions have it in the symbol_conf struct. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cwv8ysvpywzjq4v3xtbd4zwv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07perf tools: Make --no-asm-raw the defaultArnaldo Carvalho de Melo
And add the annotation output knobs to all the tools that have integrated annotation (top, report). Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-gnlob67mke6sji2kf4nstp7m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07perf tools: Make perf.data more self-descriptive (v8)Stephane Eranian
The goal of this patch is to include more information about the host environment into the perf.data so it is more self-descriptive. Overtime, profiles are captured on various machines and it becomes hard to track what was recorded, on what machine and when. This patch provides a way to solve this by extending the perf.data file with basic information about the host machine. To add those extensions, we leverage the feature bits capabilities of the perf.data format. The change is backward compatible with existing perf.data files. We define the following useful new extensions: - HEADER_HOSTNAME: the hostname - HEADER_OSRELEASE: the kernel release number - HEADER_ARCH: the hw architecture - HEADER_CPUDESC: generic CPU description - HEADER_NRCPUS: number of online/avail cpus - HEADER_CMDLINE: perf command line - HEADER_VERSION: perf version - HEADER_TOPOLOGY: cpu topology - HEADER_EVENT_DESC: full event description (attrs) - HEADER_CPUID: easy-to-parse low level CPU identication The small granularity for the entries is to make it easier to extend without breaking backward compatiblity. Many entries are provided as ASCII strings. Perf report/script have been modified to print the basic information as easy-to-parse ASCII strings. Extended information about CPU and NUMA topology may be requested with the -I option. Thanks to David Ahern for reviewing and testing the many versions of this patch. $ perf report --stdio # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # ... $ perf report --stdio -I # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # sibling cores : 0-3 # sibling threads : 0 # sibling threads : 1 # sibling threads : 2 # sibling threads : 3 # node0 meminfo : total = 8320608 kB, free = 7571024 kB # node0 cpu list : 0-3 # ======== # ... Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20110930134040.GA5575@quad Signed-off-by: Stephane Eranian <eranian@google.com> [ committer notes: Use --show-info in the tools as was in the docs, rename perf_header_fprintf_info to perf_file_section__fprintf_info, fixup conflict with f69b64f7 "perf: Support setting the disassembler style" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07perf browsers: Add live mode to the hists, annotate browsersArnaldo Carvalho de Melo
This allows passing a timer to be run periodically, which will update the hists tree that then gers refreshed on the screen, just like the Live mode (symbol entries, annotation) we already have in 'perf top --tui'. Will be used by the new hist_entry/hists based 'top' tool. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2r44qd8oe4sagzcgoikl8qzc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07perf report: Add option to show total periodArnaldo Carvalho de Melo
Just like --show-nr-samples, to help in diagnosing problems in the tools. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1lr7ejdjfvy2uwy2wkmatcpq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-10-07perf hists: Allow limiting the number of rows and columns in fprintfArnaldo Carvalho de Melo
So that we can reuse hists__fprintf for in the new perf top tool. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-huazw48x05h8r9niz5cf63za@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29perf report: Fix stdio event name header printingArnaldo Carvalho de Melo
In the past we tried to avoid printing the name of the event when just one event was found in the perf.data file, after some refactorings it ended up not printing the event name if just one hist_entry was found in one of the events. Fix it by always printing the name of the event, even if just one is found. Reported-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-kikr0c7ou55bd9caok8569rf@git.kernel.org Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29perf: Support setting the disassembler styleAndi Kleen
Add -M option to report/annotate to pass directly to objdump. This allows to use -M intel for intel style disassembler syntax, which is useful for people who are very used to the Intel syntax. Link: http://lkml.kernel.org/r/1316122302-24306-2-git-send-email-andi@firstfloor.org [committer note: Add missing Documentation bits, fixup conflicts with 3e6a2a7] Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-03perf report: Use ui__warning in some more placesArnaldo Carvalho de Melo
So that we get a proper warning in the TUI in cases like: $ perf report --stdio -g fractal,0.5,caller --sort pid Selected -g but no callchain data. Did you call 'perf record' without -g? $ The --stdio case is ok because it uses fprintf, ui__warning is needed to figure out if --stdio or --tui is being used. Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sam Liao <phyomh@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ag9fz2wd17mbbfjsbznq1wms@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-05perf report/annotate/script: Add option to specify a CPU rangeAnton Blanchard
Add an option to perf report/annotate/script to specify which CPUs to operate on. This enables us to take a single system wide profile and analyse each CPU (or group of CPUs) in isolation. This was useful when profiling a multiprocess workload where the bottleneck was on one CPU but this was hidden in the overall profile. Per process and per thread breakdowns didn't help because multiple processes were running on each CPU and no single process consumed an entire CPU. The patch converts the list of CPUs returned by cpu_map__new into a bitmap for fast lookup. I wanted to use -C to be consistent with perf top/record/stat, but unfortunately perf report already uses -C <comms>. v2: Incorporate suggestions from David Ahern: - Added -c to perf script - Check that SAMPLE_CPU is set when -c is used - Update documentation v3: Create perf_session__cpu_bitmap() Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-30perf tools: Only display parent field if explictly sortedFrederic Weisbecker
We don't need to display the parent field if the parent sorting machinery is only used for parent filtering (as in "-p foo"). However if parent filtering is used in combination with explicit parent sorting ( -s parent), we want to display it. Result with: perf report -p kernel_thread -s parent Before: # Overhead Parent symbol # ........ ............. # 0.07% | --- ioread8 ata_sff_check_status ata_sff_tf_load ata_sff_qc_issue ata_bmdma_qc_issue ata_qc_issue ata_scsi_translate ata_scsi_queuecmd scsi_dispatch_cmd scsi_request_fn __blk_run_queue __make_request generic_make_request submit_bio submit_bh journal_submit_commit_record jbd2_journal_commit_transaction kjournald2 kthread kernel_thread_helpe After: # Overhead Parent symbol # ........ ............. # 0.07% kernel_thread_helper | --- ioread8 ata_sff_check_status ata_sff_tf_load ata_sff_qc_issue ata_bmdma_qc_issue ata_qc_issue ata_scsi_translate ata_scsi_queuecmd scsi_dispatch_cmd scsi_request_fn __blk_run_queue __make_request generic_make_request submit_bio submit_bh journal_submit_commit_record jbd2_journal_commit_transaction kjournald2 kthread kernel_thread_helper Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Sam Liao <phyomh@gmail.com>
2011-06-30perf tools: Add inverted call graph report support.Sam Liao
Add "caller/callee" option to support inverted butterfly report, in the inverted report (with caller option), the call graph start from the callee's ancestor. Users can use such view to catch system's performance bottleneck from a sysprof like view. Using this option with specified sort order like pid gives us high level view of call graph statistics. Also add "-G" alias for inverted call graph. Signed-off-by: Sam Liao <phyomh@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-05-27perf tools: Make sure kptr_restrict warnings fit 80 col termsArnaldo Carvalho de Melo
Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26perf symbols: Handle /proc/sys/kernel/kptr_restrictArnaldo Carvalho de Melo
Perf uses /proc/modules to figure out where kernel modules are loaded. With the advent of kptr_restrict, non root users get zeroes for all module start addresses. So check if kptr_restrict is non zero and don't generate the syntethic PERF_RECORD_MMAP events for them. Warn the user about it in perf record and in perf report. In perf report the reference relocation symbol being zero means that kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't use it to fixup symbol addresses when using a valid kallsyms (in the buildid cache) or vmlinux (in the vmlinux path) build-id located automatically or specified by the user. Provide an explanation about it in 'perf report' if kernel samples were taken, checking if a suitable vmlinux or kallsyms was found/specified. Restricted /proc/kallsyms don't go to the buildid cache anymore. Example: [acme@emilia ~]$ perf record -F 100000 sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ] [acme@emilia ~]$ [acme@emilia ~]$ perf report --stdio Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. If some relocation was applied (e.g. kexec) symbols may be misresolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ..................... # 20.24% sleep [kernel.kallsyms] [k] page_fault 20.04% sleep [kernel.kallsyms] [k] filemap_fault 19.78% sleep [kernel.kallsyms] [k] __lru_cache_add 19.69% sleep ld-2.12.so [.] memcpy 14.71% sleep [kernel.kallsyms] [k] dput 4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers 0.73% sleep [kernel.kallsyms] [k] perf_event_comm 0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ This is because it found a suitable vmlinux (build-id checked) in /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long file name). If we remove that file from the vmlinux path: [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \ /lib/modules/2.6.39-rc7+/build/vmlinux.OFF [acme@emilia ~]$ perf report --stdio [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562 not found, continuing without symbols Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. As no suitable kallsyms nor vmlinux was found, kernel samples can't be resolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ...... # 80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a 19.69% sleep ld-2.12.so [.] memcpy # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ Reported-by: Stephane Eranian <eranian@google.com> Suggested-by: David Miller <davem@davemloft.net> Cc: Dave Jones <davej@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-23perf session: Pass evsel in event_ops->sample()Arnaldo Carvalho de Melo
Resolving the sample->id to an evsel since the most advanced tools, report and annotate, and the others will too when they evolve to properly support multi-event perf.data files. Good also because it does an extra validation, checking that the ID is valid when present. When that is not the case, the overhead is just a branch + function call (perf_evlist__id2evsel). Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-10perf session: Use evlist/evsel for managing perf.data attributesArnaldo Carvalho de Melo
So that we can reuse things like the id to attr lookup routine (perf_evlist__id2evsel) that uses a hash table instead of the linear lookup done in the older perf_header_attr routines, etc. Also to make evsels/evlist more pervasive an API, simplyfing using the emerging perf lib. cc: Arun Sharma <arun@sharma-home.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06perf report tui: Improve multi event session supportArnaldo Carvalho de Melo
When multiple events were used in 'perf record', allow the user to choose which one is wanted before showing the per event histograms. Annotations will be performed on the chosen event. Allow going back and forth from event to event quickly using just the arrow keys and enter. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: William Cohen <wcohen@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06perf tools: Improve support for sessions with multiple eventsArnaldo Carvalho de Melo
By creating an perf_evlist out of the attributes in the perf.data file header, so that we can use evlists and evsels when reading recorded sessions in addition to when we record sessions. More work is needed to allow tools to allow the user to select which events are wanted when browsing sessions, be it just one or a subset of them, aggregated or showed at the same time but with different indications on the UI to allow seeing workloads thru different views at the same time. But the overall goal/trend is to more uniformly use evsels and evlists. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-17perf report: Tell the user when a perf.data file has no samplesArnaldo Carvalho de Melo
[root@emilia ~]# perf report --stdio The perf.data file has no samples! [root@emilia ~]# The TUI shows a popup warning message with the same message. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-11perf report: Fix initializion of annotate symbol priv areaArnaldo Carvalho de Melo
We only allocate it when in TUI mode. In --stdio mode unconditionally initializing this area leads to memory corruption. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-08perf annotate: Move locking to struct annotationArnaldo Carvalho de Melo
Since we'll need it when implementing the live annotate TUI browser. This also simplifies things a bit by having the list head for the source code to be in the dynamicly allocated part of struct annotation, that way we don't have to pass it around, it can be found from the struct symbol that is passed everywhere. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05perf annotate: Support multiple histograms in annotationArnaldo Carvalho de Melo
The perf annotate tool continues aggregating everything on just one histograms, but to support the top model add support for one histogram perf evsel in the evlist. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05perf annotate: Move annotate functions to util/Arnaldo Carvalho de Melo
They will be used by perf top, so that we have just one set of routines to do annotation. Rename "struct sym_priv" to "struct annotation", etc, to clarify this code a bit. Rename "struct sym_ext" to "struct source_line", to give it a meaningful name, that clarifies that it is a the result of an addr2line call, that is sorted by percentage one particular source code line appeared in the annotation. And since we're moving things around also rename 'sym_hist->ip' to 'sym_hist->addr' as we want to do data structure annotation at some point. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-31perf tools: Don't fallback to setup_pager unconditionallyArnaldo Carvalho de Melo
Because in tools like 'top' we don't want the pager. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29perf tools: Kill event_t typedef, use 'union perf_event' insteadArnaldo Carvalho de Melo
And move the event_t methods to the perf_event__ too. No code changes, just namespace consistency. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29perf tools: Rename 'struct sample_data' to 'struct perf_sample'Arnaldo Carvalho de Melo
Making the namespace more uniform. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22perf callchain: Rename register_callchain_param into callchain_register_paramFrederic Weisbecker
To make the callchain API naming more consistent. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294977121-5700-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22perf callchain: Feed callchains into a cursorFrederic Weisbecker
The callchains are fed with an array of a fixed size. As a result we iterate over each callchains three times: - 1st to resolve symbols - 2nd to filter out context boundaries - 3rd for the insertion into the tree This also involves some pairs of memory allocation/deallocation everytime we insert a callchain, for the filtered out array of addresses and for the array of symbols that comes along. Instead, feed the callchains through a linked list with persistent allocations. It brings several pros like: - Merge the 1st and 2nd iterations in one. That was possible before but in a way that would involve allocating an array slightly taller than necessary because we don't know in advance the number of context boundaries to filter out. - Much lesser allocations/deallocations. The linked list keeps persistent empty entries for the next usages and is extendable at will. - Makes it easier for multiple sources of callchains to feed a stacktrace together. This is deemed to pave the way for cfi based callchains wherein traditional frame pointer based kernel stacktraces will precede cfi based user ones, producing an overall callchain which size is hardly predictable. This requirement makes the static array obsolete and makes a linked list based iterator a much more flexible fit. Basic testing on a big perf file containing callchains (~ 176 MB) has shown a throughput gain of about 11% with perf report. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22perf tools: Fix 64 bit integer format stringsArnaldo Carvalho de Melo
Using %L[uxd] has issues in some architectures, like on ppc64. Fix it by making our 64 bit integers typedefs of stdint.h types and using PRI[ux]64 like, for instance, git does. Reported by Denis Kirjanov that provided a patch for one case, I went and changed all cases. Reported-by: Denis Kirjanov <dkirjanov@kernel.org> Tested-by: Denis Kirjanov <dkirjanov@kernel.org> LKML-Reference: <20110120093246.GA8031@hera.kernel.org> Cc: Denis Kirjanov <dkirjanov@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pingtian Han <phan@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21perf symbols: Add symfs option for off-box analysis using specified treeDavid Ahern
The symfs argument allows analysis of perf.data file using a locally accessible filesystem tree with debug symbols - e.g., tree created during image builds, sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything with an OS tree can be analyzed from anywhere without the need to populate a local data store with build-ids. Commiter notes: o Fixed up symfs="/" variants handling. o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files outside the symfs directory. LKML-Reference: <1291926427-28846-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21perf record,report,annotate,diff: Process events in orderIan Munsie
This patch changes perf report to ask for the ID info on all events be default if recording from multiple CPUs. Perf report, annotate and diff will now process the events in order if the kernel is able to provide timestamps on all events. This ensures that events such as COMM and MMAP which are necessary to correctly interpret samples are processed prior to those samples so that they are attributed correctly. Before: # perf record ./cachetest # perf report # Events: 6K cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 74.11% :3259 [unknown] [k] 0x4a6c 1.50% cachetest ld-2.11.2.so [.] 0x1777c 1.46% :3259 [kernel.kallsyms] [k] .perf_event_mmap_ctx 1.25% :3259 [kernel.kallsyms] [k] restore 0.74% :3259 [kernel.kallsyms] [k] ._raw_spin_lock 0.71% :3259 [kernel.kallsyms] [k] .filemap_fault 0.66% :3259 [kernel.kallsyms] [k] .memset 0.54% cachetest [kernel.kallsyms] [k] .sha_transform 0.54% :3259 [kernel.kallsyms] [k] .copy_4K_page 0.54% :3259 [kernel.kallsyms] [k] .find_get_page 0.52% :3259 [kernel.kallsyms] [k] .trace_hardirqs_off 0.50% :3259 [kernel.kallsyms] [k] .__do_fault <SNIP> After: # perf report # Events: 6K cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 44.28% cachetest cachetest [.] sumArrayNaive 22.53% cachetest cachetest [.] sumArrayOptimal 6.59% cachetest ld-2.11.2.so [.] 0x1777c 2.13% cachetest [unknown] [k] 0x340 1.46% cachetest [kernel.kallsyms] [k] .perf_event_mmap_ctx 1.25% cachetest [kernel.kallsyms] [k] restore 0.74% cachetest [kernel.kallsyms] [k] ._raw_spin_lock 0.71% cachetest [kernel.kallsyms] [k] .filemap_fault 0.66% cachetest [kernel.kallsyms] [k] .memset 0.54% cachetest [kernel.kallsyms] [k] .copy_4K_page 0.54% cachetest [kernel.kallsyms] [k] .find_get_page 0.54% cachetest [kernel.kallsyms] [k] .sha_transform 0.52% cachetest [kernel.kallsyms] [k] .trace_hardirqs_off 0.50% cachetest [kernel.kallsyms] [k] .__do_fault <SNIP> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291872833-839-1-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21perf session: Fallback to unordered processing if no sample_id_allIan Munsie
If we are running the new perf on an old kernel without support for sample_id_all, we should fall back to the old unordered processing of events. If we didn't than we would *always* process events without timestamps out of order, whether or not we hit a reordering race. In other words, instead of there being a chance of not attributing samples correctly, we would guarantee that samples would not be attributed. While processing all events without timestamps before events with timestamps may seem like an intuitive solution, it falls down as PERF_RECORD_EXIT events would also be processed before any samples. Even with a workaround for that case, samples before/after an exec would not be attributed correctly. This patch allows commands to indicate whether they need to fall back to unordered processing, so that commands that do not care about timestamps on every event will not be affected. If we do fallback, this will print out a warning if report -D was invoked. This patch adds the test in perf_session__new so that we only need to test once per session. Commands that do not use an event_ops (such as record and top) can simply pass NULL in it's place. Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291951882-sup-6069@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09perf report: Allow user to specify path to kallsyms fileDavid Ahern
This is useful for analyzing a perf data file on a different system than the one data was collected on and still include symbols from loaded kernel modules in the output. Commiter note: Updated the man page accordingly. LKML-Reference: <1291775986-16475-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-04perf session: Parse sample earlierArnaldo Carvalho de Melo
At perf_session__process_event, so that we reduce the number of lines in eache tool sample processing routine that now receives a sample_data pointer already parsed. This will also be useful in the next patch, where we'll allow sample the identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu, timestamp) just after before every event. Also validate callchains in perf_session__process_event, i.e. as early as possible, and keep a counter of the number of events discarded due to invalid callchains, warning the user about it if it happens. There is an assumption that was kept that all events have the same sample_type, that will be dealt with in the future, when this preexisting limitation will be removed. Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-22perf: Rename append_callchain into callchain_appendFrederic Weisbecker
Do that to start a consistant callchain API namespace. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@infradead.org>