From 97e1c18e8d17bd87e1e383b2e9d9fc740332c8e2 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Fri, 18 Jul 2008 12:16:16 -0400 Subject: tracing: Kernel Tracepoints Implementation of kernel tracepoints. Inspired from the Linux Kernel Markers. Allows complete typing verification by declaring both tracing statement inline functions and probe registration/unregistration static inline functions within the same macro "DEFINE_TRACE". No format string is required. See the tracepoint Documentation and Samples patches for usage examples. Taken from the documentation patch : "A tracepoint placed in code provides a hook to call a function (probe) that you can provide at runtime. A tracepoint can be "on" (a probe is connected to it) or "off" (no probe is attached). When a tracepoint is "off" it has no effect, except for adding a tiny time penalty (checking a condition for a branch) and space penalty (adding a few bytes for the function call at the end of the instrumented function and adds a data structure in a separate section). When a tracepoint is "on", the function you provide is called each time the tracepoint is executed, in the execution context of the caller. When the function provided ends its execution, it returns to the caller (continuing from the tracepoint site). You can put tracepoints at important locations in the code. They are lightweight hooks that can pass an arbitrary number of parameters, which prototypes are described in a tracepoint declaration placed in a header file." Addition and removal of tracepoints is synchronized by RCU using the scheduler (and preempt_disable) as guarantees to find a quiescent state (this is really RCU "classic"). The update side uses rcu_barrier_sched() with call_rcu_sched() and the read/execute side uses "preempt_disable()/preempt_enable()". We make sure the previous array containing probes, which has been scheduled for deletion by the rcu callback, is indeed freed before we proceed to the next update. It therefore limits the rate of modification of a single tracepoint to one update per RCU period. The objective here is to permit fast batch add/removal of probes on _different_ tracepoints. Changelog : - Use #name ":" #proto as string to identify the tracepoint in the tracepoint table. This will make sure not type mismatch happens due to connexion of a probe with the wrong type to a tracepoint declared with the same name in a different header. - Add tracepoint_entry_free_old. - Change __TO_TRACE to get rid of the 'i' iterator. Masami Hiramatsu : Tested on x86-64. Performance impact of a tracepoint : same as markers, except that it adds about 70 bytes of instructions in an unlikely branch of each instrumented function (the for loop, the stack setup and the function call). It currently adds a memory read, a test and a conditional branch at the instrumentation site (in the hot path). Immediate values will eventually change this into a load immediate, test and branch, which removes the memory read which will make the i-cache impact smaller (changing the memory read for a load immediate removes 3-4 bytes per site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it also saves the d-cache hit). About the performance impact of tracepoints (which is comparable to markers), even without immediate values optimizations, tests done by Hideo Aoki on ia64 show no regression. His test case was using hackbench on a kernel where scheduler instrumentation (about 5 events in code scheduler code) was added. Quoting Hideo Aoki about Markers : I evaluated overhead of kernel marker using linux-2.6-sched-fixes git tree, which includes several markers for LTTng, using an ia64 server. While the immediate trace mark feature isn't implemented on ia64, there is no major performance regression. So, I think that we don't have any issues to propose merging marker point patches into Linus's tree from the viewpoint of performance impact. I prepared two kernels to evaluate. The first one was compiled without CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS. I downloaded the original hackbench from the following URL: http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c I ran hackbench 5 times in each condition and calculated the average and difference between the kernels. The parameter of hackbench: every 50 from 50 to 800 The number of CPUs of the server: 2, 4, and 8 Below is the results. As you can see, major performance regression wasn't found in any case. Even if number of processes increases, differences between marker-enabled kernel and marker- disabled kernel doesn't increase. Moreover, if number of CPUs increases, the differences doesn't increase either. Curiously, marker-enabled kernel is better than marker-disabled kernel in more than half cases, although I guess it comes from the difference of memory access pattern. * 2 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 4.811 | 4.872 | +0.061 | +1.27 | 100 | 9.854 | 10.309 | +0.454 | +4.61 | 150 | 15.602 | 15.040 | -0.562 | -3.6 | 200 | 20.489 | 20.380 | -0.109 | -0.53 | 250 | 25.798 | 25.652 | -0.146 | -0.56 | 300 | 31.260 | 30.797 | -0.463 | -1.48 | 350 | 36.121 | 35.770 | -0.351 | -0.97 | 400 | 42.288 | 42.102 | -0.186 | -0.44 | 450 | 47.778 | 47.253 | -0.526 | -1.1 | 500 | 51.953 | 52.278 | +0.325 | +0.63 | 550 | 58.401 | 57.700 | -0.701 | -1.2 | 600 | 63.334 | 63.222 | -0.112 | -0.18 | 650 | 68.816 | 68.511 | -0.306 | -0.44 | 700 | 74.667 | 74.088 | -0.579 | -0.78 | 750 | 78.612 | 79.582 | +0.970 | +1.23 | 800 | 85.431 | 85.263 | -0.168 | -0.2 | -------------------------------------------------------------- * 4 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.586 | 2.584 | -0.003 | -0.1 | 100 | 5.254 | 5.283 | +0.030 | +0.56 | 150 | 8.012 | 8.074 | +0.061 | +0.76 | 200 | 11.172 | 11.000 | -0.172 | -1.54 | 250 | 13.917 | 14.036 | +0.119 | +0.86 | 300 | 16.905 | 16.543 | -0.362 | -2.14 | 350 | 19.901 | 20.036 | +0.135 | +0.68 | 400 | 22.908 | 23.094 | +0.186 | +0.81 | 450 | 26.273 | 26.101 | -0.172 | -0.66 | 500 | 29.554 | 29.092 | -0.461 | -1.56 | 550 | 32.377 | 32.274 | -0.103 | -0.32 | 600 | 35.855 | 35.322 | -0.533 | -1.49 | 650 | 39.192 | 38.388 | -0.804 | -2.05 | 700 | 41.744 | 41.719 | -0.025 | -0.06 | 750 | 45.016 | 44.496 | -0.520 | -1.16 | 800 | 48.212 | 47.603 | -0.609 | -1.26 | -------------------------------------------------------------- * 8 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.094 | 2.072 | -0.022 | -1.07 | 100 | 4.162 | 4.273 | +0.111 | +2.66 | 150 | 6.485 | 6.540 | +0.055 | +0.84 | 200 | 8.556 | 8.478 | -0.078 | -0.91 | 250 | 10.458 | 10.258 | -0.200 | -1.91 | 300 | 12.425 | 12.750 | +0.325 | +2.62 | 350 | 14.807 | 14.839 | +0.032 | +0.22 | 400 | 16.801 | 16.959 | +0.158 | +0.94 | 450 | 19.478 | 19.009 | -0.470 | -2.41 | 500 | 21.296 | 21.504 | +0.208 | +0.98 | 550 | 23.842 | 23.979 | +0.137 | +0.57 | 600 | 26.309 | 26.111 | -0.198 | -0.75 | 650 | 28.705 | 28.446 | -0.259 | -0.9 | 700 | 31.233 | 31.394 | +0.161 | +0.52 | 750 | 34.064 | 33.720 | -0.344 | -1.01 | 800 | 36.320 | 36.114 | -0.206 | -0.57 | -------------------------------------------------------------- Signed-off-by: Mathieu Desnoyers Acked-by: Masami Hiramatsu Acked-by: 'Peter Zijlstra' Signed-off-by: Ingo Molnar --- init/Kconfig | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index c11da38837e..70082678a91 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -771,6 +771,13 @@ config PROFILING Say Y here to enable the extended profiling support mechanisms used by profilers such as OProfile. +config TRACEPOINTS + bool "Activate tracepoints" + default y + help + Place an empty function call at each tracepoint site. Can be + dynamically changed for a probe function. + config MARKERS bool "Activate markers" help -- cgit v1.2.3-70-g09d2 From fa340d9c050e78fb21a142b617304214ae5e0c2d Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 23 Jul 2008 13:38:00 +0200 Subject: tracing: disable tracepoints by default while it's arguably low overhead, we dont enable new features by default. Signed-off-by: Ingo Molnar --- init/Kconfig | 1 - 1 file changed, 1 deletion(-) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 70082678a91..d5994490b0b 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -773,7 +773,6 @@ config PROFILING config TRACEPOINTS bool "Activate tracepoints" - default y help Place an empty function call at each tracepoint site. Can be dynamically changed for a probe function. -- cgit v1.2.3-70-g09d2 From 5f87f1121895dc09d2d1c1db5f14af6aa4ce3e94 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Wed, 23 Jul 2008 14:15:22 +0200 Subject: tracing: clean up tracepoints kconfig structure do not expose users to CONFIG_TRACEPOINTS - tracers can select it just fine. update ftrace to select CONFIG_TRACEPOINTS. Signed-off-by: Ingo Molnar --- init/Kconfig | 9 +++++---- kernel/trace/Kconfig | 1 + 2 files changed, 6 insertions(+), 4 deletions(-) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index d5994490b0b..031344f954f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -771,11 +771,12 @@ config PROFILING Say Y here to enable the extended profiling support mechanisms used by profilers such as OProfile. +# +# Place an empty function call at each tracepoint site. Can be +# dynamically changed for a probe function. +# config TRACEPOINTS - bool "Activate tracepoints" - help - Place an empty function call at each tracepoint site. Can be - dynamically changed for a probe function. + bool config MARKERS bool "Activate markers" diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 263e9e6bbd6..cae2637d5e6 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -14,6 +14,7 @@ config TRACING bool select DEBUG_FS select STACKTRACE + select TRACEPOINTS config FTRACE bool "Kernel Function Tracer" -- cgit v1.2.3-70-g09d2 From 68bf21aa15c85d2e9b623dcda2b1ed8893275fa1 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 14 Aug 2008 15:45:08 -0400 Subject: ftrace: mcount call site on boot nops core This is the infrastructure to the converting the mcount call sites recorded by the __mcount_loc section into nops on boot. It also allows for using these sites to enable tracing as normal. When the __mcount_loc section is used, the "ftraced" kernel thread is disabled. This uses the current infrastructure to record the mcount call sites as well as convert them to nops. The mcount function is kept as a stub on boot up and not converted to the ftrace_record_ip function. We use the ftrace_record_ip to only record from the table. This patch does not handle modules. That comes with a later patch. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- include/asm-x86/ftrace.h | 10 ++++ include/linux/ftrace.h | 6 ++ init/main.c | 3 + kernel/trace/ftrace.c | 148 +++++++++++++++++++++++++++++++++-------------- 4 files changed, 124 insertions(+), 43 deletions(-) (limited to 'init') diff --git a/include/asm-x86/ftrace.h b/include/asm-x86/ftrace.h index be0e004ad14..1bb6f9bbe1a 100644 --- a/include/asm-x86/ftrace.h +++ b/include/asm-x86/ftrace.h @@ -7,6 +7,16 @@ #ifndef __ASSEMBLY__ extern void mcount(void); + +static inline unsigned long ftrace_call_adjust(unsigned long addr) +{ + /* + * call mcount is "e8 <4 byte offset>" + * The addr points to the 4 byte offset and the caller of this + * function wants the pointer to e8. Simply subtract one. + */ + return addr - 1; +} #endif #endif /* CONFIG_FTRACE */ diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index bb384068272..d4d6ab453b7 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -162,4 +162,10 @@ static inline void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } #endif +#ifdef CONFIG_FTRACE_MCOUNT_RECORD +extern void ftrace_init(void); +#else +static inline void ftrace_init(void) { } +#endif + #endif /* _LINUX_FTRACE_H */ diff --git a/init/main.c b/init/main.c index 3820323c4c8..ded1fae965a 100644 --- a/init/main.c +++ b/init/main.c @@ -60,6 +60,7 @@ #include #include #include +#include #include #include @@ -687,6 +688,8 @@ asmlinkage void __init start_kernel(void) acpi_early_init(); /* before LAPIC and SMP init */ + ftrace_init(); + /* Do the rest non-__init'ed, we're now alive */ rest_init(); } diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index f6e3af31b40..df96d5990c0 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -792,47 +792,7 @@ static int ftrace_update_code(void) return 1; } -static int ftraced(void *ignore) -{ - unsigned long usecs; - - while (!kthread_should_stop()) { - - set_current_state(TASK_INTERRUPTIBLE); - - /* check once a second */ - schedule_timeout(HZ); - - if (unlikely(ftrace_disabled)) - continue; - - mutex_lock(&ftrace_sysctl_lock); - mutex_lock(&ftraced_lock); - if (!ftraced_suspend && !ftraced_stop && - ftrace_update_code()) { - usecs = nsecs_to_usecs(ftrace_update_time); - if (ftrace_update_tot_cnt > 100000) { - ftrace_update_tot_cnt = 0; - pr_info("hm, dftrace overflow: %lu change%s" - " (%lu total) in %lu usec%s\n", - ftrace_update_cnt, - ftrace_update_cnt != 1 ? "s" : "", - ftrace_update_tot_cnt, - usecs, usecs != 1 ? "s" : ""); - ftrace_disabled = 1; - WARN_ON_ONCE(1); - } - } - mutex_unlock(&ftraced_lock); - mutex_unlock(&ftrace_sysctl_lock); - - ftrace_shutdown_replenish(); - } - __set_current_state(TASK_RUNNING); - return 0; -} - -static int __init ftrace_dyn_table_alloc(void) +static int __init ftrace_dyn_table_alloc(unsigned long num_to_init) { struct ftrace_page *pg; int cnt; @@ -859,7 +819,9 @@ static int __init ftrace_dyn_table_alloc(void) pg = ftrace_pages = ftrace_pages_start; - cnt = NR_TO_INIT / ENTRIES_PER_PAGE; + cnt = num_to_init / ENTRIES_PER_PAGE; + pr_info("ftrace: allocating %ld hash entries in %d pages\n", + num_to_init, cnt); for (i = 0; i < cnt; i++) { pg->next = (void *)get_zeroed_page(GFP_KERNEL); @@ -1556,6 +1518,104 @@ static __init int ftrace_init_debugfs(void) fs_initcall(ftrace_init_debugfs); +#ifdef CONFIG_FTRACE_MCOUNT_RECORD +static int ftrace_convert_nops(unsigned long *start, + unsigned long *end) +{ + unsigned long *p; + unsigned long addr; + unsigned long flags; + + p = start; + while (p < end) { + addr = ftrace_call_adjust(*p++); + ftrace_record_ip(addr); + ftrace_shutdown_replenish(); + } + + /* p is ignored */ + local_irq_save(flags); + __ftrace_update_code(p); + local_irq_restore(flags); + + return 0; +} + +extern unsigned long __start_mcount_loc[]; +extern unsigned long __stop_mcount_loc[]; + +void __init ftrace_init(void) +{ + unsigned long count, addr, flags; + int ret; + + /* Keep the ftrace pointer to the stub */ + addr = (unsigned long)ftrace_stub; + + local_irq_save(flags); + ftrace_dyn_arch_init(&addr); + local_irq_restore(flags); + + /* ftrace_dyn_arch_init places the return code in addr */ + if (addr) + goto failed; + + count = __stop_mcount_loc - __start_mcount_loc; + + ret = ftrace_dyn_table_alloc(count); + if (ret) + goto failed; + + last_ftrace_enabled = ftrace_enabled = 1; + + ret = ftrace_convert_nops(__start_mcount_loc, + __stop_mcount_loc); + + return; + failed: + ftrace_disabled = 1; +} +#else /* CONFIG_FTRACE_MCOUNT_RECORD */ +static int ftraced(void *ignore) +{ + unsigned long usecs; + + while (!kthread_should_stop()) { + + set_current_state(TASK_INTERRUPTIBLE); + + /* check once a second */ + schedule_timeout(HZ); + + if (unlikely(ftrace_disabled)) + continue; + + mutex_lock(&ftrace_sysctl_lock); + mutex_lock(&ftraced_lock); + if (!ftraced_suspend && !ftraced_stop && + ftrace_update_code()) { + usecs = nsecs_to_usecs(ftrace_update_time); + if (ftrace_update_tot_cnt > 100000) { + ftrace_update_tot_cnt = 0; + pr_info("hm, dftrace overflow: %lu change%s" + " (%lu total) in %lu usec%s\n", + ftrace_update_cnt, + ftrace_update_cnt != 1 ? "s" : "", + ftrace_update_tot_cnt, + usecs, usecs != 1 ? "s" : ""); + ftrace_disabled = 1; + WARN_ON_ONCE(1); + } + } + mutex_unlock(&ftraced_lock); + mutex_unlock(&ftrace_sysctl_lock); + + ftrace_shutdown_replenish(); + } + __set_current_state(TASK_RUNNING); + return 0; +} + static int __init ftrace_dynamic_init(void) { struct task_struct *p; @@ -1572,7 +1632,7 @@ static int __init ftrace_dynamic_init(void) goto failed; } - ret = ftrace_dyn_table_alloc(); + ret = ftrace_dyn_table_alloc(NR_TO_INIT); if (ret) goto failed; @@ -1593,6 +1653,8 @@ static int __init ftrace_dynamic_init(void) } core_initcall(ftrace_dynamic_init); +#endif /* CONFIG_FTRACE_MCOUNT_RECORD */ + #else # define ftrace_startup() do { } while (0) # define ftrace_shutdown() do { } while (0) -- cgit v1.2.3-70-g09d2 From aa5d9151f745b6ee6a236a1f109118034277eb92 Mon Sep 17 00:00:00 2001 From: Arjan van de Ven Date: Sat, 13 Sep 2008 09:36:06 -0700 Subject: tracing/fastboot: add a script to visualize the kernel boot process / time When optimizing the kernel boot time, it's very valuable to visualize what is going on at which time. In addition, with the fastboot asynchronous initcall level, it's very valuable to see which initcall gets run where and when. This patch adds a script to turn a dmesg into a SVG graph (that can be shown with tools such as InkScape, Gimp or Firefox) and a small change to the initcall code to print the PID of the thread calling the initcall (so that the script can work out the parallelism). Signed-off-by: Arjan van de Ven --- init/main.c | 3 +- scripts/bootgraph.pl | 138 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 scripts/bootgraph.pl (limited to 'init') diff --git a/init/main.c b/init/main.c index ded1fae965a..16abba05c82 100644 --- a/init/main.c +++ b/init/main.c @@ -711,7 +711,8 @@ int do_one_initcall(initcall_t fn) int result; if (initcall_debug) { - printk("calling %pF\n", fn); + printk("calling %pF", fn); + printk(" @ %i\n", task_pid_nr(current)); t0 = ktime_get(); } diff --git a/scripts/bootgraph.pl b/scripts/bootgraph.pl new file mode 100644 index 00000000000..d459b8bdef0 --- /dev/null +++ b/scripts/bootgraph.pl @@ -0,0 +1,138 @@ +#!/usr/bin/perl + +# Copyright 2008, Intel Corporation +# +# This file is part of the Linux kernel +# +# This program file is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by the +# Free Software Foundation; version 2 of the License. +# +# This program is distributed in the hope that it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +# for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program in a file named COPYING; if not, write to the +# Free Software Foundation, Inc., +# 51 Franklin Street, Fifth Floor, +# Boston, MA 02110-1301 USA +# +# Authors: +# Arjan van de Ven + + +# +# This script turns a dmesg output into a SVG graphic that shows which +# functions take how much time. You can view SVG graphics with various +# programs, including Inkscape, The Gimp and Firefox. +# +# +# For this script to work, the kernel needs to be compiled with the +# CONFIG_PRINTK_TIME configuration option enabled, and with +# "initcall_debug" passed on the kernel command line. +# +# usage: +# dmesg | perl scripts/bootgraph.pl > output.svg +# + +my @rows; +my %start, %end, %row; +my $done = 0; +my $rowcount = 0; +my $maxtime = 0; +my $count = 0; +while (<>) { + my $line = $_; + if ($line =~ /([0-9\.]+)\] calling ([a-zA-Z\_]+)\+/) { + my $func = $2; + if ($done == 0) { + $start{$func} = $1; + } + $row{$func} = 1; + if ($line =~ /\@ ([0-9]+)/) { + my $pid = $1; + if (!defined($rows[$pid])) { + $rowcount = $rowcount + 1; + $rows[$pid] = $rowcount; + } + $row{$func} = $rows[$pid]; + } + $count = $count + 1; + } + + if ($line =~ /([0-9\.]+)\] initcall ([a-zA-Z\_]+)\+.*returned/) { + if ($done == 0) { + $end{$2} = $1; + $maxtime = $1; + } + } + if ($line =~ /Write protecting the/) { + $done = 1; + } +} + +if ($count == 0) { + print "No data found in the dmesg. Make sure CONFIG_PRINTK_TIME is enabled and\n"; + print "that initcall_debug is passed on the kernel command line.\n\n"; + print "Usage: \n"; + print " dmesg | perl scripts/bootgraph.pl > output.svg\n\n"; + exit; +} + +print " \n"; +print "\n"; + +my @styles; + +$styles[0] = "fill:rgb(0,0,255);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[1] = "fill:rgb(0,255,0);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[2] = "fill:rgb(255,0,20);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[3] = "fill:rgb(255,255,20);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[4] = "fill:rgb(255,0,255);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[5] = "fill:rgb(0,255,255);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[6] = "fill:rgb(0,128,255);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[7] = "fill:rgb(0,255,128);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[8] = "fill:rgb(255,0,128);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[9] = "fill:rgb(255,255,128);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[10] = "fill:rgb(255,128,255);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; +$styles[11] = "fill:rgb(128,255,255);fill-opacity:0.5;stroke-width:1;stroke:rgb(0,0,0)"; + +my $mult = 950.0 / $maxtime; +my $threshold = 0.0500 / $maxtime; +my $stylecounter = 0; +while (($key,$value) = each %start) { + my $duration = $end{$key} - $start{$key}; + + if ($duration >= $threshold) { + my $s, $s2, $e, $y; + $s = $value * $mult; + $s2 = $s + 6; + $e = $end{$key} * $mult; + $w = $e - $s; + + $y = $row{$key} * 150; + $y2 = $y + 4; + + $style = $styles[$stylecounter]; + $stylecounter = $stylecounter + 1; + if ($stylecounter > 11) { + $stylecounter = 0; + }; + + print "\n"; + print "$key\n"; + } +} + + +# print the time line on top +my $time = 0.0; +while ($time < $maxtime) { + my $s2 = $time * $mult; + print "$time\n"; + $time = $time + 0.1; +} + +print "\n"; -- cgit v1.2.3-70-g09d2 From 3bf77af6e1fef1124bf71d81f9f84885f0ee0dea Mon Sep 17 00:00:00 2001 From: Frédéric Weisbecker Date: Tue, 23 Sep 2008 11:38:18 +0100 Subject: tracing/ftrace: launch boot tracing after pre-smp initcalls Launch the boot tracing inside the initcall_debug area. Old printk have not been removed to keep the old way of initcall tracing for backward compatibility. [ mingo@elte.hu: resolved conflicts ] Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- init/main.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) (limited to 'init') diff --git a/init/main.c b/init/main.c index 16abba05c82..1e39a1eab19 100644 --- a/init/main.c +++ b/init/main.c @@ -709,10 +709,12 @@ int do_one_initcall(initcall_t fn) ktime_t t0, t1, delta; char msgbuf[64]; int result; + struct boot_trace it; if (initcall_debug) { - printk("calling %pF", fn); - printk(" @ %i\n", task_pid_nr(current)); + it.caller = task_pid_nr(current); + it.func = fn; + printk("calling %pF @ %i\n", fn, it.caller); t0 = ktime_get(); } @@ -721,10 +723,11 @@ int do_one_initcall(initcall_t fn) if (initcall_debug) { t1 = ktime_get(); delta = ktime_sub(t1, t0); - - printk("initcall %pF returned %d after %Ld msecs\n", - fn, result, - (unsigned long long) delta.tv64 >> 20); + it.result = result; + it.duration = (unsigned long long) delta.tv64 >> 20; + printk("initcall %pF returned %d after %Ld msecs\n", fn, + result, it.duration); + trace_boot(&it); } msgbuf[0] = 0; @@ -859,6 +862,7 @@ static int __init kernel_init(void * unused) smp_prepare_cpus(setup_max_cpus); do_pre_smp_initcalls(); + start_boot_trace(); smp_init(); sched_init_smp(); -- cgit v1.2.3-70-g09d2 From cb5ab74204a6e2579d1119bf1348eb806526b12b Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Thu, 2 Oct 2008 12:59:20 +0200 Subject: tracing/fastboot: change the printing of boot tracer according to bootgraph.pl Change the boot tracer printing to make it parsable for the scripts/bootgraph.pl script. We have now to output two lines for each initcall, according to the printk in do_one_initcall() in init/main.c We need now the call's time and the return's time. Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 2 ++ init/main.c | 20 +++++++++----------- kernel/trace/trace_boot.c | 22 +++++++++++++++------- 3 files changed, 26 insertions(+), 18 deletions(-) (limited to 'init') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 91954eb6460..4455490d91b 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -216,6 +216,8 @@ struct boot_trace { initcall_t func; int result; unsigned long long duration; + ktime_t calltime; + ktime_t rettime; }; #ifdef CONFIG_BOOT_TRACER diff --git a/init/main.c b/init/main.c index 1e39a1eab19..61eb6615939 100644 --- a/init/main.c +++ b/init/main.c @@ -706,34 +706,32 @@ __setup("initcall_debug", initcall_debug_setup); int do_one_initcall(initcall_t fn) { int count = preempt_count(); - ktime_t t0, t1, delta; + ktime_t delta; char msgbuf[64]; - int result; struct boot_trace it; if (initcall_debug) { it.caller = task_pid_nr(current); it.func = fn; printk("calling %pF @ %i\n", fn, it.caller); - t0 = ktime_get(); + it.calltime = ktime_get(); } - result = fn(); + it.result = fn(); if (initcall_debug) { - t1 = ktime_get(); - delta = ktime_sub(t1, t0); - it.result = result; + it.rettime = ktime_get(); + delta = ktime_sub(it.rettime, it.calltime); it.duration = (unsigned long long) delta.tv64 >> 20; printk("initcall %pF returned %d after %Ld msecs\n", fn, - result, it.duration); + it.result, it.duration); trace_boot(&it); } msgbuf[0] = 0; - if (result && result != -ENODEV && initcall_debug) - sprintf(msgbuf, "error code %d ", result); + if (it.result && it.result != -ENODEV && initcall_debug) + sprintf(msgbuf, "error code %d ", it.result); if (preempt_count() != count) { strlcat(msgbuf, "preemption imbalance ", sizeof(msgbuf)); @@ -747,7 +745,7 @@ int do_one_initcall(initcall_t fn) printk("initcall %pF returned with %s\n", fn, msgbuf); } - return result; + return it.result; } diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index f2dac6f1cf0..7c15f3e68ba 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -52,16 +52,24 @@ static enum print_line_t initcall_print_line(struct trace_iterator *iter) struct trace_boot *field = (struct trace_boot *)entry; struct boot_trace *it = &field->initcall; struct trace_seq *s = &iter->seq; + struct timespec calltime = ktime_to_timespec(it->calltime); + struct timespec rettime = ktime_to_timespec(it->rettime); if (entry->type == TRACE_BOOT) { - ret = trace_seq_printf(s, "%pF called from %i " - "returned %d after %lld msecs\n", - it->func, it->caller, it->result, - it->duration); - if (ret) - return TRACE_TYPE_HANDLED; - else + ret = trace_seq_printf(s, "[%5ld.%06ld] calling %pF @ %i\n", + calltime.tv_sec, + calltime.tv_nsec, + it->func, it->caller); + if (!ret) return TRACE_TYPE_PARTIAL_LINE; + ret = trace_seq_printf(s, "[%5ld.%06ld] initcall %pF " + "returned %d after %lld msecs\n", + rettime.tv_sec, + rettime.tv_nsec, + it->func, it->result, it->duration); + if (!ret) + return TRACE_TYPE_PARTIAL_LINE; + return TRACE_TYPE_HANDLED; } return TRACE_TYPE_UNHANDLED; } -- cgit v1.2.3-70-g09d2 From 5601020feb0c3010e9e3e0131e9697ac6a06777b Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Thu, 2 Oct 2008 13:26:05 +0200 Subject: tracing/fastboot: get the initcall name before it disappears After some initcall traces, some initcall names may be inconsistent. That's because these functions will disappear from the .init section and also their name from the symbols table. So we have to copy the name of the function in a buffer large enough during the trace appending. It is not costly for the ring_buffer because the number of initcall entries is commonly not really large. Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 7 ++++--- init/main.c | 3 +-- kernel/trace/trace_boot.c | 14 ++++++++++---- 3 files changed, 15 insertions(+), 9 deletions(-) (limited to 'init') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 4455490d91b..e672e51c40a 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -7,6 +7,7 @@ #include #include #include +#include extern int ftrace_enabled; extern int @@ -213,7 +214,7 @@ ftrace_init_module(unsigned long *start, unsigned long *end) { } struct boot_trace { pid_t caller; - initcall_t func; + char func[KSYM_NAME_LEN]; int result; unsigned long long duration; ktime_t calltime; @@ -221,10 +222,10 @@ struct boot_trace { }; #ifdef CONFIG_BOOT_TRACER -extern void trace_boot(struct boot_trace *it); +extern void trace_boot(struct boot_trace *it, initcall_t fn); extern void start_boot_trace(void); #else -static inline void trace_boot(struct boot_trace *it) { } +static inline void trace_boot(struct boot_trace *it, initcall_t fn) { } static inline void start_boot_trace(void) { } #endif diff --git a/init/main.c b/init/main.c index 61eb6615939..8e96a0ef17f 100644 --- a/init/main.c +++ b/init/main.c @@ -712,7 +712,6 @@ int do_one_initcall(initcall_t fn) if (initcall_debug) { it.caller = task_pid_nr(current); - it.func = fn; printk("calling %pF @ %i\n", fn, it.caller); it.calltime = ktime_get(); } @@ -725,7 +724,7 @@ int do_one_initcall(initcall_t fn) it.duration = (unsigned long long) delta.tv64 >> 20; printk("initcall %pF returned %d after %Ld msecs\n", fn, it.result, it.duration); - trace_boot(&it); + trace_boot(&it, fn); } msgbuf[0] = 0; diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index 7c15f3e68ba..b9dc2c0093a 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -8,6 +8,7 @@ #include #include #include +#include #include "trace.h" @@ -56,17 +57,19 @@ static enum print_line_t initcall_print_line(struct trace_iterator *iter) struct timespec rettime = ktime_to_timespec(it->rettime); if (entry->type == TRACE_BOOT) { - ret = trace_seq_printf(s, "[%5ld.%06ld] calling %pF @ %i\n", + ret = trace_seq_printf(s, "[%5ld.%06ld] calling %s @ %i\n", calltime.tv_sec, calltime.tv_nsec, it->func, it->caller); if (!ret) return TRACE_TYPE_PARTIAL_LINE; - ret = trace_seq_printf(s, "[%5ld.%06ld] initcall %pF " + + ret = trace_seq_printf(s, "[%5ld.%06ld] initcall %s " "returned %d after %lld msecs\n", rettime.tv_sec, rettime.tv_nsec, it->func, it->result, it->duration); + if (!ret) return TRACE_TYPE_PARTIAL_LINE; return TRACE_TYPE_HANDLED; @@ -83,8 +86,7 @@ struct tracer boot_tracer __read_mostly = .print_line = initcall_print_line, }; - -void trace_boot(struct boot_trace *it) +void trace_boot(struct boot_trace *it, initcall_t fn) { struct ring_buffer_event *event; struct trace_boot *entry; @@ -95,6 +97,10 @@ void trace_boot(struct boot_trace *it) if (!trace_boot_enabled) return; + /* Get its name now since this function could + * disappear because it is in the .init section. + */ + sprint_symbol(it->func, (unsigned long)fn); preempt_disable(); data = tr->data[smp_processor_id()]; -- cgit v1.2.3-70-g09d2 From 097d036a2f25eecc42435c57e010aaf4a2eed2d9 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Fri, 3 Oct 2008 15:39:21 +0200 Subject: tracing/fastboot: only trace non-module initcalls At this time, only built-in initcalls interest us. We can't really produce a relevant graph if we include the modules initcall too. I had good results after this patch (see svg in attachment). Signed-off-by: Frederic Weisbecker Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 2 ++ init/main.c | 1 + kernel/trace/trace_boot.c | 11 ++++++++--- 3 files changed, 11 insertions(+), 3 deletions(-) (limited to 'init') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index ed53265d1f6..5812dba4ee2 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -225,9 +225,11 @@ struct boot_trace { #ifdef CONFIG_BOOT_TRACER extern void trace_boot(struct boot_trace *it, initcall_t fn); extern void start_boot_trace(void); +extern void stop_boot_trace(void); #else static inline void trace_boot(struct boot_trace *it, initcall_t fn) { } static inline void start_boot_trace(void) { } +static inline void stop_boot_trace(void) { } #endif diff --git a/init/main.c b/init/main.c index 8e96a0ef17f..e7939de80f3 100644 --- a/init/main.c +++ b/init/main.c @@ -886,6 +886,7 @@ static int __init kernel_init(void * unused) * we're essentially up and running. Get rid of the * initmem segments and start the user-mode stuff.. */ + stop_boot_trace(); init_post(); return 0; } diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index b9dc2c0093a..a7efe355965 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -22,11 +22,16 @@ void start_boot_trace(void) trace_boot_enabled = 1; } -void stop_boot_trace(struct trace_array *tr) +void stop_boot_trace(void) { trace_boot_enabled = 0; } +void reset_boot_trace(struct trace_array *tr) +{ + stop_boot_trace(); +} + static void boot_trace_init(struct trace_array *tr) { int cpu; @@ -43,7 +48,7 @@ static void boot_trace_ctrl_update(struct trace_array *tr) if (tr->ctrl) start_boot_trace(); else - stop_boot_trace(tr); + stop_boot_trace(); } static enum print_line_t initcall_print_line(struct trace_iterator *iter) @@ -81,7 +86,7 @@ struct tracer boot_tracer __read_mostly = { .name = "initcall", .init = boot_trace_init, - .reset = stop_boot_trace, + .reset = reset_boot_trace, .ctrl_update = boot_trace_ctrl_update, .print_line = initcall_print_line, }; -- cgit v1.2.3-70-g09d2 From ca538f6bbe583406f941f3041d40c41f9a13d1de Mon Sep 17 00:00:00 2001 From: Tim Bird Date: Thu, 9 Oct 2008 15:23:05 -0700 Subject: tracing/fastboot: add better resolution to initcall debug/tracing Change the time resolution for initcall_debug to microseconds, from milliseconds. This is handy to determine which initcalls you want to work on for faster booting. One one of my test machines, over 90% of the initcalls are less than a millisecond and (without this patch) these are all reported as 0 msecs. Working on the 900 us ones is more important than the 4 us ones. With 'quiet' on the kernel command line, this adds no significant overhead to kernel boot time. Signed-off-by: Tim Bird Signed-off-by: Andrew Morton Signed-off-by: Ingo Molnar --- include/linux/ftrace.h | 4 ++-- init/main.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) (limited to 'init') diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 5812dba4ee2..a3d46151be1 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -215,9 +215,9 @@ ftrace_init_module(unsigned long *start, unsigned long *end) { } struct boot_trace { pid_t caller; - char func[KSYM_NAME_LEN]; + char func[KSYM_NAME_LEN]; int result; - unsigned long long duration; + unsigned long long duration; /* usecs */ ktime_t calltime; ktime_t rettime; }; diff --git a/init/main.c b/init/main.c index e7939de80f3..b2e7ff4a534 100644 --- a/init/main.c +++ b/init/main.c @@ -721,8 +721,8 @@ int do_one_initcall(initcall_t fn) if (initcall_debug) { it.rettime = ktime_get(); delta = ktime_sub(it.rettime, it.calltime); - it.duration = (unsigned long long) delta.tv64 >> 20; - printk("initcall %pF returned %d after %Ld msecs\n", fn, + it.duration = (unsigned long long) delta.tv64 >> 10; + printk("initcall %pF returned %d after %Ld usecs\n", fn, it.result, it.duration); trace_boot(&it, fn); } -- cgit v1.2.3-70-g09d2