summaryrefslogtreecommitdiffstats
path: root/kernel
AgeCommit message (Collapse)Author
2009-12-14locking: Convert raw_spinlock to arch_spinlockThomas Gleixner
The raw_spin* namespace was taken by lockdep for the architecture specific implementations. raw_spin_* would be the ideal name space for the spinlocks which are not converted to sleeping locks in preempt-rt. Linus suggested to convert the raw_ to arch_ locks and cleanup the name space instead of using an artifical name like core_spin, atomic_spin or whatever No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: linux-arch@vger.kernel.org
2009-12-14locking: Reorder functions in spinlock.cThomas Gleixner
Separate spin_lock and rw_lock functions. Preempt-RT needs to exclude the rw_lock functions from being compiled. The reordering allows to do that with a single #ifdef. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c
2009-12-14Merge branch 'tip/tracing/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent
2009-12-14sched: Use rcu in sched_get_rr_param()Thomas Gleixner
read_lock(&tasklist_lock) does not protect sys_sched_get_rr_param() against a concurrent update of the policy or scheduler parameters as do_sched_scheduler() does not take the tasklist_lock. The access to task->sched_class->get_rr_interval is protected by task_rq_lock(task). Use rcu_read_lock() to protect find_task_by_vpid() and prevent the task struct from going away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.862897167@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Use rcu in sched_get/set_affinity()Thomas Gleixner
tasklist_lock is held read locked to protect the find_task_by_vpid() call and to prevent the task going away. sched_setaffinity acquires a task struct ref and drops tasklist lock right away. The access to the cpus_allowed mask is protected by rq->lock. rcu_read_lock() provides the same protection here. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.789059966@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14sched: Use rcu in sys_sched_getscheduler/sys_sched_getparam()Thomas Gleixner
read_lock(&tasklist_lock) does not protect sys_sched_getscheduler and sys_sched_getparam() against a concurrent update of the policy or scheduler parameters as do_sched_setscheduler() does not take the tasklist_lock. The accessed integers can be retrieved w/o locking and are snapshots anyway. Using rcu_read_lock() to protect find_task_by_vpid() and prevent the task struct from going away is not changing the above situation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091209100706.753790977@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14Merge branch 'linus' into tracing/urgentIngo Molnar
Conflicts: kernel/trace/trace_kprobe.c Merge reason: resolve the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-13ksym_tracer: Fix bad castLi Zefan
Fix this warning: kernel/trace/trace_ksym.c: In function 'ksym_trace_filter_read': kernel/trace/trace_ksym.c:239: warning: cast to pointer from integer of different size Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: "K.Prasad" <prasad@linux.vnet.ibm.com> LKML-Reference: <4B1DC578.9020909@cn.fujitsu.com> [remove the strstrip fix as tglx already fixed that] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing/power: Remove two exportsLi Zefan
trace_power_start and trace_power_end are used in arch/x86/kernel/power.c, and this file can't be compiled as a module, so these two tracepoints don't need to be exported. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC55F.7060305@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Change event->profile_count to be int typeLi Zefan
Like total_profile_count, struct ftrace_event_call::profile_count is protected by event_mutex, so it doesn't need to be atomic_t. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4B1DC549.5010705@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Simplify trace_option_write()Li Zefan
- remove duplicate code inside trace_options_write() - extract duplicate code in trace_options_write() and set_tracer_option() Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC532.9010802@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Remove useless trace optionLi Zefan
Since commit 4d9493c90f8e6e1b164aede3814010a290161abb ("ftrace: remove add-hoc code"), option "sched-tree" has become useless. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC50A.7040402@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Use seq file for trace_clockLi Zefan
The buffer for the output is as small as 64 bytes, so it'll overflow if we add more clock type. Use seq file instead. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC4FB.5030407@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Use seq file for trace_optionsLi Zefan
Code simplification for reading trace_options. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-reference: <4B1DC4EF.3090106@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13function-graph: Allow writing the same val to set_graph_functionLi Zefan
# echo 'do_open' > set_graph_function # echo 'do_open' >> set_graph_function bash: echo: write error: Invalid argument Make it valid to write the same value to set_graph_function, which is consistent with set_ftrace_filter interface. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-reference: <4B1DC4E1.1060303@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13ftrace: Call trace_parser_clear() properlyLi Zefan
I found a weird behavior: # echo 'fuse:*' > set_ftrace_filter bash: echo: write error: Invalid argument # cat set_ftrace_filter fuse_dev_fasync fuse_dev_poll fuse_copy_do We should call trace_parser_clear() no matter ftrace_process_regex() returns 0 or -errno, otherwise we will actually take the unaccepted records from ftrace_regex_release(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC4D2.3000406@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13ftrace: Return EINVAL when writing invalid val to set_ftrace_filterLi Zefan
Currently it doesn't warn user on invald value: # echo nonexist_symbol > set_ftrace_filter or: # echo 'nonexist_symbol:mod:fuse' > set_ftrace_filter Better make it return failure. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4B1DC4BF.2070003@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Move a printk out of ftrace_raw_reg_event_foo()Li Zefan
Move the printk from each ftrace_raw_reg_event_foo() to its caller ftrace_event_enable_disable(). This avoids each regfunc trace event callbacks to handle a same error report that can be carried from the caller. See how much space this saves: text data bss dec hex filename 5345151 1961864 7103260 14410275 dbe223 vmlinux.o.old 5331487 1961864 7103260 14396611 dbacc3 vmlinux.o Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> LKML-Reference: <4B1DC4AC.802@cn.fujitsu.com> [start cmdline record before calling regfunc to avoid lost window of pid to comm resolution] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Pull up calls to trace_define_common_fields()Li Zefan
Call trace_define_common_fields() in event_create_dir() only. This avoids trace events to handle it from their define_fields callbacks and shrinks the kernel code size: text data bss dec hex filename 5346802 1961864 7103260 14411926 dbe896 vmlinux.o.old 5345151 1961864 7103260 14410275 dbe223 vmlinux.o Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4B1DC49C.8000107@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13tracing: Extract duplicate ftrace_raw_init_event_foo()Li Zefan
Use a generic trace_event_raw_init() function for all event's raw_init callbacks (but kprobes) instead of defining the same version for each of these. This shrinks the kernel code: text data bss dec hex filename 5355293 1961928 7103260 14420481 dc0a01 vmlinux.o.old 5346802 1961864 7103260 14411926 dbe896 vmlinux.o raw_init can't be removed, because ftrace events and kprobe events use different raw_init callbacks. Though it's possible to totally remove raw_init, I choose to leave it as it is for now. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <4B1DC48C.7080603@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-12-13sched: Use pr_fmt() and pr_<level>()Joe Perches
- Convert printk(KERN_<level> to pr_<level> (not KERN_DEBUG) - Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - Coalesce long format strings - Add missing \n to "ERROR: !SD_LOAD_BALANCE domain has parent" Signed-off-by: Joe Perches <joe@perches.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1260655047.2637.7.camel@Joe-Laptop.home> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-13sched: Make wakeup side and atomic variants of completion API irq safeRafael J. Wysocki
Alan Stern noticed that all the wakeup side (and atomic) variants of the completion APIs should be irq safe, but the newly introduced completion_done() and try_wait_for_completion() aren't. The use of the irq unsafe variants in IRQ contexts can cause crashes/hangs. Fix the problem by making them use spin_lock_irqsave() and spin_lock_irqrestore(). Reported-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: pm list <linux-pm@lists.linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Chinner <david@fromorbit.com> Cc: Lachlan McIlroy <lachlan@sgi.com> LKML-Reference: <200912130007.30541.rjw@sisk.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-12Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits) sched: Remove forced2_migrations stats sched: Fix memory leak in two error corner cases sched: Fix build warning in get_update_sysctl_factor() sched: Update normalized values on user updates via proc sched: Make tunable scaling style configurable sched: Fix missing sched tunable recalculation on cpu add/remove sched: Fix task priority bug sched: cgroup: Implement different treatment for idle shares sched: Remove unnecessary RCU exclusion sched: Discard some old bits sched: Clean up check_preempt_wakeup() sched: Move update_curr() in check_preempt_wakeup() to avoid redundant call sched: Sanitize fork() handling sched: Clean up ttwu() rq locking sched: Remove rq->clock coupling from set_task_cpu() sched: Consolidate select_task_rq() callers sched: Remove sysctl.sched_features sched: Protect sched_rr_get_param() access to task->sched_class sched: Protect task->cpus_allowed access in sched_getaffinity() sched: Fix balance vs hotplug race ... Fixed up conflicts in kernel/sysctl.c (due to sysctl cleanup)
2009-12-12kbuild: move utsrelease.h to include/generatedSam Ravnborg
Fix up all users of utsrelease.h Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-12kbuild: move bounds.h to include/generatedSam Ravnborg
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Michal Marek <mmarek@suse.cz>
2009-12-11Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: itimer: Fix the itimer trace print format hrtimer: move timer stats helper functions to hrtimer.c hrtimer: Tune hrtimer_interrupt hang logic
2009-12-11Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: Avoid out of bounds array reference in save_trace() futex: Take mmap_sem for get_user_pages in fault_in_user_writeable lockstat: Add usage info to Documentation/lockstat.txt lockstat: Fix min, max times in /proc/lock_stats
2009-12-11Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Remove comparing of NULL to va_list in trace_array_vprintk() tracing: Fix function graph trace_pipe to properly display failed entries tracing: Add full state to trace_seq tracing: Buffer the output of seq_file in case of filled buffer tracing: Only call pipe_close if pipe_close is defined tracing: Add pipe_close interface
2009-12-11Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (57 commits) x86, perf events: Check if we have APIC enabled perf_event: Fix variable initialization in other codepaths perf kmem: Fix unused argument build warning perf symbols: perf_header__read_build_ids() offset'n'size should be u64 perf symbols: dsos__read_build_ids() should read both user and kernel buildids perf tools: Align long options which have no short forms perf kmem: Show usage if no option is specified sched: Mark sched_clock() as notrace perf sched: Add max delay time snapshot perf tools: Correct size given to memset perf_event: Fix perf_swevent_hrtimer() variable initialization perf sched: Fix for getting task's execution time tracing/kprobes: Fix field creation's bad error handling perf_event: Cleanup for cpu_clock_perf_event_update() perf_event: Allocate children's perf_event_ctxp at the right time perf_event: Clean up __perf_event_init_context() hw-breakpoints: Modify breakpoints without unregistering them perf probe: Update perf-probe document perf probe: Support --del option trace-kprobe: Support delete probe syntax ...
2009-12-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (58 commits) tty: split the lock up a bit further tty: Move the leader test in disassociate tty: Push the bkl down a bit in the hangup code tty: Push the lock down further into the ldisc code tty: push the BKL down into the handlers a bit tty: moxa: split open lock tty: moxa: Kill the use of lock_kernel tty: moxa: Fix modem op locking tty: moxa: Kill off the throttle method tty: moxa: Locking clean up tty: moxa: rework the locking a bit tty: moxa: Use more tty_port ops tty: isicom: fix deadlock on shutdown tty: mxser: Use the new locking rules to fix setserial properly tty: mxser: use the tty_port_open method tty: isicom: sort out the board init logic tty: isicom: switch to the new tty_port_open helper tty: tty_port: Add a kref object to the tty port tty: istallion: tty port open/close methods tty: stallion: Convert to the tty_port_open/close methods ...
2009-12-11Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdb: Always process the whole breakpoint list on activate or deactivate kgdb: continue and warn on signal passing from gdb kgdb,x86: do not set kgdb_single_step on x86 kgdb: allow for cpu switch when single stepping kgdb,i386: Fix corner case access to ss with NMI watch dog exception kgdb: Replace strstr() by strchr() for single-character needles kgdbts: Read buffer overflow kgdb: Read buffer overflow kgdb,x86: remove redundant test
2009-12-11tty: Move the leader test in disassociateAlan Cox
There are two call points, both want to check that tty->signal->leader is set. Move the test into disassociate_ctty() as that will make locking changes easier in a bit Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-11Merge branch 'linux-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (109 commits) PCI: fix coding style issue in pci_save_state() PCI: add pci_request_acs PCI: fix BUG_ON triggered by logical PCIe root port removal PCI: remove ifdefed pci_cleanup_aer_correct_error_status PCI: unconditionally clear AER uncorr status register during cleanup x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource PCI: portdrv: remove redundant definitions PCI: portdrv: remove unnecessary struct pcie_port_data PCI: portdrv: minor cleanup for pcie_port_device_register PCI: portdrv: add missing irq cleanup PCI: portdrv: enable device before irq initialization PCI: portdrv: cleanup service irqs initialization PCI: portdrv: check capabilities first PCI: portdrv: move PME capability check PCI: portdrv: remove redundant pcie type calculation PCI: portdrv: cleanup pcie_device registration PCI: portdrv: remove redundant pcie_port_device_probe PCI: Always set prefetchable base/limit upper32 registers PCI: read-modify-write the pcie device control register when initiating pcie flr PCI: show dma_mask bits in /sys ... Fixed up conflicts in: arch/x86/kernel/amd_iommu_init.c drivers/pci/dmar.c drivers/pci/hotplug/acpiphp_glue.c
2009-12-11tracing: Add stack trace to irqsoff tracerSteven Rostedt
The irqsoff and friends tracers help in finding causes of latency in the kernel. The also work with the function tracer to show what was happening when interrupts or preemption are disabled. But the function tracer has a bit of an overhead and can cause exagerated readings. Currently, when tracing with /proc/sys/kernel/ftrace_enabled = 0, where the function tracer is disabled, the information that is provided can end up being useless. For example, a 2 and a half millisecond latency only showed: # tracer: preemptirqsoff # # preemptirqsoff latency trace v1.1.5 on 2.6.32 # -------------------------------------------------------------------- # latency: 2463 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: -4242 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # => started at: _spin_lock_irqsave # => ended at: remove_wait_queue # # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| /_--=> lock-depth # |||||/ delay # cmd pid |||||| time | caller # \ / |||||| \ | / hackbenc-4242 2d.... 0us!: trace_hardirqs_off <-_spin_lock_irqsave hackbenc-4242 2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue hackbenc-4242 2...1. 2466us : trace_preempt_on <-remove_wait_queue The above lets us know that hackbench with pid 2463 grabbed a spin lock somewhere and enabled preemption at remove_wait_queue. This helps a little but where this actually happened is not informative. This patch adds the stack dump to the end of the irqsoff tracer. This provides the following output: hackbenc-4242 2d.... 0us!: trace_hardirqs_off <-_spin_lock_irqsave hackbenc-4242 2...1. 2463us+: _spin_unlock_irqrestore <-remove_wait_queue hackbenc-4242 2...1. 2466us : trace_preempt_on <-remove_wait_queue hackbenc-4242 2...1. 2467us : <stack trace> => sub_preempt_count => _spin_unlock_irqrestore => remove_wait_queue => free_poll_entry => poll_freewait => do_sys_poll => sys_poll => system_call_fastpath Now we see that the culprit of this latency was the free_poll_entry code. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-12-11tracing: Add trace_dump_stack()Steven Rostedt
I've been asked a few times about how to find out what is calling some location in the kernel. One way is to use dynamic function tracing and implement the func_stack_trace. But this only finds out who is calling a particular function. It does not tell you who is calling that function and entering a specific if conditional. I have myself implemented a quick version of trace_dump_stack() for this purpose a few times, and just needed it now. This is when I realized that this would be a good tool to have in the kernel like trace_printk(). Using trace_dump_stack() is similar to dump_stack() except that it writes to the trace buffer instead and can be used in critical locations. For example: @@ -5485,8 +5485,12 @@ need_resched_nonpreemptible: if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) { if (unlikely(signal_pending_state(prev->state, prev))) prev->state = TASK_RUNNING; - else + else { deactivate_task(rq, prev, 1); + trace_printk("Deactivating task %s:%d\n", + prev->comm, prev->pid); + trace_dump_stack(); + } switch_count = &prev->nvcsw; } Produces: <...>-3249 [001] 296.105269: schedule: Deactivating task ntpd:3249 <...>-3249 [001] 296.105270: <stack trace> => schedule => schedule_hrtimeout_range => poll_schedule_timeout => do_select => core_sys_select => sys_select => system_call_fastpath Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-12-11kgdb: Always process the whole breakpoint list on activate or deactivateJason Wessel
This patch fixes 2 edge cases in using kgdb in conjunction with gdb. 1) kgdb_deactivate_sw_breakpoints() should process the entire array of breakpoints. The failure to do so results in breakpoints that you cannot remove, because a break point can only be removed if its state flag is set to BP_SET. The easy way to duplicate this problem is to plant a break point in a kernel module and then unload the kernel module. 2) kgdb_activate_sw_breakpoints() should process the entire array of breakpoints. The failure to do so results in missed breakpoints when a breakpoint cannot be activated. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11kgdb: continue and warn on signal passing from gdbJason Wessel
On some architectures for the segv trap, gdb wants to pass the signal back on continue. For kgdb this is not the default behavior, because it can cause the kernel to crash if you arbitrarily pass back a exception outside of kgdb. Instead of causing instability, pass a message back to gdb about the supported kgdb signal passing and execute a standard kgdb continue operation. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11kgdb: allow for cpu switch when single steppingJason Wessel
The kgdb core should not assume that a single step operation of a kernel thread will complete on the same CPU. The single step flag is set at the "thread" level and it is possible in a multi cpu system that a kernel thread can get scheduled on another cpu the next time it is run. As a further safety net in case a slave cpu is hung, the debug master cpu will try 100 times before giving up and assuming control of the slave cpus is no longer possible. It is more useful to be able to get some information out of kgdb instead of spinning forever. Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11kgdb: Read buffer overflowJason Wessel
Roel Kluin reported an error found with Parfait. Where we want to ensure that that kgdb_info[-1] never gets accessed. Also check to ensure any negative tid does not exceed the size of the shadow CPU array, else report critical debug context because it is an internal kgdb failure. Reported-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2009-12-11clockevents: Prevent clockevent_devices list corruption on cpu hotplugThomas Gleixner
Xiaotian Feng triggered a list corruption in the clock events list on CPU hotplug and debugged the root cause. If a CPU registers more than one per cpu clock event device, then only the active clock event device is removed on CPU_DEAD. The unused devices are kept in the clock events device list. On CPU up the clock event devices are registered again, which means that we list_add an already enqueued list_head. That results in list corruption. Resolve this by removing all devices which are associated to the dead CPU on CPU_DEAD. Reported-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Xiaotian Feng <dfeng@redhat.com> Cc: stable@kernel.org
2009-12-10ring-buffer: Move resize integrity check under reader lockSteven Rostedt
While using an application that does splice on the ftrace ring buffer at start up, I triggered an integrity check failure. Looking into this, I discovered that resizing the buffer performs an integrity check after the buffer is resized. This check unfortunately is preformed after it releases the reader lock. If a reader is reading the buffer it may cause the integrity check to trigger a false failure. This patch simply moves the integrity checker under the protection of the ring buffer reader lock. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-12-10ring-buffer: Use sync sched protection on ring buffer resizingSteven Rostedt
There was a comment in the ring buffer code that says the calling layers should prevent tracing or reading of the ring buffer while resizing. I have discovered that the tracers do not honor this arrangement. This patch moves the disabling and synchronizing the ring buffer to a higher layer during resizing. This guarantees that no writes are occurring while the resize takes place. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-12-11tracing: Fix wrong usage of strstrip in trace_ksymsThomas Gleixner
strstrip returns a pointer to the first non space character, but the code in parse_ksym_trace_str() ignores that. strstrip is now must_check and therefor we get the correct warning: kernel/trace/trace_ksym.c:294: warning: ignoring return value of ‘strstrip’, declared with attribute warn_unused_result We are really not interested in leading whitespace here. Fix that and cleanup the dozen kfree() exit pathes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2009-12-10sys: Fix missing rcu protection for __task_cred() accessThomas Gleixner
commit c69e8d9 (CRED: Use RCU to access another task's creds and to release a task's own creds) added non rcu_read_lock() protected access to task creds of the target task in set_prio_one(). The comment above the function says: * - the caller must hold the RCU read lock The calling code in sys_setpriority does read_lock(&tasklist_lock) but not rcu_read_lock(). This works only when CONFIG_TREE_PREEMPT_RCU=n. With CONFIG_TREE_PREEMPT_RCU=y the rcu_callbacks can run in the tick interrupt when they see no read side critical section. There is another instance of __task_cred() in sys_setpriority() itself which is equally unprotected. Wrap the whole code section into a rcu read side critical section to fix this quick and dirty. Will be revisited in course of the read_lock(&tasklist_lock) -> rcu crusade. Oleg noted further: This also fixes another bug here. find_task_by_vpid() is not safe without rcu_read_lock(). I do not mean it is not safe to use the result, just find_pid_ns() by itself is not safe. Usually tasklist gives enough protection, but if copy_process() fails it calls free_pid() lockless and does call_rcu(delayed_put_pid(). This means, without rcu lock find_pid_ns() can't scan the hash table safely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091210004703.029784964@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2009-12-10signals: Fix more rcu assumptionsThomas Gleixner
1) Remove the misleading comment in __sigqueue_alloc() which claims that holding a spinlock is equivalent to rcu_read_lock(). 2) Add a rcu_read_lock/unlock around the __task_cred() access in __sigqueue_alloc() This needs to be revisited to remove the remaining users of read_lock(&tasklist_lock) but that's outside the scope of this patch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091210004703.269843657@linutronix.de>
2009-12-10signal: Fix racy access to __task_cred in kill_pid_info_as_uid()Thomas Gleixner
kill_pid_info_as_uid() accesses __task_cred() without being in a RCU read side critical section. tasklist_lock is not protecting that when CONFIG_TREE_PREEMPT_RCU=y. Convert the whole tasklist_lock section to rcu and use lock_task_sighand to prevent the exit race. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20091210004703.232302055@linutronix.de> Acked-by: Oleg Nesterov <oleg@redhat.com>
2009-12-10sched: Remove forced2_migrations statsIngo Molnar
This build warning: kernel/sched.c: In function 'set_task_cpu': kernel/sched.c:2070: warning: unused variable 'old_rq' Made me realize that the forced2_migrations stat looks pretty pointless (and a misnomer) - remove it. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-10Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Add debugobjects support
2009-12-10perf_event: Fix variable initialization in other codepathsXiao Guangrong
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <4B20BAA6.7010609@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>