summaryrefslogtreecommitdiffstats
path: root/include/linux/syscalls.h
AgeCommit message (Collapse)Author
2009-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits) mac80211: fix reorder buffer release iwmc3200wifi: Enable wimax core through module parameter iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter iwmc3200wifi: Coex table command does not expect a response iwmc3200wifi: Update wiwi priority table iwlwifi: driver version track kernel version iwlwifi: indicate uCode type when fail dump error/event log iwl3945: remove duplicated event logging code b43: fix two warnings ipw2100: fix rebooting hang with driver loaded cfg80211: indent regulatory messages with spaces iwmc3200wifi: fix NULL pointer dereference in pmkid update mac80211: Fix TX status reporting for injected data frames ath9k: enable 2GHz band only if the device supports it airo: Fix integer overflow warning rt2x00: Fix padding bug on L2PAD devices. WE: Fix set events not propagated b43legacy: avoid PPC fault during resume b43: avoid PPC fault during resume tcp: fix a timewait refcnt race ... Fix up conflicts due to sysctl cleanups (dead sysctl_check code and CTL_UNNUMBERED removed) in kernel/sysctl_check.c net/ipv4/sysctl_net_ipv4.c net/ipv6/addrconf.c net/sctp/sysctl.c
2009-12-02tracing/syscalls: Make syscall events print callbacks staticFrederic Weisbecker
enter_syscall_print_##sname and exit_syscall_print_##sname don't need to have a global scope. Make them static. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <1259734990-9034-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-01trace_syscalls: Simplify syscall profileLai Jiangshan
use only one prof_sysenter_enable() instead of prof_sysenter_enable_##sname() use only one prof_sysenter_disable() instead of prof_sysenter_disable_##sname() use only one prof_sysexit_enable() instead of prof_sysexit_enable_##sname() use only one prof_sysexit_disable() instead of prof_sysexit_disable_##sname() Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B14D2A1.8060304@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-01trace_syscalls: Remove duplicate init_enter_##sname()Lai Jiangshan
use only one init_syscall_trace instead of many init_enter_##sname()/init_exit_##sname() Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B14D29B.6090708@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-01trace_syscalls: Add syscall_nr field to struct syscall_metadataLai Jiangshan
Add syscall_nr field to struct syscall_metadata, it helps us to get syscall number easier. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B14D293.6090800@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-01trace_syscalls: Remove enter_id exit_idLai Jiangshan
use ->enter_event->id instead of ->enter_id use ->exit_event->id instead of ->exit_id Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B14D288.7030001@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-01trace_syscalls: Set event_enter_##sname->data to its metadataLai Jiangshan
Set event_enter_##sname->data to its metadata, it makes codes simpler. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B14D282.7050709@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-01trace_syscalls: Remove unused event_syscall_enter and event_syscall_exitLai Jiangshan
fix event_enter_##sname->event fix event_exit_##sname->event remove unused event_syscall_enter and event_syscall_exit Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4B14D278.4090209@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12net: Introduce recvmmsg socket syscallArnaldo Carvalho de Melo
Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-23Merge commit 'linus/master' into tracing/kprobesFrederic Weisbecker
Conflicts: kernel/trace/Makefile kernel/trace/trace.h kernel/trace/trace_event_types.h kernel/trace/trace_export.c Merge reason: Sync with latest significant tracing core changes.
2009-09-23fs: change sys_truncate length parameter typeHeiko Carstens
For this system call user space passes a signed long length parameter, while the kernel side takes an unsigned long parameter and converts it later to signed long again. This has led to bugs in compat wrappers see e.g. dd90bbd5 "powerpc: Add compat_sys_truncate". The s390 compat wrapper for this functions is broken as well since it also performs zero extension instead of sign extension for the length parameter. In addition if hpa comes up with an automated way of generating compat wrappers it would generate a wrong one here. So change the length parameter from unsigned long to long. Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-21Merge branch 'perfcounters-rename-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-rename-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Tidy up after the big rename perf: Do the big rename: Performance Counters -> Performance Events perf_counter: Rename 'event' to event_id/hw_event perf_counter: Rename list_entry -> group_entry, counter_list -> group_list Manually resolved some fairly trivial conflicts with the tracing tree in include/trace/ftrace.h and kernel/trace/trace_syscalls.c.
2009-09-21perf: Do the big rename: Performance Counters -> Performance EventsIngo Molnar
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18tracing: Factorize the events profile accountingFrederic Weisbecker
Factorize the events enabling accounting in a common tracing core helper. This reduces the size of the profile_enable() and profile_disable() callbacks for each trace events. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu>
2009-09-11Merge commit 'tracing/core' into tracing/kprobesFrederic Weisbecker
Conflicts: kernel/trace/trace_export.c kernel/trace/trace_kprobe.c Merge reason: This topic branch lacks an important build fix in tracing/core: 0dd7b74787eaf7858c6c573353a83c3e2766e674: tracing: Fix double CPP substitution in TRACE_EVENT_FN that prevents from multiple tracepoint headers inclusion crashes. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-31tracing/filters: Defer pred allocationLi Zefan
init_preds() allocates about 5392 bytes of memory (on x86_32) for a TRACE_EVENT. With my config, at system boot total memory occupied is: 5392 * (642 + 15) == 3459KB 642 == cat available_events | wc -l 15 == number of dirs in events/ftrace That's quite a lot, so we'd better defer memory allocation util it's needed, that's when filter is used. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <4A9B8EA5.6020700@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-27tracing: Ftrace dynamic ftrace_event_call supportMasami Hiramatsu
Add dynamic ftrace_event_call support to ftrace. Trace engines can add new ftrace_event_call to ftrace on the fly. Each operator function of the call takes an ftrace_event_call data structure as an argument, because these functions may be shared among several ftrace_event_calls. Changes from v13: - Define remove_subsystem_dir() always (revirt a2ca5e03), because trace_remove_event_call() uses it. - Modify syscall tracer because of ftrace_event_call change. [fweisbec@gmail.com: Fixed conflict against latest tracing/core] Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Avi Kivity <avi@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jason Baron <jbaron@redhat.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it> Cc: Roland McGrath <roland@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> LKML-Reference: <20090813203453.31965.71901.stgit@localhost.localdomain> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-19tracing/syscalls: Add filtering supportLi Zefan
Add filtering support for syscall events: # echo 'mode == 0666' > events/syscalls/sys_enter_open # echo 'ret == 0' > events/syscalls/sys_exit_open # echo 1 > events/syscalls/sys_enter_open # echo 1 > events/syscalls/sys_exit_open # cat trace ... modprobe-3084 [001] 117.463140: sys_open(filename: 917d3e8, flags: 0, mode: 1b6) modprobe-3084 [001] 117.463176: sys_open -> 0x0 less-3086 [001] 117.510455: sys_open(filename: 9c6bdb8, flags: 8000, mode: 1b6) sendmail-2574 [001] 122.145840: sys_open(filename: b807a365, flags: 0, mode: 1b6) ... Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A8BAFCB.1040006@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-19tracing/syscalls: Add fields format for exit eventsLi Zefan
Add "format" file for syscall exit events: # cat events/syscalls/sys_exit_open/format name: sys_exit_open ID: 344 format: field:unsigned short common_type; offset:0; size:2; field:unsigned char common_flags; offset:2; size:1; field:unsigned char common_preempt_count; offset:3; size:1; field:int common_pid; offset:4; size:4; field:int common_tgid; offset:8; size:4; field:int nr; offset:12; size:4; field:unsigned long ret; offset:16; size:4; Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A8BAF61.3060307@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11tracing: Add fields format definition for syscall eventsFrederic Weisbecker
Define the format of the syscall trace fields to parse the binary values from a raw trace using the syscall events "format" file. This is defined dynamically using the syscalls metadata. It prepares the export of syscall event raw records to perf counters. Example: $ cat /debug/tracing/events/syscalls/sys_enter_sched_getparam/format name: sys_enter_sched_getparam ID: 39 format: field:unsigned short common_type; offset:0; size:2; field:unsigned char common_flags; offset:2; size:1; field:unsigned char common_preempt_count; offset:3; size:1; field:int common_pid; offset:4; size:4; field:int common_tgid; offset:8; size:4; field:pid_t pid; offset:12; size:8; field:struct sched_param * param; offset:20; size:8; print fmt: "pid: 0x%08lx, param: 0x%08lx", ((unsigned long)(REC->pid)), ((unsigned long)(REC->param)) Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Jason Baron <jbaron@redhat.com>
2009-08-11tracing: Add perf counter support for syscalls tracingJason Baron
The perf counter support is automated for usual trace events. But we have to define specific callbacks for this to handle syscalls trace events Make 'perf stat -e syscalls:sys_enter_blah' work with syscall style tracepoints. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-11tracing: Add individual syscalls tracepoint id supportJason Baron
The current state of syscalls tracepoints generates only one event id for every syscall events. This patch associates an id with each syscall trace event, so that we can identify each syscall trace event using the 'perf' tool. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-11tracing: Add trace events for each syscall entry/exitJason Baron
Layer Frederic's syscall tracer on tracepoints. We create trace events via hooking into the SYSCALL_DEFINE macros. This allows us to individually toggle syscall entry and exit points on/off. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-07-07signals: declare sys_rt_tgsigqueueinfo in syscalls.hArnd Bergmann
sys_rt_tgsigqueueinfo needs to be declared in linux/syscalls.h so that architectures defining the system call table in C can reference it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> LKML-Reference: <200907071023.44008.arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-06-16syscalls.h: remove duplicated declarations for sys_pipe2Masatake YAMATO
sys_pipe2 is declared twice in include/linux/syscalls.h. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-12perf_counter: Add forward/backward attribute ABI compatibilityPeter Zijlstra
Provide for means of extending the perf_counter_attr in a 'natural' way. We allow growing the structure by appending fields at the end by specifying the full structure size inside it. When a new kernel sees a smaller (old) structure, it will 0 pad the tail. When an old kernel sees a larger (new) structure, it will verify the tail consists of 0s, otherwise fail. If we fail due to a size-mismatch, we return -E2BIG and write the kernel's native attribe size back into the provided structure. Furthermore, add some attribute verification, so that we'll fail counter creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in the __reserved fields). (This ABI detail is introduced while keeping the existing syscall ABI.) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-02perf_counter: Rename perf_counter_hw_event => perf_counter_attrPeter Zijlstra
The structure isn't hw only and when I read event, I think about those things that fall out the other end. Rename the thing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> Cc: Stephane Eranian <eranian@googlemail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18Merge commit 'v2.6.30-rc6' into perfcounters/coreIngo Molnar
Merge reason: this branch was on an -rc4 base, merge it up to -rc6 to get the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-12syscalls.h add the missing sys_pipe2 declarationArnd Bergmann
In order to build the generic syscall table, we need a declaration for every system call. sys_pipe2 was added without a proper declaration, so add this to syscalls.h now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-29Merge branch 'linus' into perfcounters/coreIngo Molnar
Merge reason: This brach was on -rc1, refresh it to almost-rc4 to pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21Fix SYSCALL_ALIAS for older MIPS assemblerThomas Bogendoerfer
Older MIPS assembler don't support .set for defining aliases. Using = works for old and new assembers. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-09tracing/syscalls: use a dedicated file headerFrederic Weisbecker
Impact: fix build warnings and possibe compat misbehavior on IA64 Building a kernel on ia64 might trigger these ugly build warnings: CC arch/ia64/ia32/sys_ia32.o In file included from arch/ia64/ia32/sys_ia32.c:55: arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined In file included from include/linux/elf.h:7, from include/linux/module.h:14, from include/linux/ftrace.h:8, from include/linux/syscalls.h:68, from arch/ia64/ia32/sys_ia32.c:18: arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition [...] sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h to import the syscalls tracing prototypes. But including ftrace.h can pull too much things for a low level file, especially on ia64 where the ia32 private headers conflict with higher level headers. Now we isolate the syscall tracing headers in their own lightweight file. Reported-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jason Baron <jbaron@redhat.com> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Jiaying Zhang <jiayingz@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: Martin Bligh <mbligh@google.com> Cc: Michael Davidson <md@google.com> LKML-Reference: <20090408184058.GB6017@nowhere> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06Merge branch 'linus' into perfcounters/core-v2Ingo Molnar
Merge reason: we have gathered quite a few conflicts, need to merge upstream Conflicts: arch/powerpc/kernel/Makefile arch/x86/ia32/ia32entry.S arch/x86/include/asm/hardirq.h arch/x86/include/asm/unistd_32.h arch/x86/include/asm/unistd_64.h arch/x86/kernel/cpu/common.c arch/x86/kernel/irq.c arch/x86/kernel/syscall_table_32.S arch/x86/mm/iomap_32.c include/linux/sched.h kernel/Makefile Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-05Merge branch 'tracing-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits) tracing, net: fix net tree and tracing tree merge interaction tracing, powerpc: fix powerpc tree and tracing tree interaction ring-buffer: do not remove reader page from list on ring buffer free function-graph: allow unregistering twice trace: make argument 'mem' of trace_seq_putmem() const tracing: add missing 'extern' keywords to trace_output.h tracing: provide trace_seq_reserve() blktrace: print out BLK_TN_MESSAGE properly blktrace: extract duplidate code blktrace: fix memory leak when freeing struct blk_io_trace blktrace: fix blk_probes_ref chaos blktrace: make classic output more classic blktrace: fix off-by-one bug blktrace: fix the original blktrace blktrace: fix a race when creating blk_tree_root in debugfs blktrace: fix timestamp in binary output tracing, Text Edit Lock: cleanup tracing: filter fix for TRACE_EVENT_FORMAT events ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release() x86: kretprobe-booster interrupt emulation code fix ... Fix up trivial conflicts in arch/parisc/include/asm/ftrace.h include/linux/memory.h kernel/extable.c kernel/module.c
2009-04-04Make non-compat preadv/pwritev use native register sizeLinus Torvalds
Instead of always splitting the file offset into 32-bit 'high' and 'low' parts, just split them into the largest natural word-size - which in C terms is 'unsigned long'. This allows 64-bit architectures to avoid the unnecessary 32-bit shifting and masking for native format (while the compat interfaces will obviously always have to do it). This also changes the order of 'high' and 'low' to be "low first". Why? Because when we have it like this, the 64-bit system calls now don't use the "pos_high" argument at all, and it makes more sense for the native system call to simply match the user-mode prototype. This results in a much more natural calling convention, and allows the compiler to generate much more straightforward code. On x86-64, we now generate testq %rcx, %rcx # pos_l js .L122 #, movq %rcx, -48(%rbp) # pos_l, pos from the C source loff_t pos = pos_from_hilo(pos_h, pos_l); ... if (pos < 0) return -EINVAL; and the 'pos_h' register isn't even touched. It used to generate code like mov %r8d, %r8d # pos_low, pos_low salq $32, %rcx #, tmp71 movq %r8, %rax # pos_low, pos.386 orq %rcx, %rax # tmp71, pos.386 js .L122 #, movq %rax, -48(%rbp) # pos.386, pos which isn't _that_ horrible, but it does show how the natural word size is just a more sensible interface (same arguments will hold in the user level glibc wrapper function, of course, so the kernel side is just half of the equation!) Note: in all cases the user code wrapper can again be the same. You can just do #define HALF_BITS (sizeof(unsigned long)*4) __syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS); or something like that. That way the user mode wrapper will also be nicely passing in a zero (it won't actually have to do the shifts, the compiler will understand what is going on) for the last argument. And that is a good idea, even if nobody will necessarily ever care: if we ever do move to a 128-bit lloff_t, this particular system call might be left alone. Of course, that will be the least of our worries if we really ever need to care, so this may not be worth really caring about. [ Fixed for lost 'loff_t' cast noticed by Andrew Morton ] Acked-by: Gerd Hoffmann <kraxel@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Ingo Molnar <mingo@elte.hu> Cc: Ralf Baechle <ralf@linux-mips.org>> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02preadv/pwritev: Add preadv and pwritev system calls.Gerd Hoffmann
This patch adds preadv and pwritev system calls. These syscalls are a pretty straightforward combination of pread and readv (same for write). They are quite useful for doing vectored I/O in threaded applications. Using lseek+readv instead opens race windows you'll have to plug with locking. Other systems have such system calls too, for example NetBSD, check here: http://www.daemon-systems.org/man/preadv.2.html The application-visible interface provided by glibc should look like this to be compatible to the existing implementations in the *BSD family: ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset); ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset); This prototype has one problem though: On 32bit archs is the (64bit) offset argument unaligned, which the syscall ABI of several archs doesn't allow to do. At least s390 needs a wrapper in glibc to handle this. As we'll need a wrappers in glibc anyway I've decided to push problem to glibc entriely and use a syscall prototype which works without arch-specific wrappers inside the kernel: The offset argument is explicitly splitted into two 32bit values. The patch sports the actual system call implementation and the windup in the x86 system call tables. Other archs follow as separate patches. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <linux-api@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-13tracing/syscalls: core infrastructure for syscalls tracing, enhancementsFrederic Weisbecker
Impact: new feature This adds the generic support for syscalls tracing. This is currently exploited through a devoted tracer but other tracing engines can use it. (They just have to play with {start,stop}_ftrace_syscalls() and use the display callbacks unless they want to override them.) The syscalls prototypes definitions are abused here to steal some metadata informations: - syscall name, param types, param names, number of params The syscall addr is not directly saved during this definition because we don't know if its prototype is available in the namespace. But we don't really need it. The arch has just to build a function able to resolve the syscall number to its metadata struct. The current tracer prints the syscall names, parameters names and values (and their types optionally). Currently the value is a raw hex but higher level values diplaying is on my TODO list. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1236955332-10133-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-04perfcounters: provide expansion room in the ABIPaul Mackerras
Impact: ABI change This expands several fields in the perf_counter_hw_event struct and adds a "flags" argument to the perf_counter_open system call, in order that features can be added in future without ABI changes. In particular the record_type field is expanded to 64 bits, and the space for flag bits has been expanded from 32 to 64 bits. This also adds some new fields: * read_format (64 bits) is intended to provide a way to specify what userspace wants to get back when it does a read() on a simple (non-interrupting) counter; * exclude_idle (1 bit) provides a way for userspace to ask that events that occur when the cpu is idle be excluded; * extra_config_len will provide a way for userspace to supply an arbitrary amount of extra machine-specific PMU configuration data immediately following the perf_counter_hw_event struct, to allow sophisticated users to program things such as instruction matching CAMs and address range registers; * __reserved_3 and __reserved_4 provide space for future expansion. Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-26perfcounters: fix a few minor cleanliness issuesPaul Mackerras
This fixes three issues noticed by Arnd Bergmann: - Add #ifdef __KERNEL__ and move some things around in perf_counter.h to make sure only the bits that userspace needs are exported to userspace. - Use __u64, __s64, __u32 types in the structs exported to userspace rather than u64, s64, u32. - Make the sys_perf_counter_open syscall available to the SPUs on Cell platforms. And one issue that I noticed in looking at the code again: - Wrap the perf_counter_open syscall with SYSCALL_DEFINE4 so we get the proper handling of int arguments on ppc64 (and some other 64-bit architectures). Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-13Merge branch 'linus' into perfcounters/coreIngo Molnar
Conflicts: arch/x86/kernel/acpi/boot.c
2009-02-11syscall define: fix uml compile bugHeiko Carstens
With the new system call defines we get this on uml: arch/um/sys-i386/built-in.o: In function `sys_call_table': (.rodata+0x308): undefined reference to `sys_sigprocmask' Reason for this is that uml passes the preprocessor option -Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel. This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system call named sys_kernel_sigprocmask. However sys_sigprocmask is missing because of this. To avoid macro expansion for the system call name just concatenate the name at first define instead of carrying it through severel levels. This was pointed out by Al Viro. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-11Merge commit 'v2.6.29-rc4' into perfcounters/coreIngo Molnar
Conflicts: arch/x86/kernel/setup_percpu.c arch/x86/mm/fault.c drivers/acpi/processor_idle.c kernel/irq/handle.c
2009-01-29alpha: use syscall wrappersIvan Kokshaysky
Convert OSF syscalls and add alpha specific SYSCALL_ALIAS() macro. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-21Merge commit 'v2.6.29-rc2' into perfcounters/coreIngo Molnar
Conflicts: include/linux/syscalls.h
2009-01-14[CVE-2009-0029] System call wrappers part 33Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] System call wrappers part 32Heiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] powerpc: Enable syscall wrappers for 64-bitBenjamin Herrenschmidt
This enables the use of syscall wrappers to do proper sign extension for 64-bit programs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] System call wrapper infrastructureHeiko Carstens
From: Martin Schwidefsky <schwidefsky@de.ibm.com> By selecting HAVE_SYSCALL_WRAPPERS architectures can activate system call wrappers in order to sign extend system call arguments. All architectures where the ABI defines that the caller of a function has to perform sign extension probably need this. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] Rename old_readdir to sys_old_readdirHeiko Carstens
This way it matches the generic system call name convention. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14[CVE-2009-0029] Convert all system calls to return a longHeiko Carstens
Convert all system calls to return a long. This should be a NOP since all converted types should have the same size anyway. With the exception of sys_exit_group which returned void. But that doesn't matter since the system call doesn't return. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>