From 88ac2921a71f788ed693bcd44731dd6bc1994640 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:43 -0700 Subject: tracehook: add linux/tracehook.h This patch series introduces the "tracehook" interface layer of inlines in . There are more details in the log entry for patch 01/23 and in the header file comments inside that patch. Most of these changes move code around with little or no change, and they should not break anything or change any behavior. This sets a new standard for uniform arch support to enable clean arch-independent implementations of new debugging and tracing stuff, denoted by CONFIG_HAVE_ARCH_TRACEHOOK. Patch 20/23 adds that symbol to arch/Kconfig, with comments listing everything an arch has to do before setting "select HAVE_ARCH_TRACEHOOK". These are elaborted a bit at: http://sourceware.org/systemtap/wiki/utrace/arch/HowTo The new inlines that arch code must define or call have detailed kerneldoc comments in the generic header files that say what is required. No arch is obligated to do any work, and no arch's build should be broken by these changes. There are several steps that each arch should take so it can set HAVE_ARCH_TRACEHOOK. Most of these are simple. Providing this support will let new things people add for doing debugging and tracing of user-level threads "just work" for your arch in the future. For an arch that does not provide HAVE_ARCH_TRACEHOOK, some new options for such features will not be available for config. I have done some arch work and will submit this to the arch maintainers after the generic tracehook series settles in. For now, that work is available in my GIT repositories, and in patch and mbox-of-patches form at http://people.redhat.com/roland/utrace/2.6-current/ This paves the way for my "utrace" work, to be submitted later. But it is not innately tied to that. I hope that the tracehook series can go in soon regardless of what eventually does or doesn't go on top of it. For anyone implementing any kind of new tracing/debugging plan, or just understanding all the context of the existing ptrace implementation, having tracehook.h makes things much easier to find and understand. This patch: This adds the new kernel-internal header file . This is not yet used at all. The comments in the header introduce what the following series of patches is about. The aim is to formalize and consolidate all the places that the core kernel code and the arch code now ties into the ptrace implementation. These patches mostly don't cause any functional change. They just move the details of ptrace logic out of core code into tracehook.h inlines, where they are mostly compiled away to the same as before. All that changes is that everything is thoroughly documented and any future reworking of ptrace, or addition of something new, would not have to touch core code all over, just change the tracehook.h inlines. The new linux/ptrace.h inlines are used by the following patches in the new tracehook_*() inlines. Using these helpers for the ptrace event stops makes it simple to change or disable the old ptrace implementation of these stops conditionally later. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 52 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 include/linux/tracehook.h (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h new file mode 100644 index 00000000000..bea0f3eeff5 --- /dev/null +++ b/include/linux/tracehook.h @@ -0,0 +1,52 @@ +/* + * Tracing hooks + * + * Copyright (C) 2008 Red Hat, Inc. All rights reserved. + * + * This copyrighted material is made available to anyone wishing to use, + * modify, copy, or redistribute it subject to the terms and conditions + * of the GNU General Public License v.2. + * + * This file defines hook entry points called by core code where + * user tracing/debugging support might need to do something. These + * entry points are called tracehook_*(). Each hook declared below + * has a detailed kerneldoc comment giving the context (locking et + * al) from which it is called, and the meaning of its return value. + * + * Each function here typically has only one call site, so it is ok + * to have some nontrivial tracehook_*() inlines. In all cases, the + * fast path when no tracing is enabled should be very short. + * + * The purpose of this file and the tracehook_* layer is to consolidate + * the interface that the kernel core and arch code uses to enable any + * user debugging or tracing facility (such as ptrace). The interfaces + * here are carefully documented so that maintainers of core and arch + * code do not need to think about the implementation details of the + * tracing facilities. Likewise, maintainers of the tracing code do not + * need to understand all the calling core or arch code in detail, just + * documented circumstances of each call, such as locking conditions. + * + * If the calling core code changes so that locking is different, then + * it is ok to change the interface documented here. The maintainer of + * core code changing should notify the maintainers of the tracing code + * that they need to work out the change. + * + * Some tracehook_*() inlines take arguments that the current tracing + * implementations might not necessarily use. These function signatures + * are chosen to pass in all the information that is on hand in the + * caller and might conceivably be relevant to a tracer, so that the + * core code won't have to be updated when tracing adds more features. + * If a call site changes so that some of those parameters are no longer + * already on hand without extra work, then the tracehook_* interface + * can change so there is no make-work burden on the core code. The + * maintainer of core code changing should notify the maintainers of the + * tracing code that they need to work out the change. + */ + +#ifndef _LINUX_TRACEHOOK_H +#define _LINUX_TRACEHOOK_H 1 + +#include +#include + +#endif /* */ -- cgit v1.2.3-70-g09d2 From 6341c393fcc37d58727865f1ee2f65e632e9d4f0 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:44 -0700 Subject: tracehook: exec This moves all the ptrace hooks related to exec into tracehook.h inlines. This also lifts the calls for tracing out of the binfmt load_binary hooks into search_binary_handler() after it calls into the binfmt module. This change has no effect, since all the binfmt modules' load_binary functions did the call at the end on success, and now search_binary_handler() does it immediately after return if successful. We consolidate the repeated code, and binfmt modules no longer need to import ptrace_notify(). Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- arch/x86/ia32/ia32_aout.c | 6 ------ fs/binfmt_aout.c | 6 ------ fs/binfmt_elf.c | 6 ------ fs/binfmt_elf_fdpic.c | 7 ------- fs/binfmt_flat.c | 3 --- fs/binfmt_som.c | 2 -- fs/exec.c | 12 ++++-------- include/linux/tracehook.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 8 files changed, 50 insertions(+), 38 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c index 58cccb6483b..a0e1dbe67dc 100644 --- a/arch/x86/ia32/ia32_aout.c +++ b/arch/x86/ia32/ia32_aout.c @@ -441,12 +441,6 @@ beyond_if: regs->r8 = regs->r9 = regs->r10 = regs->r11 = regs->r12 = regs->r13 = regs->r14 = regs->r15 = 0; set_fs(USER_DS); - if (unlikely(current->ptrace & PT_PTRACED)) { - if (current->ptrace & PT_TRACE_EXEC) - ptrace_notify((PTRACE_EVENT_EXEC << 8) | SIGTRAP); - else - send_sig(SIGTRAP, current, 0); - } return 0; } diff --git a/fs/binfmt_aout.c b/fs/binfmt_aout.c index ba4cddb92f1..204cfd1d767 100644 --- a/fs/binfmt_aout.c +++ b/fs/binfmt_aout.c @@ -444,12 +444,6 @@ beyond_if: regs->gp = ex.a_gpvalue; #endif start_thread(regs, ex.a_entry, current->mm->start_stack); - if (unlikely(current->ptrace & PT_PTRACED)) { - if (current->ptrace & PT_TRACE_EXEC) - ptrace_notify ((PTRACE_EVENT_EXEC << 8) | SIGTRAP); - else - send_sig(SIGTRAP, current, 0); - } return 0; } diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 3b6ff854d98..655ed8d30a8 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1003,12 +1003,6 @@ static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs) #endif start_thread(regs, elf_entry, bprm->p); - if (unlikely(current->ptrace & PT_PTRACED)) { - if (current->ptrace & PT_TRACE_EXEC) - ptrace_notify ((PTRACE_EVENT_EXEC << 8) | SIGTRAP); - else - send_sig(SIGTRAP, current, 0); - } retval = 0; out: kfree(loc); diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index 1b59b1edf26..fdeadab2f18 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -433,13 +433,6 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm, entryaddr = interp_params.entry_addr ?: exec_params.entry_addr; start_thread(regs, entryaddr, current->mm->start_stack); - if (unlikely(current->ptrace & PT_PTRACED)) { - if (current->ptrace & PT_TRACE_EXEC) - ptrace_notify((PTRACE_EVENT_EXEC << 8) | SIGTRAP); - else - send_sig(SIGTRAP, current, 0); - } - retval = 0; error: diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c index 2cb1acda3a8..56372ecf169 100644 --- a/fs/binfmt_flat.c +++ b/fs/binfmt_flat.c @@ -920,9 +920,6 @@ static int load_flat_binary(struct linux_binprm * bprm, struct pt_regs * regs) start_thread(regs, start_addr, current->mm->start_stack); - if (current->ptrace & PT_PTRACED) - send_sig(SIGTRAP, current, 0); - return 0; } diff --git a/fs/binfmt_som.c b/fs/binfmt_som.c index fdc36bfd6a7..68be580ba28 100644 --- a/fs/binfmt_som.c +++ b/fs/binfmt_som.c @@ -274,8 +274,6 @@ load_som_binary(struct linux_binprm * bprm, struct pt_regs * regs) map_hpux_gateway_page(current,current->mm); start_thread_som(regs, som_entry, bprm->p); - if (current->ptrace & PT_PTRACED) - send_sig(SIGTRAP, current, 0); return 0; /* error cleanup */ diff --git a/fs/exec.c b/fs/exec.c index 5e559013e30..b8792a13153 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -42,13 +42,13 @@ #include #include #include -#include #include #include #include #include #include #include +#include #include #include @@ -1071,13 +1071,8 @@ EXPORT_SYMBOL(prepare_binprm); static int unsafe_exec(struct task_struct *p) { - int unsafe = 0; - if (p->ptrace & PT_PTRACED) { - if (p->ptrace & PT_PTRACE_CAP) - unsafe |= LSM_UNSAFE_PTRACE_CAP; - else - unsafe |= LSM_UNSAFE_PTRACE; - } + int unsafe = tracehook_unsafe_exec(p); + if (atomic_read(&p->fs->count) > 1 || atomic_read(&p->files->count) > 1 || atomic_read(&p->sighand->count) > 1) @@ -1214,6 +1209,7 @@ int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs) read_unlock(&binfmt_lock); retval = fn(bprm, regs); if (retval >= 0) { + tracehook_report_exec(fmt, bprm, regs); put_binfmt(fmt); allow_write_access(bprm->file); if (bprm->file) diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index bea0f3eeff5..6276353709c 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -48,5 +48,51 @@ #include #include +#include +struct linux_binprm; + +/** + * tracehook_unsafe_exec - check for exec declared unsafe due to tracing + * @task: current task doing exec + * + * Return %LSM_UNSAFE_* bits applied to an exec because of tracing. + * + * Called with task_lock() held on @task. + */ +static inline int tracehook_unsafe_exec(struct task_struct *task) +{ + int unsafe = 0; + int ptrace = task_ptrace(task); + if (ptrace & PT_PTRACED) { + if (ptrace & PT_PTRACE_CAP) + unsafe |= LSM_UNSAFE_PTRACE_CAP; + else + unsafe |= LSM_UNSAFE_PTRACE; + } + return unsafe; +} + +/** + * tracehook_report_exec - a successful exec was completed + * @fmt: &struct linux_binfmt that performed the exec + * @bprm: &struct linux_binprm containing exec details + * @regs: user-mode register state + * + * An exec just completed, we are shortly going to return to user mode. + * The freshly initialized register state can be seen and changed in @regs. + * The name, file and other pointers in @bprm are still on hand to be + * inspected, but will be freed as soon as this returns. + * + * Called with no locks, but with some kernel resources held live + * and a reference on @fmt->module. + */ +static inline void tracehook_report_exec(struct linux_binfmt *fmt, + struct linux_binprm *bprm, + struct pt_regs *regs) +{ + if (!ptrace_event(PT_TRACE_EXEC, PTRACE_EVENT_EXEC, 0) && + unlikely(task_ptrace(current) & PT_PTRACED)) + send_sig(SIGTRAP, current, 0); +} #endif /* */ -- cgit v1.2.3-70-g09d2 From 30199f5a46aee204bf437a4f5b0740f3efe448b7 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:46 -0700 Subject: tracehook: exit This moves the PTRACE_EVENT_EXIT tracing into a tracehook.h inline, tracehook_report_exec(). The change has no effect, just clean-up. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 15 +++++++++++++++ kernel/exit.c | 6 ++---- 2 files changed, 17 insertions(+), 4 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 6276353709c..967ab473afb 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -95,4 +95,19 @@ static inline void tracehook_report_exec(struct linux_binfmt *fmt, send_sig(SIGTRAP, current, 0); } +/** + * tracehook_report_exit - task has begun to exit + * @exit_code: pointer to value destined for @current->exit_code + * + * @exit_code points to the value passed to do_exit(), which tracing + * might change here. This is almost the first thing in do_exit(), + * before freeing any resources or setting the %PF_EXITING flag. + * + * Called with no locks held. + */ +static inline void tracehook_report_exit(long *exit_code) +{ + ptrace_event(PT_TRACE_EXIT, PTRACE_EVENT_EXIT, *exit_code); +} + #endif /* */ diff --git a/kernel/exit.c b/kernel/exit.c index ad933bb29ec..c3691cbc220 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -46,6 +46,7 @@ #include #include #include +#include #include #include @@ -1029,10 +1030,7 @@ NORET_TYPE void do_exit(long code) if (unlikely(!tsk->pid)) panic("Attempted to kill the idle task!"); - if (unlikely(current->ptrace & PT_TRACE_EXIT)) { - current->ptrace_message = code; - ptrace_notify((PTRACE_EVENT_EXIT << 8) | SIGTRAP); - } + tracehook_report_exit(&code); /* * We're taking recursive faults here in do_exit. Safest is to just -- cgit v1.2.3-70-g09d2 From 09a05394fe2448a4139b014936330af23fa7ec83 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:47 -0700 Subject: tracehook: clone This moves all the ptrace initialization and tracing logic for task creation into tracehook.h and ptrace.h inlines. It reorganizes the code slightly, but should not change any behavior. There are four tracehook entry points, at each important stage of task creation. This keeps the interface from the core fork.c code fairly clean, while supporting the complex setup required for ptrace or something like it. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/ptrace.h | 22 ++++++++++ include/linux/tracehook.h | 100 ++++++++++++++++++++++++++++++++++++++++++++++ kernel/fork.c | 69 +++++++++++++------------------- 3 files changed, 150 insertions(+), 41 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h index c74abfc4c7e..dae6d85520f 100644 --- a/include/linux/ptrace.h +++ b/include/linux/ptrace.h @@ -154,6 +154,28 @@ static inline int ptrace_event(int mask, int event, unsigned long message) return 1; } +/** + * ptrace_init_task - initialize ptrace state for a new child + * @child: new child task + * @ptrace: true if child should be ptrace'd by parent's tracer + * + * This is called immediately after adding @child to its parent's children + * list. @ptrace is false in the normal case, and true to ptrace @child. + * + * Called with current's siglock and write_lock_irq(&tasklist_lock) held. + */ +static inline void ptrace_init_task(struct task_struct *child, bool ptrace) +{ + INIT_LIST_HEAD(&child->ptrace_entry); + INIT_LIST_HEAD(&child->ptraced); + child->parent = child->real_parent; + child->ptrace = 0; + if (unlikely(ptrace)) { + child->ptrace = current->ptrace; + __ptrace_link(child, current->parent); + } +} + #ifndef force_successful_syscall_return /* * System call handlers that, upon successful completion, need to return a diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 967ab473afb..3ebc58b5976 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -110,4 +110,104 @@ static inline void tracehook_report_exit(long *exit_code) ptrace_event(PT_TRACE_EXIT, PTRACE_EVENT_EXIT, *exit_code); } +/** + * tracehook_prepare_clone - prepare for new child to be cloned + * @clone_flags: %CLONE_* flags from clone/fork/vfork system call + * + * This is called before a new user task is to be cloned. + * Its return value will be passed to tracehook_finish_clone(). + * + * Called with no locks held. + */ +static inline int tracehook_prepare_clone(unsigned clone_flags) +{ + if (clone_flags & CLONE_UNTRACED) + return 0; + + if (clone_flags & CLONE_VFORK) { + if (current->ptrace & PT_TRACE_VFORK) + return PTRACE_EVENT_VFORK; + } else if ((clone_flags & CSIGNAL) != SIGCHLD) { + if (current->ptrace & PT_TRACE_CLONE) + return PTRACE_EVENT_CLONE; + } else if (current->ptrace & PT_TRACE_FORK) + return PTRACE_EVENT_FORK; + + return 0; +} + +/** + * tracehook_finish_clone - new child created and being attached + * @child: new child task + * @clone_flags: %CLONE_* flags from clone/fork/vfork system call + * @trace: return value from tracehook_clone_prepare() + * + * This is called immediately after adding @child to its parent's children list. + * The @trace value is that returned by tracehook_prepare_clone(). + * + * Called with current's siglock and write_lock_irq(&tasklist_lock) held. + */ +static inline void tracehook_finish_clone(struct task_struct *child, + unsigned long clone_flags, int trace) +{ + ptrace_init_task(child, (clone_flags & CLONE_PTRACE) || trace); +} + +/** + * tracehook_report_clone - in parent, new child is about to start running + * @trace: return value from tracehook_clone_prepare() + * @regs: parent's user register state + * @clone_flags: flags from parent's system call + * @pid: new child's PID in the parent's namespace + * @child: new child task + * + * Called after a child is set up, but before it has been started running. + * The @trace value is that returned by tracehook_clone_prepare(). + * This is not a good place to block, because the child has not started yet. + * Suspend the child here if desired, and block in tracehook_clone_complete(). + * This must prevent the child from self-reaping if tracehook_clone_complete() + * uses the @child pointer; otherwise it might have died and been released by + * the time tracehook_report_clone_complete() is called. + * + * Called with no locks held, but the child cannot run until this returns. + */ +static inline void tracehook_report_clone(int trace, struct pt_regs *regs, + unsigned long clone_flags, + pid_t pid, struct task_struct *child) +{ + if (unlikely(trace)) { + /* + * The child starts up with an immediate SIGSTOP. + */ + sigaddset(&child->pending.signal, SIGSTOP); + set_tsk_thread_flag(child, TIF_SIGPENDING); + } +} + +/** + * tracehook_report_clone_complete - new child is running + * @trace: return value from tracehook_clone_prepare() + * @regs: parent's user register state + * @clone_flags: flags from parent's system call + * @pid: new child's PID in the parent's namespace + * @child: child task, already running + * + * This is called just after the child has started running. This is + * just before the clone/fork syscall returns, or blocks for vfork + * child completion if @clone_flags has the %CLONE_VFORK bit set. + * The @child pointer may be invalid if a self-reaping child died and + * tracehook_report_clone() took no action to prevent it from self-reaping. + * + * Called with no locks held. + */ +static inline void tracehook_report_clone_complete(int trace, + struct pt_regs *regs, + unsigned long clone_flags, + pid_t pid, + struct task_struct *child) +{ + if (unlikely(trace)) + ptrace_event(0, trace, pid); +} + #endif /* */ diff --git a/kernel/fork.c b/kernel/fork.c index 80e83e459b1..b42f8ed2361 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include @@ -865,8 +866,7 @@ static void copy_flags(unsigned long clone_flags, struct task_struct *p) new_flags &= ~PF_SUPERPRIV; new_flags |= PF_FORKNOEXEC; - if (!(clone_flags & CLONE_PTRACE)) - p->ptrace = 0; + new_flags |= PF_STARTING; p->flags = new_flags; clear_freeze_flag(p); } @@ -907,7 +907,8 @@ static struct task_struct *copy_process(unsigned long clone_flags, struct pt_regs *regs, unsigned long stack_size, int __user *child_tidptr, - struct pid *pid) + struct pid *pid, + int trace) { int retval; struct task_struct *p; @@ -1163,8 +1164,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, */ p->group_leader = p; INIT_LIST_HEAD(&p->thread_group); - INIT_LIST_HEAD(&p->ptrace_entry); - INIT_LIST_HEAD(&p->ptraced); /* Now that the task is set up, run cgroup callbacks if * necessary. We need to run them before the task is visible @@ -1195,7 +1194,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, p->real_parent = current->real_parent; else p->real_parent = current; - p->parent = p->real_parent; spin_lock(¤t->sighand->siglock); @@ -1237,8 +1235,7 @@ static struct task_struct *copy_process(unsigned long clone_flags, if (likely(p->pid)) { list_add_tail(&p->sibling, &p->real_parent->children); - if (unlikely(p->ptrace & PT_PTRACED)) - __ptrace_link(p, current->parent); + tracehook_finish_clone(p, clone_flags, trace); if (thread_group_leader(p)) { if (clone_flags & CLONE_NEWPID) @@ -1323,29 +1320,13 @@ struct task_struct * __cpuinit fork_idle(int cpu) struct pt_regs regs; task = copy_process(CLONE_VM, 0, idle_regs(®s), 0, NULL, - &init_struct_pid); + &init_struct_pid, 0); if (!IS_ERR(task)) init_idle(task, cpu); return task; } -static int fork_traceflag(unsigned clone_flags) -{ - if (clone_flags & CLONE_UNTRACED) - return 0; - else if (clone_flags & CLONE_VFORK) { - if (current->ptrace & PT_TRACE_VFORK) - return PTRACE_EVENT_VFORK; - } else if ((clone_flags & CSIGNAL) != SIGCHLD) { - if (current->ptrace & PT_TRACE_CLONE) - return PTRACE_EVENT_CLONE; - } else if (current->ptrace & PT_TRACE_FORK) - return PTRACE_EVENT_FORK; - - return 0; -} - /* * Ok, this is the main fork-routine. * @@ -1380,14 +1361,14 @@ long do_fork(unsigned long clone_flags, } } - if (unlikely(current->ptrace)) { - trace = fork_traceflag (clone_flags); - if (trace) - clone_flags |= CLONE_PTRACE; - } + /* + * When called from kernel_thread, don't do user tracing stuff. + */ + if (likely(user_mode(regs))) + trace = tracehook_prepare_clone(clone_flags); p = copy_process(clone_flags, stack_start, regs, stack_size, - child_tidptr, NULL); + child_tidptr, NULL, trace); /* * Do this prior waking up the new thread - the thread pointer * might get invalid after that point, if the thread exits quickly. @@ -1405,24 +1386,30 @@ long do_fork(unsigned long clone_flags, init_completion(&vfork); } - if ((p->ptrace & PT_PTRACED) || (clone_flags & CLONE_STOPPED)) { + tracehook_report_clone(trace, regs, clone_flags, nr, p); + + /* + * We set PF_STARTING at creation in case tracing wants to + * use this to distinguish a fully live task from one that + * hasn't gotten to tracehook_report_clone() yet. Now we + * clear it and set the child going. + */ + p->flags &= ~PF_STARTING; + + if (unlikely(clone_flags & CLONE_STOPPED)) { /* * We'll start up with an immediate SIGSTOP. */ sigaddset(&p->pending.signal, SIGSTOP); set_tsk_thread_flag(p, TIF_SIGPENDING); - } - - if (!(clone_flags & CLONE_STOPPED)) - wake_up_new_task(p, clone_flags); - else __set_task_state(p, TASK_STOPPED); - - if (unlikely (trace)) { - current->ptrace_message = nr; - ptrace_notify ((trace << 8) | SIGTRAP); + } else { + wake_up_new_task(p, clone_flags); } + tracehook_report_clone_complete(trace, regs, + clone_flags, nr, p); + if (clone_flags & CLONE_VFORK) { freezer_do_not_count(); wait_for_completion(&vfork); -- cgit v1.2.3-70-g09d2 From daded34be96b1975ff8539ff62ad8b158ce7d842 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:47 -0700 Subject: tracehook: vfork-done This moves the PTRACE_EVENT_VFORK_DONE tracing into a tracehook.h inline, tracehook_report_vfork_done(). The change has no effect, just clean-up. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 18 ++++++++++++++++++ kernel/fork.c | 5 +---- 2 files changed, 19 insertions(+), 4 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 3ebc58b5976..830e6e16097 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -210,4 +210,22 @@ static inline void tracehook_report_clone_complete(int trace, ptrace_event(0, trace, pid); } +/** + * tracehook_report_vfork_done - vfork parent's child has exited or exec'd + * @child: child task, already running + * @pid: new child's PID in the parent's namespace + * + * Called after a %CLONE_VFORK parent has waited for the child to complete. + * The clone/vfork system call will return immediately after this. + * The @child pointer may be invalid if a self-reaping child died and + * tracehook_report_clone() took no action to prevent it from self-reaping. + * + * Called with no locks held. + */ +static inline void tracehook_report_vfork_done(struct task_struct *child, + pid_t pid) +{ + ptrace_event(PT_TRACE_VFORK_DONE, PTRACE_EVENT_VFORK_DONE, pid); +} + #endif /* */ diff --git a/kernel/fork.c b/kernel/fork.c index b42f8ed2361..abb3ed6298f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1414,10 +1414,7 @@ long do_fork(unsigned long clone_flags, freezer_do_not_count(); wait_for_completion(&vfork); freezer_count(); - if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE)) { - current->ptrace_message = nr; - ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP); - } + tracehook_report_vfork_done(p, nr); } } else { nr = PTR_ERR(p); -- cgit v1.2.3-70-g09d2 From dae33574dcf5211e1f43c7e45fa29f73ba3e00cb Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:48 -0700 Subject: tracehook: release_task This moves the ptrace-related logic from release_task into tracehook.h and ptrace.h inlines. It provides clean hooks both before and after locking tasklist_lock, for future tracing logic to do more cleanup without the lock. This also changes release_task() itself in the rare "zap_leader" case to set the leader to EXIT_DEAD before iterating. This maintains the invariant that release_task() only ever handles a task in EXIT_DEAD. This is a common-sense invariant that is already always true except in this one arcane case of zombie leader whose parent ignores SIGCHLD. This change is harmless and only costs one store in this one rare case. It keeps the expected state more consisently sane, which is nicer when debugging weirdness in release_task(). It also lets some future code in the tracehook entry points rely on this invariant for bookkeeping. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/ptrace.h | 13 +++++++++++++ include/linux/tracehook.h | 28 ++++++++++++++++++++++++++++ kernel/exit.c | 21 +++++++++------------ 3 files changed, 50 insertions(+), 12 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h index dae6d85520f..ed69c03692d 100644 --- a/include/linux/ptrace.h +++ b/include/linux/ptrace.h @@ -176,6 +176,19 @@ static inline void ptrace_init_task(struct task_struct *child, bool ptrace) } } +/** + * ptrace_release_task - final ptrace-related cleanup of a zombie being reaped + * @task: task in %EXIT_DEAD state + * + * Called with write_lock(&tasklist_lock) held. + */ +static inline void ptrace_release_task(struct task_struct *task) +{ + BUG_ON(!list_empty(&task->ptraced)); + ptrace_unlink(task); + BUG_ON(!list_empty(&task->ptrace_entry)); +} + #ifndef force_successful_syscall_return /* * System call handlers that, upon successful completion, need to return a diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 830e6e16097..9a5b3be2503 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -228,4 +228,32 @@ static inline void tracehook_report_vfork_done(struct task_struct *child, ptrace_event(PT_TRACE_VFORK_DONE, PTRACE_EVENT_VFORK_DONE, pid); } +/** + * tracehook_prepare_release_task - task is being reaped, clean up tracing + * @task: task in %EXIT_DEAD state + * + * This is called in release_task() just before @task gets finally reaped + * and freed. This would be the ideal place to remove and clean up any + * tracing-related state for @task. + * + * Called with no locks held. + */ +static inline void tracehook_prepare_release_task(struct task_struct *task) +{ +} + +/** + * tracehook_finish_release_task - task is being reaped, clean up tracing + * @task: task in %EXIT_DEAD state + * + * This is called in release_task() when @task is being in the middle of + * being reaped. After this, there must be no tracing entanglements. + * + * Called with write_lock_irq(&tasklist_lock) held. + */ +static inline void tracehook_finish_release_task(struct task_struct *task) +{ + ptrace_release_task(task); +} + #endif /* */ diff --git a/kernel/exit.c b/kernel/exit.c index c3691cbc220..da28745f7c3 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -163,27 +163,17 @@ static void delayed_put_task_struct(struct rcu_head *rhp) put_task_struct(container_of(rhp, struct task_struct, rcu)); } -/* - * Do final ptrace-related cleanup of a zombie being reaped. - * - * Called with write_lock(&tasklist_lock) held. - */ -static void ptrace_release_task(struct task_struct *p) -{ - BUG_ON(!list_empty(&p->ptraced)); - ptrace_unlink(p); - BUG_ON(!list_empty(&p->ptrace_entry)); -} void release_task(struct task_struct * p) { struct task_struct *leader; int zap_leader; repeat: + tracehook_prepare_release_task(p); atomic_dec(&p->user->processes); proc_flush_task(p); write_lock_irq(&tasklist_lock); - ptrace_release_task(p); + tracehook_finish_release_task(p); __exit_signal(p); /* @@ -205,6 +195,13 @@ repeat: * that case. */ zap_leader = task_detached(leader); + + /* + * This maintains the invariant that release_task() + * only runs on a task in EXIT_DEAD, just for sanity. + */ + if (zap_leader) + leader->exit_state = EXIT_DEAD; } write_unlock_irq(&tasklist_lock); -- cgit v1.2.3-70-g09d2 From 0d094efeb1e98010c6b99923f1eb7e17bf1e3a74 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:49 -0700 Subject: tracehook: tracehook_tracer_task This adds the tracehook_tracer_task() hook to consolidate all forms of "Who is using ptrace on me?" logic. This is used for "TracerPid:" in /proc and for permission checks. We also clean up the selinux code the called an identical accessor. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/proc/array.c | 9 +++++++-- fs/proc/base.c | 13 +++++++++---- include/linux/tracehook.h | 18 ++++++++++++++++++ security/selinux/hooks.c | 22 +++------------------- 4 files changed, 37 insertions(+), 25 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/fs/proc/array.c b/fs/proc/array.c index 797d775e035..0d6eb33597c 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -80,6 +80,7 @@ #include #include #include +#include #include #include @@ -168,8 +169,12 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns, rcu_read_lock(); ppid = pid_alive(p) ? task_tgid_nr_ns(rcu_dereference(p->real_parent), ns) : 0; - tpid = pid_alive(p) && p->ptrace ? - task_pid_nr_ns(rcu_dereference(p->parent), ns) : 0; + tpid = 0; + if (pid_alive(p)) { + struct task_struct *tracer = tracehook_tracer_task(p); + if (tracer) + tpid = task_pid_nr_ns(tracer, ns); + } seq_printf(m, "State:\t%s\n" "Tgid:\t%d\n" diff --git a/fs/proc/base.c b/fs/proc/base.c index a891fe4cb43..4b74dba69a6 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -69,6 +69,7 @@ #include #include #include +#include #include #include #include @@ -231,10 +232,14 @@ static int check_mem_permission(struct task_struct *task) * If current is actively ptrace'ing, and would also be * permitted to freshly attach with ptrace now, permit it. */ - if (task->parent == current && (task->ptrace & PT_PTRACED) && - task_is_stopped_or_traced(task) && - ptrace_may_access(task, PTRACE_MODE_ATTACH)) - return 0; + if (task_is_stopped_or_traced(task)) { + int match; + rcu_read_lock(); + match = (tracehook_tracer_task(task) == current); + rcu_read_unlock(); + if (match && ptrace_may_access(task, PTRACE_MODE_ATTACH)) + return 0; + } /* * Noone else is allowed. diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 9a5b3be2503..6468ca0fe69 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -72,6 +72,24 @@ static inline int tracehook_unsafe_exec(struct task_struct *task) return unsafe; } +/** + * tracehook_tracer_task - return the task that is tracing the given task + * @tsk: task to consider + * + * Returns NULL if noone is tracing @task, or the &struct task_struct + * pointer to its tracer. + * + * Must called under rcu_read_lock(). The pointer returned might be kept + * live only by RCU. During exec, this may be called with task_lock() + * held on @task, still held from when tracehook_unsafe_exec() was called. + */ +static inline struct task_struct *tracehook_tracer_task(struct task_struct *tsk) +{ + if (task_ptrace(tsk) & PT_PTRACED) + return rcu_dereference(tsk->parent); + return NULL; +} + /** * tracehook_report_exec - a successful exec was completed * @fmt: &struct linux_binfmt that performed the exec diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 63f131fc42e..3481cde5bf1 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -25,7 +25,7 @@ #include #include -#include +#include #include #include #include @@ -1971,22 +1971,6 @@ static int selinux_vm_enough_memory(struct mm_struct *mm, long pages) return __vm_enough_memory(mm, pages, cap_sys_admin); } -/** - * task_tracer_task - return the task that is tracing the given task - * @task: task to consider - * - * Returns NULL if noone is tracing @task, or the &struct task_struct - * pointer to its tracer. - * - * Must be called under rcu_read_lock(). - */ -static struct task_struct *task_tracer_task(struct task_struct *task) -{ - if (task->ptrace & PT_PTRACED) - return rcu_dereference(task->parent); - return NULL; -} - /* binprm security operations */ static int selinux_bprm_alloc_security(struct linux_binprm *bprm) @@ -2238,7 +2222,7 @@ static void selinux_bprm_apply_creds(struct linux_binprm *bprm, int unsafe) u32 ptsid = 0; rcu_read_lock(); - tracer = task_tracer_task(current); + tracer = tracehook_tracer_task(current); if (likely(tracer != NULL)) { sec = tracer->security; ptsid = sec->sid; @@ -5247,7 +5231,7 @@ static int selinux_setprocattr(struct task_struct *p, Otherwise, leave SID unchanged and fail. */ task_lock(p); rcu_read_lock(); - tracer = task_tracer_task(p); + tracer = tracehook_tracer_task(p); if (tracer != NULL) { struct task_security_struct *ptsec = tracer->security; u32 ptsid = ptsec->sid; -- cgit v1.2.3-70-g09d2 From fa8e26ccd485216fc45c8c2dd1ec3b7ef1a0a2f8 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:50 -0700 Subject: tracehook: tracehook_expect_breakpoints This adds tracehook_expect_breakpoints() as a formal hook for the nommu code to use for its, "Is text-poking likely?" check at mmap time. This names the actual semantics the code means to test, and documents it. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 15 +++++++++++++++ mm/nommu.c | 4 ++-- 2 files changed, 17 insertions(+), 2 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 6468ca0fe69..e113e09b034 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -51,6 +51,21 @@ #include struct linux_binprm; +/** + * tracehook_expect_breakpoints - guess if task memory might be touched + * @task: current task, making a new mapping + * + * Return nonzero if @task is expected to want breakpoint insertion in + * its memory at some point. A zero return is no guarantee it won't + * be done, but this is a hint that it's known to be likely. + * + * May be called with @task->mm->mmap_sem held for writing. + */ +static inline int tracehook_expect_breakpoints(struct task_struct *task) +{ + return (task_ptrace(task) & PT_PTRACED) != 0; +} + /** * tracehook_unsafe_exec - check for exec declared unsafe due to tracing * @task: current task doing exec diff --git a/mm/nommu.c b/mm/nommu.c index 4462b6a3fcb..5edccd9c921 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -22,7 +22,7 @@ #include #include #include -#include +#include #include #include #include @@ -745,7 +745,7 @@ static unsigned long determine_vm_flags(struct file *file, * it's being traced - otherwise breakpoints set in it may interfere * with another untraced process */ - if ((flags & MAP_PRIVATE) && (current->ptrace & PT_PTRACED)) + if ((flags & MAP_PRIVATE) && tracehook_expect_breakpoints(current)) vm_flags &= ~VM_MAYSHARE; return vm_flags; -- cgit v1.2.3-70-g09d2 From c45aea27617d6a1e0aacddc3b0233f704222fcbd Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:50 -0700 Subject: tracehook: tracehook_signal_handler This defines tracehook_signal_handler() as a hook for the arch signal handling code to call. It gives ptrace the opportunity to stop for a pseudo-single-step trap immediately after signal handler setup is done. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index e113e09b034..2d1426f8e33 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -289,4 +289,27 @@ static inline void tracehook_finish_release_task(struct task_struct *task) ptrace_release_task(task); } +/** + * tracehook_signal_handler - signal handler setup is complete + * @sig: number of signal being delivered + * @info: siginfo_t of signal being delivered + * @ka: sigaction setting that chose the handler + * @regs: user register state + * @stepping: nonzero if debugger single-step or block-step in use + * + * Called by the arch code after a signal handler has been set up. + * Register and stack state reflects the user handler about to run. + * Signal mask changes have already been made. + * + * Called without locks, shortly before returning to user mode + * (or handling more signals). + */ +static inline void tracehook_signal_handler(int sig, siginfo_t *info, + const struct k_sigaction *ka, + struct pt_regs *regs, int stepping) +{ + if (stepping) + ptrace_notify(SIGTRAP); +} + #endif /* */ -- cgit v1.2.3-70-g09d2 From 35de254dc60f91004b3b5ebb1fc7b2c3093d6032 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:51 -0700 Subject: tracehook: tracehook_consider_ignored_signal This defines tracehook_consider_ignored_signal() has a fine-grained hook for deciding to prevent the normal short-circuit of sending an ignored signal, as ptrace does. There is no change, only cleanup. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 19 +++++++++++++++++++ kernel/signal.c | 27 ++++++++++++++++----------- 2 files changed, 35 insertions(+), 11 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 2d1426f8e33..8cffd34f88d 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -312,4 +312,23 @@ static inline void tracehook_signal_handler(int sig, siginfo_t *info, ptrace_notify(SIGTRAP); } +/** + * tracehook_consider_ignored_signal - suppress short-circuit of ignored signal + * @task: task receiving the signal + * @sig: signal number being sent + * @handler: %SIG_IGN or %SIG_DFL + * + * Return zero iff tracing doesn't care to examine this ignored signal, + * so it can short-circuit normal delivery and never even get queued. + * Either @handler is %SIG_DFL and @sig's default is ignore, or it's %SIG_IGN. + * + * Called with @task->sighand->siglock held. + */ +static inline int tracehook_consider_ignored_signal(struct task_struct *task, + int sig, + void __user *handler) +{ + return (task_ptrace(task) & PT_PTRACED) != 0; +} + #endif /* */ diff --git a/kernel/signal.c b/kernel/signal.c index 8715c18b27b..9efd1cee6d0 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -39,24 +40,21 @@ static struct kmem_cache *sigqueue_cachep; -static int __sig_ignored(struct task_struct *t, int sig) +static void __user *sig_handler(struct task_struct *t, int sig) { - void __user *handler; + return t->sighand->action[sig - 1].sa.sa_handler; +} +static int sig_handler_ignored(void __user *handler, int sig) +{ /* Is it explicitly or implicitly ignored? */ - - handler = t->sighand->action[sig - 1].sa.sa_handler; return handler == SIG_IGN || (handler == SIG_DFL && sig_kernel_ignore(sig)); } static int sig_ignored(struct task_struct *t, int sig) { - /* - * Tracers always want to know about signals.. - */ - if (t->ptrace & PT_PTRACED) - return 0; + void __user *handler; /* * Blocked signals are never ignored, since the @@ -66,7 +64,14 @@ static int sig_ignored(struct task_struct *t, int sig) if (sigismember(&t->blocked, sig) || sigismember(&t->real_blocked, sig)) return 0; - return __sig_ignored(t, sig); + handler = sig_handler(t, sig); + if (!sig_handler_ignored(handler, sig)) + return 0; + + /* + * Tracers may want to know about even ignored signals. + */ + return !tracehook_consider_ignored_signal(t, sig, handler); } /* @@ -2298,7 +2303,7 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact) * (for example, SIGCHLD), shall cause the pending signal to * be discarded, whether or not it is blocked" */ - if (__sig_ignored(t, sig)) { + if (sig_handler_ignored(sig_handler(t, sig), sig)) { sigemptyset(&mask); sigaddset(&mask, sig); rm_from_queue_full(&mask, &t->signal->shared_pending); -- cgit v1.2.3-70-g09d2 From 445a91d2fe3667fb8fc251433645f686933cf56a Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:52 -0700 Subject: tracehook: tracehook_consider_fatal_signal This defines tracehook_consider_fatal_signal() has a fine-grained hook for deciding to skip the special cases for a fatal signal, as ptrace does. There is no change, only cleanup. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 21 +++++++++++++++++++++ kernel/signal.c | 9 +++++---- 2 files changed, 26 insertions(+), 4 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 8cffd34f88d..8b4c15e208f 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -331,4 +331,25 @@ static inline int tracehook_consider_ignored_signal(struct task_struct *task, return (task_ptrace(task) & PT_PTRACED) != 0; } +/** + * tracehook_consider_fatal_signal - suppress special handling of fatal signal + * @task: task receiving the signal + * @sig: signal number being sent + * @handler: %SIG_DFL or %SIG_IGN + * + * Return nonzero to prevent special handling of this termination signal. + * Normally @handler is %SIG_DFL. It can be %SIG_IGN if @sig is ignored, + * in which case force_sig() is about to reset it to %SIG_DFL. + * When this returns zero, this signal might cause a quick termination + * that does not give the debugger a chance to intercept the signal. + * + * Called with or without @task->sighand->siglock held. + */ +static inline int tracehook_consider_fatal_signal(struct task_struct *task, + int sig, + void __user *handler) +{ + return (task_ptrace(task) & PT_PTRACED) != 0; +} + #endif /* */ diff --git a/kernel/signal.c b/kernel/signal.c index 9efd1cee6d0..1a942ce32ba 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -300,12 +300,12 @@ flush_signal_handlers(struct task_struct *t, int force_default) int unhandled_signal(struct task_struct *tsk, int sig) { + void __user *handler = tsk->sighand->action[sig-1].sa.sa_handler; if (is_global_init(tsk)) return 1; - if (tsk->ptrace & PT_PTRACED) + if (handler != SIG_IGN && handler != SIG_DFL) return 0; - return (tsk->sighand->action[sig-1].sa.sa_handler == SIG_IGN) || - (tsk->sighand->action[sig-1].sa.sa_handler == SIG_DFL); + return !tracehook_consider_fatal_signal(tsk, sig, handler); } @@ -761,7 +761,8 @@ static void complete_signal(int sig, struct task_struct *p, int group) if (sig_fatal(p, sig) && !(signal->flags & (SIGNAL_UNKILLABLE | SIGNAL_GROUP_EXIT)) && !sigismember(&t->real_blocked, sig) && - (sig == SIGKILL || !(t->ptrace & PT_PTRACED))) { + (sig == SIGKILL || + !tracehook_consider_fatal_signal(t, sig, SIG_DFL))) { /* * This signal will be fatal to the whole group. */ -- cgit v1.2.3-70-g09d2 From 283d7559e7712f95a05331eb0a85394c6368101b Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:52 -0700 Subject: tracehook: syscall This adds standard tracehook.h inlines for arch code to call when TIF_SYSCALL_TRACE has been set. This replaces having each arch implement the ptrace guts for its syscall tracing support. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 70 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 8b4c15e208f..3548694a24d 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -66,6 +66,76 @@ static inline int tracehook_expect_breakpoints(struct task_struct *task) return (task_ptrace(task) & PT_PTRACED) != 0; } +/* + * ptrace report for syscall entry and exit looks identical. + */ +static inline void ptrace_report_syscall(struct pt_regs *regs) +{ + int ptrace = task_ptrace(current); + + if (!(ptrace & PT_PTRACED)) + return; + + ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0)); + + /* + * this isn't the same as continuing with a signal, but it will do + * for normal use. strace only continues with a signal if the + * stopping signal is not SIGTRAP. -brl + */ + if (current->exit_code) { + send_sig(current->exit_code, current, 1); + current->exit_code = 0; + } +} + +/** + * tracehook_report_syscall_entry - task is about to attempt a system call + * @regs: user register state of current task + * + * This will be called if %TIF_SYSCALL_TRACE has been set, when the + * current task has just entered the kernel for a system call. + * Full user register state is available here. Changing the values + * in @regs can affect the system call number and arguments to be tried. + * It is safe to block here, preventing the system call from beginning. + * + * Returns zero normally, or nonzero if the calling arch code should abort + * the system call. That must prevent normal entry so no system call is + * made. If @task ever returns to user mode after this, its register state + * is unspecified, but should be something harmless like an %ENOSYS error + * return. + * + * Called without locks, just after entering kernel mode. + */ +static inline __must_check int tracehook_report_syscall_entry( + struct pt_regs *regs) +{ + ptrace_report_syscall(regs); + return 0; +} + +/** + * tracehook_report_syscall_exit - task has just finished a system call + * @regs: user register state of current task + * @step: nonzero if simulating single-step or block-step + * + * This will be called if %TIF_SYSCALL_TRACE has been set, when the + * current task has just finished an attempted system call. Full + * user register state is available here. It is safe to block here, + * preventing signals from being processed. + * + * If @step is nonzero, this report is also in lieu of the normal + * trap that would follow the system call instruction because + * user_enable_block_step() or user_enable_single_step() was used. + * In this case, %TIF_SYSCALL_TRACE might not be set. + * + * Called without locks, just before checking for pending signals. + */ +static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step) +{ + ptrace_report_syscall(regs); +} + /** * tracehook_unsafe_exec - check for exec declared unsafe due to tracing * @task: current task doing exec -- cgit v1.2.3-70-g09d2 From 7bcf6a2ca5f639b038c48711ebe6c4eca2036641 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:53 -0700 Subject: tracehook: get_signal_to_deliver This defines the tracehook_get_signal() hook to allow tracing code to slip in before normal signal dequeuing. This lays the groundwork for new tracing features that can inject synthetic signals outside the normal queue or control the disposition of delivered signals. The calling convention lets tracehook_get_signal() decide both exactly what will happen and what signal number to report in the handler/exit. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 29 +++++++++++++++++++++++++++++ kernel/signal.c | 38 +++++++++++++++++++++++++++----------- 2 files changed, 56 insertions(+), 11 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 3548694a24d..42a0d7b1195 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -422,4 +422,33 @@ static inline int tracehook_consider_fatal_signal(struct task_struct *task, return (task_ptrace(task) & PT_PTRACED) != 0; } +/** + * tracehook_get_signal - deliver synthetic signal to traced task + * @task: @current + * @regs: task_pt_regs(@current) + * @info: details of synthetic signal + * @return_ka: sigaction for synthetic signal + * + * Return zero to check for a real pending signal normally. + * Return -1 after releasing the siglock to repeat the check. + * Return a signal number to induce an artifical signal delivery, + * setting *@info and *@return_ka to specify its details and behavior. + * + * The @return_ka->sa_handler value controls the disposition of the + * signal, no matter the signal number. For %SIG_DFL, the return value + * is a representative signal to indicate the behavior (e.g. %SIGTERM + * for death, %SIGQUIT for core dump, %SIGSTOP for job control stop, + * %SIGTSTP for stop unless in an orphaned pgrp), but the signal number + * reported will be @info->si_signo instead. + * + * Called with @task->sighand->siglock held, before dequeuing pending signals. + */ +static inline int tracehook_get_signal(struct task_struct *task, + struct pt_regs *regs, + siginfo_t *info, + struct k_sigaction *return_ka) +{ + return 0; +} + #endif /* */ diff --git a/kernel/signal.c b/kernel/signal.c index 1a942ce32ba..10b31ecdd9c 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1754,17 +1754,33 @@ relock: do_signal_stop(0)) goto relock; - signr = dequeue_signal(current, ¤t->blocked, info); - if (!signr) - break; /* will return 0 */ + /* + * Tracing can induce an artifical signal and choose sigaction. + * The return value in @signr determines the default action, + * but @info->si_signo is the signal number we will report. + */ + signr = tracehook_get_signal(current, regs, info, return_ka); + if (unlikely(signr < 0)) + goto relock; + if (unlikely(signr != 0)) + ka = return_ka; + else { + signr = dequeue_signal(current, ¤t->blocked, + info); - if (signr != SIGKILL) { - signr = ptrace_signal(signr, info, regs, cookie); if (!signr) - continue; + break; /* will return 0 */ + + if (signr != SIGKILL) { + signr = ptrace_signal(signr, info, + regs, cookie); + if (!signr) + continue; + } + + ka = &sighand->action[signr-1]; } - ka = &sighand->action[signr-1]; if (ka->sa.sa_handler == SIG_IGN) /* Do nothing. */ continue; if (ka->sa.sa_handler != SIG_DFL) { @@ -1812,7 +1828,7 @@ relock: spin_lock_irq(&sighand->siglock); } - if (likely(do_signal_stop(signr))) { + if (likely(do_signal_stop(info->si_signo))) { /* It released the siglock. */ goto relock; } @@ -1833,7 +1849,7 @@ relock: if (sig_kernel_coredump(signr)) { if (print_fatal_signals) - print_fatal_signal(regs, signr); + print_fatal_signal(regs, info->si_signo); /* * If it was able to dump core, this kills all * other threads in the group and synchronizes with @@ -1842,13 +1858,13 @@ relock: * first and our do_group_exit call below will use * that value and ignore the one we pass it. */ - do_coredump((long)signr, signr, regs); + do_coredump(info->si_signo, info->si_signo, regs); } /* * Death signals, no core dump. */ - do_group_exit(signr); + do_group_exit(info->si_signo); /* NOTREACHED */ } spin_unlock_irq(&sighand->siglock); -- cgit v1.2.3-70-g09d2 From fa00b80b3c41a845b3d56f866fb40a2e98754c51 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:54 -0700 Subject: tracehook: job control This defines the tracehook_notify_jctl() hook to formalize the ptrace effects on the job control notifications. There is no change, only cleanup. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 20 ++++++++++++++++++++ kernel/signal.c | 10 +++++----- 2 files changed, 25 insertions(+), 5 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 42a0d7b1195..6dc428dd2f3 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -451,4 +451,24 @@ static inline int tracehook_get_signal(struct task_struct *task, return 0; } +/** + * tracehook_notify_jctl - report about job control stop/continue + * @notify: nonzero if this is the last thread in the group to stop + * @why: %CLD_STOPPED or %CLD_CONTINUED + * + * This is called when we might call do_notify_parent_cldstop(). + * It's called when about to stop for job control; we are already in + * %TASK_STOPPED state, about to call schedule(). It's also called when + * a delayed %CLD_STOPPED or %CLD_CONTINUED report is ready to be made. + * + * Return nonzero to generate a %SIGCHLD with @why, which is + * normal if @notify is nonzero. + * + * Called with no locks held. + */ +static inline int tracehook_notify_jctl(int notify, int why) +{ + return notify || (current->ptrace & PT_PTRACED); +} + #endif /* */ diff --git a/kernel/signal.c b/kernel/signal.c index 10b31ecdd9c..e9e699f4b1b 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -596,9 +596,6 @@ static int check_kill_permission(int sig, struct siginfo *info, return security_task_kill(t, info, sig, 0); } -/* forward decl */ -static void do_notify_parent_cldstop(struct task_struct *tsk, int why); - /* * Handle magic process-wide effects of stop/continue signals. Unlike * the signal actions, these happen immediately at signal-generation @@ -1605,7 +1602,7 @@ finish_stop(int stop_count) * a group stop in progress and we are the last to stop, * report to the parent. When ptraced, every thread reports itself. */ - if (stop_count == 0 || (current->ptrace & PT_PTRACED)) { + if (tracehook_notify_jctl(stop_count == 0, CLD_STOPPED)) { read_lock(&tasklist_lock); do_notify_parent_cldstop(current, CLD_STOPPED); read_unlock(&tasklist_lock); @@ -1741,6 +1738,9 @@ relock: signal->flags &= ~SIGNAL_CLD_MASK; spin_unlock_irq(&sighand->siglock); + if (unlikely(!tracehook_notify_jctl(1, why))) + goto relock; + read_lock(&tasklist_lock); do_notify_parent_cldstop(current->group_leader, why); read_unlock(&tasklist_lock); @@ -1906,7 +1906,7 @@ void exit_signals(struct task_struct *tsk) out: spin_unlock_irq(&tsk->sighand->siglock); - if (unlikely(group_stop)) { + if (unlikely(group_stop) && tracehook_notify_jctl(1, CLD_STOPPED)) { read_lock(&tasklist_lock); do_notify_parent_cldstop(tsk, CLD_STOPPED); read_unlock(&tasklist_lock); -- cgit v1.2.3-70-g09d2 From 2b2a1ff64afbadac842bbc58c5166962cf4f7664 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:54 -0700 Subject: tracehook: death This moves the ptrace logic in task death (exit_notify) into tracehook.h inlines. Some code is rearranged slightly to make things nicer. There is no change, only cleanup. There is one hook called with the tasklist_lock write-locked, as ptrace needs. There is also a new hook called after exit_state changes and without locks. This is a better place for tracing work to be in the future, since it doesn't delay the whole system with locking. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/sched.h | 2 +- include/linux/tracehook.h | 52 +++++++++++++++++++++++++++++++++++++++++++++++ kernel/exit.c | 26 ++++++++---------------- kernel/signal.c | 10 ++++++--- 4 files changed, 69 insertions(+), 21 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/sched.h b/include/linux/sched.h index adb8077dc46..a95d84d0da9 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1796,7 +1796,7 @@ extern int kill_pid_info_as_uid(int, struct siginfo *, struct pid *, uid_t, uid_ extern int kill_pgrp(struct pid *pid, int sig, int priv); extern int kill_pid(struct pid *pid, int sig, int priv); extern int kill_proc_info(int, struct siginfo *, pid_t); -extern void do_notify_parent(struct task_struct *, int); +extern int do_notify_parent(struct task_struct *, int); extern void force_sig(int, struct task_struct *); extern void force_sig_specific(int, struct task_struct *); extern int send_sig(int, struct task_struct *, int); diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 6dc428dd2f3..4c50e1b5734 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -471,4 +471,56 @@ static inline int tracehook_notify_jctl(int notify, int why) return notify || (current->ptrace & PT_PTRACED); } +/** + * tracehook_notify_death - task is dead, ready to notify parent + * @task: @current task now exiting + * @death_cookie: value to pass to tracehook_report_death() + * @group_dead: nonzero if this was the last thread in the group to die + * + * Return the signal number to send our parent with do_notify_parent(), or + * zero to send no signal and leave a zombie, or -1 to self-reap right now. + * + * Called with write_lock_irq(&tasklist_lock) held. + */ +static inline int tracehook_notify_death(struct task_struct *task, + void **death_cookie, int group_dead) +{ + if (task->exit_signal == -1) + return task->ptrace ? SIGCHLD : -1; + + /* + * If something other than our normal parent is ptracing us, then + * send it a SIGCHLD instead of honoring exit_signal. exit_signal + * only has special meaning to our real parent. + */ + if (thread_group_empty(task) && !ptrace_reparented(task)) + return task->exit_signal; + + return task->ptrace ? SIGCHLD : 0; +} + +/** + * tracehook_report_death - task is dead and ready to be reaped + * @task: @current task now exiting + * @signal: signal number sent to parent, or 0 or -1 + * @death_cookie: value passed back from tracehook_notify_death() + * @group_dead: nonzero if this was the last thread in the group to die + * + * Thread has just become a zombie or is about to self-reap. If positive, + * @signal is the signal number just sent to the parent (usually %SIGCHLD). + * If @signal is -1, this thread will self-reap. If @signal is 0, this is + * a delayed_group_leader() zombie. The @death_cookie was passed back by + * tracehook_notify_death(). + * + * If normal reaping is not inhibited, @task->exit_state might be changing + * in parallel. + * + * Called without locks. + */ +static inline void tracehook_report_death(struct task_struct *task, + int signal, void *death_cookie, + int group_dead) +{ +} + #endif /* */ diff --git a/kernel/exit.c b/kernel/exit.c index da28745f7c3..6cdf60712bd 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -885,7 +885,8 @@ static void forget_original_parent(struct task_struct *father) */ static void exit_notify(struct task_struct *tsk, int group_dead) { - int state; + int signal; + void *cookie; /* * This does two things: @@ -922,22 +923,11 @@ static void exit_notify(struct task_struct *tsk, int group_dead) !capable(CAP_KILL)) tsk->exit_signal = SIGCHLD; - /* If something other than our normal parent is ptracing us, then - * send it a SIGCHLD instead of honoring exit_signal. exit_signal - * only has special meaning to our real parent. - */ - if (!task_detached(tsk) && thread_group_empty(tsk)) { - int signal = ptrace_reparented(tsk) ? - SIGCHLD : tsk->exit_signal; - do_notify_parent(tsk, signal); - } else if (tsk->ptrace) { - do_notify_parent(tsk, SIGCHLD); - } + signal = tracehook_notify_death(tsk, &cookie, group_dead); + if (signal > 0) + signal = do_notify_parent(tsk, signal); - state = EXIT_ZOMBIE; - if (task_detached(tsk) && likely(!tsk->ptrace)) - state = EXIT_DEAD; - tsk->exit_state = state; + tsk->exit_state = signal < 0 ? EXIT_DEAD : EXIT_ZOMBIE; /* mt-exec, de_thread() is waiting for us */ if (thread_group_leader(tsk) && @@ -947,8 +937,10 @@ static void exit_notify(struct task_struct *tsk, int group_dead) write_unlock_irq(&tasklist_lock); + tracehook_report_death(tsk, signal, cookie, group_dead); + /* If the process is dead, release it - nobody will wait for it */ - if (state == EXIT_DEAD) + if (signal < 0) release_task(tsk); } diff --git a/kernel/signal.c b/kernel/signal.c index e9e699f4b1b..0e862d3130f 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1326,9 +1326,11 @@ static inline void __wake_up_parent(struct task_struct *p, /* * Let a parent know about the death of a child. * For a stopped/continued status change, use do_notify_parent_cldstop instead. + * + * Returns -1 if our parent ignored us and so we've switched to + * self-reaping, or else @sig. */ - -void do_notify_parent(struct task_struct *tsk, int sig) +int do_notify_parent(struct task_struct *tsk, int sig) { struct siginfo info; unsigned long flags; @@ -1399,12 +1401,14 @@ void do_notify_parent(struct task_struct *tsk, int sig) */ tsk->exit_signal = -1; if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN) - sig = 0; + sig = -1; } if (valid_signal(sig) && sig > 0) __group_send_sig_info(sig, &info, tsk->parent); __wake_up_parent(tsk, tsk->parent); spin_unlock_irqrestore(&psig->siglock, flags); + + return sig; } static void do_notify_parent_cldstop(struct task_struct *tsk, int why) -- cgit v1.2.3-70-g09d2 From b787f7ba677840da16a2228c16571ce8a1fcb799 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:55 -0700 Subject: tracehook: force signal_pending() This defines a new hook tracehook_force_sigpending() that lets tracing code decide to force TIF_SIGPENDING on in recalc_sigpending(). This is not used yet, so it compiles away to nothing for now. It lays the groundwork for new tracing code that can interrupt a task synthetically without actually sending a signal. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 14 ++++++++++++++ kernel/signal.c | 4 +++- 2 files changed, 17 insertions(+), 1 deletion(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 4c50e1b5734..43bc51b6bd3 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -422,6 +422,20 @@ static inline int tracehook_consider_fatal_signal(struct task_struct *task, return (task_ptrace(task) & PT_PTRACED) != 0; } +/** + * tracehook_force_sigpending - let tracing force signal_pending(current) on + * + * Called when recomputing our signal_pending() flag. Return nonzero + * to force the signal_pending() flag on, so that tracehook_get_signal() + * will be called before the next return to user mode. + * + * Called with @current->sighand->siglock held. + */ +static inline int tracehook_force_sigpending(void) +{ + return 0; +} + /** * tracehook_get_signal - deliver synthetic signal to traced task * @task: @current diff --git a/kernel/signal.c b/kernel/signal.c index 0e862d3130f..954f77d7e3b 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -134,7 +134,9 @@ void recalc_sigpending_and_wake(struct task_struct *t) void recalc_sigpending(void) { - if (!recalc_sigpending_tsk(current) && !freezing(current)) + if (unlikely(tracehook_force_sigpending())) + set_thread_flag(TIF_SIGPENDING); + else if (!recalc_sigpending_tsk(current) && !freezing(current)) clear_thread_flag(TIF_SIGPENDING); } -- cgit v1.2.3-70-g09d2 From 64b1208d5b0ef8859fd52ea7ae286a3eb994669b Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:56 -0700 Subject: tracehook: TIF_NOTIFY_RESUME This adds tracehook.h inlines to enable a new arch feature in support of user debugging/tracing. This is not used yet, but it lays the groundwork for a debugger to be able to wrangle a task that's possibly running, without interrupting its syscalls in progress. Each arch should define TIF_NOTIFY_RESUME, and in their entry.S code treat it much like TIF_SIGPENDING. That is, it causes you to take the slow path when returning to user mode, where you get the full user-mode state accessible as for signal handling or ptrace. The arch code should check TIF_NOTIFY_RESUME after handling TIF_SIGPENDING. When it's set, clear it and then call tracehook_notify_resume(). In future, tracing code will call set_notify_resume() when it wants to get a callback in tracehook_notify_resume(). Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 43bc51b6bd3..32867ab86c7 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -537,4 +537,38 @@ static inline void tracehook_report_death(struct task_struct *task, { } +#ifdef TIF_NOTIFY_RESUME +/** + * set_notify_resume - cause tracehook_notify_resume() to be called + * @task: task that will call tracehook_notify_resume() + * + * Calling this arranges that @task will call tracehook_notify_resume() + * before returning to user mode. If it's already running in user mode, + * it will enter the kernel and call tracehook_notify_resume() soon. + * If it's blocked, it will not be woken. + */ +static inline void set_notify_resume(struct task_struct *task) +{ + if (!test_and_set_tsk_thread_flag(task, TIF_NOTIFY_RESUME)) + kick_process(task); +} + +/** + * tracehook_notify_resume - report when about to return to user mode + * @regs: user-mode registers of @current task + * + * This is called when %TIF_NOTIFY_RESUME has been set. Now we are + * about to return to user mode, and the user state in @regs can be + * inspected or adjusted. The caller in arch code has cleared + * %TIF_NOTIFY_RESUME before the call. If the flag gets set again + * asynchronously, this will be called again before we return to + * user mode. + * + * Called without locks. + */ +static inline void tracehook_notify_resume(struct pt_regs *regs) +{ +} +#endif /* TIF_NOTIFY_RESUME */ + #endif /* */ -- cgit v1.2.3-70-g09d2 From 828c365cc8b8d38c346fccb19fa80d28f2240831 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Fri, 25 Jul 2008 19:45:57 -0700 Subject: tracehook: asm/syscall.h This adds asm-generic/syscall.h, which documents what a real asm-ARCH/syscall.h file should define. This is not used yet, but will provide all the machine-dependent details of examining a user system call about to begin, in progress, or just ended. Each arch should add an asm-ARCH/syscall.h that defines all the entry points documented in asm-generic/syscall.h, as short inlines if possible. This lets us write new tracing code that understands user system call registers, without any new arch-specific work. Signed-off-by: Roland McGrath Cc: Oleg Nesterov Reviewed-by: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/asm-generic/syscall.h | 141 ++++++++++++++++++++++++++++++++++++++++++ include/linux/tracehook.h | 3 +- 2 files changed, 143 insertions(+), 1 deletion(-) create mode 100644 include/asm-generic/syscall.h (limited to 'include/linux/tracehook.h') diff --git a/include/asm-generic/syscall.h b/include/asm-generic/syscall.h new file mode 100644 index 00000000000..abcf34c2fdc --- /dev/null +++ b/include/asm-generic/syscall.h @@ -0,0 +1,141 @@ +/* + * Access to user system call parameters and results + * + * Copyright (C) 2008 Red Hat, Inc. All rights reserved. + * + * This copyrighted material is made available to anyone wishing to use, + * modify, copy, or redistribute it subject to the terms and conditions + * of the GNU General Public License v.2. + * + * This file is a stub providing documentation for what functions + * asm-ARCH/syscall.h files need to define. Most arch definitions + * will be simple inlines. + * + * All of these functions expect to be called with no locks, + * and only when the caller is sure that the task of interest + * cannot return to user mode while we are looking at it. + */ + +#ifndef _ASM_SYSCALL_H +#define _ASM_SYSCALL_H 1 + +struct task_struct; +struct pt_regs; + +/** + * syscall_get_nr - find what system call a task is executing + * @task: task of interest, must be blocked + * @regs: task_pt_regs() of @task + * + * If @task is executing a system call or is at system call + * tracing about to attempt one, returns the system call number. + * If @task is not executing a system call, i.e. it's blocked + * inside the kernel for a fault or signal, returns -1. + * + * It's only valid to call this when @task is known to be blocked. + */ +long syscall_get_nr(struct task_struct *task, struct pt_regs *regs); + +/** + * syscall_rollback - roll back registers after an aborted system call + * @task: task of interest, must be in system call exit tracing + * @regs: task_pt_regs() of @task + * + * It's only valid to call this when @task is stopped for system + * call exit tracing (due to TIF_SYSCALL_TRACE or TIF_SYSCALL_AUDIT), + * after tracehook_report_syscall_entry() returned nonzero to prevent + * the system call from taking place. + * + * This rolls back the register state in @regs so it's as if the + * system call instruction was a no-op. The registers containing + * the system call number and arguments are as they were before the + * system call instruction. This may not be the same as what the + * register state looked like at system call entry tracing. + */ +void syscall_rollback(struct task_struct *task, struct pt_regs *regs); + +/** + * syscall_get_error - check result of traced system call + * @task: task of interest, must be blocked + * @regs: task_pt_regs() of @task + * + * Returns 0 if the system call succeeded, or -ERRORCODE if it failed. + * + * It's only valid to call this when @task is stopped for tracing on exit + * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. + */ +long syscall_get_error(struct task_struct *task, struct pt_regs *regs); + +/** + * syscall_get_return_value - get the return value of a traced system call + * @task: task of interest, must be blocked + * @regs: task_pt_regs() of @task + * + * Returns the return value of the successful system call. + * This value is meaningless if syscall_get_error() returned nonzero. + * + * It's only valid to call this when @task is stopped for tracing on exit + * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. + */ +long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs); + +/** + * syscall_set_return_value - change the return value of a traced system call + * @task: task of interest, must be blocked + * @regs: task_pt_regs() of @task + * @error: negative error code, or zero to indicate success + * @val: user return value if @error is zero + * + * This changes the results of the system call that user mode will see. + * If @error is zero, the user sees a successful system call with a + * return value of @val. If @error is nonzero, it's a negated errno + * code; the user sees a failed system call with this errno code. + * + * It's only valid to call this when @task is stopped for tracing on exit + * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. + */ +void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, + int error, long val); + +/** + * syscall_get_arguments - extract system call parameter values + * @task: task of interest, must be blocked + * @regs: task_pt_regs() of @task + * @i: argument index [0,5] + * @n: number of arguments; n+i must be [1,6]. + * @args: array filled with argument values + * + * Fetches @n arguments to the system call starting with the @i'th argument + * (from 0 through 5). Argument @i is stored in @args[0], and so on. + * An arch inline version is probably optimal when @i and @n are constants. + * + * It's only valid to call this when @task is stopped for tracing on + * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. + * It's invalid to call this with @i + @n > 6; we only support system calls + * taking up to 6 arguments. + */ +void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs, + unsigned int i, unsigned int n, unsigned long *args); + +/** + * syscall_set_arguments - change system call parameter value + * @task: task of interest, must be in system call entry tracing + * @regs: task_pt_regs() of @task + * @i: argument index [0,5] + * @n: number of arguments; n+i must be [1,6]. + * @args: array of argument values to store + * + * Changes @n arguments to the system call starting with the @i'th argument. + * @n'th argument to @val. Argument @i gets value @args[0], and so on. + * An arch inline version is probably optimal when @i and @n are constants. + * + * It's only valid to call this when @task is stopped for tracing on + * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. + * It's invalid to call this with @i + @n > 6; we only support system calls + * taking up to 6 arguments. + */ +void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs, + unsigned int i, unsigned int n, + const unsigned long *args); + +#endif /* _ASM_SYSCALL_H */ diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 32867ab86c7..589f429619c 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -103,7 +103,8 @@ static inline void ptrace_report_syscall(struct pt_regs *regs) * the system call. That must prevent normal entry so no system call is * made. If @task ever returns to user mode after this, its register state * is unspecified, but should be something harmless like an %ENOSYS error - * return. + * return. It should preserve enough information so that syscall_rollback() + * can work (see asm-generic/syscall.h). * * Called without locks, just after entering kernel mode. */ -- cgit v1.2.3-70-g09d2 From a9906a19193db69ad0158f289f839edf8aaf103f Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Sat, 26 Jul 2008 14:41:26 -0700 Subject: tracehook: comment fixes This fixes some typos and errors in comments. No code changes. Signed-off-by: Roland McGrath --- include/linux/tracehook.h | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 589f429619c..b1875582c1a 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -244,7 +244,7 @@ static inline int tracehook_prepare_clone(unsigned clone_flags) * tracehook_finish_clone - new child created and being attached * @child: new child task * @clone_flags: %CLONE_* flags from clone/fork/vfork system call - * @trace: return value from tracehook_clone_prepare() + * @trace: return value from tracehook_prepare_clone() * * This is called immediately after adding @child to its parent's children list. * The @trace value is that returned by tracehook_prepare_clone(). @@ -259,19 +259,20 @@ static inline void tracehook_finish_clone(struct task_struct *child, /** * tracehook_report_clone - in parent, new child is about to start running - * @trace: return value from tracehook_clone_prepare() + * @trace: return value from tracehook_prepare_clone() * @regs: parent's user register state * @clone_flags: flags from parent's system call * @pid: new child's PID in the parent's namespace * @child: new child task * - * Called after a child is set up, but before it has been started running. - * The @trace value is that returned by tracehook_clone_prepare(). - * This is not a good place to block, because the child has not started yet. - * Suspend the child here if desired, and block in tracehook_clone_complete(). - * This must prevent the child from self-reaping if tracehook_clone_complete() - * uses the @child pointer; otherwise it might have died and been released by - * the time tracehook_report_clone_complete() is called. + * Called after a child is set up, but before it has been started + * running. @trace is the value returned by tracehook_prepare_clone(). + * This is not a good place to block, because the child has not started + * yet. Suspend the child here if desired, and then block in + * tracehook_report_clone_complete(). This must prevent the child from + * self-reaping if tracehook_report_clone_complete() uses the @child + * pointer; otherwise it might have died and been released by the time + * tracehook_report_report_clone_complete() is called. * * Called with no locks held, but the child cannot run until this returns. */ @@ -290,7 +291,7 @@ static inline void tracehook_report_clone(int trace, struct pt_regs *regs, /** * tracehook_report_clone_complete - new child is running - * @trace: return value from tracehook_clone_prepare() + * @trace: return value from tracehook_prepare_clone() * @regs: parent's user register state * @clone_flags: flags from parent's system call * @pid: new child's PID in the parent's namespace @@ -347,7 +348,7 @@ static inline void tracehook_prepare_release_task(struct task_struct *task) } /** - * tracehook_finish_release_task - task is being reaped, clean up tracing + * tracehook_finish_release_task - final tracing clean-up * @task: task in %EXIT_DEAD state * * This is called in release_task() when @task is being in the middle of -- cgit v1.2.3-70-g09d2 From 5c7edcd7ee6b77b88252fe4096dce1a46a60c829 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Thu, 31 Jul 2008 02:04:09 -0700 Subject: tracehook: fix exit_signal=0 case My commit 2b2a1ff64afbadac842bbc58c5166962cf4f7664 introduced a regression (sorry about that) for the odd case of exit_signal=0 (e.g. clone_flags=0). This is not a normal use, but it's used by a case in the glibc test suite. Dying with exit_signal=0 sends no signal, but it's supposed to wake up a parent's blocked wait*() calls (unlike the delayed_group_leader case). This fixes tracehook_notify_death() and its caller to distinguish a "signal 0" wakeup from the delayed_group_leader case (with no wakeup). Signed-off-by: Roland McGrath Tested-by: Serge Hallyn Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 21 +++++++++++++-------- kernel/exit.c | 6 +++--- 2 files changed, 16 insertions(+), 11 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index b1875582c1a..12532839f50 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -493,16 +493,21 @@ static inline int tracehook_notify_jctl(int notify, int why) * @death_cookie: value to pass to tracehook_report_death() * @group_dead: nonzero if this was the last thread in the group to die * - * Return the signal number to send our parent with do_notify_parent(), or - * zero to send no signal and leave a zombie, or -1 to self-reap right now. + * A return value >= 0 means call do_notify_parent() with that signal + * number. Negative return value can be %DEATH_REAP to self-reap right + * now, or %DEATH_DELAYED_GROUP_LEADER to a zombie without notifying our + * parent. Note that a return value of 0 means a do_notify_parent() call + * that sends no signal, but still wakes up a parent blocked in wait*(). * * Called with write_lock_irq(&tasklist_lock) held. */ +#define DEATH_REAP -1 +#define DEATH_DELAYED_GROUP_LEADER -2 static inline int tracehook_notify_death(struct task_struct *task, void **death_cookie, int group_dead) { if (task->exit_signal == -1) - return task->ptrace ? SIGCHLD : -1; + return task->ptrace ? SIGCHLD : DEATH_REAP; /* * If something other than our normal parent is ptracing us, then @@ -512,21 +517,21 @@ static inline int tracehook_notify_death(struct task_struct *task, if (thread_group_empty(task) && !ptrace_reparented(task)) return task->exit_signal; - return task->ptrace ? SIGCHLD : 0; + return task->ptrace ? SIGCHLD : DEATH_DELAYED_GROUP_LEADER; } /** * tracehook_report_death - task is dead and ready to be reaped * @task: @current task now exiting - * @signal: signal number sent to parent, or 0 or -1 + * @signal: return value from tracheook_notify_death() * @death_cookie: value passed back from tracehook_notify_death() * @group_dead: nonzero if this was the last thread in the group to die * * Thread has just become a zombie or is about to self-reap. If positive, * @signal is the signal number just sent to the parent (usually %SIGCHLD). - * If @signal is -1, this thread will self-reap. If @signal is 0, this is - * a delayed_group_leader() zombie. The @death_cookie was passed back by - * tracehook_notify_death(). + * If @signal is %DEATH_REAP, this thread will self-reap. If @signal is + * %DEATH_DELAYED_GROUP_LEADER, this is a delayed_group_leader() zombie. + * The @death_cookie was passed back by tracehook_notify_death(). * * If normal reaping is not inhibited, @task->exit_state might be changing * in parallel. diff --git a/kernel/exit.c b/kernel/exit.c index eb4d6470d1d..38ec4063014 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -911,10 +911,10 @@ static void exit_notify(struct task_struct *tsk, int group_dead) tsk->exit_signal = SIGCHLD; signal = tracehook_notify_death(tsk, &cookie, group_dead); - if (signal > 0) + if (signal >= 0) signal = do_notify_parent(tsk, signal); - tsk->exit_state = signal < 0 ? EXIT_DEAD : EXIT_ZOMBIE; + tsk->exit_state = signal == DEATH_REAP ? EXIT_DEAD : EXIT_ZOMBIE; /* mt-exec, de_thread() is waiting for us */ if (thread_group_leader(tsk) && @@ -927,7 +927,7 @@ static void exit_notify(struct task_struct *tsk, int group_dead) tracehook_report_death(tsk, signal, cookie, group_dead); /* If the process is dead, release it - nobody will wait for it */ - if (signal < 0) + if (signal == DEATH_REAP) release_task(tsk); } -- cgit v1.2.3-70-g09d2 From 115a326c1e5cab457924356123bbfd7d783ecf9d Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Mon, 4 Aug 2008 13:56:01 -0700 Subject: tracehook: kerneldoc fix My last change to tracehook.h made it confuse the kerneldoc parser. Move the #define's before the comment so it's happy again. Signed-off-by: Roland McGrath Acked-by: Randy Dunlap Signed-off-by: Linus Torvalds --- include/linux/tracehook.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index 12532839f50..ab3ef7aefa9 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -487,6 +487,9 @@ static inline int tracehook_notify_jctl(int notify, int why) return notify || (current->ptrace & PT_PTRACED); } +#define DEATH_REAP -1 +#define DEATH_DELAYED_GROUP_LEADER -2 + /** * tracehook_notify_death - task is dead, ready to notify parent * @task: @current task now exiting @@ -501,8 +504,6 @@ static inline int tracehook_notify_jctl(int notify, int why) * * Called with write_lock_irq(&tasklist_lock) held. */ -#define DEATH_REAP -1 -#define DEATH_DELAYED_GROUP_LEADER -2 static inline int tracehook_notify_death(struct task_struct *task, void **death_cookie, int group_dead) { -- cgit v1.2.3-70-g09d2 From 5861bbfcc10fc0358abf52c7d22850c8d180f0b0 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Thu, 7 Aug 2008 16:55:03 -0700 Subject: tracehook: fix CLONE_PTRACE In the change in commit 09a05394fe2448a4139b014936330af23fa7ec83, I overlooked two nits in the logic and this broke using CLONE_PTRACE when PTRACE_O_TRACE* are not being used. A parent that is itself traced at all but not using PTRACE_O_TRACE*, using CLONE_PTRACE would have its new child fail to be traced. A parent that is not itself traced at all that uses CLONE_PTRACE (which should be a no-op in this case) would confuse the bookkeeping and lead to a crash at exit time. This restores the missing checks and fixes both failure modes. Reported-by: Eduardo Habkost Signed-off-by: Roland McGrath --- include/linux/ptrace.h | 2 +- include/linux/tracehook.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 'include/linux/tracehook.h') diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h index fd31756e1a0..ea7416c901d 100644 --- a/include/linux/ptrace.h +++ b/include/linux/ptrace.h @@ -172,7 +172,7 @@ static inline void ptrace_init_task(struct task_struct *child, bool ptrace) child->ptrace = 0; if (unlikely(ptrace)) { child->ptrace = current->ptrace; - __ptrace_link(child, current->parent); + ptrace_link(child, current->parent); } } diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h index ab3ef7aefa9..b48d8196957 100644 --- a/include/linux/tracehook.h +++ b/include/linux/tracehook.h @@ -280,7 +280,7 @@ static inline void tracehook_report_clone(int trace, struct pt_regs *regs, unsigned long clone_flags, pid_t pid, struct task_struct *child) { - if (unlikely(trace)) { + if (unlikely(trace) || unlikely(clone_flags & CLONE_PTRACE)) { /* * The child starts up with an immediate SIGSTOP. */ -- cgit v1.2.3-70-g09d2