diff options
author | Peter Zijlstra <peterz@infradead.org> | 2013-10-24 12:52:06 +0200 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2013-10-29 12:02:54 +0100 |
commit | e00b12e64be9a34ef071de7b6052ca9ea29dd460 (patch) | |
tree | 2f3395d06d639550039f3c9aa69c4ad0a4854327 /arch/x86/mm/fault.c | |
parent | 2c42cfbfe10872929c2ba1f8130e31063ff59b94 (diff) |
perf/x86: Further optimize copy_from_user_nmi()
Now that we can deal with nested NMI due to IRET re-enabling NMIs and
can deal with faults from NMI by making sure we preserve CR2 over NMIs
we can in fact simply access user-space memory from NMI context.
So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and
rework the fault path to do the minimal required work before taking
the in_atomic() fault handler.
In particular avoid perf_sw_event() which would make perf recurse on
itself (it should be harmless as our recursion protections should be
able to deal with this -- but why tempt fate).
Also rename notify_page_fault() to kprobes_fault() as that is a much
better name; there is no notifier in it and its specific to kprobes.
Don measured that his worst case NMI path shrunk from ~300K cycles to
~150K cycles.
Cc: Stephane Eranian <eranian@google.com>
Cc: jmario@redhat.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: dave.hansen@linux.intel.com
Tested-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'arch/x86/mm/fault.c')
-rw-r--r-- | arch/x86/mm/fault.c | 41 |
1 files changed, 21 insertions, 20 deletions
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 3aaeffcfd67..7a517bb4106 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -51,7 +51,7 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr) return 0; } -static inline int __kprobes notify_page_fault(struct pt_regs *regs) +static inline int __kprobes kprobes_fault(struct pt_regs *regs) { int ret = 0; @@ -1048,7 +1048,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code) return; /* kprobes don't want to hook the spurious faults: */ - if (notify_page_fault(regs)) + if (kprobes_fault(regs)) return; /* * Don't take the mm semaphore here. If we fixup a prefetch @@ -1060,23 +1060,8 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code) } /* kprobes don't want to hook the spurious faults: */ - if (unlikely(notify_page_fault(regs))) + if (unlikely(kprobes_fault(regs))) return; - /* - * It's safe to allow irq's after cr2 has been saved and the - * vmalloc fault has been handled. - * - * User-mode registers count as a user access even for any - * potential system fault or CPU buglet: - */ - if (user_mode_vm(regs)) { - local_irq_enable(); - error_code |= PF_USER; - flags |= FAULT_FLAG_USER; - } else { - if (regs->flags & X86_EFLAGS_IF) - local_irq_enable(); - } if (unlikely(error_code & PF_RSVD)) pgtable_bad(regs, error_code, address); @@ -1088,8 +1073,6 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code) } } - perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); - /* * If we're in an interrupt, have no user context or are running * in an atomic region then we must not take the fault: @@ -1099,6 +1082,24 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code) return; } + /* + * It's safe to allow irq's after cr2 has been saved and the + * vmalloc fault has been handled. + * + * User-mode registers count as a user access even for any + * potential system fault or CPU buglet: + */ + if (user_mode_vm(regs)) { + local_irq_enable(); + error_code |= PF_USER; + flags |= FAULT_FLAG_USER; + } else { + if (regs->flags & X86_EFLAGS_IF) + local_irq_enable(); + } + + perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); + if (error_code & PF_WRITE) flags |= FAULT_FLAG_WRITE; |