summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)Author
2014-12-01ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameterSteven Rostedt (Red Hat)
Instead of having save_mcount_regs store the RIP in %rdx as a temp register to place it in the proper location of the pt_regs on the stack. Use the %rdi register as the temp register. This lets us remove the extra store in the ftrace_caller_setup macro. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed commentsSteven Rostedt (Red Hat)
The name MCOUNT_SAVE_FRAME is rather confusing as it really isn't a function frame that is saved, but just the required mcount registers that are needed to be saved before C code may be called. The word "frame" confuses it as being a function frame which it is not. Rename MCOUNT_SAVE_FRAME and MCOUNT_RESTORE_FRAME to save_mcount_regs and restore_mcount_regs respectively. Noticed the lower case, which keeps it from screaming at the reviewers. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01ftrace/x86: Move MCOUNT_SAVE_FRAME out of header fileSteven Rostedt (Red Hat)
Linus pointed out that MCOUNT_SAVE_FRAME is used in only a single file and that there's no reason that it should be in a header file. Move the macro to the code that uses it. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01ftrace/x86: Have static tracing also use ftrace_caller_setupSteven Rostedt (Red Hat)
Linus pointed out that there were locations that did the hard coded update of the parent and rip parameters. One of them was the static tracer which could also use the ftrace_caller_setup to do that work. In fact, because it did not use it, it is prone to bugs, and since the static tracer is hardly ever used (who wants function tracing code always being called?) it doesn't get tested very often. I only run a few "does it still work" tests on it. But I do not run stress tests on that code. Although, since it is never turned off, just having it on should be stressful enough. (especially for the performance folks) There's no reason that the static tracer can't also use ftrace_caller_setup. Have it do so. Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1411262304010.3961@nanos Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-12-01x86, microcode, AMD: Do not use smp_processor_id() in preemtible contextBorislav Petkov
Hand down the cpu number instead, otherwise lockdep screams when doing echo 1 > /sys/devices/system/cpu/microcode/reload. BUG: using smp_processor_id() in preemptible [00000000] code: amd64-microcode/2470 caller is debug_smp_processor_id+0x12/0x20 CPU: 1 PID: 2470 Comm: amd64-microcode Not tainted 3.18.0-rc6+ #26 ... Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1417428741-4501-1-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-01x86, microcode: Limit the microcode reloading to 64-bit for nowBorislav Petkov
First, there was this: https://bugzilla.kernel.org/show_bug.cgi?id=88001 The problem there was that microcode patches are not being reapplied after suspend-to-ram. It was important to reapply them, though, because of for example Haswell's TSX erratum which disabled TSX instructions with a microcode patch. A simple fix was fb86b97300d9 ("x86, microcode: Update BSPs microcode on resume") but, as it is often the case, simple fixes are too simple. This one causes 32-bit resume to fail: https://bugzilla.kernel.org/show_bug.cgi?id=88391 Properly fixing this would require more involved changes for which it is too late now, right before the merge window. Thus, limit this to 64-bit only temporarily. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1417353999-32236-1-git-send-email-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-01Merge back earlier cpufreq material for 3.19-rc1.Rafael J. Wysocki
2014-11-26kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()Ard Biesheuvel
This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. The problem being addressed by the patch above was that some ARM code based the memory mapping attributes of a pfn on the return value of kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should be mapped as device memory. However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, and the existing non-ARM users were already using it in a way which suggests that its name should probably have been 'kvm_is_reserved_pfn' from the beginning, e.g., whether or not to call get_page/put_page on it etc. This means that returning false for the zero page is a mistake and the patch above should be reverted. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-26crypto: include crypto- module prefix in templateKees Cook
This adds the module loading prefix "crypto-" to the template lookup as well. For example, attempting to load 'vfat(blowfish)' via AF_ALG now correctly includes the "crypto-" prefix at every level, correctly rejecting "vfat": net-pf-38 algif-hash crypto-vfat(blowfish) crypto-vfat(blowfish)-all crypto-vfat Reported-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-25x86/nmi: Fix use of unallocated cpumask_var_tSasha Levin
Commit "x86/nmi: Perform a safe NMI stack trace on all CPUs" has introduced a cpumask_var_t variable: +static cpumask_var_t printtrace_mask; But never allocated it before using it, which caused a NULL ptr deref when trying to print the stack trace: [ 1110.296154] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1110.296169] IP: __memcpy (arch/x86/lib/memcpy_64.S:151) [ 1110.296178] PGD 4c34b3067 PUD 4c351b067 PMD 0 [ 1110.296186] Oops: 0002 [#1] PREEMPT SMP KASAN [ 1110.296234] Dumping ftrace buffer: [ 1110.296330] (ftrace buffer empty) [ 1110.296339] Modules linked in: [ 1110.296345] CPU: 1 PID: 10538 Comm: trinity-c99 Not tainted 3.18.0-rc5-next-20141124-sasha-00058-ge2a8c09-dirty #1499 [ 1110.296348] task: ffff880152650000 ti: ffff8804c3560000 task.ti: ffff8804c3560000 [ 1110.296357] RIP: __memcpy (arch/x86/lib/memcpy_64.S:151) [ 1110.296360] RSP: 0000:ffff8804c3563870 EFLAGS: 00010246 [ 1110.296363] RAX: 0000000000000000 RBX: ffffe8fff3c4a809 RCX: 0000000000000000 [ 1110.296366] RDX: 0000000000000008 RSI: ffffffff9e254040 RDI: 0000000000000000 [ 1110.296369] RBP: ffff8804c3563908 R08: 0000000000ffffff R09: 0000000000ffffff [ 1110.296371] R10: 0000000000000000 R11: 0000000000000006 R12: 0000000000000000 [ 1110.296375] R13: 0000000000000000 R14: ffffffff9e254040 R15: ffffe8fff3c4a809 [ 1110.296379] FS: 00007f9e43b0b700(0000) GS:ffff880107e00000(0000) knlGS:0000000000000000 [ 1110.296382] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1110.296385] CR2: 0000000000000000 CR3: 00000004e4334000 CR4: 00000000000006a0 [ 1110.296400] Stack: [ 1110.296406] ffffffff81b1e46c 0000000000000000 ffff880107e03fb8 000000000000000b [ 1110.296413] ffff880107dfffc0 ffff880107e03fc0 0000000000000008 ffffffff93f2e9c8 [ 1110.296419] 0000000000000000 ffffda0020fc07f7 0000000000000008 ffff8804c3563901 [ 1110.296420] Call Trace: [ 1110.296429] ? memcpy (mm/kasan/kasan.c:275) [ 1110.296437] ? arch_trigger_all_cpu_backtrace (include/linux/bitmap.h:215 include/linux/cpumask.h:506 arch/x86/kernel/apic/hw_nmi.c:76) [ 1110.296444] arch_trigger_all_cpu_backtrace (include/linux/bitmap.h:215 include/linux/cpumask.h:506 arch/x86/kernel/apic/hw_nmi.c:76) [ 1110.296451] ? dump_stack (./arch/x86/include/asm/preempt.h:95 lib/dump_stack.c:55) [ 1110.296458] do_raw_spin_lock (./arch/x86/include/asm/spinlock.h:86 kernel/locking/spinlock_debug.c:130 kernel/locking/spinlock_debug.c:137) [ 1110.296468] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 1110.296474] ? __page_check_address (include/linux/spinlock.h:309 mm/rmap.c:630) [ 1110.296481] __page_check_address (include/linux/spinlock.h:309 mm/rmap.c:630) [ 1110.296487] ? preempt_count_sub (kernel/sched/core.c:2615) [ 1110.296493] try_to_unmap_one (include/linux/rmap.h:202 mm/rmap.c:1146) [ 1110.296504] ? anon_vma_interval_tree_iter_next (mm/interval_tree.c:72 mm/interval_tree.c:103) [ 1110.296514] rmap_walk (mm/rmap.c:1653 mm/rmap.c:1725) [ 1110.296521] ? page_get_anon_vma (include/linux/rcupdate.h:423 include/linux/rcupdate.h:935 mm/rmap.c:435) [ 1110.296530] try_to_unmap (mm/rmap.c:1545) [ 1110.296536] ? page_get_anon_vma (mm/rmap.c:437) [ 1110.296545] ? try_to_unmap_nonlinear (mm/rmap.c:1138) [ 1110.296551] ? SyS_msync (mm/rmap.c:1501) [ 1110.296558] ? page_remove_rmap (mm/rmap.c:1409) [ 1110.296565] ? page_get_anon_vma (mm/rmap.c:448) [ 1110.296571] ? anon_vma_ctor (mm/rmap.c:1496) [ 1110.296579] migrate_pages (mm/migrate.c:913 mm/migrate.c:956 mm/migrate.c:1136) [ 1110.296586] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:169 kernel/locking/spinlock.c:199) [ 1110.296593] ? buffer_migrate_lock_buffers (mm/migrate.c:1584) [ 1110.296601] ? handle_mm_fault (mm/memory.c:3163 mm/memory.c:3223 mm/memory.c:3336 mm/memory.c:3365) [ 1110.296607] migrate_misplaced_page (mm/migrate.c:1738) [ 1110.296614] handle_mm_fault (mm/memory.c:3170 mm/memory.c:3223 mm/memory.c:3336 mm/memory.c:3365) [ 1110.296623] __do_page_fault (arch/x86/mm/fault.c:1246) [ 1110.296630] ? vtime_account_user (kernel/sched/cputime.c:701) [ 1110.296638] ? get_parent_ip (kernel/sched/core.c:2559) [ 1110.296646] ? context_tracking_user_exit (kernel/context_tracking.c:144) [ 1110.296656] trace_do_page_fault (arch/x86/mm/fault.c:1329 include/linux/jump_label.h:114 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1330) [ 1110.296664] do_async_page_fault (arch/x86/kernel/kvm.c:280) [ 1110.296670] async_page_fault (arch/x86/kernel/entry_64.S:1285) [ 1110.296755] Code: 08 4c 8b 54 16 f0 4c 8b 5c 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 90 83 fa 08 72 1b 4c 8b 06 4c 8b 4c 16 f8 <4c> 89 07 4c 89 4c 17 f8 c3 66 2e 0f 1f 84 00 00 00 00 00 83 fa All code ======== 0: 08 4c 8b 54 or %cl,0x54(%rbx,%rcx,4) 4: 16 (bad) 5: f0 4c 8b 5c 16 f8 lock mov -0x8(%rsi,%rdx,1),%r11 b: 4c 89 07 mov %r8,(%rdi) e: 4c 89 4f 08 mov %r9,0x8(%rdi) 12: 4c 89 54 17 f0 mov %r10,-0x10(%rdi,%rdx,1) 17: 4c 89 5c 17 f8 mov %r11,-0x8(%rdi,%rdx,1) 1c: c3 retq 1d: 90 nop 1e: 83 fa 08 cmp $0x8,%edx 21: 72 1b jb 0x3e 23: 4c 8b 06 mov (%rsi),%r8 26: 4c 8b 4c 16 f8 mov -0x8(%rsi,%rdx,1),%r9 2b:* 4c 89 07 mov %r8,(%rdi) <-- trapping instruction 2e: 4c 89 4c 17 f8 mov %r9,-0x8(%rdi,%rdx,1) 33: c3 retq 34: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 3b: 00 00 00 3e: 83 fa 00 cmp $0x0,%edx Code starting with the faulting instruction =========================================== 0: 4c 89 07 mov %r8,(%rdi) 3: 4c 89 4c 17 f8 mov %r9,-0x8(%rdi,%rdx,1) 8: c3 retq 9: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 10: 00 00 00 13: 83 fa 00 cmp $0x0,%edx [ 1110.296760] RIP __memcpy (arch/x86/lib/memcpy_64.S:151) [ 1110.296763] RSP <ffff8804c3563870> [ 1110.296765] CR2: 0000000000000000 Link: http://lkml.kernel.org/r/1416931560-10603-1-git-send-email-sasha.levin@oracle.com Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-25crypto: sha-mb - remove a bogus NULL checkDan Carpenter
This can't be NULL and we dereferenced it earlier. Smatch used to ignore these things where the pointer was obviously non-NULL but I've found that sometimes the intention was to check something else so we were maybe missing bugs. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-25kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()Ard Biesheuvel
This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. The problem being addressed by the patch above was that some ARM code based the memory mapping attributes of a pfn on the return value of kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should be mapped as device memory. However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, and the existing non-ARM users were already using it in a way which suggests that its name should probably have been 'kvm_is_reserved_pfn' from the beginning, e.g., whether or not to call get_page/put_page on it etc. This means that returning false for the zero page is a mistake and the patch above should be reverted. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-11-25x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regsAndy Lutomirski
These functions can be executed on the int3 stack, so kprobes are dangerous. Tracing is probably a bad idea, too. Fixes: b645af2d5905 ("x86_64, traps: Rework bad_iret") Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: <stable@vger.kernel.org> # Backport as far back as it would apply Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/50e33d26adca60816f3ba968875801652507d0c4.1416870125.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-24ftrace/x86: Have static function tracing always test for function graphSteven Rostedt (Red Hat)
New updates to the ftrace generic code had ftrace_stub not always being called when ftrace is off. This causes the static tracer to always save and restore functions. But it also showed that when function tracing is running, the function graph tracer can not. We should always check to see if function graph tracing is running even if the function tracer is running too. The function tracer code is not the only one that uses the hook to function mcount. Cc: Markos Chandras <Markos.Chandras@imgtec.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-24kvm: x86: avoid warning about potential shift wrapping bugPaolo Bonzini
cs.base is declared as a __u64 variable and vector is a u32 so this causes a static checker warning. The user indeed can set "sipi_vector" to any u32 value in kvm_vcpu_ioctl_x86_set_vcpu_events(), but the value should really have 8-bit precision only. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24KVM: x86: move device assignment out of kvm_host.hPaolo Bonzini
Create a new header, and hide the device assignment functions there. Move struct kvm_assigned_dev_kernel to assigned-dev.c by modifying arch/x86/kvm/iommu.c to take a PCI device struct. Based on a patch by Radim Krcmar <rkrcmark@redhat.com>. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24crypto: prefix module autoloading with "crypto-"Kees Cook
This prefixes all crypto module loading with "crypto-" so we never run the risk of exposing module auto-loading to userspace via a crypto API, as demonstrated by Mathias Krause: https://lkml.org/lkml/2013/3/4/70 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-23uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUMEAndy Lutomirski
x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but not on non-paranoid returns. I suspect that this is a mistake and that the code only works because int3 is paranoid. Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround for the x86 bug. With that bug fixed, we can remove _TIF_NOTIFY_RESUME from the uprobes code. Reported-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23Merge branch 'x86-traps' (trap handling from Andy Lutomirski)Linus Torvalds
Merge x86-64 iret fixes from Andy Lutomirski: "This addresses the following issues: - an unrecoverable double-fault triggerable with modify_ldt. - invalid stack usage in espfix64 failed IRET recovery from IST context. - invalid stack usage in non-espfix64 failed IRET recovery from IST context. It also makes a good but IMO scary change: non-espfix64 failed IRET will now report the correct error. Hopefully nothing depended on the old incorrect behavior, but maybe Wine will get confused in some obscure corner case" * emailed patches from Andy Lutomirski <luto@amacapital.net>: x86_64, traps: Rework bad_iret x86_64, traps: Stop using IST for #SS x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in C
2014-11-23x86_64, traps: Rework bad_iretAndy Lutomirski
It's possible for iretq to userspace to fail. This can happen because of a bad CS, SS, or RIP. Historically, we've handled it by fixing up an exception from iretq to land at bad_iret, which pretends that the failed iret frame was really the hardware part of #GP(0) from userspace. To make this work, there's an extra fixup to fudge the gs base into a usable state. This is suboptimal because it loses the original exception. It's also buggy because there's no guarantee that we were on the kernel stack to begin with. For example, if the failing iret happened on return from an NMI, then we'll end up executing general_protection on the NMI stack. This is bad for several reasons, the most immediate of which is that general_protection, as a non-paranoid idtentry, will try to deliver signals and/or schedule from the wrong stack. This patch throws out bad_iret entirely. As a replacement, it augments the existing swapgs fudge into a full-blown iret fixup, mostly written in C. It's should be clearer and more correct. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23x86_64, traps: Stop using IST for #SSAndy Lutomirski
On a 32-bit kernel, this has no effect, since there are no IST stacks. On a 64-bit kernel, #SS can only happen in user code, on a failed iret to user space, a canonical violation on access via RSP or RBP, or a genuine stack segment violation in 32-bit kernel code. The first two cases don't need IST, and the latter two cases are unlikely fatal bugs, and promoting them to double faults would be fine. This fixes a bug in which the espfix64 code mishandles a stack segment violation. This saves 4k of memory per CPU and a tiny bit of code. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23x86_64, traps: Fix the espfix64 #DF fixup and rewrite it in CAndy Lutomirski
There's nothing special enough about the espfix64 double fault fixup to justify writing it in assembly. Move it to C. This also fixes a bug: if the double fault came from an IST stack, the old asm code would return to a partially uninitialized stack frame. Fixes: 3891a04aafd668686239349ea58f3314ea2af86b Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-23x86: Use $(OBJDUMP) instead of plain objdumpChris Clayton
commit e6023367d779 'x86, kaslr: Prevent .bss from overlaping initrd' broke the cross compile of x86. It added a objdump invocation, which invokes the host native objdump and ignores an active cross tool chain. Use $(OBJDUMP) instead which takes the CROSS_COMPILE prefix into account. [ tglx: Massage changelog and use $(OBJDUMP) ] Fixes: e6023367d779 'x86, kaslr: Prevent .bss from overlaping initrd' Signed-off-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Borislav Petkov <bp@suse.de> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/54705C8E.1080400@googlemail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-23kvm: x86: mask out XSAVESPaolo Bonzini
This feature is not supported inside KVM guests yet, because we do not emulate MSR_IA32_XSS. Mask it out. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23kvm: x86: move assigned-dev.c and iommu.c to arch/x86/Radim Krčmář
Now that ia64 is gone, we can hide deprecated device assignment in x86. Notable changes: - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl() The easy parts were removed from generic kvm code, remaining - kvm_iommu_(un)map_pages() would require new code to be moved - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23PCI/MSI: Rename mask/unmask_msi_irq treewideThomas Gleixner
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage sites. The conversion helper functions are kept around to avoid conflicts in next and will be removed after merging into mainline. Coccinelle assisted conversion. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: x86@kernel.org Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Mohit Kumar <mohit.kumar@st.com> Cc: Simon Horman <horms@verge.net.au> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Yijing Wang <wangyijing@huawei.com>
2014-11-23PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg()Jiang Liu
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI specific. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-23PCI/MSI: Rename __read_msi_msg() to __pci_read_msi_msg()Jiang Liu
Rename __read_msi_msg() to __pci_read_msi_msg() and kill unused read_msi_msg(). It's a preparation to separate generic MSI code from PCI core. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-21Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Misc fixes: - gold linker build fix - noxsave command line parsing fix - bugfix for NX setup - microcode resume path bug fix - _TIF_NOHZ versus TIF_NOHZ bugfix as discussed in the mysterious lockup thread" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, syscall: Fix _TIF_NOHZ handling in syscall_trace_enter_phase1 x86, kaslr: Handle Gold linker for finding bss/brk x86, mm: Set NX across entire PMD at boot x86, microcode: Update BSPs microcode on resume x86: Require exact match for 'noxsave' command line option
2014-11-21Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: two Intel uncore driver fixes, a CPU-hotplug fix and a build dependencies fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Fix boot crash on SBOX PMU on Haswell-EP perf/x86/intel/uncore: Fix IRP uncore register offsets on Haswell EP perf: Fix corruption of sibling list with hotplug perf/x86: Fix embarrasing typo
2014-11-21kvm: remove CONFIG_X86 #ifdefs from files formerly shared with ia64Radim Krcmar
Signed-off-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/Paolo Bonzini
ia64 does not need them anymore. Ack notifiers become x86-specific too. Suggested-by: Gleb Natapov <gleb@kernel.org> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21Merge back earlier cpuidle material for 3.19-rc1.Rafael J. Wysocki
Conflicts: drivers/cpuidle/dt_idle_states.c
2014-11-20x86, syscall: Fix _TIF_NOHZ handling in syscall_trace_enter_phase1Andy Lutomirski
TIF_NOHZ is 19 (i.e. _TIF_SYSCALL_TRACE | _TIF_NOTIFY_RESUME | _TIF_SINGLESTEP), not (1<<19). This code is involved in Dave's trinity lockup, but I don't see why it would cause any of the problems he's seeing, except inadvertently by causing a different path through entry_64.S's syscall handling. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Jones <davej@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/a6cd3b60a3f53afb6e1c8081b0ec30ff19003dd7.1416434075.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-20kprobes/ftrace: Recover original IP if pre_handler doesn't change itMasami Hiramatsu
Recover original IP register if the pre_handler doesn't change it. Since current kprobes doesn't expect that another ftrace handler may change regs->ip, it sets kprobe.addr + MCOUNT_INSN_SIZE to regs->ip and returns to ftrace. This seems wrong behavior since kprobes can recover regs->ip and safely pass it to another handler. This adds code which recovers original regs->ip passed from ftrace right before returning to ftrace, so that another ftrace user can change regs->ip. Link: http://lkml.kernel.org/r/20141009130106.4698.26362.stgit@kbuild-f20.novalocal Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19x86/nmi: Perform a safe NMI stack trace on all CPUsSteven Rostedt (Red Hat)
When trigger_all_cpu_backtrace() is called on x86, it will trigger an NMI on each CPU and call show_regs(). But this can lead to a hard lock up if the NMI comes in on another printk(). In order to avoid this, when the NMI triggers, it switches the printk routine for that CPU to call a NMI safe printk function that records the printk in a per_cpu seq_buf descriptor. After all NMIs have finished recording its data, the seq_bufs are printed in a safe context. Link: http://lkml.kernel.org/p/20140619213952.360076309@goodmis.org Link: http://lkml.kernel.org/r/20141115050605.055232587@goodmis.org Tested-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19x86/kvm/tracing: Use helper function trace_seq_buffer_ptr()Steven Rostedt (Red Hat)
To allow for the restructiong of the trace_seq code, we need users of it to use the helper functions instead of accessing the internals of the trace_seq structure itself. Link: http://lkml.kernel.org/r/20141104160221.585025609@goodmis.org Tested-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Mark Rustad <mark.d.rustad@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.cz> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19ftrace/x86/extable: Add is_ftrace_trampoline() functionSteven Rostedt (Red Hat)
Stack traces that happen from function tracing check if the address on the stack is a __kernel_text_address(). That is, is the address kernel code. This calls core_kernel_text() which returns true if the address is part of the builtin kernel code. It also calls is_module_text_address() which returns true if the address belongs to module code. But what is missing is ftrace dynamically allocated trampolines. These trampolines are allocated for individual ftrace_ops that call the ftrace_ops callback functions directly. But if they do a stack trace, the code checking the stack wont detect them as they are neither core kernel code nor module address space. Adding another field to ftrace_ops that also stores the size of the trampoline assigned to it we can create a new function called is_ftrace_trampoline() that returns true if the address is a dynamically allocate ftrace trampoline. Note, it ignores trampolines that are not dynamically allocated as they will return true with the core_kernel_text() function. Link: http://lkml.kernel.org/r/20141119034829.497125839@goodmis.org Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19ftrace/x86: Add frames pointers to trampoline as necessarySteven Rostedt (Red Hat)
When CONFIG_FRAME_POINTERS are enabled, it is required that the ftrace_caller and ftrace_regs_caller trampolines set up frame pointers otherwise a stack trace from a function call wont print the functions that called the trampoline. This is due to a check in __save_stack_address(): #ifdef CONFIG_FRAME_POINTER if (!reliable) return; #endif The "reliable" variable is only set if the function address is equal to contents of the address before the address the frame pointer register points to. If the frame pointer is not set up for the ftrace caller then this will fail the reliable test. It will miss the function that called the trampoline. Worse yet, if fentry is used (gcc 4.6 and beyond), it will also miss the parent, as the fentry is called before the stack frame is set up. That means the bp frame pointer points to the stack of just before the parent function was called. Link: http://lkml.kernel.org/r/20141119034829.355440340@goodmis.org Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: stable@vger.kernel.org # 3.7+ Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-19x86, mce: Support memory error recovery for both UCNA and Deferred error in ↵Chen Yucong
machine_check_poll Uncorrected no action required (UCNA) - is a uncorrected recoverable machine check error that is not signaled via a machine check exception and, instead, is reported to system software as a corrected machine check error. UCNA errors indicate that some data in the system is corrupted, but the data has not been consumed and the processor state is valid and you may continue execution on this processor. UCNA errors require no action from system software to continue execution. Note that UCNA errors are supported by the processor only when IA32_MCG_CAP[24] (MCG_SER_P) is set. -- Intel SDM Volume 3B Deferred errors are errors that cannot be corrected by hardware, but do not cause an immediate interruption in program flow, loss of data integrity, or corruption of processor state. These errors indicate that data has been corrupted but not consumed. Hardware writes information to the status and address registers in the corresponding bank that identifies the source of the error if deferred errors are enabled for logging. Deferred errors are not reported via machine check exceptions; they can be seen by polling the MCi_STATUS registers. -- AMD64 APM Volume 2 Above two items, both UCNA and Deferred errors belong to detected errors, but they can't be corrected by hardware, and this is very similar to Software Recoverable Action Optional (SRAO) errors. Therefore, we can take some actions that have been used for handling SRAO errors to handle UCNA and Deferred errors. Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-11-19x86, mce, severity: Extend the the mce_severity mechanism to handle ↵Chen Yucong
UCNA/DEFERRED error Until now, the mce_severity mechanism can only identify the severity of UCNA error as MCE_KEEP_SEVERITY. Meanwhile, it is not able to filter out DEFERRED error for AMD platform. This patch extends the mce_severity mechanism for handling UCNA/DEFERRED error. In order to do this, the patch introduces a new severity level - MCE_UCNA/DEFERRED_SEVERITY. In addition, mce_severity is specific to machine check exception, and it will check MCIP/EIPV/RIPV bits. In order to use mce_severity mechanism in non-exception context, the patch also introduces a new argument (is_excp) for mce_severity. `is_excp' is used to explicitly specify the calling context of mce_severity. Reviewed-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-11-19assorted conversions to %p[dD]Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-19KVM: x86: Remove FIXMEs in emulate.cNicholas Krause
Remove FIXME comments about needing fault addresses to be returned. These are propaagated from walk_addr_generic to gva_to_gpa and from there to ops->read_std and ops->write_std. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: emulator: remove duplicated limit checkPaolo Bonzini
The check on the higher limit of the segment, and the check on the maximum accessible size, is the same for both expand-up and expand-down segments. Only the computation of "lim" varies. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: emulator: remove code duplication in register_address{,_increment}Paolo Bonzini
register_address has been a duplicate of address_mask ever since the ancestor of __linearize was born in 90de84f50b42 (KVM: x86 emulator: preserve an operand's segment identity, 2010-11-17). However, we can put it to a better use by including the call to reg_read in register_address. Similarly, the call to reg_rmw can be moved to register_address_increment. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Move __linearize masking of la into switchNadav Amit
In __linearize there is check of the condition whether to check if masking of the linear address is needed. It occurs immediately after switch that evaluates the same condition. Merge them. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Non-canonical access using SS should cause #SSNadav Amit
When SS is used using a non-canonical address, an #SS exception is generated on real hardware. KVM emulator causes a #GP instead. Fix it to behave as real x86 CPU. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Perform limit checks when assigning EIPNadav Amit
If branch (e.g., jmp, ret) causes limit violations, since the target IP > limit, the #GP exception occurs before the branch. In other words, the RIP pushed on the stack should be that of the branch and not that of the target. To do so, we can call __linearize, with new EIP, which also saves us the code which performs the canonical address checks. On the case of assigning an EIP >= 2^32 (when switching cs.l), we also safe, as __linearize will check the new EIP does not exceed the limit and would trigger #GP(0) otherwise. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Emulator performs privilege checks on __linearizeNadav Amit
When segment is accessed, real hardware does not perform any privilege level checks. In contrast, KVM emulator does. This causes some discrepencies from real hardware. For instance, reading from readable code segment may fail due to incorrect segment checks. In addition, it introduces unnecassary overhead. To reference Intel SDM 5.5 ("Privilege Levels"): "Privilege levels are checked when the segment selector of a segment descriptor is loaded into a segment register." The SDM never mentions privilege level checks during memory access, except for loading far pointers in section 5.10 ("Pointer Validation"). Those are actually segment selector loads and are emulated in the similarily (i.e., regardless to __linearize checks). This behavior was also checked using sysexit. A data-segment whose DPL=0 was loaded, and after sysexit (CPL=3) it is still accessible. Therefore, all the privilege level checks in __linearize are removed. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Stack size is overridden by __linearizeNadav Amit
When performing segmented-read/write in the emulator for stack operations, it ignores the stack size, and uses the ad_bytes as indication for the pointer size. As a result, a wrong address may be accessed. To fix this behavior, we can remove the masking of address in __linearize and perform it beforehand. It is already done for the operands (so currently it is inefficiently done twice). It is missing in two cases: 1. When using rip_relative 2. On fetch_bit_operand that changes the address. This patch masks the address on these two occassions, and removes the masking from __linearize. Note that it does not mask EIP during fetch. In protected/legacy mode code fetch when RIP >= 2^32 should result in #GP and not wrap-around. Since we make limit checks within __linearize, this is the expected behavior. Partial revert of commit 518547b32ab4 (KVM: x86: Emulator does not calculate address correctly, 2014-09-30). Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>