summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)Author
2015-01-21kvm: Fix CR3_PCID_INVD type on 32-bitBorislav Petkov
arch/x86/kvm/emulate.c: In function ‘check_cr_write’: arch/x86/kvm/emulate.c:3552:4: warning: left shift count >= width of type rsvd = CR3_L_MODE_RESERVED_BITS & ~CR3_PCID_INVD; happens because sizeof(UL) on 32-bit is 4 bytes but we shift it 63 bits to the left. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-21Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - Preemptible-RCU fixes, including fixing an old bug in the interaction of RCU priority boosting and CPU hotplug. - SRCU updates. - RCU CPU stall-warning updates. - RCU torture-test updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-20KVM: x86: workaround SuSE's 2.6.16 pvclock vs masterclock issueMarcelo Tosatti
SuSE's 2.6.16 kernel fails to boot if the delta between tsc_timestamp and rdtsc is larger than a given threshold: * If we get more than the below threshold into the future, we rerequest * the real time from the host again which has only little offset then * that we need to adjust using the TSC. * * For now that threshold is 1/5th of a jiffie. That should be good * enough accuracy for completely broken systems, but also give us swing * to not call out to the host all the time. */ #define PVCLOCK_DELTA_MAX ((1000000000ULL / HZ) / 5) Disable masterclock support (which increases said delta) in case the boot vcpu does not use MSR_KVM_SYSTEM_TIME_NEW. Upstreams kernels which support pvclock vsyscalls (and therefore make use of PVCLOCK_STABLE_BIT) use MSR_KVM_SYSTEM_TIME_NEW. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-20KVM: fix "Should it be static?" warnings from sparseFengguang Wu
arch/x86/kvm/x86.c:495:5: sparse: symbol 'kvm_read_nested_guest_page' was not declared. Should it be static? arch/x86/kvm/x86.c:646:5: sparse: symbol '__kvm_set_xcr' was not declared. Should it be static? arch/x86/kvm/x86.c:1183:15: sparse: symbol 'max_tsc_khz' was not declared. Should it be static? arch/x86/kvm/x86.c:1237:6: sparse: symbol 'kvm_track_tsc_matching' was not declared. Should it be static? Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-20x86/xen: prefer TSC over xen clocksource for dom0Palik, Imre
In Dom0's the use of the TSC clocksource (whenever it is stable enough to be used) instead of the Xen clocksource should not cause any issues, as Dom0 VMs never live-migrated. The TSC clocksource is somewhat more efficient than the Xen paravirtualised clocksource, thus it should have higher rating. This patch decreases the rating of the Xen clocksource in Dom0s to 275. Which is half-way between the rating of the TSC clocksource (300) and the hpet clocksource (250). Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-20livepatch: change ARCH_HAVE_LIVE_PATCHING to HAVE_LIVE_PATCHINGMiroslav Benes
Change ARCH_HAVE_LIVE_PATCHING to HAVE_LIVE_PATCHING in Kconfigs. HAVE_ bools are prevalent there and we should go with the flow. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-01-20x86, hyperv: Mark the Hyper-V clocksource as being continuousK. Y. Srinivasan
The Hyper-V clocksource is continuous; mark it accordingly. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: jasowang@redhat.com Cc: gregkh@linuxfoundation.org Cc: devel@linuxdriverproject.org Cc: olaf@aepfle.de Cc: apw@canonical.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1421108762-3331-1-git-send-email-kys@microsoft.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86: Don't rely on VMWare emulating PAT MSR correctlyJuergen Gross
VMWare seems not to emulate the PAT MSR correctly: reaeding MSR_IA32_CR_PAT returns 0 even after writing another value to it. Commit bd809af16e3ab triggers this VMWare bug when the kernel is booted as a VMWare guest. Detect this bug and don't use the read value if it is 0. Fixes: bd809af16e3ab "x86: Enable PAT to use cache mode translation tables" Reported-and-tested-by: Jongman Heo <jongman.heo@samsung.com> Acked-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: Juergen Gross <jgross@suse.com> Link: http://lkml.kernel.org/r/1421039745-14335-1-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86, fpu: Fix math_state_restore() race with kernel_fpu_begin()Oleg Nesterov
math_state_restore() can race with kernel_fpu_begin() if irq comes right after __thread_fpu_begin(), __save_init_fpu() will overwrite fpu->state we are going to restore. Add 2 simple helpers, kernel_fpu_disable() and kernel_fpu_enable() which simply set/clear in_kernel_fpu, and change math_state_restore() to exclude kernel_fpu_begin() in between. Alternatively we could use local_irq_save/restore, but probably these new helpers can have more users. Perhaps they should disable/enable preemption themselves, in this case we can remove preempt_disable() in __restore_xstate_sig(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: matt.fleming@intel.com Cc: bp@suse.de Cc: pbonzini@redhat.com Cc: luto@amacapital.net Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150115192028.GD27332@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86, fpu: Don't abuse has_fpu in __kernel_fpu_begin/end()Oleg Nesterov
Now that we have in_kernel_fpu we can remove __thread_clear_has_fpu() in __kernel_fpu_begin(). And this allows to replace the asymmetrical and nontrivial use_eager_fpu + tsk_used_math check in kernel_fpu_end() with the same __thread_has_fpu() check. The logic becomes really simple; if _begin() does save() then _end() needs restore(), this is controlled by __thread_has_fpu(). Otherwise they do clts/stts unless use_eager_fpu(). Not only this makes begin/end symmetrical and imo more understandable, potentially this allows to change irq_fpu_usable() to avoid all other checks except "in_kernel_fpu". Also, with this patch __kernel_fpu_end() does restore_fpu_checking() and WARNs if it fails instead of math_state_restore(). I think this looks better because we no longer need __thread_fpu_begin(), and it would be better to report the failure in this case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: matt.fleming@intel.com Cc: bp@suse.de Cc: pbonzini@redhat.com Cc: luto@amacapital.net Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150115192005.GC27332@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86, fpu: Introduce per-cpu in_kernel_fpu stateOleg Nesterov
interrupted_kernel_fpu_idle() tries to detect if kernel_fpu_begin() is safe or not. In particular it should obviously deny the nested kernel_fpu_begin() and this logic looks very confusing. If use_eager_fpu() == T we rely on a) __thread_has_fpu() check in interrupted_kernel_fpu_idle(), and b) on the fact that _begin() does __thread_clear_has_fpu(). Otherwise we demand that the interrupted task has no FPU if it is in kernel mode, this works because __kernel_fpu_begin() does clts() and interrupted_kernel_fpu_idle() checks X86_CR0_TS. Add the per-cpu "bool in_kernel_fpu" variable, and change this code to check/set/clear it. This allows to do more cleanups and fixes, see the next changes. The patch also moves WARN_ON_ONCE() under preempt_disable() just to make this_cpu_read() look better, this is not really needed. And in fact I think we should move it into __kernel_fpu_begin(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: matt.fleming@intel.com Cc: bp@suse.de Cc: pbonzini@redhat.com Cc: luto@amacapital.net Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150115191943.GB27332@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86: pmc_atom: Expose contents of PSSAndy Shevchenko
The PSS register reflects the power state of each island on SoC. It would be useful to know which of the islands is on or off at the momemnt. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Aubrey Li <aubrey.li@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Kumar P. Mahesh <mahesh.kumar.p@intel.com> Link: http://lkml.kernel.org/r/1421253575-22509-6-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86: pmc_atom: Clean up init functionAndy Shevchenko
There is no need to use err variable. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Aubrey Li <aubrey.li@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Kumar P. Mahesh <mahesh.kumar.p@intel.com> Link: http://lkml.kernel.org/r/1421253575-22509-5-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86: pmc-atom: Remove unused macroAndy Shevchenko
DRIVER_NAME seems unused. This patch just removes it. There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aubrey Li <aubrey.li@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Kumar P. Mahesh <mahesh.kumar.p@intel.com> Link: http://lkml.kernel.org/r/1421253575-22509-4-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86: pmc_atom: don%27t check for NULL twiceAndy Shevchenko
debugfs_remove_recursive() is NULL-aware, thus, we may safely remove the check here. There is no need to assing NULL to variable since it will be not used anywhere. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Aubrey Li <aubrey.li@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Kumar P. Mahesh <mahesh.kumar.p@intel.com> Link: http://lkml.kernel.org/r/1421253575-22509-3-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86: pmc-atom: Assign debugfs node as soon as possibleAndy Shevchenko
pmc_dbgfs_unregister() will be called when pmc->dbgfs_dir is unconditionally NULL on error path in pmc_dbgfs_register(). To prevent this we move the assignment to where is should be. Fixes: f855911c1f48 (x86/pmc_atom: Expose PMC device state and platform sleep state) Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aubrey Li <aubrey.li@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Kumar P. Mahesh <mahesh.kumar.p@intel.com> Link: http://lkml.kernel.org/r/1421253575-22509-2-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86, irq: Properly tag virtualization entry in /proc/interruptsJan Beulich
The mis-naming likely was a copy-and-paste effect. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/54B9408B0200007800055E8B@mail.emea.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86, boot: Skip relocs when load address unchangedKees Cook
On 64-bit, relocation is not required unless the load address gets changed. Without this, relocations do unexpected things when the kernel is above 4G. Reported-by: Baoquan He <bhe@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Thomas D. <whissi@whissi.de> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Junjie Mao <eternal.n08@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20150116005146.GA4212@www.outflux.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86/xen: Override ACPI IRQ management callback __acpi_unregister_gsiJiang Liu
Xen overrides __acpi_register_gsi and leaves __acpi_unregister_gsi as is. That means, an IRQ allocated by acpi_register_gsi_xen_hvm() or acpi_register_gsi_xen() will be freed by acpi_unregister_gsi_ioapic(), which may cause undesired effects. So override __acpi_unregister_gsi to NULL for safety. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Tony Luck <tony.luck@intel.com> Cc: xen-devel@lists.xenproject.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Graeme Gregory <graeme.gregory@linaro.org> Cc: Lv Zheng <lv.zheng@intel.com> Link: http://lkml.kernel.org/r/1421720467-7709-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20x86/xen: Treat SCI interrupt as normal GSI interruptJiang Liu
Currently Xen Domain0 has special treatment for ACPI SCI interrupt, that is initialize irq for ACPI SCI at early stage in a special way as: xen_init_IRQ() ->pci_xen_initial_domain() ->xen_setup_acpi_sci() Allocate and initialize irq for ACPI SCI Function xen_setup_acpi_sci() calls acpi_gsi_to_irq() to get an irq number for ACPI SCI. But unfortunately acpi_gsi_to_irq() depends on IOAPIC irqdomains through following path acpi_gsi_to_irq() ->mp_map_gsi_to_irq() ->mp_map_pin_to_irq() ->check IOAPIC irqdomain For PV domains, it uses Xen event based interrupt manangement and doesn't make uses of native IOAPIC, so no irqdomains created for IOAPIC. This causes Xen domain0 fail to install interrupt handler for ACPI SCI and all ACPI events will be lost. Please refer to: https://lkml.org/lkml/2014/12/19/178 So the fix is to get rid of special treatment for ACPI SCI, just treat ACPI SCI as normal GSI interrupt as: acpi_gsi_to_irq() ->acpi_register_gsi() ->acpi_register_gsi_xen() ->xen_register_gsi() With above change, there's no need for xen_setup_acpi_sci() anymore. The above change also works with bare metal kernel too. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Cc: Tony Luck <tony.luck@intel.com> Cc: xen-devel@lists.xenproject.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1421720467-7709-2-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fix from Herbert Xu: "This fixes a regression that arose from the change to add a crypto prefix to module names which was done to prevent the loading of arbitrary modules through the Crypto API. In particular, a number of modules were missing the crypto prefix which meant that they could no longer be autoloaded" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: add missing crypto module aliases
2015-01-20module: remove mod arg from module_free, rename module_memfree().Rusty Russell
Nothing needs the module pointer any more, and the next patch will call it from RCU, where the module itself might no longer exist. Removing the arg is the safest approach. This just codifies the use of the module_alloc/module_free pattern which ftrace and bpf use. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: x86@kernel.org Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: linux-cris-kernel@axis.com Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: nios2-dev@lists.rocketboards.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: netdev@vger.kernel.org
2015-01-19x86/spinlock: Leftover conversion ACCESS_ONCE->READ_ONCEChristian Borntraeger
commit 78bff1c8684f ("x86/ticketlock: Fix spin_unlock_wait() livelock") introduced two additional ACCESS_ONCE cases in x86 spinlock.h. Lets change those as well. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com>
2015-01-19x86/xen/p2m: Replace ACCESS_ONCE with READ_ONCEChristian Borntraeger
ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the p2m code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
2015-01-19Optimize TLB flush in kvm_mmu_slot_remove_write_access.Kai Huang
No TLB flush is needed when there's no valid rmap in memory slot. Signed-off-by: Kai Huang <kai.huang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-19x86: kvm: vmx: Remove some unused functionsRickard Strandqvist
Removes some functions that are not used anywhere: cpu_has_vmx_eptp_writeback() cpu_has_vmx_eptp_uncacheable() This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-01-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also two PMU driver fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tools powerpc: Use dwfl_report_elf() instead of offline. perf tools: Fix segfault for symbol annotation on TUI perf test: Fix dwarf unwind using libunwind. perf tools: Avoid build splat for syscall numbers with uclibc perf tools: Elide strlcpy warning with uclibc perf tools: Fix statfs.f_type data type mismatch build error with uclibc tools: Remove bitops/hweight usage of bits in tools/perf perf machine: Fix __machine__findnew_thread() error path perf tools: Fix building error in x86_64 when dwarf unwind is on perf probe: Propagate error code when write(2) failed perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLM perf/rapl: Fix sysfs_show() initialization for RAPL PMU
2015-01-17x86_64 entry: Fix RCX for ptraced syscallsAndy Lutomirski
The int_ret_from_sys_call and syscall tracing code disagrees with the sysret path as to the value of RCX. The Intel SDM, the AMD APM, and my laptop all agree that sysret returns with RCX == RIP. The syscall tracing code does not respect this property. For example, this program: int main() { extern const char syscall_rip[]; unsigned long rcx = 1; unsigned long orig_rcx = rcx; asm ("mov $-1, %%eax\n\t" "syscall\n\t" "syscall_rip:" : "+c" (rcx) : : "r11"); printf("syscall: RCX = %lX RIP = %lX orig RCX = %lx\n", rcx, (unsigned long)syscall_rip, orig_rcx); return 0; } prints: syscall: RCX = 400556 RIP = 400556 orig RCX = 1 Running it under strace gives this instead: syscall: RCX = FFFFFFFFFFFFFFFF RIP = 400556 orig RCX = 1 This changes FIXUP_TOP_OF_STACK to match sysret, causing the test to show RCX == RIP even under strace. It looks like this is a partial revert of: 88e4bc32686e ("[PATCH] x86-64 architecture specific sync for 2.5.8") from the historic git tree. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/c9a418c3dc3993cb88bb7773800225fd318a4c67.1421453410.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-17Merge tag 'trace-fixes-v3.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "This holds a few fixes to the ftrace infrastructure as well as the mixture of function graph tracing and kprobes. When jprobes and function graph tracing is enabled at the same time it will crash the system: # modprobe jprobe_example # echo function_graph > /sys/kernel/debug/tracing/current_tracer After the first fork (jprobe_example probes it), the system will crash. This is due to the way jprobes copies the stack frame and does not do a normal function return. This messes up with the function graph tracing accounting which hijacks the return address from the stack and replaces it with a hook function. It saves the return addresses in a separate stack to put back the correct return address when done. But because the jprobe functions do not do a normal return, their stack addresses are not put back until the function they probe is called, which means that the probed function will get the return address of the jprobe handler instead of its own. The simple fix here was to disable function graph tracing while the jprobe handler is being called. While debugging this I found two minor bugs with the function graph tracing. The first was about the function graph tracer sharing its function hash with the function tracer (they both get filtered by the same input). The changing of the set_ftrace_filter would not sync the function recording records after a change if the function tracer was disabled but the function graph tracer was enabled. This was due to the update only checking one of the ops instead of the shared ops to see if they were enabled and should perform the sync. This caused the ftrace accounting to break and a ftrace_bug() would be triggered, disabling ftrace until a reboot. The second was that the check to update records only checked one of the filter hashes. It needs to test both the "filter" and "notrace" hashes. The "filter" hash determines what functions to trace where as the "notrace" hash determines what functions not to trace (trace all but these). Both hashes need to be passed to the update code to find out what change is being done during the update. This also broke the ftrace record accounting and triggered a ftrace_bug(). This patch set also include two more fixes that were reported separately from the kprobe issue. One was that init_ftrace_syscalls() was called twice at boot up. This is not a major bug, but that call performed a rather large kmalloc (NR_syscalls * sizeof(*syscalls_metadata)). The second call made the first one a memory leak, and wastes memory. The other fix is a regression caused by an update in the v3.19 merge window. The moving to enable events early, moved the enabling before PID 1 was created. The syscall events require setting the TIF_SYSCALL_TRACEPOINT for all tasks. But for_each_process_thread() does not include the swapper task (PID 0), and ended up being a nop. A suggested fix was to add the init_task() to have its flag set, but I didn't really want to mess with PID 0 for this minor bug. Instead I disable and re-enable events again at early_initcall() where it use to be enabled. This also handles any other event that might have its own reg function that could break at early boot up" * tag 'trace-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix enabling of syscall events on the command line tracing: Remove extra call to init_ftrace_syscalls() ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing ftrace: Check both notrace and filter for old hash ftrace: Fix updating of filters for shared global_ops filters
2015-01-16x86/PCI: Clip bridge windows to fit in upstream windowsYinghai Lu
Every PCI-PCI bridge window should fit inside an upstream bridge window because orphaned address space is unreachable from the primary side of the upstream bridge. If we inherit invalid bridge windows that overlap an upstream window from firmware, clip them to fit and update the bridge accordingly. [bhelgaas: changelog] Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491 Reported-by: Marek Kordik <kordikmarek@gmail.com> Tested-by: Marek Kordik <kordikmarek@gmail.com> Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: x86@kernel.org CC: stable@vger.kernel.org # v3.16+
2015-01-16KVM: x86: switch to kvm_get_dirty_log_protectPaolo Bonzini
We now have a generic function that does most of the work of kvm_vm_ioctl_get_dirty_log, now use it. Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
2015-01-16perf/x86/intel: Fix bug for "cycles:p" and "cycles:pp" on SLMKan Liang
cycles:p and cycles:pp do not work on SLM since commit: 86a04461a99f ("perf/x86: Revamp PEBS event selection") UOPS_RETIRED.ALL is not a PEBS capable event, so it should not be used to count cycle number. Actually SLM calls intel_pebs_aliases_core2() which uses INST_RETIRED.ANY_P to count the number of cycles. It's a PEBS capable event. But inv and cmask must be set to count cycles. Considering SLM allows all events as PEBS with no flags, only INST_RETIRED.ANY_P, inv=1, cmask=16 needs to handled specially. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421084541-31639-1-git-send-email-kan.liang@intel.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-16perf/rapl: Fix sysfs_show() initialization for RAPL PMUStephane Eranian
This patch fixes a problem with the initialization of the sysfs_show() routine for the RAPL PMU. The current code was wrongly relying on the EVENT_ATTR_STR() macro which uses the events_sysfs_show() function in the x86 PMU code. That function itself was relying on the x86_pmu data structure. Yet RAPL and the core PMU (x86_pmu) have nothing to do with each other. They should therefore not interact with each other. The x86_pmu structure is initialized at boot time based on the host CPU model. When the host CPU is not supported, the x86_pmu remains uninitialized and some of the callbacks it contains are NULL. The false dependency with x86_pmu could potentially cause crashes in case the x86_pmu is not initialized while the RAPL PMU is. This may, for instance, be the case in virtualized environments. This patch fixes the problem by using a private sysfs_show() routine for exporting the RAPL PMU events. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150113225953.GA21525@thinkpad Cc: vincent.weaver@maine.edu Cc: jolsa@redhat.com Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-15ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracingSteven Rostedt (Red Hat)
If the function graph tracer traces a jprobe callback, the system will crash. This can easily be demonstrated by compiling the jprobe sample module that is in the kernel tree, loading it and running the function graph tracer. # modprobe jprobe_example.ko # echo function_graph > /sys/kernel/debug/tracing/current_tracer # ls The first two commands end up in a nice crash after the first fork. (do_fork has a jprobe attached to it, so "ls" just triggers that fork) The problem is caused by the jprobe_return() that all jprobe callbacks must end with. The way jprobes works is that the function a jprobe is attached to has a breakpoint placed at the start of it (or it uses ftrace if fentry is supported). The breakpoint handler (or ftrace callback) will copy the stack frame and change the ip address to return to the jprobe handler instead of the function. The jprobe handler must end with jprobe_return() which swaps the stack and does an int3 (breakpoint). This breakpoint handler will then put back the saved stack frame, simulate the instruction at the beginning of the function it added a breakpoint to, and then continue on. For function tracing to work, it hijakes the return address from the stack frame, and replaces it with a hook function that will trace the end of the call. This hook function will restore the return address of the function call. If the function tracer traces the jprobe handler, the hook function for that handler will not be called, and its saved return address will be used for the next function. This will result in a kernel crash. To solve this, pause function tracing before the jprobe handler is called and unpause it before it returns back to the function it probed. Some other updates: Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the code look a bit cleaner and easier to understand (various tries to fix this bug required this change). Note, if fentry is being used, jprobes will change the ip address before the function graph tracer runs and it will not be able to trace the function that the jprobe is probing. Link: http://lkml.kernel.org/r/20150114154329.552437962@goodmis.org Cc: stable@vger.kernel.org # 2.6.30+ Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-01-15Merge tag 'x86_queue' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp ↵Ingo Molnar
into x86/cleanups Pull minor x86 cleanups from Borislav Petkov. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-15Merge tag 'please-pull-einj-mmcfg' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Pull RAS update from Tony Luck: "When checking addresses in APEI action entries for validity, allow access to the mmcfg space - some error injection functions need to do this." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-15Merge tag 'ras_for_3.20' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/ras Pull RAS updates from Borislav Petkov: "Nothing special this time, just an error messages improvement from Andy and a cleanup from me." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-15iommu/irq_remapping: Kill function irq_remapping_supported() and related codeJiang Liu
Simplify irq_remapping code by killing irq_remapping_supported() and related interfaces. Joerg posted a similar patch at https://lkml.org/lkml/2014/12/15/490, so assume an signed-off from Joerg. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-14-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Only disable CPU x2apic mode when necessaryJiang Liu
When interrupt remapping hardware is not in X2APIC, CPU X2APIC mode will be disabled if: 1) Maximum CPU APIC ID is bigger than 255 2) hypervisior doesn't support x2apic mode. But we should only check whether hypervisor supports X2APIC mode when hypervisor(CONFIG_HYPERVISOR_GUEST) is enabled, otherwise X2APIC will always be disabled when CONFIG_HYPERVISOR_GUEST is disabled and IR doesn't work in X2APIC mode. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-12-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Handle XAPIC remap mode proper.Jiang Liu
If remapping is in XAPIC mode, the setup code just skips X2APIC initialization without checking max CPU APIC ID in system, which may cause problem if system has a CPU with APIC ID bigger than 255. Handle IR in XAPIC mode the same way as if remapping is disabled. [ tglx: Split out from previous patch ] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-8-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Refine enable_IR_x2apic() and related functionsJiang Liu
Refine enable_IR_x2apic() and related functions for better readability. [ tglx: Removed the XAPIC mode change and split it out into a seperate patch. Added comments. ] Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-8-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Correctly detect X2APIC status in function enable_IR()Jiang Liu
X2APIC will be disabled if user specifies "nox2apic" on kernel command line, even when x2apic_preenabled is true. So correctly detect X2APIC status by using x2apic_enabled() instead of x2apic_preenabled. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-7-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Kill useless variable x2apic_enabled in function enable_IR_x2apic()Jiang Liu
Local variable x2apic_enabled has been assigned to but never referred, so kill it. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-6-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Panic if kernel doesn't support x2apic but BIOS has enabled x2apicJiang Liu
When kernel doesn't support X2APIC but BIOS has enabled X2APIC, system may panic or hang without useful messages. On the other hand, it's hard to dynamically disable X2APIC when CONFIG_X86_X2APIC is disabled. So panic with a clear message in such a case. Now system panics as below when X2APIC is disabled and interrupt remapping is enabled: [ 0.316118] LAPIC pending interrupts after 512 EOI [ 0.322126] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.368655] Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC [ 0.378300] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.18.0+ #340 [ 0.385300] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0051.L05.1406240953 06/24/2014 [ 0.396997] ffff88046dc03000 ffff88046c307dd8 ffffffff8179dada 00000000000043f2 [ 0.405629] ffffffff81a92158 ffff88046c307e58 ffffffff8179b757 0000000000000002 [ 0.414261] 0000000000000008 ffff88046c307e68 ffff88046c307e08 ffffffff813ad82b [ 0.422890] Call Trace: [ 0.425711] [<ffffffff8179dada>] dump_stack+0x45/0x57 [ 0.431533] [<ffffffff8179b757>] panic+0xc1/0x1f5 [ 0.436978] [<ffffffff813ad82b>] ? delay_tsc+0x3b/0x70 [ 0.442910] [<ffffffff8166fa2c>] panic_if_irq_remap+0x1c/0x20 [ 0.449524] [<ffffffff81d73645>] setup_IO_APIC+0x405/0x82e [ 0.464979] [<ffffffff81d6fcc2>] native_smp_prepare_cpus+0x2d9/0x31c [ 0.472274] [<ffffffff81d5d0ac>] kernel_init_freeable+0xd6/0x223 [ 0.479170] [<ffffffff81792ad0>] ? rest_init+0x80/0x80 [ 0.485099] [<ffffffff81792ade>] kernel_init+0xe/0xf0 [ 0.490932] [<ffffffff817a537c>] ret_from_fork+0x7c/0xb0 [ 0.497054] [<ffffffff81792ad0>] ? rest_init+0x80/0x80 [ 0.502983] ---[ end Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC System hangs as below when X2APIC and interrupt remapping are both disabled: [ 1.102782] pci 0000:00:02.0: System wakeup disabled by ACPI [ 1.109351] pci 0000:00:03.0: System wakeup disabled by ACPI [ 1.115915] pci 0000:00:03.2: System wakeup disabled by ACPI [ 1.122479] pci 0000:00:03.3: System wakeup disabled by ACPI [ 1.132274] pci 0000:00:1c.0: Enabling MPC IRBNCE [ 1.137620] pci 0000:00:1c.0: Intel PCH root port ACS workaround enabled [ 1.145239] pci 0000:00:1c.0: System wakeup disabled by ACPI [ 1.151790] pci 0000:00:1c.7: Enabling MPC IRBNCE [ 1.157128] pci 0000:00:1c.7: Intel PCH root port ACS workaround enabled [ 1.164748] pci 0000:00:1c.7: System wakeup disabled by ACPI [ 1.171447] pci 0000:00:1e.0: System wakeup disabled by ACPI [ 1.178612] acpiphp: Slot [8] registered [ 1.183095] pci 0000:00:02.0: PCI bridge to [bus 01] [ 1.188867] acpiphp: Slot [2] registered With this patch applied, the system panics in both cases with a proper panic message. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Link: http://lkml.kernel.org/r/1420615903-28253-5-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-15x86/apic: Clear stale x2apic modeThomas Gleixner
If x2apic got disabled on the kernel command line, then the following issue can happen: enable_IR_x2apic() .... x2apic_mode = 1; enable_x2apic(); if (x2apic_disabled) { __disable_x2apic(); return; } That leaves X2APIC disabled in hardware, but x2apic_mode stays 1. So all other code which checks x2apic_mode gets the wrong information. Set x2apic_mode to 0 after disabling it in hardware. This is just a hotfix. The proper solution is to rework this code so it has seperate functions for the initial setup on the boot processor and the secondary cpus, but that's beyond the scope of this fix. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com>
2015-01-15iommu, x86: Restructure setup of the irq remapping featureThomas Gleixner
enable_IR_x2apic() calls setup_irq_remapping_ops() which by default installs the intel dmar remapping ops and then calls the amd iommu irq remapping prepare callback to figure out whether we are running on an AMD machine with irq remapping hardware. Right after that it calls irq_remapping_prepare() which pointlessly checks: if (!remap_ops || !remap_ops->prepare) return -ENODEV; and then calls remap_ops->prepare() which is silly in the AMD case as it got called from setup_irq_remapping_ops() already a few microseconds ago. Simplify this and just collapse everything into irq_remapping_prepare(). The irq_remapping_prepare() remains still silly as it assigns blindly the intel ops, but that's not scope of this patch. The scope here is to move the preperatory work, i.e. memory allocations out of the atomic section which is required to enable irq remapping. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Acked-and-tested-by: Joerg Roedel <joro@8bytes.org> Cc: Tony Luck <tony.luck@intel.com> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Oren Twaig <oren@scalemp.com> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20141205084147.232633738@linutronix.de Link: http://lkml.kernel.org/r/1420615903-28253-2-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-01-14Merge tag 'uaccess_for_upstream' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost into asm-generic Merge "uaccess: fix sparse warning on get/put_user for bitwise types" from Michael S. Tsirkin: At the moment, if p and x are both tagged as bitwise types, some of get_user(x, p), put_user(x, p), __get_user(x, p), __put_user(x, p) might produce a sparse warning on many architectures. This is a false positive: *p on these architectures is loaded into long (typically using asm), then cast back to typeof(*p). When typeof(*p) is a bitwise type (which is uncommon), such a cast needs __force, otherwise sparse produces a warning. Some architectures already have the __force tag, add it where it's missing. I verified that adding these __force casts does not supress any useful warnings. Specifically, vhost wants to read/write bitwise types in userspace memory using get_user/put_user. At the moment this triggers sparse errors, since the value is passed through an integer. For example: __le32 __user *p; __u32 x; both put_user(x, p); and get_user(x, p); should be safe, but produce warnings on some architectures. While there, I noticed that a bunch of architectures violated coding style rules within uaccess macros. Included patches to fix them up. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> * tag 'uaccess_for_upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (37 commits) sparc32: nocheck uaccess coding style tweaks sparc64: nocheck uaccess coding style tweaks xtensa: macro whitespace fixes sh: macro whitespace fixes parisc: macro whitespace fixes m68k: macro whitespace fixes m32r: macro whitespace fixes frv: macro whitespace fixes cris: macro whitespace fixes avr32: macro whitespace fixes arm64: macro whitespace fixes arm: macro whitespace fixes alpha: macro whitespace fixes blackfin: macro whitespace fixes sparc64: uaccess_64 macro whitespace fixes sparc32: uaccess_32 macro whitespace fixes avr32: whitespace fix sh: fix put_user sparse errors metag: fix put_user sparse errors ia64: fix put_user sparse errors ...
2015-01-14crypto: aesni - Add support for 192 & 256 bit keys to AESNI RFC4106Timothy McCaffrey
These patches fix the RFC4106 implementation in the aesni-intel module so it supports 192 & 256 bit keys. Since the AVX support that was added to this module also only supports 128 bit keys, and this patch only affects the SSE implementation, changes were also made to use the SSE version if key sizes other than 128 are specified. RFC4106 specifies that 192 & 256 bit keys must be supported (section 8.4). Also, this should fix Strongswan issue 341 where the aesni module needs to be unloaded if 256 bit keys are used: http://wiki.strongswan.org/issues/341 This patch has been tested with Sandy Bridge and Haswell processors. With 128 bit keys and input buffers > 512 bytes a slight performance degradation was noticed (~1%). For input buffers of less than 512 bytes there was no performance impact. Compared to 128 bit keys, 256 bit key size performance is approx. .5 cycles per byte slower on Sandy Bridge, and .37 cycles per byte slower on Haswell (vs. SSE code). This patch has also been tested with StrongSwan IPSec connections where it worked correctly. I created this diff from a git clone of crypto-2.6.git. Any questions, please feel free to contact me. Signed-off-by: Timothy McCaffrey <timothy.mccaffrey@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-01-13x86: entry_64.S: fold SAVE_ARGS_IRQ macro into its sole userDenys Vlasenko
No code changes. This is a preparatory patch for change in "struct pt_regs" handling. CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Oleg Nesterov <oleg@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Andy Lutomirski <luto@amacapital.net> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: X86 ML <x86@kernel.org> CC: Alexei Starovoitov <ast@plumgrid.com> CC: Will Drewry <wad@chromium.org> CC: Kees Cook <keescook@chromium.org> CC: linux-kernel@vger.kernel.org Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
2015-01-13x86: ia32entry.S: fix wrong symbolic constant usage: R11->ARGOFFSETDenys Vlasenko
The values of these two constants are the same, the meaning is different. Acked-by: Borislav Petkov <bp@suse.de> CC: Linus Torvalds <torvalds@linux-foundation.org> CC: Oleg Nesterov <oleg@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Borislav Petkov <bp@alien8.de> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: X86 ML <x86@kernel.org> CC: Alexei Starovoitov <ast@plumgrid.com> CC: Will Drewry <wad@chromium.org> CC: Kees Cook <keescook@chromium.org> CC: linux-kernel@vger.kernel.org Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net>