summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)Author
2014-11-16x86: Support PAT bit in pagetable dump for lower levelsJuergen Gross
Dumping page table protection bits is not correct for entries on levels 2 and 3 regarding the PAT bit, which is at a different position as on level 4. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-16-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Clean up pgtable_types.hJuergen Gross
Remove no longer used defines from pgtable_types.h as they are not used any longer. Switch __PAGE_KERNEL_NOCACHE to use cache mode type instead of pte bits. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-15-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in memtype related functionsJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-14-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in mm/ioremap.cJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-13-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in setting page attributesJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type in the functions for modifying page attributes. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-12-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.cJuergen Gross
When modifying page attributes via change_page_attr_set_clr() don't test for setting _PAGE_PAT_LARGE, as this is - never done - PAT support for large pages is not included in the kernel up to now Signed-off-by: Juergen Gross <jgross@suse.com> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-11-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert()Juergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. As those are the main callers of lookup_memtype(), change this as well. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-10-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in mm/iomap_32.cJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. This requires to change io_reserve_memtype() as well. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-9-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in asm/pgtable.hJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. This requires changing some callers of is_new_memtype_allowed() to be changed as well. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-8-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in arch/x86/mm/init_64.cJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-7-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in arch/x86/pciJuergen Gross
Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-6-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Use new cache mode type in include/asm/fb.hJuergen Gross
Instead of directly using cache mode bits in the pte switch to usage of the new cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-3-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16x86: Make page cache mode a real typeJuergen Gross
At the moment there are a lot of places that handle setting or getting the page cache mode by treating the pgprot bits equal to the cache mode. This is only true because there are a lot of assumptions about the setup of the PAT MSR. Otherwise the cache type needs to get translated into pgprot bits and vice versa. This patch tries to prepare for that by introducing a separate type for the cache mode and adding functions to translate between those and pgprot values. To avoid too much performance penalty the translation between cache mode and pgprot values is done via tables which contain the relevant information. Write-back cache mode is hard-wired to be 0, all other modes are configurable via those tables. For large pages there are translation functions as the PAT bit is located at different positions in the ptes of 4k and large pages. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-2-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-16Merge branch 'sched/urgent' into sched/core, to pick up fixes before ↵Ingo Molnar
applying more changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16Merge tag 'efi-next' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi Pull EFI updates for v3.19 from Matt Fleming: - Support module unload for efivarfs - Mathias Krause - Another attempt at moving x86 to libstub taking advantage of the __pure attribute - Ard Biesheuvel - Add EFI runtime services section to ptdump - Mathias Krause Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16perf/x86/intel/uncore: Fix boot crash on SBOX PMU on Haswell-EPAndi Kleen
There were several reports that on some systems writing the SBOX0 PMU initialization MSR would #GP at boot. This did not happen on all systems -- my two test systems booted fine. Writing the three initialization bits bit-by-bit seems to avoid the problem. So add a special callback to do just that. This replaces an earlier patch that disabled the SBOX. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Reported-and-Tested-by: Patrick Lu <patrick.lu@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: http://lkml.kernel.org/r/1415062828-19759-4-git-send-email-andi@firstfloor.org [ Fixed a whitespace error and added attribution tags that were left out inexplicably. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16perf/x86/intel/uncore: Fix IRP uncore register offsets on Haswell EPAndi Kleen
The counter register offsets for the IRP box PMU for Haswell-EP were incorrect. The offsets actually changed over IvyBridge EP. Fix them to the correct values. For this we need to fork the read function from the IVB and use an own counter array. Tested-by: patrick.lu@intel.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: http://lkml.kernel.org/r/1415062828-19759-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-14kvm: x86: increase user memory slots to 509Igor Mammedov
With the 3 private slots, this gives us 512 slots total. Motivation for this is in addition to assigned devices support more memory hotplug slots, where 1 slot is used by a hotplugged memory stick. It will allow to support upto 256 hotplug memory slots and leave 253 slots for assigned devices and other devices that use them. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-13kvm: svm: move WARN_ON in svm_adjust_tsc_offsetChris J Arges
When running the tsc_adjust kvm-unit-test on an AMD processor with the IA32_TSC_ADJUST feature enabled, the WARN_ON in svm_adjust_tsc_offset can be triggered. This WARN_ON checks for a negative adjustment in case __scale_tsc is called; however it may trigger unnecessary warnings. This patch moves the WARN_ON to trigger only if __scale_tsc will actually be called from svm_adjust_tsc_offset. In addition make adj in kvm_set_msr_common s64 since this can have signed values. Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logicDaniel Lezcano
The only place where the time is invalid is when the ACPI_CSTATE_FFH entry method is not set. Otherwise for all the drivers, the time can be correctly measured. Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers for all the states, just invert the logic by replacing it by the flag CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle driver, remove the former flag from all the drivers and invert the logic with this flag in the different governor. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-12x86, kvm, vmx: Don't set LOAD_IA32_EFER when host and guest matchAndy Lutomirski
There's nothing to switch if the host and guest values are the same. I am unable to find evidence that this makes any difference whatsoever. Signed-off-by: Andy Lutomirski <luto@amacapital.net> [I could see a difference on Nehalem. From 5 runs: userspace exit, guest!=host 12200 11772 12130 12164 12327 userspace exit, guest=host 11983 11780 11920 11919 12040 lightweight exit, guest!=host 3214 3220 3238 3218 3337 lightweight exit, guest=host 3178 3193 3193 3187 3220 This passes the t-test with 99% confidence for userspace exit, 98.5% confidence for lightweight exit. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12perf/x86/amd/ibs: Update IBS MSRs and feature definitionsAravind Gopalakrishnan
New Fam15h models carry extra feature bits and extend the MSR register space for IBS ops. Adding them here. While at it, add functionality to read IbsBrTarget and OpData4 depending on their availability if user wants a PERF_SAMPLE_RAW. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Len Brown <len.brown@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <paulus@samba.org> Cc: <acme@kernel.org> Link: http://lkml.kernel.org/r/1415651066-13523-1-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linuxHerbert Xu
Merging 3.18-rc4 in order to pick up the memzero_explicit helper.
2014-11-12Merge tag 'x86_queue_for_3.19' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/cleanups Pull two minor cleanups from Borislav Petkov. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-12Merge tag 'v3.18-rc4' into x86/cleanups, to refresh the tree before pulling ↵Ingo Molnar
new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-12x86, kvm, vmx: Always use LOAD_IA32_EFER if availableAndy Lutomirski
At least on Sandy Bridge, letting the CPU switch IA32_EFER is much faster than switching it manually. I benchmarked this using the vmexit kvm-unit-test (single run, but GOAL multiplied by 5 to do more iterations): Test Before After Change cpuid 2000 1932 -3.40% vmcall 1914 1817 -5.07% mov_from_cr8 13 13 0.00% mov_to_cr8 19 19 0.00% inl_from_pmtimer 19164 10619 -44.59% inl_from_qemu 15662 10302 -34.22% inl_from_kernel 3916 3802 -2.91% outl_to_kernel 2230 2194 -1.61% mov_dr 172 176 2.33% ipi (skipped) (skipped) ipi+halt (skipped) (skipped) ple-round-robin 13 13 0.00% wr_tsc_adjust_msr 1920 1845 -3.91% rd_tsc_adjust_msr 1892 1814 -4.12% mmio-no-eventfd:pci-mem 16394 11165 -31.90% mmio-wildcard-eventfd:pci-mem 4607 4645 0.82% mmio-datamatch-eventfd:pci-mem 4601 4610 0.20% portio-no-eventfd:pci-io 11507 7942 -30.98% portio-wildcard-eventfd:pci-io 2239 2225 -0.63% portio-datamatch-eventfd:pci-io 2250 2234 -0.71% I haven't explicitly computed the significance of these numbers, but this isn't subtle. Signed-off-by: Andy Lutomirski <luto@amacapital.net> [The results were reproducible on all of Nehalem, Sandy Bridge and Ivy Bridge. The slowness of manual switching is because writing to EFER with WRMSR triggers a TLB flush, even if the only bit you're touching is SCE (so the page table format is not affected). Doing the write as part of vmentry/vmexit, instead, does not flush the TLB, probably because all processors that have EPT also have VPID. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-12Merge tag 'v3.18-rc4' into drm-nextDave Airlie
backmerge to get vmwgfx locking changes into next as the conflict with per-plane locking.
2014-11-12intel_pstate: Add support for HWPDirk Brandewie
Add support of Hardware Managed Performance States (HWP) described in Volume 3 section 14.4 of the SDM. With HWP enbaled intel_pstate will no longer be responsible for selecting P states for the processor. intel_pstate will continue to register to the cpufreq core as the scaling driver for CPUs implementing HWP. In HWP mode intel_pstate provides three functions reporting frequency to the cpufreq core, support for the set_policy() interface from the core and maintaining the intel_pstate sysfs interface in /sys/devices/system/cpu/intel_pstate. User preferences expressed via the set_policy() interface or the sysfs interface are forwared to the CPU via the HWP MSR interface. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-12x86: Add support for Intel HWP feature detection.Dirk Brandewie
Add support of Hardware Managed Performance States (HWP) described in Volume 3 section 14.4 of the SDM. One bit CPUID.06H:EAX[bit 7] expresses the presence of the HWP feature on the processor. The remaining bits CPUID.06H:EAX[bit 8-11] denote the presense of various HWP features. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-11x86, ptdump: Add section for EFI runtime servicesMathias Krause
In commit 3891a04aafd6 ("x86-64, espfix: Don't leak bits 31:16 of %esp returning..") the "ESPFix Area" was added to the page table dump special sections. That area, though, has a limited amount of entries printed. The EFI runtime services are, unfortunately, located in-between the espfix area and the high kernel memory mapping. Due to the enforced limitation for the espfix area, the EFI mappings won't be printed in the page table dump. To make the ESP runtime service mappings visible again, provide them a dedicated entry. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-11-11efi/x86: Move x86 back to libstubArd Biesheuvel
This reverts commit 84be880560fb, which itself reverted my original attempt to move x86 from #include'ing .c files from across the tree to using the EFI stub built as a static library. The issue that affected the original approach was that splitting the implementation into several .o files resulted in the variable 'efi_early' becoming a global with external linkage, which under -fPIC implies that references to it must go through the GOT. However, dealing with this additional GOT entry turned out to be troublesome on some EFI implementations. (GCC's visibility=hidden attribute is supposed to lift this requirement, but it turned out not to work on the 32-bit build.) Instead, use a pure getter function to get a reference to efi_early. This approach results in no additional GOT entries being generated, so there is no need for any changes in the early GOT handling. Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-11-11Revert "PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()"Yijing Wang
The problem fixed by 0e4ccb1505a9 ("PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()") has been fixed in a simpler way by a previous commit ("PCI/MSI: Add pci_msi_ignore_mask to prevent writes to MSI/MSI-X Mask Bits"). The msi_mask_irq() and msix_mask_irq() x86_msi_ops added by 0e4ccb1505a9 are no longer needed, so revert the commit. default_msi_mask_irq() and default_msix_mask_irq() were added by 0e4ccb1505a9 and are still used by s390, so keep them for now. [bhelgaas: changelog] Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David Vrabel <david.vrabel@citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: xen-devel@lists.xenproject.org
2014-11-11Merge branch 'io' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into asm-generic * 'io' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux: documentation: memory-barriers: clarify relaxed io accessor semantics x86: io: implement dummy relaxed accessor macros for writes tile: io: implement dummy relaxed accessor macros for writes sparc: io: implement dummy relaxed accessor macros for writes powerpc: io: implement dummy relaxed accessor macros for writes parisc: io: implement dummy relaxed accessor macros for writes mn10300: io: implement dummy relaxed accessor macros for writes m68k: io: implement dummy relaxed accessor macros for writes m32r: io: implement dummy relaxed accessor macros for writes ia64: io: implement dummy relaxed accessor macros for writes cris: io: implement dummy relaxed accessor macros for writes frv: io: implement dummy relaxed accessor macros for writes xtensa: io: remove dummy relaxed accessor macros for reads s390: io: remove dummy relaxed accessor macros for reads microblaze: io: remove dummy relaxed accessor macros asm-generic: io: implement relaxed accessor macros as conditional wrappers Conflicts: include/asm-generic/io.h Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2014-11-11ftrace: Add more information to ftrace_bug() outputSteven Rostedt (Red Hat)
With the introduction of the dynamic trampolines, it is useful that if things go wrong that ftrace_bug() produces more information about what the current state is. This can help debug issues that may arise. Ftrace has lots of checks to make sure that the state of the system it touchs is exactly what it expects it to be. When it detects an abnormality it calls ftrace_bug() and disables itself to prevent any further damage. It is crucial that ftrace_bug() produces sufficient information that can be used to debug the situation. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Borislav Petkov <bp@suse.de> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolinesSteven Rostedt (Red Hat)
When the static ftrace_ops (like function tracer) enables tracing, and it is the only callback that is referencing a function, a trampoline is dynamically allocated to the function that calls the callback directly instead of calling a loop function that iterates over all the registered ftrace ops (if more than one ops is registered). But when it comes to dynamically allocated ftrace_ops, where they may be freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe to free the trampoline. If a task was preempted while executing on the trampoline, there's currently no way to know when it will be off that trampoline. But this is not true when it comes to !CONFIG_PREEMPT. The current method of calling schedule_on_each_cpu() will force tasks off the trampoline, becaues they can not schedule while on it (kernel preemption is not configured). That means it is safe to free a dynamically allocated ftrace ops trampoline when CONFIG_PREEMPT is not configured. Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Borislav Petkov <bp@suse.de> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11x86, CPU, AMD: Move K8 TLB flush filter workaround to K8 codeBorislav Petkov
This belongs with the rest of the code in init_amd_k8() which gets executed on family 0xf. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-11x86, espfix: Remove stale ptemaskBorislav Petkov
Previous versions of the espfix had a single function which did setup the pagetables. It was later split into BSP and AP version. Drop unused leftovers after that split. Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-10Merge tag 'microcode_fixes_for_3.18' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent Pull two fixes for early microcode loader on 32-bit from Borislav Petkov: - access the dis_ucode_ldr chicken bit properly - fix patch stashing on AMD on 32-bit Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10/dev/mem: Use more consistent data typesThierry Reding
The xlate_dev_{kmem,mem}_ptr() functions take either a physical address or a kernel virtual address, so data types should be phys_addr_t and void *. They both return a kernel virtual address which is only ever used in calls to copy_{from,to}_user(), so make variables that store it void * rather than char * for consistency. Also only define a weak unxlate_dev_mem_ptr() function if architectures haven't overridden them in the asm/io.h header file. Signed-off-by: Thierry Reding <treding@nvidia.com>
2014-11-10KVM: x86: fix warning on 32-bit compilationPaolo Bonzini
PCIDs are only supported in 64-bit mode. No need to clear bit 63 of CR3 unless the host is 64-bit. Reported by Fengguang Wu's autobuilder. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-10x86, microcode, AMD: Fix ucode patch stashing on 32-bitBorislav Petkov
Save the patch while we're running on the BSP instead of later, before the initrd has been jettisoned. More importantly, on 32-bit we need to access the physical address instead of the virtual. This way we actually do find it on the APs instead of having to go through the initrd each time. Tested-by: Richard Hendershot <rshendershot@mchsi.com> Fixes: 5335ba5cf475 ("x86, microcode, AMD: Fix early ucode loading") Cc: <stable@vger.kernel.org> # v3.13+ Signed-off-by: Borislav Petkov <bp@suse.de>
2014-11-10x86/core, x86/xen/smp: Use 'die_complete' completion when taking CPU downBoris Ostrovsky
Commit 2ed53c0d6cc9 ("x86/smpboot: Speed up suspend/resume by avoiding 100ms sleep for CPU offline during S3") introduced completions to CPU offlining process. These completions are not initialized on Xen kernels causing a panic in play_dead_common(). Move handling of die_complete into common routines to make them available to Xen guests. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: tianyu.lan@intel.com Cc: konrad.wilk@oracle.com Cc: xen-devel@lists.xenproject.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1414770572-7950-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10x86_64/vsyscall: Restore orig_ax after vsyscall seccompAndy Lutomirski
The vsyscall emulation code sets orig_ax for seccomp's benefit, but it forgot to set it back. I'm not sure that this is observable at all, but it could cause confusion to various /proc or ptrace users, and it's possible that it could cause minor artifacts if a signal were to be delivered on return from vsyscall emulation. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/cdc6a564517a4df09235572ee5f530ccdcf933f7.1415144089.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-10x86_64: Add a comment explaining the TASK_SIZE_MAX guard pageAndy Lutomirski
That guard page is absolutely necessary; explain why for posterity. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/23320cb5017c2da8475ec20fcde8089d82aa2699.1415144745.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-08kvm: x86: add trace event for pvclock updatesDavid Matlack
The new trace event records: * the id of vcpu being updated * the pvclock_vcpu_time_info struct being written to guest memory This is useful for debugging pvclock bugs, such as the bug fixed by "[PATCH] kvm: x86: Fix kvm clock versioning.". Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08kvm: x86: Fix kvm clock versioning.Owen Hofmann
kvm updates the version number for the guest paravirt clock structure by incrementing the version of its private copy. It does not read the guest version, so will write version = 2 in the first update for every new VM, including after restoring a saved state. If guest state is saved during reading the clock, it could read and accept struct fields and guest TSC from two different updates. This changes the code to increment the guest version and write it back. Signed-off-by: Owen Hofmann <osh@google.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08KVM: x86: MOVNTI emulation min opsize is not respectedNadav Amit
Commit 3b32004a66e9 ("KVM: x86: movnti minimum op size of 32-bit is not kept") did not fully fix the minimum operand size of MONTI emulation. Still, MOVNTI may be mistakenly performed using 16-bit opsize. This patch add No16 flag to mark an instruction does not support 16-bits operand size. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08KVM: x86: update masterclock values on TSC writesMarcelo Tosatti
When the guest writes to the TSC, the masterclock TSC copy must be updated as well along with the TSC_OFFSET update, otherwise a negative tsc_timestamp is calculated at kvm_guest_time_update. Once "if (!vcpus_matched && ka->use_master_clock)" is simplified to "if (ka->use_master_clock)", the corresponding "if (!ka->use_master_clock)" becomes redundant, so remove the do_request boolean and collapse everything into a single condition. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08KVM: x86: Return UNHANDLABLE on unsupported SYSENTERNadav Amit
Now that KVM injects #UD on "unhandlable" error, it makes better sense to return such error on sysenter instead of directly injecting #UD to the guest. This allows to track more easily the unhandlable cases the emulator does not support. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-08KVM: x86: Warn on APIC base relocationNadav Amit
APIC base relocation is unsupported by KVM. If anyone uses it, the least should be to report a warning in the hypervisor. Note that KVM-unit-tests uses this feature for some reason, so running the tests triggers the warning. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>