summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)Author
2011-01-12KVM: Send async PF when guest is not in userspace too.Gleb Natapov
If guest indicates that it can handle async pf in kernel mode too send it, but only if interrupts are enabled. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Let host know whether the guest can handle async PF in non-userspace ↵Gleb Natapov
context. If guest can detect that it runs in non-preemptable context it can handle async PFs at any time, so let host know that it can send async PF even if guest cpu is not in userspace. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM paravirt: Handle async PF in non preemptable contextGleb Natapov
If async page fault is received by idle task or when preemp_count is not zero guest cannot reschedule, so do sti; hlt and wait for page to be ready. vcpu can still process interrupts while it waits for the page to be ready. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Inject asynchronous page fault into a PV guest if page is swapped out.Gleb Natapov
Send async page fault to a PV guest if it accesses swapped out memory. Guest will choose another task to run upon receiving the fault. Allow async page fault injection only when guest is in user mode since otherwise guest may be in non-sleepable context and will not be able to reschedule. Vcpu will be halted if guest will fault on the same page again or if vcpu executes kernel code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Handle async PF in a guest.Gleb Natapov
When async PF capability is detected hook up special page fault handler that will handle async page fault events and bypass other page faults to regular page fault handler. Also add async PF handling to nested SVM emulation. Async PF always generates exit to L1 where vcpu thread will be scheduled out until page is available. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM paravirt: Add async PF initialization to PV guest.Gleb Natapov
Enable async PF in a guest if async PF capability is discovered. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Add PV MSR to enable asynchronous page faults delivery.Gleb Natapov
Guest enables async PF vcpu functionality using this MSR. Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM paravirt: Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c.Gleb Natapov
Async PF also needs to hook into smp_prepare_boot_cpu so move the hook into generic code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Add memory slot versioning and use it to provide fast guest write interfaceGleb Natapov
Keep track of memslots changes by keeping generation number in memslots structure. Provide kvm_write_guest_cached() function that skips gfn_to_hva() translation if memslots was not changed since previous invocation. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Retry fault before vmentryGleb Natapov
When page is swapped in it is mapped into guest memory only after guest tries to access it again and generate another fault. To save this fault we can map it immediately since we know that guest is going to access the page. Do it only when tdp is enabled for now. Shadow paging case is more complicated. CR[034] and EFER registers should be switched before doing mapping and then switched back. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Halt vcpu if page it tries to access is swapped outGleb Natapov
If a guest accesses swapped out memory do not swap it in from vcpu thread context. Schedule work to do swapping and put vcpu into halted state instead. Interrupts will still be delivered to the guest and if interrupt will cause reschedule guest will continue to run another task. [avi: remove call to get_user_pages_noio(), nacked by Linus; this makes everything synchrnous again] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type supportHuang Ying
Generic Hardware Error Source provides a way to report platform hardware errors (such as that from chipset). It works in so called "Firmware First" mode, that is, hardware errors are reported to firmware firstly, then reported to Linux by firmware. This way, some non-standard hardware error registers or non-standard hardware link can be checked by firmware to produce more valuable hardware error information for Linux. This patch adds POLL/IRQ/NMI notification types support. Because the memory area used to transfer hardware error information from BIOS to Linux can be determined only in NMI, IRQ or timer handler, but general ioremap can not be used in atomic context, so a special version of atomic ioremap is implemented for that. Known issue: - Error information can not be printed for recoverable errors notified via NMI, because printk is not NMI-safe. Will fix this via delay printing to IRQ context via irq_work or make printk NMI-safe. v2: - adjust printk format per comments. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-11xen/p2m: Fix module linking error.Konrad Rzeszutek Wilk
Fixes: ERROR: "m2p_find_override_pfn" [drivers/block/xen-blkfront.ko] undefined! Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11xen p2m: clear the old pte when adding a page to m2p_overrideJeremy Fitzhardinge
When adding a page to m2p_override we change the p2m of the page so we need to also clear the old pte of the kernel linear mapping because it doesn't correspond anymore. When we remove the page from m2p_override we restore the original p2m of the page and we also restore the old pte of the kernel linear mapping. Before changing the p2m mappings in m2p_add_override and m2p_remove_override, check that the page passed as argument is valid and return an error if it is not. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11xen p2m: transparently change the p2m mappings in the m2p overrideStefano Stabellini
In m2p_add_override store the original mfn into page->index and then change the p2m mapping, setting mfns as FOREIGN_FRAME. In m2p_remove_override restore the original mapping. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11xen: add m2p override mechanismJeremy Fitzhardinge
Add a simple hashtable based mechanism to override some portions of the m2p, so that we can find out the pfn corresponding to an mfn of a granted page. In fact entries corresponding to granted pages in the m2p hold the original pfn value of the page in the source domain that granted it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11xen: move p2m handling to separate fileJeremy Fitzhardinge
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix Moorestown VRTC fixmap placement x86/gpio: Implement x86 gpio_to_irq convert function x86, UV: Fix APICID shift for Westmere processors x86: Use PCI method for enabling AMD extended config space before MSR method x86: tsc: Prevent delayed init if initial tsc calibration failed x86, lapic-timer: Increase the max_delta to 31 bits x86: Fix sparse non-ANSI function warnings in smpboot.c x86, numa: Fix CONFIG_DEBUG_PER_CPU_MAPS without NUMA emulation x86, AMD, PCI: Add AMD northbridge PCI device id for CPU families 12h and 14h x86, numa: Fix cpu to node mapping for sparse node ids x86, numa: Fake node-to-cpumask for NUMA emulation x86, numa: Fake apicid and pxm mappings for NUMA emulation x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU x86, numa: Reduce minimum fake node size to 32M Fix up trivial conflict in arch/x86/kernel/apic/x2apic_uv_x.c
2011-01-11Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits) perf session: Fix infinite loop in __perf_session__process_events perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1) perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() fail perf tools: Emit clearer message for sys_perf_event_open ENOENT return perf stat: better error message for unsupported events perf sched: Fix allocation result check perf, x86: P4 PMU - Fix unflagged overflows handling dynamic debug: Fix build issue with older gcc tracing: Fix TRACE_EVENT power tracepoint creation tracing: Fix preempt count leak tracepoint: Add __rcu annotation tracing: remove duplicate null-pointer check in skb tracepoint tracing/trivial: Add missing comma in TRACE_EVENT comment tracing: Include module.h in define_trace.h x86: Save rbp in pt_regs on irq entry x86, dumpstack: Fix unused variable warning x86, NMI: Clean-up default_do_nmi() x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU x86, NMI: Remove DIE_NMI_IPI x86, NMI: Add priorities to handlers ...
2011-01-11x86,percpu: Move out of place 64 bit ops into X86_64 sectionChristoph Lameter
Some operations that operate on 64 bit operands are defined for 32 bit. Move them into the correct section. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2011-01-11x86: Fix Moorestown VRTC fixmap placementArjan van de Ven
The x86 fixmaps need to be all together... unfortunately the VRTC one was misplaced. This patch makes sure the MRST VRTC fixmap is put prior to the __end_of_permanent_fixed_addresses marker. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20110111105544.24448.27607.stgit@bob.linux.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11x86/gpio: Implement x86 gpio_to_irq convert functionAlek Du
We need this for x86 MID platforms where GPIO interrupts are used. No special magic is needed so the default 1:1 behaviour will do nicely. Signed-off-by: Alek Du <alek.du@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> LKML-Reference: <20110111105439.24448.69863.stgit@bob.linux.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11x86, UV: Fix APICID shift for Westmere processorsJack Steiner
Westmere processors use a different algorithm for assigning APICIDs on SGI UV systems. The location of the node number within the apicid is now a function of the processor type. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20110110195210.GA18737@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11x86: Use PCI method for enabling AMD extended config space before MSR methodJan Beulich
While both methods should work equivalently well for the native case, the Xen Dom0 case can't reliably work with the MSR one, since there's no guarantee that the virtual CPUs it has available fully cover all necessary physical ones. As per the suggestion of Robert Richter the patch only adds the PCI method, but leaves the MSR one as a fallback to cover new systems the PCI IDs of which may not have got added to the code base yet. The only change in v2 is the breaking out of the new CPI initialization method into a separate function, as requested by Ingo. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Robert Richter <robert.richter@amd.com> Cc: Andreas Herrmann3 <Andreas.Herrmann3@amd.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> LKML-Reference: <4D2B3FD7020000780002B67D@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-11x86: tsc: Prevent delayed init if initial tsc calibration failedThomas Gleixner
commit a8760ec (x86: Check tsc available/disabled in the delayed init function) missed to prevent the setup of the delayed init function in case the initial tsc calibration failed. This results in the same divide by zero bug as we have seen without the tsc disabled check. Skip the delayed work setup when tsc_khz (the initial calibration value) is 0. Bisected-and-tested-by: Kirill A. Shutemov <kas@openvz.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-01-10Merge branch 'stable/bug-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/event: validate irq before get evtchn by irq xen/fb: fix potential memory leak xen/fb: fix xenfb suspend/resume race. xen: disable ACPI NUMA for PV guests xen/irq: Cleanup the find_unbound_irq
2011-01-10Merge branch 'stable/generic' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/generic' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: HVM X2APIC support apic: Move hypervisor detection of x2apic to hypervisor.h
2011-01-10xen: disable ACPI NUMA for PV guestsIan Campbell
Xen does not currently expose PV-NUMA information to PV guests. Therefore disable NUMA for the time being to prevent the kernel picking up on an host-level NUMA information which it might come across in the firmware. [ Added comment - Jeremy ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-10x86, lapic-timer: Increase the max_delta to 31 bitsPierre Tardy
Latest atom socs(penwell) does not have hpet timer. As their local APIC timer is clocked at 400KHZ, and the current code limit their Initial Counter register to 23 bits, they cannot sleep more than 1.34 seconds which leads to ~2 spurious wakeup per second (1 per thread) These SOCs support 32bit timer so we change the max_delta to at least 31bits. So we can at least sleep for 300 seconds. We could not find any previous chip errata where lapic would only have 23 bit precision As powertop is suggesting to activate HPET to "sleep longer", this could mean this problem is already known. Problem is here since very first implementation of lapic timer as a clock event e9e2cdb [PATCH] clockevents: i386 drivers. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Pierre Tardy <pierre.tardy@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Adrian Bunk <bunk@stusta.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andi Kleen <ak@suse.de> LKML-Reference: <1294327409-19426-1-git-send-email-pierre.tardy@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-10Merge branch 'x86/numa' into x86/urgentIngo Molnar
Merge reason: Topic is ready for upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-10Merge branch 'x86/apic-cleanups' into x86/urgentIngo Molnar
Merge reason: Topic is ready for upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-09Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
2011-01-09perf, x86: P4 PMU - Fix unflagged overflows handlingCyrill Gorcunov
Don found that P4 PMU reads CCCR register instead of counter itself (in attempt to catch unflagged event) this makes P4 NMI handler to consume all NMIs it observes. So the other NMI users such as kgdb simply have no chance to get NMI on their hands. Side note: at moment there is no way to run nmi-watchdog together with perf tool. This is because both 'perf top' and nmi-watchdog use same event. So while nmi-watchdog reserves one event/counter for own needs there is no room for perf tool left (there is a way to disable nmi-watchdog on boot of course). Ming has tested this patch with the following results | 1. watchdog disabled | | kgdb tests on boot OK | perf works OK | | 2. watchdog enabled, without patch perf-x86-p4-nmi-4 | | kgdb tests on boot hang | | 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb | tests on boot | | "perf top" partialy works | cpu-cycles no | instructions yes | cache-references no | cache-misses no | branch-instructions no | branch-misses yes | bus-cycles no | | 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied | | kgdb tests on boot OK | perf does not work, NMI "Dazed and confused" messages show up | Which means we still have problems with p4 box due to 'unknown' nmi happens but at least it should fix kgdb test cases. Reported-by: Jason Wessel <jason.wessel@windriver.com> Reported-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4D275E7E.3040903@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-09x86: Fix sparse non-ANSI function warnings in smpboot.cRandy Dunlap
Fix sparse warning for non-ANSI function declaration: arch/x86/kernel/smpboot.c:100:30: warning: non-ANSI function declaration of function 'cpu_hotplug_driver_lock' arch/x86/kernel/smpboot.c:105:32: warning: non-ANSI function declaration of function 'cpu_hotplug_driver_unlock' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <20110108195914.95d366ea.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07Merge branch 'for-2.6.38' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits) gameport: use this_cpu_read instead of lookup x86: udelay: Use this_cpu_read to avoid address calculation x86: Use this_cpu_inc_return for nmi counter x86: Replace uses of current_cpu_data with this_cpu ops x86: Use this_cpu_ops to optimize code vmstat: User per cpu atomics to avoid interrupt disable / enable irq_work: Use per cpu atomics instead of regular atomics cpuops: Use cmpxchg for xchg to avoid lock semantics x86: this_cpu_cmpxchg and this_cpu_xchg operations percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support percpu,x86: relocate this_cpu_add_return() and friends connector: Use this_cpu operations xen: Use this_cpu_inc_return taskstats: Use this_cpu_ops random: Use this_cpu_inc_return fs: Use this_cpu_inc_return in buffer.c highmem: Use this_cpu_xx_return() operations vmstat: Use this_cpu_inc_return for vm statistics x86: Support for this_cpu_add, sub, dec, inc_return percpu: Generic support for this_cpu_add, sub, dec, inc_return ... Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c} as per Tejun.
2011-01-07x86: Save rbp in pt_regs on irq entryFrederic Weisbecker
From the x86_64 low level interrupt handlers, the frame pointer is saved right after the partial pt_regs frame. rbp is not supposed to be part of the irq partial saved registers, but it only requires to extend the pt_regs frame by 8 bytes to do so, plus a tiny stack offset fixup on irq exit. This changes a bit the semantics or get_irq_entry() that is supposed to provide only the value of caller saved registers and the cpu saved frame. However it's a win for unwinders that can walk through stack frames on top of get_irq_regs() snapshots. A noticeable impact is that it makes perf events cpu-clock and task-clock events based callchains working on x86_64. Let's then save rbp into the irq pt_regs. As a result with: perf record -e cpu-clock perf bench sched messaging perf report --stdio Before: 20.94% perf [kernel.kallsyms] [k] lock_acquire | --- lock_acquire | |--44.01%-- __write_nocancel | |--43.18%-- __read | |--6.08%-- fork | create_worker | |--0.88%-- _dl_fixup | |--0.65%-- do_lookup_x | |--0.53%-- __GI___libc_read --4.67%-- [...] After: 19.23% perf [kernel.kallsyms] [k] __lock_acquire | --- __lock_acquire | |--97.74%-- lock_acquire | | | |--21.82%-- _raw_spin_lock | | | | | |--37.26%-- unix_stream_recvmsg | | | sock_aio_read | | | do_sync_read | | | vfs_read | | | sys_read | | | system_call | | | __read | | | | | |--24.09%-- unix_stream_sendmsg | | | sock_aio_write | | | do_sync_write | | | vfs_write | | | sys_write | | | system_call | | | __write_nocancel v2: Fix cfi annotations. Reported-by: Soeren Sandmann Pedersen <sandmann@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Jan Beulich <JBeulich@novell.com>
2011-01-07x86, dumpstack: Fix unused variable warningRakib Mullick
In dump_stack function, bp isn't used anymore, which is introduced by commit 9c0729dc8062bed96189bd14ac6d4920f3958743. This patch removes bp completely. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Soeren Sandmann <sandmann@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <AANLkTik9U_Z0WSZ7YjrykER_pBUfPDdgUUmtYx=R74nL@mail.gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-01-07xen: HVM X2APIC supportSheng Yang
This patch is similiar to Gleb Natapov's patch for KVM, which enable the hypervisor to emulate x2apic feature for the guest. By this way, the emulation of lapic would be simpler with x2apic interface(MSR), and faster. [v2: Re-organized 'xen_hvm_need_lapic' per Ian Campbell suggestion] Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-07apic: Move hypervisor detection of x2apic to hypervisor.hSheng Yang
Then we can reuse it for Xen later. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Avi Kivity <avi@redhat.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-07x86, NMI: Clean-up default_do_nmi()Don Zickus
Just re-arrange the code a bit to make it easier to follow what is going on. Basically un-negating the if-statement and swapping the code inside the if-statement with code outside. No functional changes. Originally-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-7-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPUDon Zickus
In original NMI handler, NMI reason io port (0x61) is only processed on BSP. This makes it impossible to hot-remove BSP. To solve the issue, a raw spinlock is used to allow the port to be processed on any CPU. Originally-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-6-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Remove DIE_NMI_IPIDon Zickus
With priorities in place and no one really understanding the difference between DIE_NMI and DIE_NMI_IPI, just remove DIE_NMI_IPI and convert everyone to DIE_NMI. This also simplifies default_do_nmi() a little bit. Instead of calling the die_notifier in both the if and else part, just pull it out and call it before the if-statement. This has the side benefit of avoiding a call to the ioport to see if there is an external NMI sitting around until after the (more frequent) internal NMIs are dealt with. Patch-Inspired-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-5-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Add priorities to handlersDon Zickus
In order to consolidate the NMI die_chain events, we need to setup the priorities for the die notifiers. I started by defining a bunch of common priorities that can be used by the notifier blocks. Then I modified the notifier blocks to use the newly created priorities. Now that the priorities are straightened out, it should be easier to remove the event DIE_NMI_IPI. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-4-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86: Convert some devices to use DIE_NMIUNKNOWNDon Zickus
They are a handful of places in the code that register a die_notifier as a catch all in case no claims the NMI. Unfortunately, they trigger on events like DIE_NMI and DIE_NMI_IPI, which depending on when they registered may collide with other handlers that have the ability to determine if the NMI is theirs or not. The function unknown_nmi_error() makes one last effort to walk the die_chain when no one else has claimed the NMI before spitting out messages that the NMI is unknown. This is a better spot for these devices to execute any code without colliding with the other handlers. The two drivers modified are only compiled on x86 arches I believe, so they shouldn't be affected by other arches that may not have DIE_NMIUNKNOWN defined. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Russ Anderson <rja@sgi.com> Cc: Corey Minyard <minyard@acm.org> Cc: openipmi-developer@lists.sourceforge.net Cc: dann frazier <dannf@hp.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, NMI: Add NMI symbol constants and rename memory parity to PCI SERRHuang Ying
Replace the NMI related magic numbers with symbol constants. Memory parity error is only valid for IBM PC-AT, newer machine use bit 7 (0x80) of 0x61 port for PCI SERR. While memory error is usually reported via MCE. So corresponding function name and kernel log string is changed. But on some machines, PCI SERR line is still used to report memory errors. This is used by EDAC, so corresponding EDAC call is reserved. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07Merge branch 'linus' into x86/apic-cleanupsIngo Molnar
Conflicts: arch/x86/include/asm/io_apic.h Merge reason: Resolve the conflict, update to a more recent -rc base Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07x86, numa: Fix CONFIG_DEBUG_PER_CPU_MAPS without NUMA emulationDavid Rientjes
"x86, numa: Fake node-to-cpumask for NUMA emulation" broke the build when CONFIG_DEBUG_PER_CPU_MAPS is set and CONFIG_NUMA_EMU is not. This is because it is possible to map a cpu to multiple nodes when NUMA emulation is used; the patch required a physical node address table to find those nodes that was only available when CONFIG_NUMA_EMU was enabled. This extracts the common debug functionality to its own function for CONFIG_DEBUG_PER_CPU_MAPS and uses it regardless of whether CONFIG_NUMA_EMU is set or not. NUMA emulation will now iterate over the set of possible nodes for each cpu and call the new debug function whereas only the cpu's node will be used without NUMA emulation enabled. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <alpine.DEB.2.00.1012301053590.12995@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07PM / ACPI: Move NVS saving and restoring code to drivers/acpiRafael J. Wysocki
The saving of the ACPI NVS area during hibernation and suspend and restoring it during the subsequent resume is entirely specific to ACPI, so move it to drivers/acpi and drop the CONFIG_SUSPEND_NVS configuration option which is redundant. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-06Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mm: Initialize initial_page_table before paravirt jumps
2011-01-06Merge branches 'x86-alternatives-for-linus', 'x86-fpu-for-linus', ↵Linus Torvalds
'x86-hwmon-for-linus', 'x86-paravirt-for-linus', 'core-locking-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, asm: Use fxsaveq/fxrestorq in more places * 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hwmon: Add core threshold notification to therm_throt.c * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Use native_halt on a halt, not native_safe_halt * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking, lockdep: Convert sprintf_symbol to %pS * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Better struct irqaction layout