summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/x86.c
AgeCommit message (Collapse)Author
2010-08-01KVM: x86 emulator: advance RIP outside x86 emulator codeGleb Natapov
Return new RIP as part of instruction emulation result instead of updating KVM's RIP from x86 emulator code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: handle emulation failure case firstGleb Natapov
If emulation failed return immediately. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: do not inject #PF in (read|write)_emulated() callbacksGleb Natapov
Return error to x86 emulator instead of injection exception behind its back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: remove export of emulator_write_emulated()Gleb Natapov
It is not called directly outside of the file it's defined in anymore. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86 emulator: x86_emulate_insn() return -1 only in case of emulation ↵Gleb Natapov
failure Currently emulator returns -1 when emulation failed or IO is needed. Caller tries to guess whether emulation failed by looking at other variables. Make it easier for caller to recognise error condition by always returning -1 in case of failure. For this new emulator internal return value X86EMUL_IO_NEEDED is introduced. It is used to distinguish between error condition (which returns X86EMUL_UNHANDLEABLE) and condition that requires IO exit to userspace to continue emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: fill in run->mmio details in (read|write)_emulated functionGleb Natapov
Fill in run->mmio details in (read|write)_emulated function just like pio does. There is no point in filling only vcpu fields there just to copy them into vcpu->run a little bit later. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86 emulator: make (get|set)_dr() callback return error if it failsGleb Natapov
Make (get|set)_dr() callback return error if it fails instead of injecting exception behind emulator's back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86 emulator: make set_cr() callback return error if it failsGleb Natapov
Make set_cr() callback return error if it fails instead of injecting #GP behind emulator's back. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86 emulator: add get_cached_segment_base() callback to x86_emulate_opsGleb Natapov
On VMX it is expensive to call get_cached_descriptor() just to get segment base since multiple vmcs_reads are done instead of only one. Introduce new call back get_cached_segment_base() for efficiency. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86 emulator: add (set|get)_msr callbacks to x86_emulate_opsGleb Natapov
Add (set|get)_msr callbacks to x86_emulate_ops instead of calling them directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86 emulator: add (set|get)_dr callbacks to x86_emulate_opsGleb Natapov
Add (set|get)_dr callbacks to x86_emulate_ops instead of calling them directly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: VMX: Avoid writing HOST_CR0 every entryAvi Kivity
cr0.ts may change between entries, so we copy cr0 to HOST_CR0 before each entry. That is slow, so instead, set HOST_CR0 to have TS set unconditionally (which is a safe value), and issue a clts() just before exiting vcpu context if the task indeed owns the fpu. Saves ~50 cycles/exit. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-01KVM: x86: avoid unnecessary bitmap allocation when memslot is cleanTakuya Yoshikawa
Although we always allocate a new dirty bitmap in x86's get_dirty_log(), it is only used as a zero-source of copy_to_user() and freed right after that when memslot is clean. This patch uses clear_user() instead of doing this unnecessary zero-source allocation. Performance improvement: as we can expect easily, the time needed to allocate a bitmap is completely reduced. In my test, the improved ioctl was about 4 to 10 times faster than the original one for clean slots. Furthermore, reducing memory allocations and copies will produce good effects to caches too. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-07-27Merge remote branch 'origin/x86/urgent' into x86/asmH. Peter Anvin
2010-07-23KVM: Use kmalloc() instead of vmalloc() for KVM_[GS]ET_MSRAvi Kivity
We don't need more than a page, and vmalloc() is slower (much slower recently due to a regression). Signed-off-by: Avi Kivity <avi@redhat.com>
2010-07-21x86: Remove redundant K6 MSRsBrian Gerst
MSR_K6_EFER is unused, and MSR_K6_STAR is redundant with MSR_STAR. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1279371808-24804-1-git-send-email-brgerst@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-05-21Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (269 commits) KVM: x86: Add missing locking to arch specific vcpu ioctls KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls KVM: MMU: Segregate shadow pages with different cr0.wp KVM: x86: Check LMA bit before set_efer KVM: Don't allow lmsw to clear cr0.pe KVM: Add cpuid.txt file KVM: x86: Tell the guest we'll warn it about tsc stability x86, paravirt: don't compute pvclock adjustments if we trust the tsc x86: KVM guest: Try using new kvm clock msrs KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID KVM: x86: add new KVMCLOCK cpuid feature KVM: x86: change msr numbers for kvmclock x86, paravirt: Add a global synchronization point for pvclock x86, paravirt: Enable pvclock flags in vcpu_time_info structure KVM: x86: Inject #GP with the right rip on efer writes KVM: SVM: Don't allow nested guest to VMMCALL into host KVM: x86: Fix exception reinjection forced to true KVM: Fix wallclock version writing race KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots KVM: VMX: enable VMXON check with SMX enabled (Intel TXT) ...
2010-05-19KVM: x86: Add missing locking to arch specific vcpu ioctlsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: x86: Check LMA bit before set_eferSheng Yang
kvm_x86_ops->set_efer() would execute vcpu->arch.efer = efer, so the checking of LMA bit didn't work. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19KVM: Don't allow lmsw to clear cr0.peAvi Kivity
The current lmsw implementation allows the guest to clear cr0.pe, contrary to the manual, which breaks EMM386.EXE. Fix by ORing the old cr0.pe with lmsw's operand. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19KVM: x86: Tell the guest we'll warn it about tsc stabilityGlauber Costa
This patch puts up the flag that tells the guest that we'll warn it about the tsc being trustworthy or not. By now, we also say it is not. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUIDGlauber Costa
Right now, we were using individual KVM_CAP entities to communicate userspace about which cpuids we support. This is suboptimal, since it generates a delay between the feature arriving in the host, and being available at the guest. A much better mechanism is to list para features in KVM_GET_SUPPORTED_CPUID. This makes userspace automatically aware of what we provide. And if we ever add a new cpuid bit in the future, we have to do that again, which create some complexity and delay in feature adoption. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19KVM: x86: change msr numbers for kvmclockGlauber Costa
Avi pointed out a while ago that those MSRs falls into the pentium PMU range. So the idea here is to add new ones, and after a while, deprecate the old ones. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19KVM: x86: Inject #GP with the right rip on efer writesRoedel, Joerg
This patch fixes a bug in the KVM efer-msr write path. If a guest writes to a reserved efer bit the set_efer function injects the #GP directly. The architecture dependent wrmsr function does not see this, assumes success and advances the rip. This results in a #GP in the guest with the wrong rip. This patch fixes this by reporting efer write errors back to the architectural wrmsr function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: x86: Fix exception reinjection forced to trueJoerg Roedel
The patch merged recently which allowed to mark an exception as reinjected has a bug as it always marks the exception as reinjected. This breaks nested-svm shadow-on-shadow implementation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: Fix wallclock version writing raceAvi Kivity
Wallclock writing uses an unprotected global variable to hold the version; this can cause one guest to interfere with another if both write their wallclock at the same time. Acked-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-19KVM: x86: properly update ready_for_interrupt_injectionMarcelo Tosatti
The recent changes to emulate string instructions without entering guest mode exposed a bug where pending interrupts are not properly reflected in ready_for_interrupt_injection. The result is that userspace overwrites a previously queued interrupt, when irqchip's are emulated in userspace. Fix by always updating state before returning to userspace. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-18Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits) perf tools: Add mode to build without newt support perf symbols: symbol inconsistency message should be done only at verbose=1 perf tui: Add explicit -lslang option perf options: Type check all the remaining OPT_ variants perf options: Type check OPT_BOOLEAN and fix the offenders perf options: Check v type in OPT_U?INTEGER perf options: Introduce OPT_UINTEGER perf tui: Add workaround for slang < 2.1.4 perf record: Fix bug mismatch with -c option definition perf options: Introduce OPT_U64 perf tui: Add help window to show key associations perf tui: Make <- exit menus too perf newt: Add single key shortcuts for zoom into DSO and threads perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed perf newt: Fix the 'A'/'a' shortcut for annotate perf newt: Make <- exit the ui_browser x86, perf: P4 PMU - fix counters management logic perf newt: Make <- zoom out filters perf report: Report number of events, not samples perf hist: Clarify events_stats fields usage ... Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
2010-05-17KVM: x86: Allow marking an exception as reinjectedJoerg Roedel
This patch adds logic to kvm/x86 which allows to mark an injected exception as reinjected. This allows to remove an ugly hack from svm_complete_interrupts that prevented exceptions from being reinjected at all in the nested case. The hack was necessary because an reinjected exception into the nested guest could cause a nested vmexit emulation. But reinjected exceptions must not intercept. The downside of the hack is that a exception that in injected could get lost. This patch fixes the problem and puts the code for it into generic x86 files because. Nested-VMX will likely have the same problem and could reuse the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: x86: Add callback to let modules decide over some supported cpuid bitsJoerg Roedel
This patch adds the get_supported_cpuid callback to kvm_x86_ops. It will be used in do_cpuid_ent to delegate the decission about some supported cpuid bits to the architecture modules. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17Merge remote branch 'tip/perf/core'Avi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: MMU: Drop cr4.pge from shadow page roleAvi Kivity
Since commit bf47a760f66ad, we no longer handle ptes with the global bit set specially, so there is no reason to distinguish between shadow pages created with cr4.gpe set and clear. Such tracking is expensive when the guest toggles cr4.pge, so drop it. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: use the correct RCU API for PROVE_RCU=yLai Jiangshan
The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17Merge branch 'perf'Avi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: fix emulator_task_switch() return value.Gleb Natapov
emulator_task_switch() should return -1 for failure and 0 for success to the caller, just like x86_emulate_insn() does. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86: Push potential exception error code on task switchesJan Kiszka
When a fault triggers a task switch, the error code, if existent, has to be pushed on the new task's stack. Implement the missing bits. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86: get rid of mmu_only parameter in emulator_write_emulated()Gleb Natapov
We can call kvm_mmu_pte_write() directly from emulator_cmpxchg_emulated() instead of passing mmu_only down to emulator_write_emulated_onepage() and call it there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: move DR register access handling into generic codeGleb Natapov
Currently both SVM and VMX have their own DR handling code. Move it to x86.c. Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Fix MAXPHYADDR calculation when cpuid does not support itAvi Kivity
MAXPHYADDR is derived from cpuid 0x80000008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x80000008 is supported. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Trace emulated instructionsAvi Kivity
Log emulated instructions in ftrace, especially if they failed. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86 emulator: commit rflags as part of registers commitGleb Natapov
Make sure that rflags is committed only after successful instruction emulation. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: x86: Fix 32-bit build breakage due to typoJan Kiszka
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17KVM: small kvm_arch_vcpu_ioctl_run() cleanup.Gleb Natapov
Unify all conditions that get us back into emulator after returning from userspace. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86 emulator: restart string instruction without going back to a guest.Gleb Natapov
Currently when string instruction is only partially complete we go back to a guest mode, guest tries to reexecute instruction and exits again and at this point emulation continues. Avoid all of this by restarting instruction without going back to a guest mode, but return to a guest mode each 1024 iterations to allow interrupt injection. Pending exception causes immediate guest entry too. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86 emulator: Move string pio emulation into emulator.cGleb Natapov
Currently emulation is done outside of emulator so things like doing ins/outs to/from mmio are broken it also makes it hard (if not impossible) to implement single stepping in the future. The implementation in this patch is not efficient since it exits to userspace for each IO while previous implementation did 'ins' in batches. Further patch that implements pio in string read ahead address this problem. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86 emulator: fix in/out emulation.Gleb Natapov
in/out emulation is broken now. The breakage is different depending on where IO device resides. If it is in userspace emulator reports emulation failure since it incorrectly interprets kvm_emulate_pio() return value. If IO device is in the kernel emulation of 'in' will do nothing since kvm_emulate_pio() stores result directly into vcpu registers, so emulator will overwrite result of emulation during commit of shadowed register. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Use task switch from emulator.cGleb Natapov
Remove old task switch code from x86.c Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: x86 emulator: Provide more callbacks for x86 emulator.Gleb Natapov
Provide get_cached_descriptor(), set_cached_descriptor(), get_segment_selector(), set_segment_selector(), get_gdt(), write_std() callbacks. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Provide current eip as part of emulator context.Gleb Natapov
Eliminate the need to call back into KVM to get it from emulator. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17KVM: Provide x86_emulate_ctxt callback to get current cplGleb Natapov
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>