summaryrefslogtreecommitdiffstats
path: root/drivers/kvm/kvm_main.c
AgeCommit message (Collapse)Author
2007-10-13KVM: In-kernel I/O APIC modelEddie Dong
This allows in-kernel host-side device drivers to raise guest interrupts without going to userspace. [avi: fix level-triggered interrupt redelivery on eoi] [avi: add missing #include] [avi: avoid redelivery of edge-triggered interrupt] [avi: implement polarity] [avi: don't deliver edge-triggered interrupts when unmasking] [avi: fix host oops on invalid guest access] Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Emulate local APIC in kernelEddie Dong
Because lightweight exits (exits which don't involve userspace) are many times faster than heavyweight exits, it makes sense to emulate high usage devices in the kernel. The local APIC is one such device, especially for Windows and for SMP, so we add an APIC model to kvm. It also allows in-kernel host-side drivers to inject interrupts without going through userspace. [compile fix on i386 from Jindrich Makovicka] Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com> Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Define and use cr8 access functionsEddie Dong
This patch is to wrap APIC base register and CR8 operation which can provide a unique API for user level irqchip and kernel irqchip. This is a preparation of merging lapic/ioapic patch. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Add support for in-kernel PIC emulationEddie Dong
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Clean up kvm_setup_pio()Laurent Vivier
Split kvm_setup_pio() into two functions, one to setup in/out pio (kvm_emulate_pio()) and one to setup ins/outs pio (kvm_emulate_pio_string()). Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Cleanup string I/O instruction emulationLaurent Vivier
Both vmx and svm decode the I/O instructions, and both botch the job, requiring the instruction prefixes to be fetched in order to completely decode the instruction. So, if we see a string I/O instruction, use the x86 emulator to decode it, as it already has all the prefix decoding machinery. This patch defines ins/outs opcodes in x86_emulate.c and calls emulate_instruction() from io_interception() (svm.c) and from handle_io() (vmx.c). It removes all vmx/svm prefix instruction decoders (get_addr_size(), io_get_override(), io_address(), get_io_count()) Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove useless assignmentLaurent Vivier
Line 1809 of kvm_main.c is useless, value is overwritten in line 1815: 1809 now = min(count, PAGE_SIZE / size); 1810 1811 if (!down) 1812 in_page = PAGE_SIZE - offset_in_page(address); 1813 else 1814 in_page = offset_in_page(address) + size; 1815 now = min(count, (unsigned long)in_page / size); 1816 if (!now) { Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Add and use pr_unimpl for standard formatting of unimplemented featuresRusty Russell
All guest-invokable printks should be ratelimited to prevent malicious guests from flooding logs. This is a start. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove unneeded kvm_dev_open and kvm_dev_release functions.Rusty Russell
Devices don't need open or release functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove stat_set from debugfsRusty Russell
We shouldn't define stat_set on the debug attributes, since that will cause silent failure on writing: without a set argument, userspace will get -EACCESS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Cleanup mark_page_dirtyRusty Russell
For some reason, mark_page_dirty open-codes __gfn_to_memslot(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Don't assign vcpu->cr3 if it's invalid: check first, set lastRusty Russell
sSigned-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: VMX: Add cpu consistency checkYang, Sheng
All the physical CPUs on the board should support the same VMX feature set. Add check_processor_compatibility to kvm_arch_ops for the consistency check. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: kvm_vm_ioctl_get_dirty_log restore "nothing dirty" optimizationRusty Russell
kvm_vm_ioctl_get_dirty_log scans bitmap to see it it's all zero, but doesn't use that information. Avi says: Looks like it was used to guard kvm_mmu_slot_remove_write_access(); optimizing the case where the guest just leaves the screen alone (which it usually does, especially in benchmarks). I'd rather reinstate that optimization. See 90cb0529dd230548a7f0d6b315997be854caea1b where the damage was done. It's pretty simple: if the bitmap is all zero, we don't need to do anything to clean it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use alignment properties of vcpu to simplify FPU opsRusty Russell
Now we use a kmem cache for allocating vcpus, we can get the 16-byte alignment required by fxsave & fxrstor instructions, and avoid manually aligning the buffer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use kmem cache for allocating vcpusRusty Russell
Avi wants the allocations of vcpus centralized again. The easiest way is to add a "size" arg to kvm_init_arch, and expose the thus-prepared cache to the modules. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove kvm_{read,write}_guest()Laurent Vivier
... in favor of the more general emulator_{read,write}_*. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Change the emulator_{read,write,cmpxchg}_* functions to take a vcpuLaurent Vivier
... instead of a x86_emulate_ctxt, so that other callers can use it easily. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove three magic numbersRusty Russell
There are several places where hardcoded numbers are used in place of the easily-available constant, which is poor form. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: fx_init() needs preemption disabled while it plays with the FPU stateRusty Russell
Now that kvm generally runs with preemption enabled, we need to protect the fpu intialization sequence. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Convert vm lock to a mutexShaohua Li
This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use the scheduler preemption notifiers to make kvm preemptibleAvi Kivity
Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: add hypercall nr to kvm_runJeff Dike
Add the hypercall number to kvm_run and initialize it. This changes the ABI, but as this particular ABI was unusable before this no users are affected. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Dynamically allocate vcpusRusty Russell
This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Remove arch specific components from the general codeGregory Haskins
struct kvm_vcpu has vmx-specific members; remove them to a private structure. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: load_pdptrs() cleanupsRusty Russell
load_pdptrs can be handed an invalid cr3, and it should not oops. This can happen because we injected #gp in set_cr3() after we set vcpu->cr3 to the invalid value, or from kvm_vcpu_ioctl_set_sregs(), or memory configuration changes after the guest did set_cr3(). We should also copy the pdpte array once, before checking and assigning, otherwise an SMP guest can potentially alter the values between the check and the set. Finally one nitpick: ret = 1 should be done as late as possible: this allows GCC to check for unset "ret" should the function change in future. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Move gfn_to_page out of kmap/unmap pairsShaohua Li
gfn_to_page might sleep with swap support. Move it out of the kmap calls. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Return if the pdptrs are invalid when the guest turns on PAE.Rusty Russell
Don't fall through and turn on PAE in this case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR8 flags, and fix TPR definitionRusty Russell
Intel manual (and KVM definition) say the TPR is 4 bits wide. Also fix CR8_RESEVED_BITS typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized.Jeff Dike
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Use standard BITMAP macros, open-code userspace-exposed headerRusty Russell
Creating one's own BITMAP macro seems suboptimal: if we use manual arithmetic in the one place exposed to userspace, we can use standard macros elsewhere. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR4 flags, tighten checkingRusty Russell
On this machine (Intel), writing to the CR4 bits 0x00000800 and 0x00001000 cause a GPF. The Intel manual is a little unclear, but AFIACT they're reserved, too. Also fix spelling of CR4_RESEVED_BITS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Use standard CR3 flags, tighten checkingRusty Russell
The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR3_RESEVED_BITS, fix spelling and tighten it for the non-PAE case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.hRusty Russell
The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Trivial: Avoid hardware_disable predeclarationRusty Russell
Don't pre-declare hardware_disable: shuffle the reboot hook down. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: In-kernel string pio write supportEddie Dong
Add string pio write support to support some version of Windows. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: SMP: Add vcpu_id field in struct vcpuQing He
This patch adds a `vcpu_id' field in `struct vcpu', so we can differentiate BSP and APs without pointer comparison or arithmetic. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13KVM: Fix *nopage() in kvm_main.cNguyen Anh Quynh
*nopage() in kvm_main.c should only store the type of mmap() fault if the pointers are not NULL. This patch fixes the problem. Signed-off-by: Nguyen Anh Quynh <aquynh@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-08-19KVM: Avoid calling smp_call_function_single() with interrupts disabledAvi Kivity
When taking a cpu down, we need to hardware_disable() it. Unfortunately, the CPU_DYING notifier is called with interrupts disabled, which means we can't use smp_call_function_single(). Fortunately, the CPU_DYING notifier is always called on the dying cpu, so we don't need to use the function at all and can simply call hardware_disable() directly. Tested-by: Paolo Ornati <ornati@fastwebnet.it> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-25KVM: Fix removal of nx capability from guest cpuidAvi Kivity
Testing the wrong bit caused kvm not to disable nx on the guest when it is disabled on the host (an mmu optimization relies on the nx bits being the same in the guest and host). This allows Windows to boot when nx is disabled on te host (e.g. when host pae is disabled). Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-25Revert "KVM: Avoid useless memory write when possible"Avi Kivity
This reverts commit a3c870bdce4d34332ebdba7eb9969592c4c6b243. While it does save useless updates, it (probably) defeats the fork detector, causing a massive performance loss. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-25KVM: Fix unlikely kvm_create vs decache_vcpus_on_cpu raceRusty Russell
We add the kvm to the vm_list before initializing the vcpu mutexes, which can be mutex_trylock()'ed by decache_vcpus_on_cpu(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-25KVM: Correctly handle writes crossing a page boundaryAvi Kivity
Writes that are contiguous in virtual memory may not be contiguous in physical memory; so split writes that straddle a page boundary. Thanks to Aurelien for reporting the bug, patient testing, and a fix to this very patch. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-20KVM: x86 emulator: implement rdmsr and wrmsrAvi Kivity
Allow real-mode emulation of rdmsr and wrmsr. This allows smp Windows to boot, presumably for its sipi trampoline. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-20KVM: Fix memory slot management functions for guest smpAvi Kivity
The memory slot management functions were oriented against vcpu 0, where they should be kvm-wide. This causes hangs starting X on guest smp. Fix by making the functions (and resultant tail in the mmu) non-vcpu-specific. Unfortunately this reduces the efficiency of the mmu object cache a bit. We may have to revisit this later. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Use CPU_DYING for disabling virtualizationAvi Kivity
Only at the CPU_DYING stage can we be sure that no user process will be scheduled onto the cpu and oops when trying to use virtualization extensions. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Tune hotplug/suspend IPIsAvi Kivity
The hotplug IPIs can be called from the cpu on which we are currently running on, so use on_cpu(). Similarly, drop on_each_cpu() for the suspend/resume callbacks, as we're in atomic context here and only one cpu is up anyway. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Keep track of which cpus have virtualization enabledAvi Kivity
By keeping track of which cpus have virtualization enabled, we prevent double-enable or double-disable during hotplug, which is a very fatal oops. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Clean up #includesAvi Kivity
Remove unnecessary ones, and rearange the remaining in the standard order. Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-07-16KVM: Remove kvmfs in favor of the anonymous inodes sourceAvi Kivity
kvm uses a pseudo filesystem, kvmfs, to generate inodes, a job that the new anonymous inodes source does much better. Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@qumranet.com>