summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/mmu.c
AgeCommit message (Collapse)Author
2011-07-24KVM: MMU: abstract some functions to handle fault pfnXiao Guangrong
Introduce handle_abnormal_pfn to handle fault pfn on page fault path, introduce mmu_invalid_pfn to handle fault pfn on prefetch path It is the preparing work for mmio page fault support Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: filter out the mmio pfn from the fault pfnXiao Guangrong
If the page fault is caused by mmio, the gfn can not be found in memslots, and 'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify the mmio page fault. And, to clarify the meaning of mmio pfn, we return fault page instead of bad page when the gfn is not allowd to prefetch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: remove bypass_guest_pfXiao Guangrong
The idea is from Avi: | Maybe it's time to kill off bypass_guest_pf=1. It's not as effective as | it used to be, since unsync pages always use shadow_trap_nonpresent_pte, | and since we convert between the two nonpresent_ptes during sync and unsync. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: split kvm_mmu_free_pageXiao Guangrong
Split kvm_mmu_free_page to kvm_mmu_isolate_page and kvm_mmu_free_page One is used to remove the page from cache under mmu lock and the other is used to free page table out of mmu lock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: count used shadow pages on prepareing pathXiao Guangrong
Move counting used shadow pages from commiting path to preparing path to reduce tlb flush on some paths Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: rename 'pt_write' to 'emulate'Xiao Guangrong
If 'pt_write' is true, we need to emulate the fault. And in later patch, we need to emulate the fault even though it is not a pt_write event, so rename it to better fit the meaning Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: optimize to handle dirty bitXiao Guangrong
If dirty bit is not set, we can make the pte access read-only to avoid handing dirty bit everywhere Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: cache mmio info on page fault pathXiao Guangrong
If the page fault is caused by mmio, we can cache the mmio info, later, we do not need to walk guest page table and quickly know it is a mmio fault while we emulate the mmio instruction Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: do not update slot bitmap if spte is nonpresentXiao Guangrong
Set slot bitmap only if the spte is present Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24KVM: MMU: fix walking shadow page tableXiao Guangrong
Properly check the last mapping, and do not walk to the next level if last spte is met Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12Revert "KVM: MMU: make kvm_mmu_reset_context() flush the guest TLB"Marcelo Tosatti
This reverts commit bee931d31e588b8eb86b7edee32fac2d16930cd7. TLB flush should be done lazily during guest entry, in kvm_mmu_load(). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: MMU: make kvm_mmu_reset_context() flush the guest TLBAvi Kivity
kvm_set_cr0() and kvm_set_cr4(), and possible other functions, assume that kvm_mmu_reset_context() flushes the guest TLB. However, it does not. Fix by flushing the tlb (and syncing the new root as well). Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0Avi Kivity
When CR0.WP=0, we sometimes map user pages as kernel pages (to allow the kernel to write to them). Unfortunately this also allows the kernel to fetch from these pages, even if CR4.SMEP is set. Adjust for this by also setting NX on the spte in these circumstances. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12KVM: MMU: cleanup for dropping parent pteXiao Guangrong
Introduce drop_parent_pte to remove the rmap of parent pte and clear parent pte Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: MMU: cleanup for kvm_mmu_page_unlink_childrenXiao Guangrong
Cleanup the same operation between kvm_mmu_page_unlink_children and mmu_pte_write_zap_pte Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: MMU: remove the arithmetic of parent pte rmapXiao Guangrong
Parent pte rmap and page rmap are very similar, so use the same arithmetic for them Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: MMU: abstract the operation of rmapXiao Guangrong
Abstract the operation of rmap to spte_list, then we can use it for the reverse mapping of parent pte in the later patch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: MMU: optimize pte write path if don't have protected spXiao Guangrong
Simply return from kvm_mmu_pte_write path if no shadow page is write-protected, then we can avoid to walk all shadow pages and hold mmu-lock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-11Merge branch 'master' into for-nextJiri Kosina
Sync with Linus' tree to be able to apply pending patches that are based on newer code already present upstream.
2011-06-20treewide: remove duplicate includesVitaliy Ivanov
Many stupid corrections of duplicated includes based on the output of scripts/checkincludes.pl. Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-19KVM: MMU: fix opposite condition in mapping_level_dirty_bitmapSteve
The condition is opposite, it always maps huge page for the dirty tracked page Reported-by: Steve <stefan.bosak@gmail.com> Signed-off-by: Steve <stefan.bosak@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-25vmscan: change shrinker API by passing shrink_control structYing Han
Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: fix warning] [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API] [akpm@linux-foundation.org: fix xfs warning] [akpm@linux-foundation.org: update gfs2] Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11KVM: MMU: remove mmu_seq verification on pte update pathXiao Guangrong
The mmu_seq verification can be removed since we get the pfn in the protection of mmu_lock. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: cleanup pte write pathXiao Guangrong
This patch does: - call vcpu->arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: introduce a common function to get no-dirty-logged slotXiao Guangrong
Cleanup the code of pte_prefetch_gfn_to_memslot and mapping_level_dirty_bitmap Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: remove unused macrosXiao Guangrong
These macros are not used, so removed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: cleanup page alloc and freeXiao Guangrong
Using __get_free_page instead of alloc_page and page_address, using free_page instead of __free_page and virt_to_page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: do not record gfn in kvm_mmu_pte_writeXiao Guangrong
No need to record the gfn to verifier the pte has the same mode as current vcpu, it's because we only speculatively update the pte only if the pte and vcpu have the same mode Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: set spte accessed bit properlyXiao Guangrong
Set spte accessed bit only if guest_initiated == 1 that means the really accessed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bitsXiao Guangrong
Only remove write access in the last sptes. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: Convert kvm_lock to raw_spinlockJan Kiszka
Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17KVM: MMU: Don't flush shadow when enabling dirty trackingAvi Kivity
Instead, drop large mappings, which were the reason we dropped shadow. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-13thp: mmu_notifier_test_youngAndrea Arcangeli
For GRU and EPT, we need gup-fast to set referenced bit too (this is why it's correct to return 0 when shadow_access_mask is zero, it requires gup-fast to set the referenced bit). qemu-kvm access already sets the young bit in the pte if it isn't zero-copy, if it's zero copy or a shadow paging EPT minor fault we relay on gup-fast to signal the page is in use... We also need to check the young bits on the secondary pagetables for NPT and not nested shadow mmu as the data may never get accessed again by the primary pte. Without this closer accuracy, we'd have to remove the heuristic that avoids collapsing hugepages in hugepage virtual regions that have not even a single subpage in use. ->test_young is full backwards compatible with GRU and other usages that don't have young bits in pagetables set by the hardware and that should nuke the secondary mmu mappings when ->clear_flush_young runs just like EPT does. Removing the heuristic that checks the young bit in khugepaged/collapse_huge_page completely isn't so bad either probably but I thought it was worth it and this makes it reliable. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-13thp: kvm mmu transparent hugepage supportAndrea Arcangeli
This should work for both hugetlbfs and transparent hugepages. [akpm@linux-foundation.org: bring forward PageTransCompound() addition for bisectability] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-12KVM: MMU: handle 'map_writable' in set_spte() functionXiao Guangrong
Move the operation of 'writable' to set_spte() to clean up code [avi: remove unneeded booleanification] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: audit: allow audit more guests at the same timeXiao Guangrong
It only allows to audit one guest in the system since: - 'audit_point' is a glob variable - mmu_audit_disable() is called in kvm_mmu_destroy(), so audit is disabled after a guest exited this patch fix those issues then allow to audit more guests at the same time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Replace reads of vcpu->arch.cr3 by an accessorAvi Kivity
This allows us to keep cr3 in the VMCS, later on. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: only write protect mappings at pagetable levelMarcelo Tosatti
If a pagetable contains a writeable large spte, all of its sptes will be write protected, including non-leaf ones, leading to endless pagefaults. Do not write protect pages above PT_PAGE_TABLE_LEVEL, as the spte fault paths assume non-leaf sptes are writable. Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: MMU: Initialize base_role for tdp mmusAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: SVM: copy instruction bytes from VMCBAndre Przywara
In case of a nested page fault or an intercepted #PF newer SVM implementations provide a copy of the faulting instruction bytes in the VMCB. Use these bytes to feed the instruction emulator and avoid the costly guest instruction fetch in this case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: cleanup emulate_instructionAndre Przywara
emulate_instruction had many callers, but only one used all parameters. One parameter was unused, another one is now hidden by a wrapper function (required for a future addition anyway), so most callers use now a shorter parameter list. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: MMU: Make the way of accessing lpage_info more genericTakuya Yoshikawa
Large page information has two elements but one of them, write_count, alone is accessed by a helper function. This patch replaces this helper function with more generic one which returns newly named kvm_lpage_info structure and use it to access the other element rmap_pde. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: retry #PF for softmmuXiao Guangrong
Retry #PF for softmmu only when the current vcpu has the same cr3 as the time when #PF occurs Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: fix accessed bit set on prefault pathXiao Guangrong
Retry #PF is the speculative path, so don't set the accessed bit Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: rename 'no_apf' to 'prefault'Xiao Guangrong
It's the speculative path if 'no_apf = 1' and we will specially handle this speculative path in the later patch, so 'prefault' is better to fit the sense. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: MMU: Avoid dropping accessed bit while removing write accessTakuya Yoshikawa
One more "KVM: MMU: Don't drop accessed bit while updating an spte." Sptes are accessed by both kvm and hardware. This patch uses update_spte() to fix the way of removing write access. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12KVM: Pull extra page fault information into struct x86_exceptionAvi Kivity
Currently page fault cr2 and nesting infomation are carried outside the fault data structure. Instead they are placed in the vcpu struct, which results in confusion as global variables are manipulated instead of passing parameters. Fix this issue by adding address and nested fields to struct x86_exception, so this struct can carry all information associated with a fault. Signed-off-by: Avi Kivity <avi@redhat.com> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Tested-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: Push struct x86_exception info the various gva_to_gpa variantsAvi Kivity
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: MMU: abstract invalid guest pte mappingXiao Guangrong
Introduce a common function to map invalid gpte Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12KVM: MMU: remove 'clear_unsync' parameterXiao Guangrong
Remove it since we can judge it by using sp->unsync Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>