asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2011-01-12	KVM: fix searching async gfn in kvm_async_pf_gfn_slot	Xiao Guangrong
	Don't search later slots if the slot is empty Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: fix tracing kvm_try_async_get_page	Xiao Guangrong
	Tracing 'async' and pfn is useless, since 'async' is always true, and 'pfn' is always "fault_pfn' We can trace 'gva' and 'gfn' instead, it can help us to see the life-cycle of an async_pf Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: replace vmalloc and memset with vzalloc	Takuya Yoshikawa
	Let's use newly introduced vzalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: handle exit due to INVD in VMX	Gleb Natapov
	Currently the exit is unhandled, so guest halts with error if it tries to execute INVD instruction. Call into emulator when INVD instruction is executed by a guest instead. This instruction is not needed by ordinary guests, but firmware (like OpenBIOS) use it and fail. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: x86: Avoid issuing wbinvd twice	Jan Kiszka
	Micro optimization to avoid calling wbinvd twice on the CPU that has to emulate it. As we might be preempted between smp_call_function_many and the local wbinvd, the cache might be filled again so that real work could be done uselessly. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: pre-allocate one more dirty bitmap to avoid vmalloc()	Takuya Yoshikawa
	Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by vmalloc() which will be used in the next logging and this has been causing bad effect to VGA and live-migration: vmalloc() consumes extra systime, triggers tlb flush, etc. This patch resolves this issue by pre-allocating one more bitmap and switching between two bitmaps during dirty logging. Performance improvement: I measured performance for the case of VGA update by trace-cmd. The result was 1.5 times faster than the original one. In the case of live migration, the improvement ratio depends on the workload and the guest memory size. In general, the larger the memory size is the more benefits we get. Note: This does not change other architectures's logic but the allocation size becomes twice. This will increase the actual memory consumption only when the new size changes the number of pages allocated by vmalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: propagate fault r/w information to gup(), allow read-only memory	Marcelo Tosatti
	As suggested by Andrea, pass r/w error code to gup(), upgrading read fault to writable if host pte allows it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12	KVM: MMU: flush TLBs on writable -> read-only spte overwrite	Marcelo Tosatti
	This can happen in the following scenario: vcpu0 vcpu1 read fault gup(.write=0) gup(.write=1) reuse swap cache, no COW set writable spte use writable spte set read-only spte Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12	KVM: MMU: remove kvm_mmu_set_base_ptes	Marcelo Tosatti
	Unused. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12	KVM: VMX: remove setting of shadow_base_ptes for EPT	Marcelo Tosatti
	The EPT present/writable bits use the same position as normal pagetable bits. Since direct_map passes ACC_ALL to mmu_set_spte, thus always setting the writable bit on sptes, use the generic PT_PRESENT shadow_base_pte. Also pass present/writable error code information from EPT violation to generic pagefault handler. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12	KVM: Avoid double interrupt injection with vapic	Avi Kivity
	After an interrupt injection, the PPR changes, and we have to reflect that into the vapic. This causes a KVM_REQ_EVENT to be set, which causes the whole interrupt injection routine to be run again (harmlessly). Optimize by only setting KVM_REQ_EVENT if the ppr was lowered; otherwise there is no chance that a new injection is needed. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-01-12	KVM: SVM: Fold save_host_msrs() and load_host_msrs() into their callers	Avi Kivity
	This abstraction only serves to obfuscate. Remove. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: SVM: Move fs/gs/ldt save/restore to heavyweight exit path	Avi Kivity
	ldt is never used in the kernel context; same goes for fs (x86_64) and gs (i386). So save/restore them in the heavyweight exit path instead of the lightweight path. By itself, this doesn't buy us much, but it paves the way for moving vmload and vmsave to the heavyweight exit path, since they modify the same registers. [jan: fix copy/pase mistake on i386] Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: SVM: Move svm->host_gs_base into a separate structure	Avi Kivity
	More members will join it soon. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: SVM: Move guest register save out of interrupts disabled section	Avi Kivity
	Saving guest registers is just a memory copy, and does not need to be in the critical section. Move outside the critical section to improve latency a bit. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: x86: Add missing inline tag to kvm_read_and_reset_pf_reason	Jan Kiszka
	May otherwise generates build warnings about unused kvm_read_and_reset_pf_reason if included without CONFIG_KVM_GUEST enabled. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Move KVM context switch into own function	Andi Kleen
	gcc 4.5 with some special options is able to duplicate the VMX context switch asm in vmx_vcpu_run(). This results in a compile error because the inline asm sequence uses an on local label. The non local label is needed because other code wants to set up the return address. This patch moves the asm code into an own function and marks that explicitely noinline to avoid this problem. Better would be probably to just move it into an .S file. The diff looks worse than the change really is, it's all just code movement and no logic change. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: x86: Mark kvm_arch_setup_async_pf static	Jan Kiszka
	It has no user outside mmu.c and also no prototype. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Send async PF when guest is not in userspace too.	Gleb Natapov
	If guest indicates that it can handle async pf in kernel mode too send it, but only if interrupts are enabled. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Let host know whether the guest can handle async PF in non-userspace ↵	Gleb Natapov
	context. If guest can detect that it runs in non-preemptable context it can handle async PFs at any time, so let host know that it can send async PF even if guest cpu is not in userspace. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM paravirt: Handle async PF in non preemptable context	Gleb Natapov
	If async page fault is received by idle task or when preemp_count is not zero guest cannot reschedule, so do sti; hlt and wait for page to be ready. vcpu can still process interrupts while it waits for the page to be ready. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Inject asynchronous page fault into a PV guest if page is swapped out.	Gleb Natapov
	Send async page fault to a PV guest if it accesses swapped out memory. Guest will choose another task to run upon receiving the fault. Allow async page fault injection only when guest is in user mode since otherwise guest may be in non-sleepable context and will not be able to reschedule. Vcpu will be halted if guest will fault on the same page again or if vcpu executes kernel code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Handle async PF in a guest.	Gleb Natapov
	When async PF capability is detected hook up special page fault handler that will handle async page fault events and bypass other page faults to regular page fault handler. Also add async PF handling to nested SVM emulation. Async PF always generates exit to L1 where vcpu thread will be scheduled out until page is available. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM paravirt: Add async PF initialization to PV guest.	Gleb Natapov
	Enable async PF in a guest if async PF capability is discovered. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Add PV MSR to enable asynchronous page faults delivery.	Gleb Natapov
	Guest enables async PF vcpu functionality using this MSR. Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM paravirt: Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c.	Gleb Natapov
	Async PF also needs to hook into smp_prepare_boot_cpu so move the hook into generic code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Add memory slot versioning and use it to provide fast guest write interface	Gleb Natapov
	Keep track of memslots changes by keeping generation number in memslots structure. Provide kvm_write_guest_cached() function that skips gfn_to_hva() translation if memslots was not changed since previous invocation. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Retry fault before vmentry	Gleb Natapov
	When page is swapped in it is mapped into guest memory only after guest tries to access it again and generate another fault. To save this fault we can map it immediately since we know that guest is going to access the page. Do it only when tdp is enabled for now. Shadow paging case is more complicated. CR[034] and EFER registers should be switched before doing mapping and then switched back. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	KVM: Halt vcpu if page it tries to access is swapped out	Gleb Natapov
	If a guest accesses swapped out memory do not swap it in from vcpu thread context. Schedule work to do swapping and put vcpu into halted state instead. Interrupts will still be delivered to the guest and if interrupt will cause reschedule guest will continue to run another task. [avi: remove call to get_user_pages_noio(), nacked by Linus; this makes everything synchrnous again] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-01-12	[S390] Randomize PIEs	Heiko Carstens
	Randomize ELF_ET_DYN_BASE, which is used when loading position independent executables. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Randomise the brk region	Heiko Carstens
	Randomize heap address like other architectures do already. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Add is_32bit_task() helper function	Heiko Carstens
	Helper function which tells us if a task is running in ESA mode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Randomize lower bits of stack address	Heiko Carstens
	Randomize the lower bits of the stack address like x86 and powerpc. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Randomize mmap start address	Heiko Carstens
	Randomize mmap start address with 8MB. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Rearrange mmap.c	Heiko Carstens
	Shuffle code around so it looks more like x86 and powerpc. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Enable flexible mmap layout for 64 bit processes	Heiko Carstens
	Historically 64 bit processes use the legacy address layout. However there is no reason why 64 bit processes shouldn't benefit from the flexible mmap layout advantages. Therefore just enable it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] vdso: dont map at mmap_base	Heiko Carstens
	The vdso object is currently always mapped with mm->mmap_base used as requested address. In case of flexible mmap layout this means it gets mapped above mmap_base and therefore potentially stealing a bit of address space that is reserved for the stack. In case of flexible mmap layout the object should be mapped below mmap base. For legacy mmap layout above. To fix this just don't request any specific address and let the mmap code figure out an address that fits. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] reduce miminum gap between stack and mmap_base	Heiko Carstens
	Reduce minimum gap between stack and mmap_base to 32MB. That way there is a bit more space for heap and mmap for tight 31 bit address spaces. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] mmap: consider stack address randomization	Heiko Carstens
	Consider stack address randomization when calulating mmap_base for flexible mmap layout . Because of address randomization the stack address can be up to 8MB lower than STACK_TOP. When calculating mmap_base this isn't taken into account, which could lead to the case that the gap between the real stack top and mmap_base is lower than what ulimit specifies for the maximum stack size. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	[S390] Update default configuration	Martin Schwidefsky
	Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12	ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support	Huang Ying
	Generic Hardware Error Source provides a way to report platform hardware errors (such as that from chipset). It works in so called "Firmware First" mode, that is, hardware errors are reported to firmware firstly, then reported to Linux by firmware. This way, some non-standard hardware error registers or non-standard hardware link can be checked by firmware to produce more valuable hardware error information for Linux. This patch adds POLL/IRQ/NMI notification types support. Because the memory area used to transfer hardware error information from BIOS to Linux can be determined only in NMI, IRQ or timer handler, but general ioremap can not be used in atomic context, so a special version of atomic ioremap is implemented for that. Known issue: - Error information can not be printed for recoverable errors notified via NMI, because printk is not NMI-safe. Will fix this via delay printing to IRQ context via irq_work or make printk NMI-safe. v2: - adjust printk format per comments. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12	Merge branch 'rmobile/sdio' into rmobile-latest	Paul Mundt

2011-01-12	ARM: mach-shmobile: sh7372 Enable SDIO IRQs for Mackerel	Magnus Damm
	Enable SDIO IRQ support for the Mackerel board. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-01-12	ARM: mach-shmobile: sh7377 Enable SDIO IRQs	Magnus Damm
	This patch enables interrupt generation for SDIO IRQs of the SDHI block on the sh7377 aka G4 processor. Use together with a recent SDHI driver using TMIO_MMC_SDIO_IRQ and with the MMC_CAP_SDIO_IRQ flag in the board code. The G4EVM specific SDHI platform data is also updated to flag SDIO capabilities. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-01-12	ARM: mach-shmobile: sh7367 Enable SDIO IRQs	Magnus Damm
	This patch enables interrupt generation for SDIO IRQs of the SDHI block on the sh7367 aka G3 processor. Use together with a recent SDHI driver using TMIO_MMC_SDIO_IRQ and with the MMC_CAP_SDIO_IRQ flag in the board code. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-01-12	ARM: mach-shmobile: sh7372 Enable SDIO IRQs	Arnd Hannemann
	This patch enables the interrupt generation for SDIO IRQs of the sdhi controllers of the SoC. To make sure interrupts are handled announce the MMC_CAP_SDIO_IRQ capability on AP4EVB. Tested with a b43-based SDIO wireless card. Signed-off-by: Arnd Hannemann <arnd@arndnet.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-01-12	Merge branch 'sh/sdio' into sh-latest	Paul Mundt

2011-01-12	sh: sh7366 Enable SDIO IRQs	Magnus Damm
	This patch enables interrupt generation for SDIO IRQs of the SDHI block on the sh7366 processor. Use together with a recent SDHI driver using TMIO_MMC_SDIO_IRQ and with the MMC_CAP_SDIO_IRQ flag in the board code. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-01-12	sh: sh7343 Enable SDIO IRQs	Magnus Damm
	This patch enables interrupt generation for SDIO IRQs of the SDHI block on the sh7343 processor. Use together with a recent SDHI driver using TMIO_MMC_SDIO_IRQ and with the MMC_CAP_SDIO_IRQ flag in the board code. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-01-12	sh: mach-ecovec24: enable runtime PM for SDHI	Arnd Hannemann
	This patch enables runtime PM for SDHI on ecovec. Tested with a b43 based SDIO card. Signed-off-by: Arnd Hannemann <arnd@arndnet.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>