summaryrefslogtreecommitdiffstats
path: root/arch/x86
AgeCommit message (Collapse)Author
2013-03-13KVM: nVMX: Clean up and fix pin-based execution controlsJan Kiszka
Only interrupt and NMI exiting are mandatory for KVM to work, thus can be exposed to the guest unconditionally, virtual NMI exiting is optional. So we must not advertise it unless the host supports it. Introduce the symbolic constant PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR at this chance. Reviewed-by:: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-03-13KVM: x86: Rework INIT and SIPI handlingJan Kiszka
A VCPU sending INIT or SIPI to some other VCPU races for setting the remote VCPU's mp_state. When we were unlucky, KVM_MP_STATE_INIT_RECEIVED was overwritten by kvm_emulate_halt and, thus, got lost. This introduces APIC events for those two signals, keeping them in kvm_apic until kvm_apic_accept_events is run over the target vcpu context. kvm_apic_has_events reports to kvm_arch_vcpu_runnable if there are pending events, thus if vcpu blocking should end. The patch comes with the side effect of effectively obsoleting KVM_MP_STATE_SIPI_RECEIVED. We still accept it from user space, but immediately translate it to KVM_MP_STATE_INIT_RECEIVED + KVM_APIC_SIPI. The vcpu itself will no longer enter the KVM_MP_STATE_SIPI_RECEIVED state. That also means we no longer exit to user space after receiving a SIPI event. Furthermore, we already reset the VCPU on INIT, only fixing up the code segment later on when SIPI arrives. Moreover, we fix INIT handling for the BSP: it never enter wait-for-SIPI but directly starts over on INIT. Tested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-03-13x86: Use tick broadcast expired checkThomas Gleixner
Avoid going back into deep idle if the tick broadcast IPI is about to fire. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Arjan van de Veen <arjan@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20130306111537.702278273@linutronix.de
2013-03-13KVM: MMU: make kvm_mmu_available_pages robust against n_used_mmu_pages > ↵Marcelo Tosatti
n_max_mmu_pages As noticed by Ulrich Obergfell <uobergfe@redhat.com>, the mmu counters are for beancounting purposes only - so n_used_mmu_pages and n_max_mmu_pages could be relaxed (example: before f0f5933a1626c8df7b), resulting in n_used_mmu_pages > n_max_mmu_pages. Make code robust against n_used_mmu_pages > n_max_mmu_pages. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-03-12Select VIRT_TO_BUS directly where neededStephen Rothwell
In commit 887cbce0adea ("arch Kconfig: centralise ARCH_NO_VIRT_TO_BUS") I introduced the config sybmol HAVE_VIRT_TO_BUS and selected that where needed. I am not sure what I was thinking. Instead, just directly select VIRT_TO_BUS where it is needed. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12KVM: x86: Drop unused return code from VCPU reset callbackJan Kiszka
Neither vmx nor svm nor the common part may generate an error on kvm_vcpu_reset. So drop the return code. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-03-12VMX: x86: handle host TSC calibration failureMarcelo Tosatti
If the host TSC calibration fails, tsc_khz is zero (see tsc_init.c). Handle such case properly in KVM (instead of dividing by zero). https://bugzilla.redhat.com/show_bug.cgi?id=859282 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-03-12x86/platform/intel/mrst: Remove cast for kmalloc() return valueZhang Yanfei
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/513EB5DA.2010300@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-11x86: Constify a few itemsJan Beulich
This in particular re-does the compiler warning fix 9faec5b ("perf/x86: Fix P6 driver section warning"), tightening the section attributes rather than relaxing them. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Shaun Ruffell <sruffell@digium.com> Cc: yangyongqiang <yangyongqiang01@baidu.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/513DB84502000078000C4880@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-11x86: Drop always empty .text..page_aligned sectionJan Beulich
Commit e44b7b7 ("x86: move suspend wakeup code to C") didn't care to also eliminate the side effects that the earlier 4c49156 ("x86: make arch/x86/kernel/acpi/wakeup_32.S use a separate") had, thus leaving a now pointless, almost page size gap at the beginning of .text. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Pavel Machek <pavel@ucw.cz> Link: http://lkml.kernel.org/r/513DBAA402000078000C4896@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-11kvm: remove cast for kmalloc return valueIoan Orghici
Signed-off-by: Ioan Orghici<ioan.orghici@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-03-11x86/platform/uv: Replace kmalloc() & memset with kzalloc()Alexandru Gheorghiu
This was found using coccicheck. Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com> Link: http://lkml.kernel.org/r/1362822043-15559-1-git-send-email-gheorghiuandru@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-10crypto: crc32c - Update the links to the white papers on CRC32C calculations ↵Tim Chen
with PCLMULQDQ instructions. Herbert, The following patch update the stale link to the CRC32C white paper that was referenced. Tim Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-03-07x86: Do not try to sync identity map for non-mapped pagesDave Hansen
kernel_map_sync_memtype() is called from a variety of contexts. The pat.c code that calls it seems to ensure that it is not called for non-ram areas by checking via pat_pagerange_is_ram(). It is important that it only be called on the actual identity map because there *IS* no map to sync for highmem pages, or for memory holes. The ioremap.c uses are not as careful as those from pat.c, and call kernel_map_sync_memtype() on PCI space which is in the middle of the kernel identity map _range_, but is not actually mapped. This patch adds a check to kernel_map_sync_memtype() which probably duplicates some of the checks already in pat.c. But, it is necessary for the ioremap.c uses and shouldn't hurt other callers. I have reproduced this bug and this patch fixes it for me and the original bug reporter: https://lkml.org/lkml/2013/2/5/396 Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20130307163151.D9B58C4E@kernel.stglabs.ibm.com Signed-off-by: Dave Hansen <dave@sr71.net> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-07KVM: MMU: Introduce a helper function for FIFO zappingTakuya Yoshikawa
Make the code for zapping the oldest mmu page, placed at the tail of the active list, a separate function. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-07KVM: MMU: Use list_for_each_entry_safe in kvm_mmu_commit_zap_page()Takuya Yoshikawa
We are traversing the linked list, invalid_list, deleting each entry by kvm_mmu_free_page(). _safe version is there for such a case. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-07KVM: MMU: Fix and clean up for_each_gfn_* macrosTakuya Yoshikawa
The expression (sp)->gfn should not be expanded using @gfn. Although no user of these macros passes a string other than gfn now, this should be fixed before anyone sees strange errors. Note: ignored the following checkpatch errors: ERROR: Macros with complex values should be enclosed in parenthesis ERROR: trailing statements should be on next line Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-07KVM: nVMX: Fix setting of CR0 and CR4 in guest modeJan Kiszka
The logic for calculating the value with which we call kvm_set_cr0/4 was broken (will definitely be visible with nested unrestricted guest mode support). Also, we performed the check regarding CR0_ALWAYSON too early when in guest mode. What really needs to be done on both CR0 and CR4 is to mask out L1-owned bits and merge them in from L1's guest_cr0/4. In contrast, arch.cr0/4 and arch.cr0/4_guest_owned_bits contain the mangled L0+L1 state and, thus, are not suited as input. For both CRs, we can then apply the check against VMXON_CRx_ALWAYSON and refuse the update if it fails. To be fully consistent, we implement this check now also for CR4. For CR4, we move the check into vmx_set_cr4 while we keep it in handle_set_cr0. This is because the CR0 checks for vmxon vs. guest mode will diverge soon when adding unrestricted guest mode support. Finally, we have to set the shadow to the value L2 wanted to write originally. Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-07KVM: nVMX: Fix content of MSR_IA32_VMX_ENTRY/EXIT_CTLSJan Kiszka
Properly set those bits to 1 that the spec demands in case bit 55 of VMX_BASIC is 0 - like in our case. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-07context_tracking: Restore correct previous context state on exception exitFrederic Weisbecker
On exception exit, we restore the previous context tracking state based on the regs of the interrupted frame. Iff that frame is in user mode as stated by user_mode() helper, we restore the context tracking user mode. However there is a tiny chunck of low level arch code after we pass through user_enter() and until the CPU eventually resumes userspace. If an exception happens in this tiny area, exception_enter() correctly exits the context tracking user mode but exception_exit() won't restore it because of the value returned by user_mode(regs). As a result we may return to userspace with the wrong context tracking state. To fix this, change exception_enter() to return the context tracking state prior to its call and pass this saved state to exception_exit(). This restores the real context tracking state of the interrupted frame. (May be this patch was suggested to me, I don't recall exactly. If so, sorry for the missing credit). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Mats Liljegren <mats.liljegren@enea.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-07context_tracking: Move exception handling to generic codeFrederic Weisbecker
Exceptions handling on context tracking should share common treatment: on entry we exit user mode if the exception triggered in that context. Then on exception exit we return to that previous context. Generalize this to avoid duplication across archs. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Mats Liljegren <mats.liljegren@enea.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-06x86, doc: Be explicit about what the x86 struct boot_params requiresPeter Jones
If the sentinel triggers, we do not want the boot loader authors to just poke it and make the error go away, we want them to actually fix the problem. This should help avoid making the incorrect change in non-compliant bootloaders. [ hpa: dropped the Documentation/x86/boot.txt hunk pending clarifications ] Signed-off-by: Peter Jones <pjones@redhat.com> Link: http://lkml.kernel.org/r/1362592823-28967-1-git-send-email-pjones@redhat.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-06x86: Don't clear efi_info even if the sentinel hitsJosh Boyer
When boot_params->sentinel is set, all we really know is that some undefined set of fields in struct boot_params contain garbage. In the particular case of efi_info, however, there is a private magic for that substructure, so it is generally safe to leave it even if the bootloader is broken. kexec (for which we did the initial analysis) did not initialize this field, but of course all the EFI bootloaders do, and most EFI bootloaders are broken in this respect (and should be fixed.) Reported-by: Robin Holt <holt@sgi.com> Link: http://lkml.kernel.org/r/CA%2B5PVA51-FT14p4CRYKbicykugVb=PiaEycdQ57CK2km_OQuRQ@mail.gmail.com Tested-by: Josh Boyer <jwboyer@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-06x86, mm: Make sure to find a 2M free block for the first mapped areaYinghai Lu
Henrik reported that his MacAir 3.1 would not boot with | commit 8d57470d8f859635deffe3919d7d4867b488b85a | Date: Fri Nov 16 19:38:58 2012 -0800 | | x86, mm: setup page table in top-down It turns out that we do not calculate the real_end properly: We try to get 2M size with 4K alignment, and later will round down to 2M, so we will get less then 2M for first mapping, in extreme case could be only 4K only. In Henrik's system it has (1M-32K) as last usable rage is [mem 0x7f9db000-0x7fef8fff]. The problem is exposed when EFI booting have several holes and it will force mapping to use PTE instead as we only map usable areas. To fix it, just make it be 2M aligned, so we can be guaranteed to be able to use large pages to map it. Reported-by: Henrik Rydberg <rydberg@euromail.se> Bisected-by: Henrik Rydberg <rydberg@euromail.se> Tested-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/CAE9FiQX4nQ7_1kg5RL_vh56rmcSHXUi1ExrZX7CwED4NGMnHfg@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-06x86: Fix 32-bit *_cpu_data initializersKrzysztof Mazur
The commit 27be457000211a6903968dfce06d5f73f051a217 ('x86 idle: remove 32-bit-only "no-hlt" parameter, hlt_works_ok flag') removed the hlt_works_ok flag from struct cpuinfo_x86, but boot_cpu_data and new_cpu_data initializers were not changed causing setting f00f_bug flag, instead of fdiv_bug. If CONFIG_X86_F00F_BUG is not set the f00f_bug flag is never cleared. To avoid such problems in future C99-style initialization is now used. Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Acked-by: Borislav Petkov <bp@suse.de> Cc: len.brown@intel.com Link: http://lkml.kernel.org/r/1362266082-2227-1-git-send-email-krzysiek@podlesie.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-03-05KVM: nVMX: Reset RFLAGS on VM-exitJan Kiszka
Ouch, how could this work so well that far? We need to clear RFLAGS to the reset value as specified by the SDM. Particularly, IF must be off after VM-exit! Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-05x86, smpboot: Remove unused variableBorislav Petkov
The cpuinfo_x86 ptr is unused now. Drop it. Got obsolete by 69fb3676df33 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param") removing its only user. [ hpa: fixes gcc warning ] Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1362428180-8865-2-git-send-email-bp@alien8.de Cc: Len Brown <len.brown@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-03-04KVM: nVMX: Fix switching of debug stateJan Kiszka
First of all, do not blindly overwrite GUEST_DR7 on L2 entry. The host may have guest debugging enabled. Then properly reset DR7 and DEBUG_CTL on L2->L1 switch as specified in the SDM. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04KVM: set_memory_region: Refactor commit_memory_region()Takuya Yoshikawa
This patch makes the parameter old a const pointer to the old memory slot and adds a new parameter named change to know the change being requested: the former is for removing extra copying and the latter is for cleaning up the code. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04KVM: set_memory_region: Refactor prepare_memory_region()Takuya Yoshikawa
This patch drops the parameter old, a copy of the old memory slot, and adds a new parameter named change to know the change being requested. This not only cleans up the code but also removes extra copying of the memory slot structure. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04KVM: set_memory_region: Drop user_alloc from set_memory_region()Takuya Yoshikawa
Except ia64's stale code, KVM_SET_MEMORY_REGION support, this is only used for sanity checks in __kvm_set_memory_region() which can easily be changed to use slot id instead. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04KVM: set_memory_region: Drop user_alloc from prepare/commit_memory_region()Takuya Yoshikawa
X86 does not use this any more. The remaining user, s390's !user_alloc check, can be simply removed since KVM_SET_MEMORY_REGION ioctl is no longer supported. Note: fixed powerpc's indentations with spaces to suppress checkpatch errors. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04Merge branch 'master' into queueMarcelo Tosatti
* master: (15791 commits) Linux 3.9-rc1 btrfs/raid56: Add missing #include <linux/vmalloc.h> fix compat_sys_rt_sigprocmask() SUNRPC: One line comment fix ext4: enable quotas before orphan cleanup ext4: don't allow quota mount options when quota feature enabled ext4: fix a warning from sparse check for ext4_dir_llseek ext4: convert number of blocks to clusters properly ext4: fix possible memory leak in ext4_remount() jbd2: fix ERR_PTR dereference in jbd2__journal_start metag: Provide dma_get_sgtable() metag: prom.h: remove declaration of metag_dt_memblock_reserve() metag: copy devicetree to non-init memory metag: cleanup metag_ksyms.c includes metag: move mm/init.c exports out of metag_ksyms.c metag: move usercopy.c exports out of metag_ksyms.c metag: move setup.c exports out of metag_ksyms.c metag: move kick.c exports out of metag_ksyms.c metag: move traps.c exports out of metag_ksyms.c metag: move irq enable out of irqflags.h on SMP ... Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Conflicts: arch/x86/kernel/kvmclock.c
2013-03-04x86: Make Linux guest support optionalBorislav Petkov
Put all config options needed to run Linux as a guest behind a CONFIG_HYPERVISOR_GUEST menu so that they don't get built-in by default but be selectable by the user. Also, make all units which depend on x86_hyper, depend on this new symbol so that compilation doesn't fail when CONFIG_HYPERVISOR_GUEST is disabled but those units assume its presence. Sort options in the new HYPERVISOR_GUEST menu, adapt config text and drop redundant select. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1362428421-9244-3-git-send-email-bp@alien8.de Cc: Dmitry Torokhov <dtor@vmware.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-03-04x86, Kconfig: Move PARAVIRT_DEBUG into the paravirt menuBorislav Petkov
This should be under the PARAVIRT_GUEST menu. Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1362428421-9244-2-git-send-email-bp@alien8.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-03-04x86, efi: Make efi_memblock_x86_reserve_range more readableBorislav Petkov
So basically this function copies EFI memmap stuff from boot_params into the EFI memmap descriptor and reserves memory for it. Make it much more readable. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Matthew Garret <mjg59@srcf.ucam.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-03-03x86: trim sys_ia32.hAl Viro
remove the externs for functions that don't exist anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03x86: sys32_kill and sys32_mprotect are pointlessAl Viro
their argument types are identical to those of sys_kill and sys_mprotect resp., so we are not doing any kind of argument validation, etc. in those - they turn into unconditional branches to corresponding syscalls. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03merge compat sys_ipc instancesAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03consolidate compat lookup_dcookie()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03convert sendfile{,64} to COMPAT_SYSCALL_DEFINEAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protectAl Viro
... and switch i386 to HAVE_SYSCALL_WRAPPERS, killing open-coded uses of asmlinkage_protect() in a bunch of syscalls. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03consolidate cond_syscall and SYSCALL_ALIAS declarationsAl Viro
take them to asm/linkage.h, with default in linux/linkage.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-03-03Merge tag 'stable/for-linus-3.9-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: - Update the Xen ACPI memory and CPU hotplug locking mechanism. - Fix PAT issues wherein various applications would not start - Fix handling of multiple MSI as AHCI now does it. - Fix ARM compile failures. * tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xenbus: fix compile failure on ARM with Xen enabled xen/pci: We don't do multiple MSI's. xen/pat: Disable PAT using pat_enabled value. xen/acpi: xen cpu hotplug minor updates xen/acpi: xen memory hotplug minor updates
2013-03-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more VFS bits from Al Viro: "Unfortunately, it looks like xattr series will have to wait until the next cycle ;-/ This pile contains 9p cleanups and fixes (races in v9fs_fid_add() etc), fixup for nommu breakage in shmem.c, several cleanups and a bit more file_inode() work" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: constify path_get/path_put and fs_struct.c stuff fix nommu breakage in shmem.c cache the value of file_inode() in struct file 9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry 9p: make sure ->lookup() adds fid to the right dentry 9p: untangle ->lookup() a bit 9p: double iput() in ->lookup() if d_materialise_unique() fails 9p: v9fs_fid_add() can't fail now v9fs: get rid of v9fs_dentry 9p: turn fid->dlist into hlist 9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine more file_inode() open-coded instances selinux: opened file can't have NULL or negative ->f_path.dentry (In the meantime, the hlist traversal macros have changed, so this required a semantic conflict fixup for the newly hlistified fid->dlist)
2013-03-02x86, ACPI, mm: Revert movablemem_map supportYinghai Lu
Tim found: WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80() Hardware name: S2600CP sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. smpboot: Booting Node 1, Processors #1 Modules linked in: Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1 Call Trace: set_cpu_sibling_map+0x279/0x449 start_secondary+0x11d/0x1e5 Don Morris reproduced on a HP z620 workstation, and bisected it to commit e8d195525809 ("acpi, memory-hotplug: parse SRAT before memblock is ready") It turns out movable_map has some problems, and it breaks several things 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed) memset(&numa_meminfo, 0, sizeof(numa_meminfo)) can not be just removed. Need to consider sequence is: numaq, srat, amd, dummy. and make fall back path working. 2. simply split acpi_numa_init to early_parse_srat. a. that early_parse_srat is NOT called for ia64, so you break ia64. b. for (i = 0; i < MAX_LOCAL_APIC; i++) set_apicid_to_node(i, NUMA_NO_NODE) still left in numa_init. So it will just clear result from early_parse_srat. it should be moved before that.... c. it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved early before override from INITRD is settled. 3. that patch TITLE is total misleading, there is NO x86 in the title, but it changes critical x86 code. It caused x86 guys did not pay attention to find the problem early. Those patches really should be routed via tip/x86/mm. 4. after that commit, following range can not use movable ram: a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed? b. initrd... it will be freed after booting, so it could be on movable... c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G anymore. d. init_mem_mapping: can not put page table high anymore. e. initmem_init: vmemmap can not be high local node anymore. That is not good. If node is hotplugable, the mem related range like page table and vmemmap could be on the that node without problem and should be on that node. We have workaround patch that could fix some problems, but some can not be fixed. So just remove that offending commit and related ones including: f7210e6c4ac7 ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to protect movablecore_map in memblock_overlaps_region().") 01a178a94e8e ("acpi, memory-hotplug: support getting hotplug info from SRAT") 27168d38fa20 ("acpi, memory-hotplug: extend movablemem_map ranges to the end of node") e8d195525809 ("acpi, memory-hotplug: parse SRAT before memblock is ready") fb06bc8e5f42 ("page_alloc: bootmem limit with movablecore_map") 42f47e27e761 ("page_alloc: make movablemem_map have higher priority") 6981ec31146c ("page_alloc: introduce zone_movable_limit[] to keep movable limit for nodes") 34b71f1e04fc ("page_alloc: add movable_memmap kernel parameter") 4d59a75125d5 ("x86: get pg_data_t's memory from other node") Later we should have patches that will make sure kernel put page table and vmemmap on local node ram instead of push them down to node0. Also need to find way to put other kernel used ram to local node ram. Reported-by: Tim Gardner <tim.gardner@canonical.com> Reported-by: Don Morris <don.morris@hp.com> Bisected-by: Don Morris <don.morris@hp.com> Tested-by: Don Morris <don.morris@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal/compat fixes from Al Viro: "Fixes for several regressions introduced in the last signal.git pile, along with fixing bugs in truncate and ftruncate compat (on just about anything biarch at least one of those two had been done wrong)." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: compat: restore timerfd settime and gettime compat syscalls [regression] braino in "sparc: convert to ksignal" fix compat truncate/ftruncate switch lseek to COMPAT_SYSCALL_DEFINE lseek() and truncate() on sparc really need sign extension
2013-03-01x86_64: Use __BOOT_DS instead_of __KERNEL_DS for safetyLans Zhang
In startup_32, the running code still uses the initial GDT located in setup. Thus, __BOOT_DS is preferred. Currently __KERNEL_DS is lucky to equal to __BOOT_DS, but this is not always a safe way. Signed-off-by: Lans Zhang <lans.zhang2008@gmail.com> Link: http://lkml.kernel.org/r/51300267.6000008@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-03-01xen/pci: We don't do multiple MSI's.Konrad Rzeszutek Wilk
There is no hypercall to setup multiple MSI per PCI device. As such with these two new commits: - 08261d87f7d1b6253ab3223756625a5c74532293 PCI/MSI: Enable multiple MSIs with pci_enable_msi_block_auto() - 5ca72c4f7c412c2002363218901eba5516c476b1 AHCI: Support multiple MSIs we would call the PHYSDEVOP_map_pirq 'nvec' times with the same contents of the PCI device. Sander discovered that we would get the same PIRQ value 'nvec' times and return said values to the caller. That of course meant that the device was configured only with one MSI and AHCI would fail with: ahci 0000:00:11.0: version 3.0 xen: registering gsi 19 triggering 0 polarity 1 xen: --> pirq=19 -> irq=19 (gsi=19) (XEN) [2013-02-27 19:43:07] IOAPIC[0]: Set PCI routing entry (6-19 -> 0x99 -> IRQ 19 Mode:1 Active:1) ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 6 Gbps 0xf impl SATA mode ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part ahci: probe of 0000:00:11.0 failed with error -22 That is b/c in ahci_host_activate the second call to devm_request_threaded_irq would return -EINVAL as we passed in (on the second run) an IRQ that was never initialized. CC: stable@vger.kernel.org Reported-and-Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-28Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull one kvm bugfix from Gleb Natapov. * git://git.kernel.org/pub/scm/virt/kvm/kvm: x86/kvm: Fix pvclock vsyscall fixmap