summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)Author
2014-05-03sparc64: Use 'ILOG2_4MB' instead of constant '22'.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Fix range check in kern_addr_valid().David S. Miller
In commit b2d438348024b75a1ee8b66b85d77f569a5dfed8 ("sparc64: Make PAGE_OFFSET variable."), the MAX_PHYS_ADDRESS_BITS value was increased (to 47). This constant reference to '41UL' was missed. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Fix top-level fault handling bugs.David S. Miller
Make get_user_insn() able to cope with huge PMDs. Next, make do_fault_siginfo() more robust when get_user_insn() can't actually fetch the instruction. In particular, use the MMU announced fault address when that happens, instead of calling compute_effective_address() and computing garbage. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Handle 32-bit tasks properly in compute_effective_address().David S. Miller
If we have a 32-bit task we must chop off the top 32-bits of the 64-bit value just as the cpu would. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Don't use _PAGE_PRESENT in pte_modify() mask.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Fix hex values in comment above pte_modify().David S. Miller
When _PAGE_SPECIAL and _PAGE_PMD_HUGE were added to the mask, the comment was not updated. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Fix bugs in get_user_pages_fast() wrt. THP.David S. Miller
The large PMD path needs to check _PAGE_VALID not _PAGE_PRESENT, to decide if it needs to bail and return 0. pmd_large() should therefore just check _PAGE_PMD_HUGE. Calls to gup_huge_pmd() are guarded with a check of pmd_large(), so we just need to add a valid bit check. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Fix huge PMD invalidation.David S. Miller
On sparc64 "present" and "valid" are seperate PTE bits, this allows us to naturally distinguish between the user explicitly asking for PROT_NONE with mprotect() and other situations. However we weren't handling this properly in the huge PMD paths. First of all, the page table walker in the TSB miss path only checks for _PAGE_PMD_HUGE. So the generic pmdp_invalidate() would clear _PAGE_PRESENT but the TLB miss paths would still load it into the TLB as a valid huge PMD. Fix this by clearing the valid bit in pmdp_invalidate(), and also checking the valid bit in USER_PGTABLE_CHECK_PMD_HUGE using "brgez" since _PAGE_VALID is bit 63 in both the sun4u and sun4v pte layouts. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Fix executable bit testing in set_pmd_at() paths.David S. Miller
This code was mistakenly using the exec bit from the PMD in all cases, even when the PMD isn't a huge PMD. If it's not a huge PMD, test the exec bit in the individual ptes down in tlb_batch_pmd_scan(). Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03sparc64: Normalize NMI watchdog logging and behavior.David S. Miller
Bring this code in line with the perf based generic NMI watchdog in kernel/watchdog.c (which we should convert over to at some point). In particular, don't do anything super fancy when the watchdog triggers, and specifically don't do a do_exit() which only makes things worse. Either panic(), or WARN(). The latter of which will do all of the actions such as give us a stack backtrace. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03arm64: Mark the Applied Micro X-Gene SATA controller as DMA coherentCatalin Marinas
Since the default DMA ops for arm64 are non-coherent, mark the X-Gene controller explicitly as dma-coherent to avoid additional cache maintenance. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Loc Ho <lho@apm.com>
2014-05-03arm64: Use bus notifiers to set per-device coherent DMA opsCatalin Marinas
Recently, the default DMA ops have been changed to non-coherent for alignment with 32-bit ARM platforms (and DT files). This patch adds bus notifiers to be able to set the coherent DMA ops (with no cache maintenance) for devices explicitly marked as coherent via the "dma-coherent" DT property. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-03arm64: Make default dma_ops to be noncoherentRitesh Harjani
Currently arm64 dma_ops is by default made coherent which makes it opposite in default policy from arm. Make default dma_ops to be noncoherent (same as arm), as currently there aren't any dma-capable drivers which assumes coherent ops Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-03arm64: fixmap: fix missing sub-page offset for earlyprintkMarc Zyngier
Commit d57c33c5daa4 (add generic fixmap.h) added (among other similar things) set_fixmap_io to deal with early ioremap of devices. More recently, commit bf4b558eba92 (arm64: add early_ioremap support) converted the arm64 earlyprintk to use set_fixmap_io. A side effect of this conversion is that my virtual machines have stopped booting when I pass "earlyprintk=uart8250-8bit,0x3f8" to the guest kernel. Turns out that the new earlyprintk code doesn't care at all about sub-page offsets, and just assumes that the earlyprintk device will be page-aligned. Obviously, that doesn't play well with the above example. Further investigation shows that set_fixmap_io uses __set_fixmap instead of __set_fixmap_offset. A fix is to introduce a set_fixmap_offset_io that uses the latter, and to remove the superflous call to fix_to_virt (which only returns the value that set_fixmap_io has already given us). With this applied, my VMs are back in business. Tested on a Cortex-A57 platform with kvmtool as platform emulation. Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Salter <msalter@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-03arm64: Fix for the arm64 kern_addr_valid() functionDave Anderson
Fix for the arm64 kern_addr_valid() function to recognize virtual addresses in the kernel logical memory map. The function fails as written because it does not check whether the addresses in that region are mapped at the pmd level to 2MB or 512MB pages, continues the page table walk to the pte level, and issues a garbage value to pfn_valid(). Tested on 4K-page and 64K-page kernels. Signed-off-by: Dave Anderson <anderson@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-03Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "This udpate delivers: - A fix for dynamic interrupt allocation on x86 which is required to exclude the GSI interrupts from the dynamic allocatable range. This was detected with the newfangled tablet SoCs which have GPIOs and therefor allocate a range of interrupts. The MSI allocations already excluded the GSI range, so we never noticed before. - The last missing set_irq_affinity() repair, which was delayed due to testing issues - A few bug fixes for the armada SoC interrupt controller - A memory allocation fix for the TI crossbar interrupt controller - A trivial kernel-doc warning fix" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: irq-crossbar: Not allocating enough memory irqchip: armanda: Sanitize set_irq_affinity() genirq: x86: Ensure that dynamic irq allocation does not conflict linux/interrupt.h: fix new kernel-doc warnings irqchip: armada-370-xp: Fix releasing of MSIs irqchip: armada-370-xp: implement the ->check_device() msi_chip operation irqchip: armada-370-xp: fix invalid cast of signed value into unsigned variable
2014-05-03x86/efi: earlyprintk=efi,keep fixDave Young
earlyprintk=efi,keep will cause kernel hangs while freeing initmem like below: VFS: Mounted root (ext4 filesystem) readonly on device 254:2. devtmpfs: mounted Freeing unused kernel memory: 880K (ffffffff817d4000 - ffffffff818b0000) It is caused by efi earlyprintk use __init function which will be freed later. Such as early_efi_write is marked as __init, also it will use early_ioremap which is init function as well. To fix this issue, I added early initcall early_efi_map_fb which maps the whole efi fb for later use. OTOH, adding a wrapper function early_efi_map which calls early_ioremap before ioremap is available. With this patch applied efi boot ok with earlyprintk=efi,keep console=efi Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-05-02Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Two very small changes: one fix for the vSMP Foundation platform, and one to help LLVM not choke on options it doesn't understand (although it probably should)" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vsmp: Fix irq routing x86: LLVMLinux: Wrap -mno-80387 with cc-option
2014-05-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: - Fix for a Haswell regression in nested virtualization, introduced during the merge window. - A fix from Oleg to async page faults. - A bunch of small ARM changes. - A trivial patch to use the new MSI-X API introduced during the merge window. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: ARM: vgic: Fix the overlap check action about setting the GICD & GICC base address. KVM: arm/arm64: vgic: fix GICD_ICFGR register accesses KVM: async_pf: mm->mm_users can not pin apf->mm KVM: ARM: vgic: Fix sgi dispatch problem MAINTAINERS: co-maintainance of KVM/{arm,arm64} arm: KVM: fix possible misalignment of PGDs and bounce page KVM: x86: Check for host supported fields in shadow vmcs kvm: Use pci_enable_msix_exact() instead of pci_enable_msix() ARM: KVM: disable KVM in Kconfig on big-endian systems
2014-05-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Two bug fixes, one to fix a potential information leak in the BPF jit and common-io-layer fix for old firmware levels" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH s390/chsc: fix SEI usage on old FW levels
2014-05-02x86/efi: earlyprintk=efi,keep fixDave Young
earlyprintk=efi,keep will cause kernel hangs while freeing initmem like below: VFS: Mounted root (ext4 filesystem) readonly on device 254:2. devtmpfs: mounted Freeing unused kernel memory: 880K (ffffffff817d4000 - ffffffff818b0000) It is caused by efi earlyprintk use __init function which will be freed later. Such as early_efi_write is marked as __init, also it will use early_ioremap which is init function as well. To fix this issue, I added early initcall early_efi_map_fb which maps the whole efi fb for later use. OTOH, adding a wrapper function early_efi_map which calls early_ioremap before ioremap is available. With this patch applied efi boot ok with earlyprintk=efi,keep console=efi Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-05-02sparc64: Make itc_sync_lock rawKirill Tkhai
One more place where we must not be able to be preempted or to be interrupted in RT. Always actually disable interrupts during synchronization cycle. Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-02sparc64: Fix argument sign extension for compat_sys_futex().David S. Miller
Only the second argument, 'op', is signed. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-01ARM: sunxi: Enable GMAC in sunxi_defconfigMaxime Ripard
Since the support of the GMAC has been merged, we're using it as the ethernet controller on the A20 devices. However, sunxi_defconfig wasn't selecting it hence breaking the NFS boot. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2014-05-01Merge branch 'parisc-3.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "Drop the architecture-specifc value for_STK_LIM_MAX to fix stack related problems with GNU make" * 'parisc-3.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Use generic uapi/asm/resource.h file parisc: remove _STK_LIM_MAX override
2014-05-01parisc: Use generic uapi/asm/resource.h fileHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2014-05-01parisc: remove _STK_LIM_MAX overrideJohn David Anglin
There are only a couple of architectures that override _STK_LIM_MAX to a non-infinity value. This changes the stack allocation semantics in subtle ways. For example, GNU make changes its stack allocation to the hard maximum defined by _STK_LIM_MAX. As a results, threads executed by processes running under make are allocated a stack size of _STK_LIM_MAX rather than a sensible default value. This causes various thread stress tests to fail when they can't muster more than about 50 threads. The attached change implements the default behavior used by the majority of architectures. Signed-off-by: John David Anglin <dave.anglin@bell.net> Reviewed-by: Carlos O'Donell <carlos@systemhalted.org> Cc: stable@vger.kernel.org # 3.14 Signed-off-by: Helge Deller <deller@gmx.de>
2014-05-01Hexagon: Delete stale barrier.hVineet Gupta
Commit 93ea02bb8435 ("arch: Clean up asm/barrier.h implementations") wired generic barrier.h for hexagon, but failed to delete the existing file. Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Compile-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus an Intel RAPL PMU driver fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf tests x86: Fix stack map lookup in dwarf unwind test perf x86: Fix perf to use non-executable stack, again perf tools: Remove extra '/' character in events file path perf machine: Search for modules in %s/lib/modules/%s perf tests: Add static build make test perf tools: Fix bfd dependency libraries detection perf tools: Use LDFLAGS instead of ALL_LDFLAGS perf/x86: Fix RAPL rdmsrl_safe() usage tools lib traceevent: Fix memory leak in pretty_print() tools lib traceevent: Fix backward compatibility macros for pevent filter enums perf tools: Disable libdw unwind for all but x86 arch perf tests x86: Fix memory leak in sample_ustack()
2014-04-30Merge tag 'kvm-arm-for-3.15-rc4' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master First round of KVM/ARM Fixes for 3.15 Includes vgic fixes, a possible kernel corruption bug due to misalignment of pages and disabling of KVM in KConfig on big-endian systems, because the last one breaks the build.
2014-04-30ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safeVineet Gupta
There was a very small race window where resume to kernel mode from a Exception Path (or pure kernel mode which is true for most of ARC exceptions anyways), was not disabling interrupts in restore_regs, clobbering the exception regs Anton found the culprit call flow (after many sleepless nights) | 1. we got a Trap from user land | 2. started to service it. | 3. While doing some stuff on user-land memory (I think it is padzero()), | we got a DataTlbMiss | 4. On return from it we are taking "resume_kernel_mode" path | 5. NEED_RESHED is not set, so we go to "return from exception" path in | restore regs. | 6. there seems to be IRQ happening Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: <stable@vger.kernel.org> #3.10, 3.12, 3.13, 3.14 Cc: Anton Kolesov <Anton.Kolesov@synopsys.com> Cc: Francois Bedard <Francois.Bedard@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-29ARM: common: edma: Fix xbar mappingThomas Gleixner
This is another great example of trainwreck engineering: commit 2646a0e529 (ARM: edma: Add EDMA crossbar event mux support) added support for using EDMA on peripherals which have no direct EDMA event mapping. The code compiles and does not explode in your face, but that's it. 1) Reading an u16 array from an u32 device tree array simply does not work. Even if the function is named "edma_of_read_u32_to_s16_array". It merily calls of_property_read_u16_array. So the resulting 16bit array will have every other entry = 0. 2) The DT entry for the xbar registers related to xbar has length 0x10 instead of the real length: 0xfd0 - 0xf90 = 0x40. Not a real problem as it does not cross a page boundary, but wrong nevertheless. 3) But none of this matters as the mapping never happens: After reading nonsense edma_of_read_u32_to_s16_array() invalidates the first array entry pair, so nobody can ever notice the braindamage by immediate explosion. Seems the QA criteria for this code was solely not to explode when someone adds edma-xbar-event-map entries to the DT. Goal achieved, congratulations! Not really helpful if someone wants to use edma on a device which requires a xbar mapping. Fix the issues by: - annotating the device tree entry with "/bits/ 16" as documented in the of_property_read_u16_array kernel doc - make the size of the xbar register mapping correct - invalidating the end of the array and not the start This convoluted mess wants to be completely rewritten as there is no point to keep the xbar_chan array memory and the iomapping of the xbar regs around forever. Marking the xbar mapped channels as used should be done right there. But that's a different issue and this patch is small enough to make it work and allows a simple backport for stable. Cc: stable@vger.kernel.org # v3.12+ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2014-04-29KVM guest: Make pv trampoline code executableAlexander Graf
Our PV guest patching code assembles chunks of instructions on the fly when it encounters more complicated instructions to hijack. These instructions need to live in a section that we don't mark as non-executable, as otherwise we fault when jumping there. Right now we put it into the .bss section where it automatically gets marked as non-executable. Add a check to the NX setting function to ensure that we leave these particular pages executable. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-04-28Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds
Pull devicetree bug fixes from Grant Likely: "These are some important bug fixes that need to get into v3.15. This branch contains a pair of important bug fixes for the DT code: - Fix some incorrect binding property names before they enter common usage - Fix bug where some platform devices will be unable to get their interrupt number when they depend on an interrupt controller that is not available at device creation time. This is a problem causing mainline to fail on a number of ARM platforms" * tag 'dt-for-linus' of git://git.secretlab.ca/git/linux: of/irq: do irq resolution in platform_get_irq of: selftest: add deferred probe interrupt test dt: Fix binding typos in clock-names and interrupt-names
2014-04-28ARM: OMAP3: clock: Back-propagate rate change from cam_mclk to dpll4_m5 on ↵Laurent Pinchart
all OMAP3 platforms Commit 7b2e1277598e4187c9be3e61fd9b0f0423f97986 ("ARM: OMAP3: clock: Back-propagate rate change from cam_mclk to dpll4_m5") enabled clock rate back-propagation from cam_mclk do dpll4_m5 on OMAP3630 only. Perform back-propagation on other OMAP3 platforms as well. Reported-by: Jean-Philippe François <jp.francois@cynove.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: <stable@kernel.org> Signed-off-by: Paul Walmsley <paul@pwsan.com>
2014-04-28Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "Here is a bunch of post-merge window fixes that have been accumulating in patchwork while I was on vacation or buried under other stuff last week. We have the now usual batch of LE fixes from Anton (sadly some new stuff that went into this merge window had endian issues, we'll try to make sure we do better next time) Some fixes and cleanups to the new 24x7 performance monitoring stuff (mostly typos and cleaning up printk's) A series of fixes for an issue with our runlatch bit, which wasn't set properly for offlined threads/cores and under KVM, causing potentially some counters to misbehave along with possible power management issues. A fix for kexec nasty race where the new kernel wouldn't "see" the secondary processors having reached back into firmware in time. And finally a few other misc (and pretty simple) bug fixes" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (33 commits) powerpc/4xx: Fix section mismatch in ppc4xx_pci.c ppc/kvm: Clear the runlatch bit of a vcpu before napping ppc/kvm: Set the runlatch bit of a CPU just before starting guest ppc/powernv: Set the runlatch bits correctly for offline cpus powerpc/pseries: Protect remove_memory() with device hotplug lock powerpc: Fix error return in rtas_flash module init powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048 powerpc: Bump COMMAND_LINE_SIZE to 2048 powerpc: Rename duplicate COMMAND_LINE_SIZE define powerpc/perf/hv-24x7: Catalog version number is be64, not be32 powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets() powerpc/perf/hv-gpci: Make device attr static powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed powerpc/mm: Fix tlbie to add AVAL fields for 64K pages powerpc/powernv: Fix little endian issues in OPAL dump code powerpc/powernv: Create OPAL sglist helper functions and fix endian issues powerpc/powernv: Fix little endian issues in OPAL error log code powerpc/powernv: Fix little endian issues with opal_do_notifier calls ...
2014-04-28ARM: sun7i: Fix i2c4 base addressMaxime Ripard
For some reason, the base address of the fifth I2C adapter in the A20 was incorrect. Change this to the actual base address. Reported-by: Marcus Cooper <codekipper@gmail.com> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Acked-by: Hans de Goede <hdegoede@redhat.com>
2014-04-28KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bitAlexander Graf
The book3s_32 target can get built as module which means we don't see the config define for it in code. Instead, check on the bool define CONFIG_KVM_BOOK3S_32_HANDLER whenever we want to know whether we're building for a book3s_32 host. This fixes running book3s_32 kvm as a module for me. Signed-off-by: Alexander Graf <agraf@suse.de> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-04-28KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exitPaul Mackerras
Testing by Michael Neuling revealed that commit e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support") is missing the code that saves away the checkpointed state of the guest when switching to the host. This adds that code, which was in earlier versions of the patch but went missing somehow. Reported-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-04-28KVM: PPC: Book3S: HV: make _PAGE_NUMA take effectpingfank@linux.vnet.ibm.com
Numa fault is a method which help to achieve auto numa balancing. When such a page fault takes place, the page fault handler will check whether the page is placed correctly. If not, migration should be involved to cut down the distance between the cpu and pages. A pte with _PAGE_NUMA help to implement numa fault. It means not to allow the MMU to access the page directly. So a page fault is triggered and numa fault handler gets the opportunity to run checker. As for the access of MMU, we need special handling for the powernv's guest. When we mark a pte with _PAGE_NUMA, we already call mmu_notifier to invalidate it in guest's htab, but when we tried to re-insert them, we firstly try to map it in real-mode. Only after this fails, we fallback to virt mode, and most of important, we run numa fault handler in virt mode. This patch guards the way of real-mode to ensure that if a pte is marked with _PAGE_NUMA, it will NOT be mapped in real mode, instead, it will be mapped in virt mode and have the opportunity to be checked with placement. Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-04-28arm: KVM: fix possible misalignment of PGDs and bounce pageMark Salter
The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate a bounce page (if hypervisor init code crosses page boundary) and hypervisor PGDs. The problem is that kalloc() does not guarantee the proper alignment. In the case of the bounce page, the page sized buffer allocated may also cross a page boundary negating the purpose and leading to a hang during kvm initialization. Likewise the PGDs allocated may not meet the minimum alignment requirements of the underlying MMU. This patch uses __get_free_page() to guarantee the worst case alignment needs of the bounce page and PGDs on both arm and arm64. Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-04-28genirq: x86: Ensure that dynamic irq allocation does not conflictThomas Gleixner
On x86 the allocation of irq descriptors may allocate interrupts which are in the range of the GSI interrupts. That's wrong as those interrupts are hardwired and we don't have the irq domain translation like PPC. So one of these interrupts can be hooked up later to one of the devices which are hard wired to it and the io_apic init code for that particular interrupt line happily reuses that descriptor with a completely different configuration so hell breaks lose. Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs, except for a few usage sites which have not yet blown up in our face for whatever reason. But for drivers which need an irq range, like the GPIO drivers, we have no limit in place and we don't want to expose such a detail to a driver. To cure this introduce a function which an architecture can implement to impose a lower bound on the dynamic interrupt allocations. Implement it for x86 and set the lower bound to nr_gsi_irqs, which is the end of the hardwired interrupt space, so all dynamic allocations happen above. That not only allows the GPIO driver to work sanely, it also protects the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and htirq code. They need to be cleaned up as well, but that's a separate issue. Reported-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mathias Nyman <mathias.nyman@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Krogerus Heikki <heikki.krogerus@intel.com> Cc: Linus Walleij <linus.walleij@linaro.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-28arm/xen: Remove definiition of virt_to_pfn in asm/xen/page.hJulien Grall
virt_to_pfn has been defined in asm/memory.h by the commit e26a9e0 "ARM: Better virt_to_page() handling" This will result of a compilation warning when CONFIG_XEN is enabled. arch/arm/include/asm/xen/page.h:80:0: warning: "virt_to_pfn" redefined [enabled by default] #define virt_to_pfn(v) (PFN_DOWN(__pa(v))) ^ In file included from arch/arm/include/asm/page.h:163:0, from arch/arm/include/asm/xen/page.h:4, from include/xen/page.h:4, from arch/arm/xen/grant-table.c:33: The definition in memory.h is nearly the same (it directly expand PFN_DOWN), so we can safely drop virt_to_pfn in xen include. Signed-off-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-28Merge commit 'e26a9e0' into stable/for-linus-3.15David Vrabel
2014-04-28KVM: x86: Check for host supported fields in shadow vmcsBandan Das
We track shadow vmcs fields through two static lists, one for read only and another for r/w fields. However, with addition of new vmcs fields, not all fields may be supported on all hosts. If so, copy_vmcs12_to_shadow() trying to vmwrite on unsupported hosts will result in a vmwrite error. For example, commit 36be0b9deb23161 introduced GUEST_BNDCFGS, which is not supported by all processors. Filter out host unsupported fields before letting guests use shadow vmcs Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-28x86/vsmp: Fix irq routingOren Twaig
Correct IRQ routing in case a vSMP box is detected but the Interrupt Routing Comply (IRC) value is set to "comply", which leads to incorrect IRQ routing. Before the patch: When a vSMP box was detected and IRC was set to "comply", users (and the kernel) couldn't effectively set the destination of the IRQs. This is because the hook inside vsmp_64.c always setup all CPUs as the IRQ destination using cpumask_setall() as the return value for IRQ allocation mask. Later, this "overrided" mask caused the kernel to set the IRQ destination to the lowest online CPU in the mask (CPU0 usually). After the patch: When the IRC is set to "comply", users (and the kernel) can control the destination of the IRQs as we will not be changing the default "apic->vector_allocation_domain". Signed-off-by: Oren Twaig <oren@scalemp.com> Acked-by: Shai Fultheim <shai@scalemp.com> Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-oren@scalemp.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-28powerpc/4xx: Fix section mismatch in ppc4xx_pci.cAlistair Popple
This patch fixes this section mismatch: WARNING: vmlinux.o(.text+0x1efc4): Section mismatch in reference from the function apm821xx_pciex_init_port_hw() to the function .init.text:ppc4xx_pciex_wait_on_sdr.isra.9() The function apm821xx_pciex_init_port_hw() references the function __init ppc4xx_pciex_wait_on_sdr.isra.9(). This is often because apm821xx_pciex_init_port_hw lacks a __init annotation or the annotation of ppc4xx_pciex_wait_on_sdr.isra.9 is wrong. apm821xx_pciex_init_port_hw is only referenced by a struct in __initdata, so it should be safe to add __init to apm821xx_pciex_init_port_hw. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28ppc/kvm: Clear the runlatch bit of a vcpu before nappingPreeti U Murthy
When the guest cedes the vcpu or the vcpu has no guest to run it naps. Clear the runlatch bit of the vcpu before napping to indicate an idle cpu. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28ppc/kvm: Set the runlatch bit of a CPU just before starting guestPreeti U Murthy
The secondary threads in the core are kept offline before launching guests in kvm on powerpc: "371fefd6f2dc4666:KVM: PPC: Allow book3s_hv guests to use SMT processor modes." Hence their runlatch bits are cleared. When the secondary threads are called in to start a guest, their runlatch bits need to be set to indicate that they are busy. The primary thread has its runlatch bit set though, but there is no harm in setting this bit once again. Hence set the runlatch bit for all threads before they start guest. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28ppc/powernv: Set the runlatch bits correctly for offline cpusPreeti U Murthy
Up until now we have been setting the runlatch bits for a busy CPU and clearing it when a CPU enters idle state. The runlatch bit has thus been consistent with the utilization of a CPU as long as the CPU is online. However when a CPU is hotplugged out the runlatch bit is not cleared. It needs to be cleared to indicate an unused CPU. Hence this patch has the runlatch bit cleared for an offline CPU just before entering an idle state and sets it immediately after it exits the idle state. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>