summaryrefslogtreecommitdiffstats
path: root/arch/arm/include/asm
AgeCommit message (Collapse)Author
2013-10-22arm/arm64: KVM: PSCI: use MPIDR to identify a target CPUMarc Zyngier
The KVM PSCI code blindly assumes that vcpu_id and MPIDR are the same thing. This is true when vcpus are organized as a flat topology, but is wrong when trying to emulate any other topology (such as A15 clusters). Change the KVM PSCI CPU_ON code to look at the MPIDR instead of the vcpu_id to pick a target CPU. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-19Merge tag 'v3.12-rc6' into develLinus Walleij
Linux 3.12-rc6 Conflicts: drivers/gpio/gpio-lynxpoint.c
2013-10-19ARM: tlb: ASID macro should give 32bit result for BE correct operationVictor Kamensky
In order for ASID macro to be used as expression passed to inline asm as 'r' operand it needs to give 32 bit unsigned result, not unsigned 64bit expression. Otherwise when 64bit ASID is passed to inline assembler statement as 'r' operand (32bit) compiler behavior is not well specified. For example when __flush_tlb_mm function compiled in big endian case, and ASID is passed to tlb_op macro directly, 0 will be passed as 'mcr 15, 0, r4, cr8, cr3, {2}' argument in r4, unless ASID macro changed to produce 32 bit result. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
2013-10-19ARM: atomic64: fix endian-ness in atomic.hVictor Kamensky
Fix inline asm for atomic64_xxx functions in arm atomic.h. Instead of %H operand specifiers code should use %Q for least significant part of the value, and %R for the most significant part of the value. %H always returns the higher of the two register numbers, and therefore it is not endian neutral. %H should be used with ldrexd and strexd instructions. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
2013-10-19ARM: kdgb: use <asm/opcodes.h> for data to be assembled as intructionBen Dooks
The arch_kgdb_breakpoint() function uses an inline assembly directive to assemble a specific instruction using .word. This means the linker will not treat is as an instruction, and therefore incorrectly swap the endian-ness if running BE8. As noted, this code means that kgdb is really only usable on arm32 kernels, and should be made dependant on not being a thumb2 kernel until fixed. However this is not something to be added to this patch. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
2013-10-19ARM: Correct BUG() assembly to ensure it is endian-agnosticBen Dooks
Currently BUG() uses .word or .hword to create the necessary illegal instructions. However if we are building BE8 then these get swapped by the linker into different illegal instructions in the text. This means that the BUG() macro does not get trapped properly. Change to using <asm/opcodes.h> to provide the necessary ARM instruction building as we cannot rely on gcc/gas having the `.inst` instructions which where added to try and resolve this issue (reported by Dave Martin <Dave.Martin@arm.com>). Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
2013-10-19ARM: hardware: fix endian-ness in <hardware/coresight.h>Ben Dooks
The <hardware/coresight.h> needs to take into account the endian-ness of the processor when reading and writing data, so change to using the readl/writel relaxed variants from the raw ones. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
2013-10-19ARM: asm: Add ARM_BE8() assembly helperBen Dooks
Add ARM_BE8() helper to wrap any code conditional on being compile when CONFIG_ARM_ENDIAN_BE8 is selected and convert existing places where this is to use it. Acked-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
2013-10-18Merge branch 'for-rmk/arm-mm-lpae' of ↵Russell King
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into devel-stable This series extends the existing ARM v2p runtime patching for 64 bit. Needed for LPAE machines which have physical memory beyond 4GB.
2013-10-17KVM: ARM: Support hugetlbfs backed huge pagesChristoffer Dall
Support huge pages in KVM/ARM and KVM/ARM64. The pud_huge checking on the unmap path may feel a bit silly as the pud_huge check is always defined to false, but the compiler should be smart about this. Note: This deals only with VMAs marked as huge which are allocated by users through hugetlbfs only. Transparent huge pages can only be detected by looking at the underlying pages (or the page tables themselves) and this patch so far simply maps these on a page-by-page level in the Stage-2 page tables. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17KVM: ARM: Update comments for kvm_handle_wfiChristoffer Dall
Update comments to reflect what is really going on and add the TWE bit to the comments in kvm_arm.h. Also renames the function to kvm_handle_wfx like is done on arm64 for consistency and uber-correctness. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17ARM: KVM: Yield CPU when vcpu executes a WFEMarc Zyngier
On an (even slightly) oversubscribed system, spinlocks are quickly becoming a bottleneck, as some vcpus are spinning, waiting for a lock to be released, while the vcpu holding the lock may not be running at all. This creates contention, and the observed slowdown is 40x for hackbench. No, this isn't a typo. The solution is to trap blocking WFEs and tell KVM that we're now spinning. This ensures that other vpus will get a scheduling boost, allowing the lock to be released more quickly. Also, using CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance when the VM is severely overcommited. Quick test to estimate the performance: hackbench 1 process 1000 2xA15 host (baseline): 1.843s 2xA15 guest w/o patch: 2.083s 4xA15 guest w/o patch: 80.212s 8xA15 guest w/o patch: Could not be bothered to find out 2xA15 guest w/ patch: 2.102s 4xA15 guest w/ patch: 3.205s 8xA15 guest w/ patch: 6.887s So we go from a 40x degradation to 1.5x in the 2x overcommit case, which is vaguely more acceptable. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17Powerpc KVM work is based on a commit after rc4.Gleb Natapov
Merging master into next to satisfy the dependencies. Conflicts: arch/arm/kvm/reset.c
2013-10-16Merge tag 'kvm-arm-for-3.13-1' of ↵Gleb Natapov
git://git.linaro.org/people/cdall/linux-kvm-arm into next Updates for KVM/ARM including cpu=host and Cortex-A7 support
2013-10-16Merge tag 'v3.12-rc4' into develLinus Walleij
Linux 3.12-rc4
2013-10-14Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Some more ARM fixes, nothing particularly major here. The biggest change is to fix the SMP_ON_UP code so that it works with TI's Aegis cores" * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7851/1: check for number of arguments in syscall_get/set_arguments() ARM: 7846/1: Update SMP_ON_UP code to detect A9MPCore with 1 CPU devices ARM: 7845/1: sharpsl_param.c: fix invalid memory access for pxa devices ARM: 7843/1: drop asm/types.h from generic-y ARM: 7842/1: MCPM: don't explode if invoked without being initialized first
2013-10-14KVM: ARM: Get rid of KVM_HPAGE definesChristoffer Dall
The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge page size is statically defined at compile time and there is only a single huge page size. Now when the main kvm code relying on these defines has been moved to the x86 specific part of the world, we can get rid of these. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-13ARM: 7851/1: check for number of arguments in syscall_get/set_arguments()AKASHI Takahiro
In ftrace_syscall_enter(), syscall_get_arguments(..., 0, n, ...) if (i == 0) { <handle ORIG_r0> ...; n--;} memcpy(..., n * sizeof(args[0])); If 'number of arguments(n)' is zero and 'argument index(i)' is also zero in syscall_get_arguments(), none of arguments should be copied by memcpy(). Otherwise 'n--' can be a big positive number and unexpected amount of data will be copied. Tracing system calls which take no argument, say sync(void), may hit this case and eventually make the system corrupted. This patch fixes the issue both in syscall_get_arguments() and syscall_set_arguments(). Cc: <stable@vger.kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-12KVM: ARM: Add support for Cortex-A7Jonathan Austin
This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts. As Cortex-A7 is architecturally compatible with A15, this patch is largely just generalising existing code. Areas where 'implementation defined' behaviour is identical for A7 and A15 is moved to allow it to be used by both cores. The check to ensure that coprocessor register tables are sorted correctly is also moved in to 'common' code to avoid each new cpu doing its own check (and possibly forgetting to do so!) Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-12KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masksJonathan Austin
The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently defines TTBCR_T{0,1}SZ as 3, not 7. The T0SZ mask is used to calculate the value for the HTCR, both to pick out TTBCR.T0SZ and mask off the equivalent field in the HTCR during read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on A7/TC2) Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness. Signed-off-by: Jonathan Austin <jonathan.austin@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-11ARM: Remove temporary sched_clock.h headerStephen Boyd
This header file is no longer needed now that the ARM sched_clock framework is generic and all users have moved to linux/sched_clock.h instead of asm/sched_clock.h. Remove it. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Kevin Hilman <khilman@linaro.org>
2013-10-11Merge branch 'core/urgent' into sched/coreIngo Molnar
Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-11compiler/gcc4: Add quirk for 'asm goto' miscompilation bugIngo Molnar
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto' constructs, as outlined here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Implement a workaround suggested by Jakub Jelinek. Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-10ARM: mm: Recreate kernel mappings in early_paging_init()Santosh Shilimkar
This patch adds a step in the init sequence, in order to recreate the kernel code/data page table mappings prior to full paging initialization. This is necessary on LPAE systems that run out of a physical address space outside the 4G limit. On these systems, this implementation provides a machine descriptor hook that allows the PHYS_OFFSET to be overridden in a machine specific fashion. Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: R Sricharan <r.sricharan@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
2013-10-10ARM: mm: Correct virt_to_phys patching for 64 bit physical addressesSricharan R
The current phys_to_virt patching mechanism works only for 32 bit physical addresses and this patch extends the idea for 64bit physical addresses. The 64bit v2p patching mechanism patches the higher 8 bits of physical address with a constant using 'mov' instruction and lower 32bits are patched using 'add'. While this is correct, in those platforms where the lowmem addressable physical memory spawns across 4GB boundary, a carry bit can be produced as a result of addition of lower 32bits. This has to be taken in to account and added in to the upper. The patched __pv_offset and va are added in lower 32bits, where __pv_offset can be in two's complement form when PA_START < VA_START and that can result in a false carry bit. e.g 1) PA = 0x80000000; VA = 0xC0000000 __pv_offset = PA - VA = 0xC0000000 (2's complement) 2) PA = 0x2 80000000; VA = 0xC000000 __pv_offset = PA - VA = 0x1 C0000000 So adding __pv_offset + VA should never result in a true overflow for (1). So in order to differentiate between a true carry, a __pv_offset is extended to 64bit and the upper 32bits will have 0xffffffff if __pv_offset is 2's complement. So 'mvn #0' is inserted instead of 'mov' while patching for the same reason. Since mov, add, sub instruction are to patched with different constants inside the same stub, the rotation field of the opcode is using to differentiate between them. So the above examples for v2p translation becomes for VA=0xC0000000, 1) PA[63:32] = 0xffffffff PA[31:0] = VA + 0xC0000000 --> results in a carry PA[63:32] = PA[63:32] + carry PA[63:0] = 0x0 80000000 2) PA[63:32] = 0x1 PA[31:0] = VA + 0xC0000000 --> results in a carry PA[63:32] = PA[63:32] + carry PA[63:0] = 0x2 80000000 The above ideas were suggested by Nicolas Pitre <nico@linaro.org> as part of the review of first and second versions of the subject patch. There is no corresponding change on the phys_to_virt() side, because computations on the upper 32-bits would be discarded anyway. Cc: Russell King <linux@arm.linux.org.uk> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Sricharan R <r.sricharan@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
2013-10-10ARM: mm: Introduce virt_to_idmap() with an arch hookSantosh Shilimkar
On some PAE systems (e.g. TI Keystone), memory is above the 32-bit addressable limit, and the interconnect provides an aliased view of parts of physical memory in the 32-bit addressable space. This alias is strictly for boot time usage, and is not otherwise usable because of coherency limitations. On such systems, the idmap mechanism needs to take this aliased mapping into account. This patch introduces virt_to_idmap() and a arch function pointer which can be populated by platform which needs it. Also populate necessary idmap spots with now available virt_to_idmap(). Avoided #ifdef approach to be compatible with multi-platform builds. Most architecture won't touch it and in that case virt_to_idmap() fall-back to existing virt_to_phys() macro. Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
2013-10-10ARM: mm: use phys_addr_t appropriately in p2v and v2p conversionsSantosh Shilimkar
Fix remainder types used when converting back and forth between physical and virtual addresses. Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
2013-10-09of: remove HAVE_ARCH_DEVTREE_FIXUPSRob Herring
HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc, but it is only used for /proc/device-teee and sparc does not enable /proc/device-tree. So this option is redundant. Remove the option and always enable it. This has the side effect of fixing /proc/device-tree on arches such as arm64 which failed to define this option. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Grant Likely <grant.likely@linaro.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com>
2013-10-09xen: introduce xen_alloc/free_coherent_pagesStefano Stabellini
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu and devices. On native x86 is sufficient to call __get_free_pages in order to get a coherent buffer, while on ARM (and potentially ARM64) we need to call the native dma_ops->alloc implementation. Introduce xen_alloc_coherent_pages to abstract the arch specific buffer allocation. Similarly introduce xen_free_coherent_pages to free a coherent buffer: on x86 is simply a call to free_pages while on ARM and ARM64 is arm_dma_ops.free. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v7: - rename __get_dma_ops to __generic_dma_ops; - call __generic_dma_ops(hwdev)->alloc/free on arm64 too. Changes in v6: - call __get_dma_ops to get the native dma_ops pointer on arm.
2013-10-18arm/xen: get_dma_ops: return xen_dma_ops if we are running as xen_initial_domainStefano Stabellini
We can't simply override arm_dma_ops with xen_dma_ops because devices are allowed to have their own dma_ops and they take precedence over arm_dma_ops. When running on Xen as initial domain, we always want xen_dma_ops to be the one in use. We introduce __generic_dma_ops to allow xen_dma_ops functions to call back to the native implementation. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> CC: will.deacon@arm.com CC: linux@arm.linux.org.uk Changes in v7: - return xen_dma_ops only if we are the initial domain; - rename __get_dma_ops to __generic_dma_ops.
2013-10-10xen/arm,arm64: enable SWIOTLB_XENStefano Stabellini
Xen on arm and arm64 needs SWIOTLB_XEN: when running on Xen we need to program the hardware with mfns rather than pfns for dma addresses. Remove SWIOTLB_XEN dependency on X86 and PCI and make XEN select SWIOTLB_XEN on arm and arm64. At the moment always rely on swiotlb-xen, but when Xen starts supporting hardware IOMMUs we'll be able to avoid it conditionally on the presence of an IOMMU on the platform. Implement xen_create_contiguous_region on arm and arm64: for the moment we assume that dom0 has been mapped 1:1 (physical addresses == machine addresses) therefore we don't need to call XENMEM_exchange. Simply return the physical address as dma address. Initialize the xen-swiotlb from xen_early_init (before the native dma_ops are initialized), set xen_dma_ops to &xen_swiotlb_dma_ops. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v8: - assume dom0 is mapped 1:1, no need to call XENMEM_exchange. Changes in v7: - call __set_phys_to_machine_multi from xen_create_contiguous_region and xen_destroy_contiguous_region to update the P2M; - don't call XENMEM_unpin, it has been removed; - call XENMEM_exchange instead of XENMEM_exchange_and_pin; - set nr_exchanged to 0 before calling the hypercall. Changes in v6: - introduce and export xen_dma_ops; - call xen_mm_init from as arch_initcall. Changes in v4: - remove redefinition of DMA_ERROR_CODE; - update the code to use XENMEM_exchange_and_pin and XENMEM_unpin; - add a note about hardware IOMMU in the commit message. Changes in v3: - code style changes; - warn on XENMEM_put_dma_buf failures.
2013-10-17arm/xen,arm64/xen: introduce p2mStefano Stabellini
Introduce physical to machine and machine to physical tracking mechanisms based on rbtrees for arm/xen and arm64/xen. We need it because any guests on ARM are an autotranslate guests, therefore a physical address is potentially different from a machine address. When programming a device to do DMA, we need to be extra-careful to use machine addresses rather than physical addresses to program the device. Therefore we need to know the physical to machine mappings. For the moment we assume that dom0 starts with a 1:1 physical to machine mapping, in other words physical addresses correspond to machine addresses. However when mapping a foreign grant reference, obviously the 1:1 model doesn't work anymore. So at the very least we need to be able to track grant mappings. We need locking to protect accesses to the two trees. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v8: - move pfn_to_mfn and mfn_to_pfn to page.h as static inline functions; - no need to walk the tree if phys_to_mach.rb_node is NULL; - correctly handle multipage p2m entries; - substitute the spin_lock with a rwlock.
2013-10-15arm: make SWIOTLB availableStefano Stabellini
IOMMU_HELPER is needed because SWIOTLB calls iommu_is_span_boundary, provided by lib/iommu_helper.c. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: will.deacon@arm.com CC: linux@arm.linux.org.uk Changes in v8: - use __phys_to_pfn and __pfn_to_phys. Changes in v7: - dma_mark_clean: empty implementation; - in dma_capable use coherent_dma_mask if dma_mask hasn't been allocated. Changes in v6: - check for dev->dma_mask being NULL in dma_capable. Changes in v5: - implement dma_mark_clean using dmac_flush_range. Changes in v3: - dma_capable: do not treat dma_mask as a limit; - remove SWIOTLB dependency on NEED_SG_DMA_LENGTH.
2013-10-09Merge tag 'v3.12-rc4' into sched/coreIngo Molnar
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree before applying more scheduler patches. Conflicts: arch/avr32/include/asm/Kbuild Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-07Merge branch 'arm-aesbs' of ↵Russell King
git://git.linaro.org/people/ardbiesheuvel/linux-arm into devel-stable
2013-10-04ARM: pull in <asm/simd.h> from asm-genericArd Biesheuvel
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2013-10-03Merge branch 'timer_evtstrm' of git://linux-arm.org/linux-skn into ↵Daniel Lezcano
clockevents/3.13 Adds support to configure the rate and enable the event stream for architected timer. The event streams can be used to impose a timeout on a wfe, to safeguard against any programming error in case an expected event is not generated or even to implement wfe-based timeouts for userspace locking implementations. This feature can be disabled(enabled by default). Since the timer control register is reset to zero on warm boot, CPU PM notifier is added to save and restore the value. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2013-10-03ARM: 7843/1: drop asm/types.h from generic-yArd Biesheuvel
Commit 09096f6 (ARM: 7822/1: add workaround for ambiguous C99 stdint.h types) introduced an ARM specific 'asm/types.h' to work around some ambiguities in the definitions of 32 bit types. Hence, we will not be needing the generic version anymore. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-03ARM: 7842/1: MCPM: don't explode if invoked without being initialized firstNicolas Pitre
Currently mcpm_cpu_power_down() and mcpm_cpu_suspend() trigger BUG() if mcpm_platform_register() is not called beforehand. This may occur for many reasons such as some incomplete device tree passed to the kernel or the like. Let's be nicer to users and avoid killing the kernel if that happens by logging a warning and returning to the caller. The mcpm_cpu_suspend() user is already set to deal with this situation, and so is cpu_die() invoking mcpm_cpu_die(). The problematic case would have been the B.L switcher's usage of mcpm_cpu_power_down(), however it has to call mcpm_cpu_power_up() first which is already set to catch an error resulting from a missing mcpm_platform_register() call. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-10-02ARM: KVM: Implement kvm_vcpu_preferred_target() functionAnup Patel
This patch implements kvm_vcpu_preferred_target() function for KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl for user space. Signed-off-by: Anup Patel <anup.patel@linaro.org> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-09-30ARM: atomics: prefetch the destination word for write prior to strexWill Deacon
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch prefixes our atomic access implementations with pldw instructions (on CPUs which support them) to try and grab the line in exclusive state from the start. Only the barrier-less functions are updated, since memory barriers can limit the usefulness of prefetching data. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-09-30ARM: locks: prefetch the destination word for write prior to strexWill Deacon
The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch prefixes our {spin,read,write}_[try]lock implementations with pldw instructions (on CPUs which support them) to try and grab the line in exclusive state from the start. arch_rwlock_t is changed to avoid using a volatile member, since this generates compiler warnings when falling back on the __builtin_prefetch intrinsic which expects a const void * argument. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-09-30ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUsWill Deacon
SMP ARMv7 CPUs implement the pldw instruction, which allows them to prefetch data cachelines in an exclusive state. This patch defines the prefetchw macro using pldw for CPUs that support it. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-09-30ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.hWill Deacon
Patching UP/SMP alternatives inside inline assembly blocks is useful outside of the spinlock implementation, where it is used for sev and wfe. This patch lifts the macro into processor.h and gives it a scarier name to (a) avoid conflicts in the global namespace and (b) to try and deter its usage unless you "know what you're doing". The W macro for generating wide instructions when targetting Thumb-2 is also made available under the name WASM, to reduce the potential for conflicts with other headers. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-09-30ARM: prefetch: remove redundant "cc" clobberWill Deacon
The pld instruction does not affect the condition flags, so don't bother clobbering them. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-09-26ARM: arch_timer: add support to configure and enable event streamSudeep KarkadaNagesha
This patch adds support for configuring the event stream frequency and enabling it. It also adds the hwcaps definitions to the user to detect this event stream feature. Cc: Russell King <linux@arm.linux.org.uk> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-09-26ARM/ARM64: arch_timer: add macros for bits in control registerSudeep KarkadaNagesha
Add macros to describe the bitfields in the ARM architected timer control register to make code easy to understand. Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-09-25sched, arch: Create asm/preempt.hPeter Zijlstra
In order to prepare to per-arch implementations of preempt_count move the required bits into an asm-generic header and use this for all archs. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-23ARM: bL_switcher: Add query interface to discover CPU affinitiesDave Martin
When the switcher is active, there is no straightforward way to figure out which logical CPU a given physical CPU maps to. This patch provides a function bL_switcher_get_logical_index(mpidr), which is analogous to get_logical_index(). This function returns the logical CPU on which the specified physical CPU is grouped (or -EINVAL if unknown). If the switcher is inactive or not present, -EUNATCH is returned instead. Signed-off-by: Dave Martin <dave.martin@linaro.org> Signed-off-by: Nicolas Pitre <nico@linaro.org>
2013-09-23ARM: bL_switcher/trace: Add kernel trace trigger interfaceDave Martin
This patch exports a bL_switcher_trace_trigger() function to provide a means for drivers using the trace events to get the current status when starting a trace session. Calling this function is equivalent to pinging the trace_trigger file in sysfs. Signed-off-by: Dave Martin <dave.martin@linaro.org>