summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)Author
2014-07-28kvm: ppc: booke: Use the shared struct helpers of SPRN_DEARBharat Bhushan
Uses kvmppc_set_dar() and kvmppc_get_dar() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1Bharat Bhushan
Use kvmppc_set_srr0/srr1() and kvmppc_get_srr0/srr1() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28kvm: ppc: bookehv: Added wrapper macros for shadow registersBharat Bhushan
There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on BOOKE-HV and these shadow registers are guest accessible. So these shadow registers needs to be updated on BOOKE-HV. This patch adds new macro for get/set helper of shadow register . Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Make magic page properly 4k mappableAlexander Graf
The magic page is defined as a 4k page of per-vCPU data that is shared between the guest and the host to accelerate accesses to privileged registers. However, when the host is using 64k page size granularity we weren't quite as strict about that rule anymore. Instead, we partially treated all of the upper 64k as magic page and mapped only the uppermost 4k with the actual magic contents. This works well enough for Linux which doesn't use any memory in kernel space in the upper 64k, but Mac OS X got upset. So this patch makes magic page actually stay in a 4k range even on 64k page size hosts. This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE hosts for me. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Add hack for split real modeAlexander Graf
Today we handle split real mode by mapping both instruction and data faults into a special virtual address space that only exists during the split mode phase. This is good enough to catch 32bit Linux guests that use split real mode for copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our instruction pointer and can map the user space process freely below there. However, that approach fails when we're running KVM inside of KVM. Here the 1st level last_inst reader may well be in the same virtual page as a 2nd level interrupt handler. It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a kernel copy_from/to_user implementation can easily overlap with user space addresses. The architecturally correct way to fix this would be to implement an instruction interpreter in KVM that kicks in whenever we go into split real mode. This interpreter however would not receive a great amount of testing and be a lot of bloat for a reasonably isolated corner case. So I went back to the drawing board and tried to come up with a way to make split real mode work with a single flat address space. And then I realized that we could get away with the same trick that makes it work for Linux: Whenever we see an instruction address during split real mode that may collide, we just move it higher up the virtual address space to a place that hopefully does not collide (keep your fingers crossed!). That approach does work surprisingly well. I am able to successfully run Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I apply a tiny timing probe hack to QEMU. I'd say this is a win over even more broken split real mode :). Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Stop PTE lookup on write errorsAlexander Graf
When a page lookup failed because we're not allowed to write to the page, we should not overwrite that value with another lookup on the second PTEG which will return "page not found". Instead, we should just tell the caller that we had a permission problem. This fixes Mac OS X guests looping endlessly in page lookup code for me. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Deflect page write faults properly in kvmppc_stAlexander Graf
When we have a page that we're not allowed to write to, xlate() will already tell us -EPERM on lookup of that page. With the code as is we change it into a "page missing" error which a guest may get confused about. Instead, just tell the caller about the -EPERM directly. This fixes Mac OS X guests when run with DCBZ32 emulation. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Move vcore definition to end of kvm_arch structAlexander Graf
When building KVM with a lot of vcores (NR_CPUS is big), we can potentially get out of the ld immediate range for dereferences inside that struct. Move the array to the end of our kvm_arch struct. This fixes compilation issues with NR_CPUS=2048 for me. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: e500: Emulate power management control SPRMihai Caraman
For FSL e6500 core the kernel uses power management SPR register (PWRMGTCR0) to enable idle power down for cores and devices by setting up the idle count period at boot time. With the host already controlling the power management configuration the guest could simply benefit from it, so emulate guest request as a general store. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Enable for little endian hostsAlexander Graf
Now that we've fixed all the issues that HV KVM code had on little endian hosts, we can enable it in the kernel configuration for users to play with. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Fix ABIv2 on LEAlexander Graf
For code that doesn't live in modules we can just branch to the real function names, giving us compatibility with ABIv1 and ABIv2. Do this for the compiled-in code of HV KVM. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Access XICS in BEAlexander Graf
On the exit path from the guest we check what type of interrupt we received if we received one. This means we're doing hardware access to the XICS interrupt controller. However, when running on a little endian system, this access is byte reversed. So let's make sure to swizzle the bytes back again and virtually make XICS accesses big endian. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BEAlexander Graf
Some data structures are always stored in big endian. Among those are the LPPACA fields as well as the shadow slb. These structures might be shared with a hypervisor. So whenever we access those fields, make sure we do so in big endian byte order. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Access guest VPA in BEAlexander Graf
There are a few shared data structures between the host and the guest. Most of them get registered through the VPA interface. These data structures are defined to always be in big endian byte order, so let's make sure we always access them in big endian. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Make HTAB code LE host awareAlexander Graf
When running on an LE host all data structures are kept in little endian byte order. However, the HTAB still needs to be maintained in big endian. So every time we access any HTAB we need to make sure we do so in the right byte order. Fix up all accesses to manually byte swap. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28PPC: Add asm helpers for BE 32bit load/storeAlexander Graf
From assembly code we might not only have to explicitly BE access 64bit values, but sometimes also 32bit ones. Add helpers that allow for easy use of lwzx/stwx in their respective byte-reverse or native form. Signed-off-by: Alexander Graf <agraf@suse.de> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28KVM: PPC: e500: Fix default tlb for victim hintMihai Caraman
Tlb search operation used for victim hint relies on the default tlb set by the host. When hardware tablewalk support is enabled in the host, the default tlb is TLB1 which leads KVM to evict the bolted entry. Set and restore the default tlb when searching for victim hint. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Add H_SET_MODE hcall handlingMichael Neuling
This adds support for the H_SET_MODE hcall. This hcall is a multiplexer that has several functions, some of which are called rarely, and some which are potentially called very frequently. Here we add support for the functions that set the debug registers CIABR (Completed Instruction Address Breakpoint Register) and DAWR/DAWRX (Data Address Watchpoint Register and eXtension), since they could be updated by the guest as often as every context switch. This also adds a kvmppc_power8_compatible() function to test to see if a guest is compatible with POWER8 or not. The CIABR and DAWR/X only exist on POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabledPaul Mackerras
This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handlingPaul Mackerras
This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu scheduleMihai Caraman
On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition keeping last_vcpu per logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. Guest - old invalidation condition real 3.89 user 3.87 sys 0.01 Guest - enhanced invalidation condition real 3.75 user 3.73 sys 0.01 Host real 3.70 user 1.85 sys 0.00 The memory stress application accesses 4KB pages backed by 75% of available TLB0 entries: char foo[ENTRIES][4096] __attribute__ ((aligned (4096))); int main() { char bar; int i, j; for (i = 0; i < ITERATIONS; i++) for (j = 0; j < ENTRIES; j++) bar = foo[j][0]; return 0; } Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S PR: Fix sparse endian checksAlexander Graf
While sending sparse with endian checks over the code base, it triggered at some places that were missing casts or had wrong types. Fix them up. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S PR: Fix ABIv2 on LEAlexander Graf
We switched to ABIv2 on Little Endian systems now which gets rid of the dotted function names. Branch to the actual functions when we see such a system. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()Anton Blanchard
Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are assembly functions that are exported to modules and also require a valid r2. As such we need to use _GLOBAL_TOC so we provide a global entry point that establishes the TOC (r2). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issueAnton Blanchard
To establish addressability quickly, ABIv2 requires the target address of the function being called to be in r12. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S PR: Handle hyp doorbell exitsAlexander Graf
If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts. Handle those the same way we treat normal doorbells. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3s HV: Fix tlbie compile errorAlexander Graf
Some compilers complain about uninitialized variables in the compute_tlbie_rb function. When you follow the code path you'll realize that we'll never get to that point, but the compiler isn't all that smart. So just default to 4k page sizes for everything, making the compiler happy and the code slightly easier to read. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>
2014-07-28KVM: PPC: Book3s PR: Disable AIL mode with OPALAlexander Graf
When we're using PR KVM we must not allow the CPU to take interrupts in virtual mode, as the SLB does not contain host kernel mappings when running inside the guest context. To make sure we get good performance for non-KVM tasks but still properly functioning PR KVM, let's just disable AIL whenever a vcpu is scheduled in. This is fundamentally different from how we deal with AIL on pSeries type machines where we disable AIL for the whole machine as soon as a single KVM VM is up. The reason for that is easy - on pSeries we do not have control over per-cpu configuration of AIL. We also don't want to mess with CPU hotplug races and AIL configuration, so setting it per CPU is easier and more flexible. This patch fixes running PR KVM on POWER8 bare metal for me. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>
2014-07-28KVM: PPC: BOOK3S: PR: Emulate instruction counterAneesh Kumar K.V
Writing to IC is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: BOOK3S: PR: Emulate virtual timebase registerAneesh Kumar K.V
virtual time base register is a per VM, per cpu register that needs to be saved and restored on vm exit and entry. Writing to VTB is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix compile error] Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28Merge tag 'v3.17-rockchip-rk3xxx-dts' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into next/dt Merge "ARM: dts: changes for existing rockchip boards" from Heiko Stuebner: Collected changes for existing Rockchip boards - convert to new clock driver - bring structure in line with recent rk3288 comments (no soc-nodes, using phandles when adding changes, sorted by address) - i2c, board-pmic and pwm nodes nodes - sd card slot and ir receiver on radxa rock * tag 'v3.17-rockchip-rk3xxx-dts' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rk3188-radxarock: add GPIO IR receiver node ARM: dts: rockchip: add pwm nodes ARM: dts: rockchip: add both clocks to uart nodes ARM: dts: rk3188-radxarock: enable sd-card slot ARM: dts: add i2c and regulator nodes to rk3188-radxarock ARM: dts: rockchip: add tps65910 regulator for bqcurie2 ARM: dts: add rk3066 and rk3188 i2c device nodes and pinctrl settings ARM: dts: rockchip: oder nodes by register address ARM: dts: rockchip: remove address from pinctrl nodes ARM: dts: uses handles to reference nodes for changes ARM: dts: rockchip: add handles for shared nodes that don't have one yet ARM: dts: rockchip: remove soc subnodes arm: dts: rockchip: remove obsolete clock gate definitions ARM: dts: rockchip: move oscillator input clock into main dtsi ARM: dts: rockchip: add cru nodes and update device clocks to use it Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2014-07-28Merge tag 'v3.17-rockchip-rk3288' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into next/dt Merge "Initial support for Rockchip RK3288 SoCs" from Heiko Stuebner: * tag 'v3.17-rockchip-rk3288' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: Build dtbs for Rockchip boards ARM: dts: add rk3288 evaluation board ARM: dts: rockchip: add core rk3288 dtsi ARM: rockchip: enable support for RK3288 SoCs ARM: Kconfig: set default gpio number for rockchip SoCs ARM: rockchip: add debug uart used by rk3288 ARM: rockchip: clarify usability of DEBUG_RK3X_UART debug_ll options dt-bindings: arm: add cortex-a12 and cortex-a17 cpu compatible properties Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2014-07-28Merge branch 'clk-rockchip' of ↵Arnd Bergmann
git://git.linaro.org/people/mike.turquette/linux into next/soc This is a dependency for the rk3288 DT updates, the branch should first get merged through Mike's clk git. * 'clk-rockchip' of git://git.linaro.org/people/mike.turquette/linux: ARM: rockchip: Select ARCH_HAS_RESET_CONTROLLER clk: rockchip: add clock controller for rk3288 dt-bindings: add documentation for rk3288 cru clk: rockchip: add clock driver for rk3188 and rk3066 clocks dt-bindings: add documentation for rk3188 clock and reset unit clk: rockchip: add reset controller clk: rockchip: add clock type for pll clocks and pll used on rk3066 clk: rockchip: add basic infrastructure for clock branches clk: composite: improve rate_hw sanity check logic clk: composite: allow read-only clocks clk: composite: support determine_rate using rate_ops->round_rate + mux_ops->set_parent Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2014-07-28Merge branch 'clk-rockchip' of ↵Arnd Bergmann
git://git.linaro.org/people/mike.turquette/linux into next/soc This is a dependency for the rk3288 DT updates, the branch should first get merged through Mike's clk git. * 'clk-rockchip' of git://git.linaro.org/people/mike.turquette/linux: ARM: rockchip: Select ARCH_HAS_RESET_CONTROLLER clk: rockchip: add clock controller for rk3288 dt-bindings: add documentation for rk3288 cru clk: rockchip: add clock driver for rk3188 and rk3066 clocks dt-bindings: add documentation for rk3188 clock and reset unit clk: rockchip: add reset controller clk: rockchip: add clock type for pll clocks and pll used on rk3066 clk: rockchip: add basic infrastructure for clock branches clk: composite: improve rate_hw sanity check logic clk: composite: allow read-only clocks clk: composite: support determine_rate using rate_ops->round_rate + mux_ops->set_parent Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2014-07-28ARM: edma: Add edma_assign_channel_eventq() to move channel to a give queuePeter Ujfalusi
In some cases it is desired to move a channel to a specific event queue. Such a use case is audio, where it is preferred that it is served with highest priority compared to other DMA clients. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-07-28ARM: edma: Set default queue to lowest priorityPeter Ujfalusi
Use the lowest priority queue as default for clients. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-07-28gpio: split gpiod board registration into machine headerLinus Walleij
As per example from the regulator subsystem: put all defines and functions related to registering board info for GPIO descriptors into a separate <linux/gpio/machine.h> header. Cc: Andrew Victor <linux@maxim.org.za> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thierry Reding <thierry.reding@gmail.com> Acked-by: Stephen Warren <swarren@wwwdotorg.org> Reviewed-by: Alexandre Courbot <gnurou@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-07-28ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodesChen-Yu Tsai
dtc was giving warnings for missing #address-cells and #size-cells for the new sun6i-a31-hummingbird.dts, which has a i2c-based rtc device. This patch adds the properties for all i2c controller nodes for sun6i. Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2014-07-28ARM: dts: zynq: Add SPIAndreas Färber
Signed-off-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2014-07-28s390/irq: improve displayed interrupt order in /proc/interruptsHendrik Brueckner
Rework the irqclass_main_desc and irqclass_sub_desc data structures which are used to report detaild IRQ statistics in /proc/interrupts. When called from the procfs ops, the entries in the structures are processed one by one. The index of an IRQ in the structures is identical to its definition in the "enum interruption_class". To control and (re)order the displayed sequence, introduce an irq member in each entry. This helps to display related IRQs together without changing the assigned number in the interruption_class enumeration. That means, adding and displaying new IRQs are independent. Finally, this new behavior improves to maintain a kernel ABI. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-28s390/seccomp: fix error return for filtered system callsJan Willeke
The syscall_set_return_value function of s390 negates the error argument before storing the value to the return register gpr2. This is incorrect, the seccomp code already passes the negative error value. Store the unmodified error value to gpr2. Signed-off-by: Jan Willeke <willeke@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-07-28Merge tag 'v3.16-rc7' into perf/core, to merge in the latest fixes before ↵Ingo Molnar
applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-27Revert "ARC: [arcfpga] stdout-path now suffices for earlycon/console"Greg Kroah-Hartman
This reverts commit 9da433c0a0b5f71ac92dc9dca778295649675f08. Vineet writes: Could you please revert this single patch from tty-next for 3.17 as the needed core changes are not yet finalized. Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-07-28powerpc/perf: Add per-event excludes on Power8Michael Ellerman
Power8 has a new register (MMCR2), which contains individual freeze bits for each counter. This is an improvement on previous chips as it means we can have multiple events on the PMU at the same time with different exclude_{user,kernel,hv} settings. Previously we had to ensure all events on the PMU had the same exclude settings. The core of the patch is fairly simple. We use the 207S feature flag to indicate that the PMU backend supports per-event excludes, if it's set we skip the generic logic that enforces the equality of excludes between events. We also use that flag to skip setting the freeze bits in MMCR0, the PMU backend is expected to have handled setting them in MMCR2. The complication arises with EBB. The FCxP bits in MMCR2 are accessible R/W to a task using EBB. Which means a task using EBB will be able to see that we are using MMCR2 for freezing, whereas the old logic which used MMCR0 is not user visible. The task can not see or affect exclude_kernel & exclude_hv, so we only need to consider exclude_user. The table below summarises the behaviour both before and after this commit is applied: exclude_user true false ------------------------------------ | User visible | N N Before | Can freeze | Y Y | Can unfreeze | N Y ------------------------------------ | User visible | Y Y After | Can freeze | Y Y | Can unfreeze | Y/N Y ------------------------------------ So firstly I assert that the simple visibility of the exclude_user setting in MMCR2 is a non-issue. The event belongs to the task, and was most likely created by the task. So the exclude_user setting is not privileged information in any way. Secondly, the behaviour in the exclude_user = false case is unchanged. This is important as it is the case that is actually useful, ie. the event is created with no exclude setting and the task uses MMCR2 to implement exclusion manually. For exclude_user = true there is no meaningful change to freezing the event. Previously the task could use MMCR2 to freeze the event, though it was already frozen with MMCR0. With the new code the task can use MMCR2 to freeze the event, though it was already frozen with MMCR2. The only real change is when exclude_user = true and the task tries to use MMCR2 to unfreeze the event. Previously this had no effect, because the event was already frozen in MMCR0. With the new code the task can unfreeze the event in MMCR2, but at some indeterminate time in the future the kernel will overwrite its setting and refreeze the event. Therefore my final assertion is that any task using exclude_user = true and also fiddling with MMCR2 was deeply confused before this change, and remains so after it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc/perf: Pass the struct perf_events down to compute_mmcr()Michael Ellerman
To support per-event exclude settings on Power8 we need access to the struct perf_events in compute_mmcr(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc/perf: Clear all MMCR settings before calling compute_mmcr()Michael Ellerman
Because we reuse cpuhw->mmcr on each call to compute_mmcr() there's a risk that we could forget to set one of the values and use whatever value was in there previously. Currently all the implementations are careful to set all the values, but it's safer to clear them all before we call compute_mmcr(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc: Document how we set AIL on guest kernelsMichael Ellerman
I spent ten minutes scratching my head, trying to work out where we enabled relocation on interrupts for guest kernels. Expand the doco to make it clear. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc/pseries: Switch pseries drivers to use machine_xxx_initcall()Michael Ellerman
A lot of the code in platforms/pseries is using non-machine initcalls. That means if a kernel built with pseries support runs on another platform, for example powernv, the initcalls will still run. Most of these cases are OK, though sometimes only due to luck. Some were having more effect: * hcall_inst_init - Checking FW_FEATURE_LPAR which is set on ps3 & celleb. * mobility_sysfs_init - created sysfs files unconditionally - but no effect due to ENOSYS from rtas_ibm_suspend_me() * apo_pm_init - created sysfs, allows write - nothing checks the value written to though * alloc_dispatch_log_kmem_cache - creating kmem_cache on non-pseries machines Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall()Michael Ellerman
A lot of the code in platforms/powernv is using non-machine initcalls. That means if a kernel built with powernv support runs on another platform, for example pseries, the initcalls will still run. That is usually OK, because the initcalls will check for something in the device tree or elsewhere before doing anything, so on other platforms they will usually just return. But it's fishy for powernv code to be running on other platforms, so switch them all to be machine initcalls. If we want any of them to run on other platforms in future they should move to sysdev. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc: Add machine_early_initcall()Michael Ellerman
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>