summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2014-07-11powerpc/perf: Never program book3s PMCs with values >= 0x80000000Anton Blanchard
We are seeing a lot of PMU warnings on POWER8: Can't find PMC that caused IRQ Looking closer, the active PMC is 0 at this point and we took a PMU exception on the transition from negative to 0. Some versions of POWER8 have an issue where they edge detect and not level detect PMC overflows. A number of places program the PMC with (0x80000000 - period_left), where period_left can be negative. We can either fix all of these or just ensure that period_left is always >= 1. This patch takes the second option. Cc: <stable@vger.kernel.org> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64Guenter Roeck
powerpc:allmodconfig has been failing for some time with the following error. arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:1312: Error: attempt to move .org backwards make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1 A number of attempts to fix the problem by moving around code have been unsuccessful and resulted in failed builds for some configurations and the discovery of toolchain bugs. Fix the problem by disabling RELOCATABLE for COMPILE_TEST builds instead. While this is less than perfect, it avoids substantial code changes which would otherwise be necessary just to make COMPILE_TEST builds happy and might have undesired side effects. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/perf: Clear MMCR2 when enabling PMUJoel Stanley
On POWER8 when switching to a KVM guest we set bits in MMCR2 to freeze the PMU counters. Aside from on boot they are then never reset, resulting in stuck perf counters for any user in the guest or host. We now set MMCR2 to 0 whenever enabling the PMU, which provides a sane state for perf to use the PMU counters under either the guest or the host. This was manifesting as a bug with ppc64_cpu --frequency: $ sudo ppc64_cpu --frequency WARNING: couldn't run on cpu 0 WARNING: couldn't run on cpu 8 ... WARNING: couldn't run on cpu 144 WARNING: couldn't run on cpu 152 min: 18446744073.710 GHz (cpu -1) max: 0.000 GHz (cpu -1) avg: 0.000 GHz The command uses a perf counter to measure CPU cycles over a fixed amount of time, in order to approximate the frequency of the machine. The counters were returning zero once a guest was started, regardless of weather it was still running or had been shut down. By dumping the value of MMCR2, it was observed that once a guest is running MMCR2 is set to 1s - which stops counters from running: $ sudo sh -c 'echo p > /proc/sysrq-trigger' CPU: 0 PMU registers, ppmu = POWER8 n_counters = 6 PMC1: 5b635e38 PMC2: 00000000 PMC3: 00000000 PMC4: 00000000 PMC5: 1bf5a646 PMC6: 5793d378 PMC7: deadbeef PMC8: deadbeef MMCR0: 0000000080000000 MMCR1: 000000001e000000 MMCRA: 0000040000000000 MMCR2: fffffffffffffc00 EBBHR: 0000000000000000 EBBRR: 0000000000000000 BESCR: 0000000000000000 SIAR: 00000000000a51cc SDAR: c00000000fc40000 SIER: 0000000001000000 This is done unconditionally in book3s_hv_interrupts.S upon entering the guest, and the original value is only save/restored if the host has indicated it was using the PMU. This is okay, however the user of the PMU needs to ensure that it is in a defined state when it starts using it. Fixes: e05b9b9e5c10 ("powerpc/perf: Power8 PMU support") Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/perf: Add PPMU_ARCH_207S defineJoel Stanley
Instead of separate bits for every POWER8 PMU feature, have a single one for v2.07 of the architecture. This saves us adding a MMCR2 define for a future patch. Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/kvm: Remove redundant save of SIER AND MMCR2Joel Stanley
These two registers are already saved in the block above. Aside from being unnecessary, by the time we get down to the second save location r8 no longer contains MMCR2, so we are clobbering the saved value with PMC5. MMCR2 primarily consists of counter freeze bits. So restoring the value of PMC5 into MMCR2 will most likely have the effect of freezing counters. Fixes: 72cde5a88d37 ("KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8") Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/powernv: Check for IRQHAPPENED before sleepingPreeti U Murthy
Commit 8d6f7c5a: "powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()" added code that prevents cpus from checking for pending interrupts just before entering sleep state, which is wrong. These interrupts are delivered during the soft irq disabled state of the cpu. A cpu cannot enter any idle state with pending interrupts because they will never be serviced until the next time the cpu is woken up by some other interrupt. Its only then that the pending interrupts are replayed. This can result in device timeouts or warnings about this cpu being stuck. This patch fixes ths issue by ensuring that cpus check for pending interrupts just before entering any idle state as long as they are not in the path of split core operations. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc: Clean up MMU_FTRS_A2 and MMU_FTR_TYPE_3EMichael Ellerman
In fb5a515704d7 "powerpc: Remove platforms/wsp and associated pieces", we removed the last user of MMU_FTRS_A2. So remove it. MMU_FTRS_A2 was the last user of MMU_FTR_TYPE_3E, so remove it also. This leaves some unreachable code in mmu_context_nohash.c, so remove that also. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-11powerpc/cell: Fix compilation with CONFIG_COREDUMP=nMichael Ellerman
Commit 046d662f4818 "coredump: make core dump functionality optional" made the coredump optional, but didn't update the spufs code that depends on it. That leads to build errors such as: arch/powerpc/platforms/built-in.o: In function `.spufs_arch_write_note': coredump.c:(.text+0x22cd4): undefined reference to `.dump_emit' coredump.c:(.text+0x22cf4): undefined reference to `.dump_emit' coredump.c:(.text+0x22d0c): undefined reference to `.dump_align' coredump.c:(.text+0x22d48): undefined reference to `.dump_emit' coredump.c:(.text+0x22e7c): undefined reference to `.dump_skip' Fix it by adding some ifdefs in the cell code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-25powerpc: Don't skip ePAPR spin-table CPUsScott Wood
Commit 59a53afe70fd530040bdc69581f03d880157f15a "powerpc: Don't setup CPUs with bad status" broke ePAPR SMP booting. ePAPR says that CPUs that aren't presently running shall have status of disabled, with enable-method being used to determine whether the CPU can be enabled. Fix by checking for spin-table, which is currently the only supported enable-method. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Emil Medve <Emilian.Medve@Freescale.com> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-25powerpc/module: Fix TOC symbol CRCLaurent Dufour
The commit 71ec7c55ed91 introduced the magic symbol ".TOC." for ELFv2 ABI. This symbol is built manually and has no CRC value computed. A zero value is put in the CRC section to avoid modpost complaining about a missing CRC. Unfortunately, this breaks the kernel module loading when the kernel is relocated (kdump case for instance) because of the relocation applied to the kcrctab values. This patch compute a CRC value for the TOC symbol which will match the one compute by the kernel when it is relocated - aka '0 - relocate_start' done in maybe_relocated called by check_version (module.c). Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-25powerpc/powernv: Remove OPAL v1 takeoverMichael Ellerman
In commit 27f4488872d9 "Add OPAL takeover from PowerVM" we added support for "takeover" on OPAL v1 machines. This was a mode of operation where we would boot under pHyp, and query for the presence of OPAL. If detected we would then do a special sequence to take over the machine, and the kernel would end up running in hypervisor mode. OPAL v1 was never a supported product, and was never shipped outside IBM. As far as we know no one is still using it. Newer versions of OPAL do not use the takeover mechanism. Although the query for OPAL should be harmless on machines with newer OPAL, we have seen a machine where it causes a crash in Open Firmware. The code in early_init_devtree() to copy boot_command_line into cmd_line was added in commit 817c21ad9a1f "Get kernel command line accross OPAL takeover", and AFAIK is only used by takeover, so should also be removed. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/kmemleak: Do not scan the DART tableCatalin Marinas
The DART table allocation is registered to kmemleak via the memblock_alloc_base() call. However, the DART table is later unmapped and dart_tablebase VA no longer accessible. This patch tells kmemleak not to scan this block and avoid an unhandled paging request. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/cell: cbe_thermal.c: Cleaning up a variable is of the wrong typeRickard Strandqvist
This variable is of the wrong type, everywhere it is used it should be an unsigned int rather than a int. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/kprobes: Fix jprobes on ABI v2 (LE)Michael Ellerman
In commit 721aeaa9 "Build little endian ppc64 kernel with ABIv2", we missed some updates required in the kprobes code to make jprobes work when the kernel is built with ABI v2. Firstly update arch_deref_entry_point() to do the right thing. Now that we have added ppc_global_function_entry() we can just always use that, it will do the right thing for 32 & 64 bit and ABI v1 & v2. Secondly we need to update the code that sets up the register state before calling the jprobe handler. On ABI v1 we setup r2 to hold the TOC, on ABI v2 we need to populate r12 with the function entry point address. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/ftrace: Use pr_fmt() to namespace error messagesMichael Ellerman
The printks() in our ftrace code have no prefix, so they appear on the console with very little context, eg: Branch out of range Use pr_fmt() & pr_err() to add a prefix. While we're at it, collapse a few split lines that don't need to be, and add a missing newline to one message. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/ftrace: Fix nop of modules on 64bit LE (ABIv2)Michael Ellerman
There is a bug in the handling of the function entry when we are nopping out a branch from a module in ftrace. We compare the result of module_trampoline_target() with the value of ppc_function_entry(), and expect them to be true. But they never will be. module_trampoline_target() will always return the global entry point of the function, whereas ppc_function_entry() will always return the local. Fix it by using the newly added ppc_global_function_entry(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/ftrace: Fix inverted check of create_branch()Michael Ellerman
In commit 24a1bdc35, "Fix ABIv2 issues with __ftrace_make_call", Anton changed the logic that creates and patches the branch, and added a thinko in the check of create_branch(). create_branch() returns the instruction that was generated, so if we get zero then it succeeded. The result is we can't ftrace modules: Branch out of range WARNING: at ../kernel/trace/ftrace.c:1638 ftrace failed to modify [<d000000004ba001c>] fuse_req_init_context+0x1c/0x90 [fuse] We should probably fix patch_instruction() to do that check and make the API saner, but that's a separate patch. For now just invert the test. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/ftrace: Fix typo in mask of opcodeMichael Ellerman
In commit 24a1bdc35, "Fix ABIv2 issues with __ftrace_make_call", Anton changed the logic that checks for the expected code sequence when patching a module. We missed the typo in the mask, 0xffff00000 should be 0xffff0000, which has the effect of making the test always true. That makes it impossible to ftrace against modules, eg: Unexpected call sequence: 48000008 e8410018 WARNING: at ../kernel/trace/ftrace.c:1638 ftrace failed to modify [<d000000007cf001c>] rng_dev_open+0x1c/0x70 [rng_core] Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc: Add ppc_global_function_entry()Michael Ellerman
ABIv2 has the concept of a global and local entry point to a function. In most cases we are interested in the local entry point, and so that is what ppc_function_entry() returns. However we have a case in the ftrace code where we want the global entry point, and there may be other places we need it too. Rather than special casing each, add an accessor. For ABIv1 and 32-bit there is only a single entry point, so we return that. That means it's safe for the caller to use this without also checking the ABI version. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc: Remove __arch_swab*Benjamin Herrenschmidt
The generic code uses gcc built-ins which work fine so there's no benefit in implementing our own anymore. We can't completely remove the ld/st_le* functions as some historical cruft still uses them, but that's next on the radar Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc: Remove ancient DEBUG_SIG codeMichael Ellerman
We have some compile-time disabled debug code in signal_xx.c. It's from some ancient time BG, almost certainly part of the original port, given the very similar code on other arches. The show_unhandled_signal logic, added in d0c3d534a438 (2.6.24) is cleaner and prints more useful information, so drop the debug code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-24powerpc/kerenl: Enable EEH for IO accessorsGavin Shan
In arch/powerpc/kernel/iomap.c, lots of IO reading accessors missed to check EEH error as Ben pointed. The patch fixes it. For the writing accessors, we change the called functions only for making them look similar to the reading counterparts. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-12Merge branch 'kbuild' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: "Kbuild changes for v3.16-rc1: - cross-compilation fix so that cc-option is testing the right compiler - Fix for make defconfig all - Using relative paths to the object and source directory where possible, plus fixes for the fallout of the change - several cleanups in the Makefiles and scripts The powerpc fix is from today, because it was only discovered recently. The rest has been in linux-next for some time" * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: powerpc: Avoid circular dependency with zImage.% kbuild: create include/config directory in scripts/kconfig/Makefile kbuild: do not create include/linux directory Makefile: Fix unrecognized cross-compiler command line options kbuild: do not add "selinux" to subdir- twice um: Fix for relative objtree when generating x86 headers kbuild: Use relative path when building in a subdir of the source tree kbuild: Use relative path when building in the source tree kbuild: Use relative path for $(objtree) firmware: Use $(quote) in the Makefile firmware: Simplify directory creation kbuild: trivial - fix comment block indent kbuild: trivial - remove trailing spaces kbuild: support simultaneous "make %config" and "make all" kbuild: move extra gcc checks to scripts/Makefile.extrawarn
2014-06-12Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull more powerpc updates from Ben Herrenschmidt: "Here are the remaining bits I was mentioning earlier. Mostly bug fixes and new selftests from Michael (yay !). He also removed the WSP platform and A2 core support which were dead before release, so less clutter. One little "feature" I snuck in is the doorbell IPI support for non-virtualized P8 which speeds up IPIs significantly between threads of a core" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits) powerpc/book3s: Fix some ABIv2 issues in machine check code powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest. powerpc/book3s: Increment the mce counter during machine_check_early call. powerpc/book3s: Add stack overflow check in machine check handler. powerpc/book3s: Fix machine check handling for unhandled errors powerpc/eeh: Dump PE location code powerpc/powernv: Enable POWER8 doorbell IPIs powerpc/cpuidle: Only clear LPCR decrementer wakeup bit on fast sleep entry powerpc/powernv: Fix killed EEH event powerpc: fix typo 'CONFIG_PMAC' powerpc: fix typo 'CONFIG_PPC_CPU' powerpc/powernv: Don't escalate non-existing frozen PE powerpc/eeh: Report frozen parent PE prior to child PE powerpc/eeh: Clear frozen state for child PE powerpc/powernv: Reduce panic timeout from 180s to 10s powerpc/xmon: avoid format string leaking to printk selftests/powerpc: Add tests of PMU EBBs selftests/powerpc: Add support for skipping tests selftests/powerpc: Put the test in a separate process group selftests/powerpc: Fix instruction loop for ABIv2 (LE) ...
2014-06-12Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more scheduler updates from Ingo Molnar: "Second round of scheduler changes: - try-to-wakeup and IPI reduction speedups, from Andy Lutomirski - continued power scheduling cleanups and refactorings, from Nicolas Pitre - misc fixes and enhancements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Delete extraneous extern for to_ratio() sched/idle: Optimize try-to-wake-up IPI sched/idle: Simplify wake_up_idle_cpu() sched/idle: Clear polling before descheduling the idle thread sched, trace: Add a tracepoint for IPI-less remote wakeups cpuidle: Set polling in poll_idle sched: Remove redundant assignment to "rt_rq" in update_curr_rt(...) sched: Rename capacity related flags sched: Final power vs. capacity cleanups sched: Remove remaining dubious usage of "power" sched: Let 'struct sched_group_power' care about CPU capacity sched/fair: Disambiguate existing/remaining "capacity" usage sched/fair: Change "has_capacity" to "has_free_capacity" sched/fair: Remove "power" from 'struct numa_stats' sched: Fix signedness bug in yield_to() sched/fair: Use time_after() in record_wakee() sched/balancing: Reduce the rate of needless idle load balancing sched/fair: Fix unlocked reads of some cfs_b->quota/period
2014-06-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
2014-06-12Merge commit '3cf2f34' into sched/core, to fix build errorIngo Molnar
Fix this dependency on the locking tree's smp_mb*() API changes: kernel/sched/idle.c:247:3: error: implicit declaration of function ‘smp_mb__after_atomic’ [-Werror=implicit-function-declaration] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-12powerpc: Avoid circular dependency with zImage.%Michal Marek
The rule to create the final images uses a zImage.% pattern. Unfortunately, this also matches the names of the zImage.*.lds linker scripts, which appear as a dependency of the final images. This somehow worked when $(srctree) used to be an absolute path, but now the pattern matches too much. List only the images from $(image-y) as the target of the rule, to avoid the circular dependency. Reported-and-tested-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2014-06-12powerpc/book3s: Fix some ABIv2 issues in machine check codeAnton Blanchard
Commit 2749a2f26a7c (powerpc/book3s: Fix machine check handling for unhandled errors) introduced a few ABIv2 issues. We can maintain ABIv1 and ABIv2 compatibility by branching to the function rather than the dot symbol. Fixes: 2749a2f26a7c ("powerpc/book3s: Fix machine check handling for unhandled errors") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest.Mahesh Salgaonkar
Currently we forward MCEs to guest which have been recovered by guest. And for unhandled errors we do not deliver the MCE to guest. It looks like with no support of FWNMI in qemu, guest just panics whenever we deliver the recovered MCEs to guest. Also, the existig code used to return to host for unhandled errors which was casuing guest to hang with soft lockups inside guest and makes it difficult to recover guest instance. This patch now forwards all fatal MCEs to guest causing guest to crash/panic. And, for recovered errors we just go back to normal functioning of guest instead of returning to host. This fixes soft lockup issues in guest. This patch also fixes an issue where guest MCE events were not logged to host console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/book3s: Increment the mce counter during machine_check_early call.Mahesh Salgaonkar
We don't see MCE counter getting increased in /proc/interrupts which gives false impression of no MCE occurred even when there were MCE events. The machine check early handling was added for PowerKVM and we missed to increment the MCE count in the early handler. We also increment mce counters in the machine_check_exception call, but in most cases where we handle the error hypervisor never reaches there unless its fatal and we want to crash. Only during fatal situation we may see double increment of mce count. We need to fix that. But for now it always good to have some count increased instead of zero. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/book3s: Add stack overflow check in machine check handler.Mahesh Salgaonkar
Currently machine check handler does not check for stack overflow for nested machine check. If we hit another MCE while inside the machine check handler repeatedly from same address then we get into risk of stack overflow which can cause huge memory corruption. This patch limits the nested MCE level to 4 and panic when we cross level 4. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/book3s: Fix machine check handling for unhandled errorsMahesh Salgaonkar
Current code does not check for unhandled/unrecovered errors and return from interrupt if it is recoverable exception which in-turn triggers same machine check exception in a loop causing hypervisor to be unresponsive. This patch fixes this situation and forces hypervisor to panic for unhandled/unrecovered errors. This patch also fixes another issue where unrecoverable_exception routine was called in real mode in case of unrecoverable exception (MSR_RI = 0). This causes another exception vector 0x300 (data access) during system crash leading to confusion while debugging cause of the system crash. Also turn ME bit off while going down, so that when another MCE is hit during panic path, system will checkstop and hypervisor will get restarted cleanly by SP. With the above fixes we now throw correct console messages (see below) while crashing the system in case of unhandled/unrecoverable machine checks. -------------- Severe Machine check interrupt [[Not recovered] Initiator: CPU Error type: UE [Instruction fetch] Effective address: 0000000030002864 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork] CPU: 36 PID: 55162 Comm: bash Tainted: G O 3.14.0mce #1 task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000 NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c REGS: c000000007ec3d80 TRAP: 0200 Tainted: G O (3.14.0mce) MSR: 9000000000041002 <SF,HV,ME,RI> CR: 28222848 XER: 20000000 CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1 GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864 GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9 GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728 GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000 GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000 GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250 GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000 NIP [0000000030002864] 0x30002864 LR [00000000300151a4] 0x300151a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 7285f0beac1e29d3 ]--- Sending IPI to other CPUs IPI complete OPAL V3 detected ! -------------- Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Dump PE location codeGavin Shan
As Ben suggested, it's meaningful to dump PE's location code for site engineers when hitting EEH errors. The patch introduces function eeh_pe_loc_get() to retireve the location code from dev-tree so that we can output it when hitting EEH errors. If primary PE bus is root bus, the PHB's dev-node would be tried prior to root port's dev-node. Otherwise, the upstream bridge's dev-node of the primary PE bus will be check for the location code directly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Enable POWER8 doorbell IPIsMichael Neuling
This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix killed EEH eventGavin Shan
On PowerNV platform, EEH errors are reported by IO accessors or poller driven by interrupt. After the PE is isolated, we won't produce EEH event for the PE. The current implementation has possibility of EEH event lost in this way: The interrupt handler queues one "special" event, which drives the poller. EEH thread doesn't pick the special event yet. IO accessors kicks in, the frozen PE is marked as "isolated" and EEH event is queued to the list. EEH thread runs because of special event and purge all existing EEH events. However, we never produce an other EEH event for the frozen PE. Eventually, the PE is marked as "isolated" and we don't have EEH event to recover it. The patch fixes the issue to keep EEH events for PEs that have been marked as "isolated" with the help of additional "force" help to eeh_remove_event(). Reported-by: Rolf Brudeseth <rolfb@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc: fix typo 'CONFIG_PMAC'Paul Bolle
Commit b0d278b7d3ae ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending") added a check for CONFIG_PMAC were a check for CONFIG_PPC_PMAC was clearly intended. Fixes: b0d278b7d3ae ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending") Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc: fix typo 'CONFIG_PPC_CPU'Paul Bolle
Commit cd64d1697cf0 ("powerpc: mtmsrd not defined") added a check for CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended. Fixes: cd64d1697cf0 ("powerpc: mtmsrd not defined") Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Don't escalate non-existing frozen PEGavin Shan
Commit cb5b242c ("powerpc/eeh: Escalate error on non-existing PE") escalates the frozen state on non-existing PE to fenced PHB. It was to improve kdump reliability. After that, commit 361f2a2a ("powrpc/powernv: Reset PHB in kdump kernel") was introduced to issue complete reset on all PHBs to increase the reliability of kdump kernel. Commit cb5b242c becomes unuseful and it would be reverted. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Report frozen parent PE prior to child PEGavin Shan
When we have the corner case of frozen parent and child PE at the same time, we have to handle the frozen parent PE prior to the child. Without clearning the frozen state on parent PE, the child PE can't be recovered successfully. The patch searches the EEH PE hierarchy tree and returns the toppest frozen PE to be handled. It ensures the frozen parent PE will be handled prior to child PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/eeh: Clear frozen state for child PEGavin Shan
Since commit cb523e09 ("powerpc/eeh: Avoid I/O access during PE reset"), the PE is kept as frozen state on hardware level until the PE reset is done completely. After that, we explicitly clear the frozen state of the affected PE. However, there might have frozen child PEs of the affected PE and we also need clear their frozen state as well. Otherwise, the recovery is going to fail. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Reduce panic timeout from 180s to 10sAnton Blanchard
We've already dropped the default pseries timeout to 10s, do the same for powernv. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/xmon: avoid format string leaking to printkKees Cook
This makes sure format strings cannot leak into printk (the string has already been correctly processed for format arguments). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/perf: Ensure all EBB register state is cleared on fork()Michael Ellerman
In commit 330a1eb "Core EBB support for 64-bit book3s" I messed up clear_task_ebb(). It clears some but not all of the task's Event Based Branch (EBB) registers when we duplicate a task struct. That allows a child task to observe the EBBHR & EBBRR of its parent, which it should not be able to do. Fix it by clearing EBBHR & EBBRR. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@vger.kernel.org [v3.11+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix reading of OPAL msglogJoel Stanley
memory_return_from_buffer returns a signed value, so ret should be ssize_t. Fixes the following issue reported by David Binderman: [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style) Checking if unsigned variable 'ret' is less than zero. [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style) Checking if unsigned variable 'ret' is less than zero. Local variable "ret" is of type size_t. This is always unsigned, so it is pointless to check if it is less than zero. https://bugzilla.kernel.org/show_bug.cgi?id=77551 Fixing this exposes a real bug for the case where the entire count bytes is successfully read from the POS_WRAP case. The second memory_read_from_buffer will return EINVAL, causing the entire read to return EINVAL to userspace, despite the data being copied correctly. The fix is to test for the case where the data has been read and return early. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/spufs: Remove duplicate SPUFS_CNTL_MAP_SIZE defineDan Carpenter
The SPUFS_CNTL_MAP_SIZE define is cut and pasted twice so we can delete the second instance. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/cpm: Remove duplicate FCC_GFMR_TTX defineDan Carpenter
The FCC_GFMR_TTX define is cut and pasted twice so we can remove the second instance. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv: Fix endianness problems in EEHGuo Chao
EEH information fetched from OPAL need fix before using in LE environment. To be included in sparse's endian check, declare them as __beXX and access them by accessors. Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powernv: Fix permissions on sysparam sysfs entriesAnton Blanchard
Everyone can write to these files, which is not what we want. Cc: stable@vger.kernel.org # 3.15 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc/powernv : Disable subcore for UP configsShreyas B. Prabhu
Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/subcore.c: In function ‘cpu_update_split_mode’: arch/powerpc/platforms/powernv/subcore.c:274:15: error: ‘setup_max_cpus’ undeclared (first use in this function) arch/powerpc/platforms/powernv/subcore.c:285:5: error: lvalue required as left operand of assignment 'setup_max_cpus' variable is relevant only on SMP, so there is no point working around it for UP. Furthermore, subcore itself is relevant only on SMP and hence the better solution is to exclude subcore.o and subcore-asm.o for UP builds. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>