summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)Author
2010-07-27perf, powerpc: Use perf_sample_data_init() for the FSL codePeter Zijlstra
We should use perf_sample_data_init() to initialize struct perf_sample_data. As explained in the description of commit dc1d628a ("perf: Provide generic perf_sample_data initialization"), it is possible for userspace to get the kernel to dereference data.raw, so if it is not initialized, that means that unprivileged userspace can possibly oops the kernel. Using perf_sample_data_init makes sure it gets initialized to NULL. This conversion should have been included in commit dc1d628a, but it got missed. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Kumar Gala <kumar.gala@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-07-23powerpc: Fix erroneous lmb->memblock conversionsBenjamin Herrenschmidt
Oooops... we missed these. We incorrectly converted strings used when parsing the device-tree on pseries, thus breaking access to drconf memory and hotplug memory. While at it, also revert some variable names that represent something the FW calls "lmb" and thus don't need to be converted to "memblock". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> ---
2010-07-14Merge branch 'lmb-to-memblock' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'lmb-to-memblock' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: lmb: rename to memblock
2010-07-14lmb: rename to memblockYinghai Lu
via following scripts FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/lmb/memblock/g' \ -e 's/LMB/MEMBLOCK/g' \ $FILES for N in $(find . -name lmb.[ch]); do M=$(echo $N | sed 's/lmb/memblock/g') mv $N $M done and remove some wrong change like lmbench and dlmb etc. also move memblock.c from lib/ to mm/ Suggested-by: Ingo Molnar <mingo@elte.hu> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-11powerpc/fsl-booke: Fix address issue when using relocatable kernelsMatthew McClintock
When booting a relocatable kernel it needs to jump to the correct start address, which for BookE parts is usually unchanged regardless of the physical memory offset. Recent changes cause problems with how we calculate the start address, it was always adding the RMO into the start address which is incorrect. This patch only adds in the RMO offset if we are in the kexec code path, as it needs the RMO to work correctly. Instead of adding the RMO offset in in the common code path, we can just set r6 to the RMO offset in the kexec code path instead of to zero, and finally perform the masking in the common code path Signed-off-by: Matthew McClintock <msm@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-07-08powerpc: Fix default_machine_crash_shutdown #ifdef botchPaul E. McKenney
crash_kexec_wait_realmode() is defined only if CONFIG_PPC_STD_MMU_64 and CONFIG_SMP, but is called if CONFIG_PPC_STD_MMU_64 even if !CONFIG_SMP. Fix the conditional compilation around the invocation. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08powerpc: Fix logic error in fixup_irqsJohannes Berg
When SPARSE_IRQ is set, irq_to_desc() can return NULL. While the code here has a check for NULL, it's not really correct. Fix it by separating the check for it. This fixes CPU hot unplug for me. Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com> Cc: stable@kernel.org [2.6.32+] Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08powerpc: Linux cannot run with 0 coresAnton Blanchard
If we configure with CONFIG_SMP=n or set NR_CPUS less than the number of SMT threads we will set the max cores property to 0 in the ibm,client-architecture-support structure. On new versions of firmware that understand this property it obliges and terminates our partition. Use DIV_ROUND_UP so we handle not only the CONFIG_SMP=n case but also the case where NR_CPUS isn't a multiple of the number of SMT threads. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08powerpc: Fix compile errors in prom_init_check for gcc 4.5Stephen Rothwell
Just whitelist these extra compiler generated symbols. Fixes these errors: Error: External symbol '_restgpr0_14' referenced from prom_init.c Error: External symbol '_restgpr0_20' referenced from prom_init.c Error: External symbol '_restgpr0_22' referenced from prom_init.c Error: External symbol '_restgpr0_24' referenced from prom_init.c Error: External symbol '_restgpr0_25' referenced from prom_init.c Error: External symbol '_restgpr0_26' referenced from prom_init.c Error: External symbol '_restgpr0_27' referenced from prom_init.c Error: External symbol '_restgpr0_28' referenced from prom_init.c Error: External symbol '_restgpr0_29' referenced from prom_init.c Error: External symbol '_restgpr0_31' referenced from prom_init.c Error: External symbol '_savegpr0_14' referenced from prom_init.c Error: External symbol '_savegpr0_20' referenced from prom_init.c Error: External symbol '_savegpr0_22' referenced from prom_init.c Error: External symbol '_savegpr0_24' referenced from prom_init.c Error: External symbol '_savegpr0_25' referenced from prom_init.c Error: External symbol '_savegpr0_26' referenced from prom_init.c Error: External symbol '_savegpr0_27' referenced from prom_init.c Error: External symbol '_savegpr0_28' referenced from prom_init.c Error: External symbol '_savegpr0_29' referenced from prom_init.c Error: External symbol '_savegpr0_31' referenced from prom_init.c Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08powerpc/perf_event: Fix for power_pmu_disable()Matt Evans
When power_pmu_disable() removes the given event from a particular index into cpuhw->event[], it shuffles down higher event[] entries. But, this array is paired with cpuhw->events[] and cpuhw->flags[] so should shuffle them similarly. If these arrays get out of sync, code such as power_check_constraints() will fail. This caused a bug where events were temporarily disabled and then failed to be re-enabled; subsequent code tried to write_pmc() with its (disabled) idx of 0, causing a message "oops trying to write PMC0". This triggers this bug on POWER7, running a miss-heavy test: perf record -e L1-dcache-load-misses -e L1-dcache-store-misses ./misstest Signed-off-by: Matt Evans <matt@ozlabs.org> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-15powerpc: rtas_flash needs to use rtas_data_bufMilton Miller
When trying to flash a machine via the update_flash command, Anton received the following error: Restarting system. FLASH: kernel bug...flash list header addr above 4GB The code in question has a comment that the flash list should be in the kernel data and therefore under 4GB: /* NOTE: the "first" block list is a global var with no data * blocks in the kernel data segment. We do this because * we want to ensure this block_list addr is under 4GB. */ Unfortunately the Kconfig option is marked tristate which means the variable may not be in the kernel data and could be above 4GB. Instead of relying on the data segment being below 4GB, use the static data buffer allocated by the kernel for use by rtas. Since we don't use the header struct directly anymore, convert it to a simple pointer. Reported-By: Anton Blanchard <anton@samba.org> Signed-Off-By: Milton Miller <miltonm@bga.com Tested-By: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-15powerpc: Unconditionally enabled irq stacksChristoph Hellwig
Irq stacks provide an essential protection from stack overflows through external interrupts, at the cost of two additionals stacks per CPU. Enable them unconditionally to simplify the kernel build and prevent people from accidentally disabling them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-15powerpc/kexec: Wait for online/possible CPUs only.Matt Evans
kexec_perpare_cpus_wait() iterates i through NR_CPUS to check paca[i].kexec_state of each to make sure they have quiesced. However now we have dynamic PACA allocation, paca[NR_CPUS] is not necessarily valid and we overrun the array; spurious "cpu is not possible, ignoring" errors result. This patch iterates for_each_online_cpu so stays within the bounds of paca[] -- and every CPU is now 'possible'. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: clear bridge resource range if BIOS assigned bad one PCI: hotplug/cpqphp, fix NULL dereference Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/" PCI: change resource collision messages from KERN_ERR to KERN_INFO
2010-06-11PCI: clear bridge resource range if BIOS assigned bad oneYinghai Lu
Yannick found that video does not work with 2.6.34. The cause of this bug was that the BIOS had assigned the wrong range to the PCI bridge above the video device. Before 2.6.34 the kernel would have shrunk the size of the bridge window, but since d65245c PCI: don't shrink bridge resources the kernel will avoid shrinking BIOS ranges. So zero out the old range if we fail to claim it at boot time; this will cause us to allocate a new range at startup, restoring the 2.6.34 behavior. Fixes regression https://bugzilla.kernel.org/show_bug.cgi?id=16009. Reported-by: Yannick <yannick.roehlly@free.fr> Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-06-02powerpc/kprobes: Remove resume_execution() in kprobesAnanth N Mavinakayanahalli
emulate_step() in kprobe_handler() would've already determined if the probed instruction can be emulated. We single-step in hardware only if the instruction couldn't be emulated. resume_execution() therefore is superfluous -- all we need is to fix up the instruction pointer after single-stepping. Thanks to Paul Mackerras for catching this. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-01Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Don't export cvt_fd & _df when CONFIG_PPC_FPU is not set powerpc/44x: icon: select SM502 and frame buffer console support powerpc/85xx: Add P1021MDS board support powerpc/85xx: Change MPC8572DS camp dtses for MSI sharing powerpc/fsl_msi: add removal path and probe failing path powerpc/fsl_msi: enable msi sharing through AMP OSes powerpc/fsl_msi: enable msi allocation in all banks powerpc/fsl_msi: fix the conflict of virt_msir's chip_data powerpc/fsl_msi: Add multiple MSI bank support powerpc/kexec: Add support for FSL-BookE powerpc/fsl-booke: Move the entry setup code into a seperate file powerpc/fsl-booke: fix the case where we are not in the first page powerpc/85xx: Enable support for ports 3 and 4 on 8548 CDS powerpc/fsl-booke: Add hibernation support for FSL BookE processors powerpc/e500mc: Implement machine check handler. powerpc/44x: Add basic ICON PPC440SPe board support powerpc/44x: Fix UART clocks on 440SPe powerpc/44x: Add reset-type to katmai.dts powerpc/44x: Adding PCI-E support for PowerPC 460SX based SOC.
2010-06-01Merge branch 'for-35' of git://repo.or.cz/linux-kbuildLinus Torvalds
* 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits) kbuild: Revert part of e8d400a to resolve a conflict kbuild: Fix checking of scm-identifier variable gconfig: add support to show hidden options that have prompts menuconfig: add support to show hidden options which have prompts gconfig: remove show_debug option gconfig: remove dbg_print_ptype() and dbg_print_stype() kconfig: fix zconfdump() kconfig: some small fixes add random binaries to .gitignore kbuild: Include gen_initramfs_list.sh and the file list in the .d file kconfig: recalc symbol value before showing search results .gitignore: ignore *.lzo files headerdep: perlcritic warning scripts/Makefile.lib: Align the output of LZO kbuild: Generate modules.builtin in make modules_install Revert "kbuild: specify absolute paths for cscope" kbuild: Do not unnecessarily regenerate modules.builtin headers_install: use local file handles headers_check: fix perl warnings export_report: fix perl warnings ...
2010-05-31powerpc: Don't export cvt_fd & _df when CONFIG_PPC_FPU is not setBenjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-31Merge commit 'kumar/next' into nextBenjamin Herrenschmidt
Conflicts: arch/powerpc/sysdev/fsl_msi.c
2010-05-27powerpc: remove unnecessary sync_single_range_* in swiotlb_dma_opsFUJITA Tomonori
sync_single_range_for_cpu and sync_single_range_for_device hooks in swiotlb_dma_ops are unnecessary because sync_single_for_cpu and sync_single_for_device are used there. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Becky Bruce <beckyb@kernel.crashing.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-24powerpc/kexec: Add support for FSL-BookESebastian Andrzej Siewior
This adds support kexec on FSL-BookE where the MMU can not be simply switched off. The code borrows the initial MMU-setup code to create the identical mapping mapping. The only difference to the original boot code is the size of the mapping(s) and the executeable address. The kexec code maps the first 2 GiB of memory in 256 MiB steps. This should work also on e500v1 boxes. SMP support is still not available. (Kumar: Added minor change to build to ifdef CONFIG_PPC_STD_MMU_64 some code that was PPC64 specific) Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-24powerpc/fsl-booke: Move the entry setup code into a seperate fileSebastian Andrzej Siewior
This patch only moves the initial entry code which setups the mapping from what ever to KERNELBASE into a seperate file. No code change has been made here. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-24powerpc/fsl-booke: fix the case where we are not in the first pageSebastian Andrzej Siewior
During boot we change the mapping a few times until we have a "defined" mapping. During this procedure a small 4KiB mapping is created and after that one a 64MiB. Currently the offset of the 4KiB page in that we run is zero because the complete startup up code is in first page which starts at RPN zero. If the code is recycled and moved to another location then its execution will fail because the start address in 64 MiB mapping is computed wrongly. It does not consider the offset to the page from the begin of the memory. This patch fixes this. Usually (system boot) r25 is zero so this does not change anything unless the code is recycled. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-22Merge remote branch 'origin' into secretlab/next-devicetreeGrant Likely
Merging in current state of Linus' tree to deal with merge conflicts and build failures in vio.c after merge. Conflicts: drivers/i2c/busses/i2c-cpm.c drivers/i2c/busses/i2c-mpc.c drivers/net/gianfar.c Also fixed up one line in arch/powerpc/kernel/vio.c to use the correct node pointer. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22of: Remove duplicate fields from of_platform_driverGrant Likely
.name, .match_table and .owner are duplicated in both of_platform_driver and device_driver. This patch is a removes the extra copies from struct of_platform_driver and converts all users to the device_driver members. This patch is a pretty mechanical change. The usage model doesn't change and if any drivers have been missed, or if anything has been fixed up incorrectly, then it will fail with a compile time error, and the fixup will be trivial. This patch looks big and scary because it touches so many files, but it should be pretty safe. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-22drivercore: Add of_match_table to the common device driversGrant Likely
OF-style matching can be available to any device, on any type of bus. This patch allows any driver to provide an OF match table when CONFIG_OF is enabled so that drivers can be bound against devices described in the device tree. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-22arch/powerpc: Move dma_mask from of_device into pdev_archdataGrant Likely
By moving dma_mask into pdev_archdata, and adding archdata to struct of_device, it makes it possible to substitute of_device with struct platform_device, which is a stepping stone to removing the of_platform bus entirely. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-21Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
* 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (269 commits) KVM: x86: Add missing locking to arch specific vcpu ioctls KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls KVM: MMU: Segregate shadow pages with different cr0.wp KVM: x86: Check LMA bit before set_efer KVM: Don't allow lmsw to clear cr0.pe KVM: Add cpuid.txt file KVM: x86: Tell the guest we'll warn it about tsc stability x86, paravirt: don't compute pvclock adjustments if we trust the tsc x86: KVM guest: Try using new kvm clock msrs KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID KVM: x86: add new KVMCLOCK cpuid feature KVM: x86: change msr numbers for kvmclock x86, paravirt: Add a global synchronization point for pvclock x86, paravirt: Enable pvclock flags in vcpu_time_info structure KVM: x86: Inject #GP with the right rip on efer writes KVM: SVM: Don't allow nested guest to VMMCALL into host KVM: x86: Fix exception reinjection forced to true KVM: Fix wallclock version writing race KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots KVM: VMX: enable VMXON check with SMX enabled (Intel TXT) ...
2010-05-21Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (92 commits) powerpc: Remove unused 'protect4gb' boot parameter powerpc: Build-in e1000e for pseries & ppc64_defconfig powerpc/pseries: Make request_ras_irqs() available to other pseries code powerpc/numa: Use ibm,architecture-vec-5 to detect form 1 affinity powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim powerpc: Use smt_snooze_delay=-1 to always busy loop powerpc: Remove check of ibm,smt-snooze-delay OF property powerpc/kdump: Fix race in kdump shutdown powerpc/kexec: Fix race in kexec shutdown powerpc/kexec: Speedup kexec hash PTE tear down powerpc/pseries: Add hcall to read 4 ptes at a time in real mode powerpc: Use more accurate limit for first segment memory allocations powerpc/kdump: Use chip->shutdown to disable IRQs powerpc/kdump: CPUs assume the context of the oopsing CPU powerpc/crashdump: Do not fail on NULL pointer dereferencing powerpc/eeh: Fix oops when probing in early boot powerpc/pci: Check devices status property when scanning OF tree powerpc/vio: Switch VIO Bus PM to use generic helpers powerpc: Avoid bad relocations in iSeries code powerpc: Use common cpu_die (fixes SMP+SUSPEND build) ...
2010-05-21powerpc/fsl-booke: Add hibernation support for FSL BookE processorsAnton Vorontsov
This is started as swsusp_32.S modifications, but the amount of #ifdefs made the whole file horribly unreadable, so let's put the support into its own separate file. The code should be relatively easy to modify to support 44x BookEs as well, but since I don't have any 44x to test, let's confine the code to FSL BookE. (The only FSL-specific part so far is 'flush_dcache_L1'.) Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-21powerpc/e500mc: Implement machine check handler.Scott Wood
Most of the MSCR bit assigments are different in e500mc versus e500, and they are now write-one-to-clear. Some e500mc machine check conditions are made recoverable (as long as they aren't stuck on), most notably L1 instruction cache parity errors. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-21powerpc: Remove unused 'protect4gb' boot parameterFUJITA Tomonori
'protect4gb' boot parameter was introduced to avoid allocating dma space acrossing 4GB boundary in 2007 (the commit 569975591c5530fdc9c7a3c45122e5e46f075a74). In 2008, the IOMMU was fixed to use the boundary_mask parameter per device properly. So 'protect4gb' workaround was removed (the 383af9525bb27f927511874f6306247ec13f1c28). But somehow I messed the 'protect4gb' boot parameter that was used to enable the workaround. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc: Use smt_snooze_delay=-1 to always busy loopAnton Blanchard
Right now if we want to busy loop and not give up any time to the hypervisor we put a very large value into smt_snooze_delay. This is sometimes useful when running a single partition and you want to avoid any latencies due to the hypervisor or CPU power state transitions. While this works, it's a bit ugly - how big a number is enough now we have NO_HZ and can be idle for a very long time. The patch below makes smt_snooze_delay signed, and a negative value means loop forever: echo -1 > /sys/devices/system/cpu/cpu0/smt_snooze_delay This change shouldn't affect the existing userspace tools (eg ppc64_cpu), but I'm cc-ing Nathan just to be sure. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc: Remove check of ibm,smt-snooze-delay OF propertyAnton Blanchard
I'm not sure why we have code for parsing an ibm,smt-snooze-delay OF property. Since we have a smt-snooze-delay= boot option and we can also set it at runtime via sysfs, it should be safe to get rid of this code. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/kdump: Fix race in kdump shutdownMichael Neuling
When we are crashing, the crashing/primary CPU IPIs the secondaries to turn off IRQs, go into real mode and wait in kexec_wait. While this is happening, the primary tears down all the MMU maps. Unfortunately the primary doesn't check to make sure the secondaries have entered real mode before doing this. On PHYP machines, the secondaries can take a long time shutting down the IRQ controller as RTAS calls are need. These RTAS calls need to be serialised which resilts in the secondaries contending in lock_rtas() and hence taking a long time to shut down. We've hit this on large POWER7 machines, where some secondaries are still waiting in lock_rtas(), when the primary tears down the HPTEs. This patch makes sure all secondaries are in real mode before the primary tears down the MMU. It uses the new kexec_state entry in the paca. It times out if the secondaries don't reach real mode after 10sec. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/kexec: Fix race in kexec shutdownMichael Neuling
In kexec_prepare_cpus, the primary CPU IPIs the secondary CPUs to kexec_smp_down(). kexec_smp_down() calls kexec_smp_wait() which sets the hw_cpu_id() to -1. The primary does this while leaving IRQs on which means the primary can take a timer interrupt which can lead to the IPIing one of the secondary CPUs (say, for a scheduler re-balance) but since the secondary CPU now has a hw_cpu_id = -1, we IPI CPU -1... Kaboom! We are hitting this case regularly on POWER7 machines. There is also a second race, where the primary will tear down the MMU mappings before knowing the secondaries have entered real mode. Also, the secondaries are clearing out any pending IPIs before guaranteeing that no more will be received. This changes kexec_prepare_cpus() so that we turn off IRQs in the primary CPU much earlier. It adds a paca flag to say that the secondaries have entered the kexec_smp_down() IPI and turned off IRQs, rather than overloading hw_cpu_id with -1. This new paca flag is again used to in indicate when the secondaries has entered real mode. It also ensures that all CPUs have their IRQs off before we clear out any pending IPI requests (in kexec_cpu_down()) to ensure there are no trailing IPIs left unacknowledged. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc: Use more accurate limit for first segment memory allocationsAnton Blanchard
Author: Milton Miller <miltonm@bga.com> On large machines we are running out of room below 256MB. In some cases we only need to ensure the allocation is in the first segment, which may be 256MB or 1TB. Add slb0_limit and use it to specify the upper limit for the irqstack and emergency stacks. On a large ppc64 box, this fixes a panic at boot when the crashkernel= option is specified (previously we would run out of memory below 256MB). Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/kdump: Use chip->shutdown to disable IRQsAnton Blanchard
I saw this in a kdump kernel: IOMMU table initialized, virtual merging enabled Interrupt 155954 (real) is invalid, disabling it. Interrupt 155953 (real) is invalid, disabling it. ie we took some spurious interrupts. default_machine_crash_shutdown tries to disable all interrupt sources but uses chip->disable which maps to the default action of: static void default_disable(unsigned int irq) { } If we use chip->shutdown, then we actually mask the IRQ: static void default_shutdown(unsigned int irq) { struct irq_desc *desc = irq_to_desc(irq); desc->chip->mask(irq); desc->status |= IRQ_MASKED; } Not sure why we don't implement a ->disable action for xics.c, or why default_disable doesn't mask the interrupt. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/kdump: CPUs assume the context of the oopsing CPUAnton Blanchard
We wrap the crash_shutdown_handles[] calls with longjmp/setjmp, so if any of them fault we can recover. The problem is we add a hook to the debugger fault handler hook which calls longjmp unconditionally. This first part of kdump is run before we marshall the other CPUs, so there is a very good chance some CPU on the box is going to page fault. And when it does it hits the longjmp code and assumes the context of the oopsing CPU. The machine gets very confused when it has 10 CPUs all with the same stack, all thinking they have the same CPU id. I get even more confused trying to debug it. The patch below adds crash_shutdown_cpu and uses it to specify which cpu is in the protected region. Since it can only be -1 or the oopsing CPU, we don't need to use memory barriers since it is only valid on the local CPU - no other CPU will ever see a value that matches it's local CPU id. Eventually we should switch the order and marshall all CPUs before doing the crash_shutdown_handles[] calls, but that is a bigger fix. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/crashdump: Do not fail on NULL pointer dereferencingMaxim Uvarov
Signed-off-by: Maxim Uvarov <muvarov@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/pci: Check devices status property when scanning OF treeSonny Rao
We ran into an issue where it looks like we're not properly ignoring a pci device with a non-good status property when we walk the device tree and instanciate the Linux side PCI devices. However, the EEH init code does look for the property and disables EEH on these devices. This leaves us in an inconsistent where we are poking at a supposedly bad piece of hardware and RTAS will block our config cycles because EEH isn't enabled anyway. Signed-of-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/vio: Switch VIO Bus PM to use generic helpersBrian King
Switch to use the generic power management helpers. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc: Use common cpu_die (fixes SMP+SUSPEND build)Milton Miller
Configuring a powerpc 32 bit kernel for both SMP and SUSPEND turns on CPU_HOTPLUG to enable disable_nonboot_cpus to be called by the common suspend code. Previously the definition of cpu_die for ppc32 was in the powermac platform code, causing it to be undefined if that platform as not selected. arch/powerpc/kernel/built-in.o: In function 'cpu_idle': arch/powerpc/kernel/idle.c:98: undefined reference to 'cpu_die' Move the code from setup_64 to smp.c and rename the power mac versions to their specific names. Note that this does not setup the cpu_die pointers in either smp_ops (request a given cpu die) or ppc_md (make this cpu die), for other platforms but there are generic versions in smp.c. Reported-by: Matt Sealey <matt@genesi-usa.com> Reported-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21powerpc/rtasd: Don't start event scan if scan rate is zeroMichael Ellerman
There appear to be Pegasos systems which have the rtas-event-scan RTAS tokens, but on which the event scan always fails. They also have an event-scan-rate property containing 0, which means call event scan 0 times per minute. So interpret a scan rate of 0 to mean don't scan at all. This fixes the problem on the Pegasos machines and makes sense as well. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-20powerpc,kgdb: Introduce low level trap catchingJason Wessel
The only way the debugger can handle a trap in inside rcu_lock, notify_die, or atomic_notifier_call_chain without a recursive fault is to allow the kernel debugger to handle the exception first in program_check_exception(). The other change here is to make sure that kgdb_handle_exception() is called with correct parameters when catching an oops, because kdb needs to know if the entry was an oops, single step, or breakpoint exception. [benh@kernel.crashing.org: move debugger_bpt instead of #ifdef] CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-20kgdb: core changes to support kdbJason Wessel
These are the minimum changes to the kgdb core in order to enable an API to connect a new front end (kdb) to the debug core. This patch introduces the dbg_kdb_mode variable controls where the user level I/O is routed. It will be routed to the gdbstub (kgdb) or to the kdb front end which is a simple shell available over the kgdboc connection. You can switch back and forth between kdb or the gdb stub mode of operation dynamically. From gdb stub mode you can blindly type "$3#33", or from the kdb mode you can enter "kgdb" to switch to the gdb stub. The logic in the debug core depends on kdb to look for the typical gdb connection sequences and return immediately with KGDB_PASS_EVENT if a gdb serial command sequence is detected. That should allow a reasonably seamless transition between kdb -> gdb without leaving the kernel exception state. The two gdb serial queries that kdb is responsible for detecting are the "?" and "qSupported" packets. CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Acked-by: Martin Hicks <mort@sgi.com>
2010-05-18of: eliminate of_device->node and dev_archdata->{of,prom}_nodeGrant Likely
This patch eliminates the node pointer from struct of_device and the of_node (or prom_node) pointer from struct dev_archdata since the node pointer is now part of struct device proper when CONFIG_OF is set, and all users of the old pointer locations have already been converted over to use device->of_node. Also remove dev_archdata_{get,set}_node() as it is no longer used by anything. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18of: Always use 'struct device.of_node' to get device node pointer.Grant Likely
The following structure elements duplicate the information in 'struct device.of_node' and so are being eliminated. This patch makes all readers of these elements use device.of_node instead. (struct of_device *)->node (struct dev_archdata *)->prom_node (sparc) (struct dev_archdata *)->of_node (powerpc & microblaze) Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits) perf tools: Add mode to build without newt support perf symbols: symbol inconsistency message should be done only at verbose=1 perf tui: Add explicit -lslang option perf options: Type check all the remaining OPT_ variants perf options: Type check OPT_BOOLEAN and fix the offenders perf options: Check v type in OPT_U?INTEGER perf options: Introduce OPT_UINTEGER perf tui: Add workaround for slang < 2.1.4 perf record: Fix bug mismatch with -c option definition perf options: Introduce OPT_U64 perf tui: Add help window to show key associations perf tui: Make <- exit menus too perf newt: Add single key shortcuts for zoom into DSO and threads perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed perf newt: Fix the 'A'/'a' shortcut for annotate perf newt: Make <- exit the ui_browser x86, perf: P4 PMU - fix counters management logic perf newt: Make <- zoom out filters perf report: Report number of events, not samples perf hist: Clarify events_stats fields usage ... Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c