Age | Commit message (Collapse) | Author |
|
The patch implements the EEH operation backend restore_config()
for PowerNV platform. That relies on OPAL API opal_pci_reinit()
where we reinitialize the error reporting properly after PE or
PHB reset.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
After reset on the specific PE or PHB, we never configure AER
correctly on PowerNV platform. We needn't care it on pSeries
platform. The patch introduces additional EEH operation eeh_ops::
restore_config() so that we have chance to configure AER correctly
for PowerNV platform.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We don't have IO ports on PHB3 and the assignment of variable
"iomap_off" on PHB3 is meaningless. The patch just removes the
unnecessary assignment to the variable. The code change should
have been part of commit c35d2a8c ("powerpc/powernv: Needn't IO
segment map for PHB3").
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
After reverting 25ebc45b93452d0bc60271f178237123c4b26808
("powerpc/pseries/iommu: remove default window before attempting DDW
manipulation"), we no longer remove the base window in enable_ddw.
Therefore, we no longer need to reset the DMA window state in
find_existing_ddw_windows(). We can instead go back to what was done
before, which simply reuses the previous configuration, if any. Further,
this removes the final caller of the reset-pe-dma-windows call, so
remove those functions.
This fixes an EEH on kdump with the ipr driver. The EEH occurs, because
the initcall removes the DDW configuration (64-bit DMA window), but
doesn't ensure the ops are via the IOMMU -- a DMA operation occurs
during probe (still investigating this) and we EEH.
This reverts commit 14b6f00f8a4fdec5ccd45a0710284de301a61628.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
manipulation"
Ben rightfully pointed out that there is a race in the "newer" DDW code.
Presuming we are running on recent enough firmware that supports the
"reset" DDW manipulation call, we currently always remove the base
32-bit DMA window in order to maximize the resources for Phyp when
creating the 64-bit window. However, this can be problematic for the
case where multiple functions are in the same PE (partitionable
endpoint), where some funtions might be 32-bit DMA only. All of a
sudden, the only functional DMA window for such functions is gone. We
will have serious errors in such situations. The best solution is simply
to revert the extension to the DDW code where we ever remove the base
DMA window.
This reverts commit 25ebc45b93452d0bc60271f178237123c4b26808.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.
The one instance where we add an include for init.h covers off
a case where that file was implicitly getting it from another
header which itself didn't need it.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
improve the common clock support code for MPC512x
- expand the CCM register set declaration with MPC5125 related registers
(which reside in the previously "reserved" area)
- tell the MPC5121, MPC5123, and MPC5125 SoC variants apart, and derive
the availability of components and their clocks from the detected SoC
(MBX, AXE, VIU, SPDIF, PATA, SATA, PCI, second FEC, second SDHC,
number of PSC components, type of NAND flash controller,
interpretation of the CPMF bitfield, PSC/CAN mux0 stage input clocks,
output clocks on SoC pins)
- add backwards compatibility (allow operation against a device tree
which lacks clock related specs) for MPC5125 FECs, too
telling SoC variants apart and adjusting the clock tree's generation
occurs at runtime, a common generic binary supports all of the chips
the MPC5125 approach to the NFC clock (one register with two counters
for the high and low periods of the clock) is not implemented, as there
are no users and there is no common implementation which supports this
kind of clock -- the new implementation would be unused and could not
get verified, so it shall wait until there is demand
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
the SDHC clock is derived from CSB with a fractional divider which can
address "quarters"; the implementation multiplies CSB by 4 and divides
it by the (integer) divider value
a bug in the clock domain synchronisation requires that only even
divider values get setup; we achieve this by
- multiplying CSB by 2 only instead of 4
- registering with CCF the divider's bit field without bit0
- the divider's lowest bit remains clear as this is the reset value
and later operations won't touch it
this change keeps fully utilizing common clock primitives (needs no
additional support logic, and avoids an excessive divider table) and
satisfies the hardware's constraint of only supporting even divider
values
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
adjust (expand on or move) a few comments,
add markers for easier navigation around helpers
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
this change removes workarounds which have become obsolete after
migration to common clock support has completed
- remove clkdev registration calls (compatibility clock item aliases)
after all peripheral drivers were adjusted for device tree based
clock lookup
- remove pre-enable workarounds after all peripheral drivers were
adjusted to acquire their respective clock items
workarounds for these clock items get removed: FEC (ethernet), I2C,
PSC (UART, SPI), PSC FIFO, USB, NFC (NAND flash), VIU (video capture),
BDLC (CAN), CAN MCLK, DIU (video output)
these clkdev registered names won't be provided any longer by the
MPC512x platform's clock driver: "psc%d_mclk", "mscan%d_mclk",
"usb%d_clk", "nfc_clk", "viu_clk", "sys_clk", "ref_clk"
the pre-enable workaround for PCI remains, but depends on the presence
of PCI related device tree nodes (disables the PCI clock in the absence
of PCI nodes, keeps the PCI clock enabled in the presence of nodes) --
moving clock acquisition into the peripheral driver isn't possible for
PCI because its initialization takes place before the platform clock
driver gets initialized, thus the clock provider isn't available then
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
adapt the DIU clock initialization to the COMMON_CLK approach:
device tree based clock lookup, prepare and unprepare for clocks,
work with frequencies not dividers, call the appropriate clk_*()
routines and don't access CCM registers
the "best clock" determination now completely relies on the
platform's clock driver to pick a frequency close to what the
caller requests, and merely checks whether the desired frequency
was met (fits the tolerance of the monitor)
this approach shall succeed upon first try in the usual case,
will test a few less desirable yet acceptable frequencies in
edge cases, and will fallback to "best effort" if none of the
previously tried frequencies pass the test
provide a fallback clock lookup approach in case the OF based clock
lookup for the DIU fails, this allows for successful operation in
the presence of an outdated device tree which lacks clock specs
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
the setup before the change was
- arch/powerpc/Kconfig had the PPC_CLOCK option, off by default
- depending on the PPC_CLOCK option the arch/powerpc/kernel/clock.c file
was built, which implements the clk.h API but always returns -ENOSYS
unless a platform registers specific callbacks
- the MPC52xx platform selected PPC_CLOCK but did not register any
callbacks, thus all clk.h API calls keep resulting in -ENOSYS errors
(which is OK, all peripheral drivers deal with the situation)
- the MPC512x platform selected PPC_CLOCK and registered specific
callbacks implemented in arch/powerpc/platforms/512x/clock.c, thus
provided real support for the clock API
- no other powerpc platform did select PPC_CLOCK
the situation after the change is
- the MPC512x platform implements the COMMON_CLK interface, and thus the
PPC_CLOCK approach in arch/powerpc/platforms/512x/clock.c has become
obsolete
- the MPC52xx platform still lacks genuine support for the clk.h API
while this is not a change against the previous situation (the error
code returned from COMMON_CLK stubs differs but every call still
results in an error)
- with all references gone, the arch/powerpc/kernel/clock.c wrapper and
the PPC_CLOCK option have become obsolete, as did the clk_interface.h
header file
the switch from PPC_CLOCK to COMMON_CLK is done for all platforms within
the same commit such that multiplatform kernels (the combination of 512x
and 52xx within one executable) keep working
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
extend the recently added COMMON_CLK platform support for MPC512x such
that it works with incomplete device tree data which lacks clock specs
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
[agust@denx.de: moved node macro definitions out of the function body]
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
this change implements a clock driver for the MPC512x PowerPC platform
which follows the COMMON_CLK approach and uses common clock drivers
shared with other platforms
this driver implements the publicly announced set of clocks (those
listed in the dt-bindings header file), as well as generates additional
'struct clk' items where the SoC hardware cannot easily get mapped to
the common primitives (shared code) of the clock API, or requires
"intermediate clock nodes" to represent clocks that have both gates and
dividers
the previous PPC_CLOCK implementation is kept in place and remains
active for the moment, the newly introduced CCF clock driver will
receive additional support for backwards compatibility in a subsequent
patch before it gets enabled and will replace the PPC_CLOCK approach
some of the clock items get pre-enabled in the clock driver to not have
them automatically disabled by the underlying clock subsystem because of
their being unused -- this approach is desirable because
- some of the clocks are useful to have for diagnostics and information
despite their not getting claimed by any drivers (CPU, internal and
external RAM, internal busses, boot media)
- some of the clocks aren't claimed by their peripheral drivers yet,
either because of missing driver support or because device tree specs
aren't available yet (but the workarounds will get removed as the
drivers get adjusted and the device tree provides the clock specs)
clkdev registration provides "alias names" for few clock items
- to not break those peripheral drivers which encode their component
index into the name that is used for clock lookup (UART, SPI, USB)
- to not break those drivers which use names for the clock lookup which
were encoded in the previous PPC_CLOCK implementation (NFC, VIU, CAN)
this workaround will get removed as these drivers get adjusted after
device tree based clock lookup has become available
the COMMON_CLK implementation copes with device trees which lack an
oscillator node (backwards compat), the REF clock is then derived from
the IPS bus frequency and multiplier values fetched from hardware
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
|
|
It is now possible to use the common cpuidle_[un]register() routines
(instead of open-coding them) so do it.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
pseries cpuidle driver sets dev->state_count to drv->state_count so
the default dev->state_count initialization in cpuidle_enable_device()
(called from cpuidle_register_device()) can be used instead.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add support for the Motorola/Emerson MVME5100 Single Board Computer.
The MVME5100 is a 6U form factor VME64 computer with:
- A single MPC7410 or MPC750 CPU
- A HAWK Processor Host Bridge (CPU to PCI) and
MultiProcessor Interrupt Controller (MPIC)
- Up to 500Mb of onboard memory
- A M48T37 Real Time Clock (RTC) and Non-Volatile Memory chip
- Two 16550 compatible UARTS
- Two Intel E100 Fast Ethernets
- Two PCI Mezzanine Card (PMC) Slots
- PPCBug Firmware
The HAWK PHB/MPIC is compatible with the MPC10x devices.
There is no onboard disk support. This is usually provided by installing a PMC
in first PMC slot.
This patch revives the board support, it was present in early 2.6
series kernels. The board support in those days was by Matt Porter of
MontaVista Software.
CSC Australia has around 31 of these boards in service. The kernel in use
for the boards is based on 2.6.31. The boards are operated without disks
from a file server.
This patch is based on linux-3.13-rc2 and has been boot tested.
Only boards with 512 Mb of memory are known to work.
Signed-off-by: Stephen Chivers <schivers@csc.com>
Tested-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
TWR-P1025 Overview
-----------------
512Mbyte DDR3 (on board DDR)
64MB Nor Flash
eTSEC1: Connected to RGMII PHY AR8035
eTSEC3: Connected to RGMII PHY AR8035
Two USB2.0 Type A
One microSD Card slot
One mini-PCIe slot
One mini-USB TypeB dual UART
Signed-off-by: Michael Johnston <michael.johnston@freescale.com>
Signed-off-by: Xie Xiaobo <X.Xie@freescale.com>
[scottwood@freescale.com: use pr_info rather than KERN_INFO]
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Define a QE init function in common file, and avoid
the same codes being duplicated in board files.
Signed-off-by: Xie Xiaobo <X.Xie@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
It makes no sense to initialize the mpic ipi for the SoC which has
doorbell support. So set the smp_85xx_ops.probe to NULL for this
case. Since the smp_85xx_ops.probe is also used in function
smp_85xx_setup_cpu() to check if we need to invoke
mpic_setup_this_cpu(), we introduce a new setup_cpu function
smp_85xx_basic_setup() to remove this dependency.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
This patch fixed several typo in printk from various
part of kernel source.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Merge a pile of fixes that went into the "merge" branch (3.13-rc's) such
as Anton Little Endian fixes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch updates the generic iommu backend code to use the
it_page_shift field to determine the iommu page size instead of
using hardcoded values.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This patch adds a it_page_shift field to struct iommu_table and
initiliases it to 4K for all platforms.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The powerpc iommu uses a hardcoded page size of 4K. This patch changes
the name of the IOMMU_PAGE_* macros to reflect the hardcoded values. A
future patch will use the existing names to support dynamic page
sizes.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
This removes the REDBOOT Kconfig parameter,
which was no longer used anywhere in the source code
and Makefiles.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Prevent ioda_eeh_hub_diag() from clobbering itself when called by supplying
a per-PHB buffer for P7IOC hub diagnostic data. Take care to inform OPAL of
the correct size for the buffer.
[Small style change to the use of sizeof -- BenH]
Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
PHB diagnostic buffer may be smaller than PAGE_SIZE, especially when
PAGE_SIZE > 4KB.
Signed-off-by: Brian W Hart <hartb@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Correct spelling typo in various part of kernel
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
We are passing pointers to the firmware for reads, we need to properly
convert the result as OPAL is always BE.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
opal_xscom_read uses a pointer to return the data so we need
to byteswap it on LE builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The MSI code is miscalculating quotas in little endian mode.
Add required byteswaps to fix this.
Before we claimed a quota of 65536, after the patch we
see the correct value of 256.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We need to byteswap ibm,pcie-link-speed-stats.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The NVRAM code has a number of endian issues. I noticed a very
confused error log count:
RTAS: 100663330 -------- RTAS event begin --------
100663330 == 0x06000022. 0x6 LE error logs and 0x22 BE error logs.
The pstore code has similar issues - if we write an oops in one
endian and attempt to read it in another we get junk.
Make both of these formats big endian, and byteswap as required.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Some obvious issues:
cat /proc/ppc64/lparcfg
...
partition_id=16777216
...
partition_potential_processors=268435456
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
I have recently found out that no iommu_groups could be found under
/sys/ on a P8. That prevents PCI passthrough from working.
During my investigation, I found out there seems to be a missing
iommu_register_group for PHB3. The following patch seems to fix the
problem. After applying it, I see iommu_groups under
/sys/kernel/iommu_groups/, and can also bind vfio-pci to an adapter,
which gives me a device at /dev/vfio/.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
infrastructure.
Get the memory errors reported by opal and plumb it into memory poison
infrastructure. This patch uses new messaging channel infrastructure to
pull the fsp memory errors to linux.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We steal the _PAGE_COHERENCE bit and use that for indicating NUMA ptes.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Even though we have same value for linux PTE bits and hash PTE pits
use the hash pte bits wen updating hash pte
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The only external user of slb_shadow is the pseries lpar code, and it
can access through the paca array instead.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Currently we continue to poll get/set_rtc_time even when we know they
are not working.
This changes it so that if it fails at boot time we remove the ppc_md
get/set_rtc_time hooks so that we don't end up polling known broken
calls.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When hitting frozen PE or fenced PHB, it's always indicative to
have dumped PHB diag-data for further analysis and diagnosis.
However, we never dump that for the cases. The patch intends to
dump PHB diag-data at the backend of eeh_ops::get_log() for PowerNV
platform.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Prior to the completion of PCI enumeration, we actively detects
EEH errors on PCI config cycles and dump PHB diag-data if necessary.
The EEH backend also dumps PHB diag-data in case of frozen PE or
fenced PHB. However, we are using different functions to dump the
PHB diag-data for those 2 cases.
The patch merges the functions for dumping PHB diag-data to one so
that we can avoid duplicate code. Also, we never dump PHB3 diag-data
during PCI config cycles with frozen PE. The patch fixes it as well.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The current implementation of IOMMU on sPAPR does not use iommu_ops
and therefore does not call IOMMU API's bus_set_iommu() which
1) sets iommu_ops for a bus
2) registers a bus notifier
Instead, PCI devices are added to IOMMU groups from
subsys_initcall_sync(tce_iommu_init) which does basically the same
thing without using iommu_ops callbacks.
However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
implements iommu_ops and when tce_iommu_init is called, every PCI device
is already added to some group so there is a conflict.
This patch does 2 things:
1. removes the loop in which PCI devices were added to groups and
adds explicit iommu_add_device() calls to add devices as soon as they get
the iommu_table pointer assigned to them.
2. moves a bus notifier to powernv code in order to avoid conflict with
the notifier from Freescale driver.
iommu_add_device() and iommu_del_device() are public now.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Move SG list and entry structure to header file so that
it can be used in other places as well.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Opal now has a new messaging infrastructure to push the messages to
linux in a generic format for different type of messages using only one
event bit. The format of the opal message is as below:
struct opal_msg {
uint32_t msg_type;
uint32_t reserved;
uint64_t params[8];
};
This patch allows clients to subscribe for notification for specific
message type. It is upto the subscriber to decipher the messages who showed
interested in receiving specific message type.
The interface to subscribe for notification is:
int opal_message_notifier_register(enum OpalMessageType msg_type,
struct notifier_block *nb)
The notifier will fetch the opal message when available and notify the
subscriber with message type and the opal message. It is subscribers
responsibility to copy the message data before returning from notifier
callback.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Add basic error handling in machine check exception handler.
- If MSR_RI isn't set, we can not recover.
- Check if disposition set to OpalMCE_DISPOSITION_RECOVERED.
- Check if address at fault is inside kernel address space, if not then send
SIGBUS to process if we hit exception when in userspace.
- If address at fault is not provided then and if we get a synchronous machine
check while in userspace then kill the task.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Now that we are ready to handle machine check directly in linux, do not
register with firmware to handle machine check exception.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When machine check real mode handler can not continue into host kernel
in V mode, it returns from the interrupt and we loose MCE event which
never gets logged. In such a situation queue up the MCE event so that
we can log it later when we get back into host kernel with r1 pointing to
kernel stack e.g. during syscall exit.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Now that we handle machine check in linux, the MCE decoding should also
take place in linux host. This info is crucial to log before we go down
in case we can not handle the machine check errors. This patch decodes
and populates a machine check event which contain high level meaning full
MCE information.
We do this in real mode C code with ME bit on. The MCE information is still
available on emergency stack (in pt_regs structure format). Even if we take
another exception at this point the MCE early handler will allocate a new
stack frame on top of current one. So when we return back here we still have
our MCE information safe on current stack.
We use per cpu buffer to save high level MCE information. Each per cpu buffer
is an array of machine check event structure indexed by per cpu counter
mce_nest_count. The mce_nest_count is incremented every time we enter
machine check early handler in real mode to get the current free slot
(index = mce_nest_count - 1). The mce_nest_count is decremented once the
MCE info is consumed by virtual mode machine exception handler.
This patch provides save_mce_event(), get_mce_event() and release_mce_event()
generic routines that can be used by machine check handlers to populate and
retrieve the event. The routine release_mce_event() will free the event slot so
that it can be reused. Caller can invoke get_mce_event() with a release flag
either to release the event slot immediately OR keep it so that it can be
fetched again. The event slot can be also released anytime by invoking
release_mce_event().
This patch also updates kvm code to invoke get_mce_event to retrieve generic
mce event rather than paca->opal_mce_evt.
The KVM code always calls get_mce_event() with release flags set to false so
that event is available for linus host machine
If machine check occurs while we are in guest, KVM tries to handle the error.
If KVM is able to handle MC error successfully, it enters the guest and
delivers the machine check to guest. If KVM is not able to handle MC error, it
exists the guest and passes the control to linux host machine check handler
which then logs MC event and decides how to handle it in linux host. In failure
case, KVM needs to make sure that the MC event is available for linux host to
consume. Hence KVM always calls get_mce_event() with release flags set to false
and later it invokes release_mce_event() only if it succeeds to handle error.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|