summaryrefslogtreecommitdiffstats
path: root/drivers/edac
AgeCommit message (Collapse)Author
2010-10-21EDAC, MCE: Add support for F11h MCEsBorislav Petkov
F11h has almost the same MCE signatures as K8 except DRAM ECC and MC5 bank errors. Reuse functionality from the other families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Enable MCE decoding on F14hBorislav Petkov
Now that all decoders have been taught about F14h, models < 0x10 MCEs, enable decoding on this family of CPUs. Also, issue a short informational message upon boot that MCE decoding gets enabled. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Fix FR MCEs decodingBorislav Petkov
Those are N/A on K8, so don't decode them there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Complete NB MCE decodersBorislav Petkov
Add support for decoding F14h BU MCEs and improve decoding of the remaining families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Warn about LS MCEs on F14hBorislav Petkov
F14h CPUs do not generate LS MCEs so exit early and warn the user in case this path is ever hit that something else might be going haywire. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Adjust IC decoders to F14hBorislav Petkov
Add support for IC MCEs for F14h CPUs. K8 and F10h are almost identical so use one function for both. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Adjust DC decoders to F14hBorislav Petkov
Add a per-family data cache decoders. Since there is a certain overlap between the different DC MCE signatures, reuse functionality between the families as far as possible. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Rename filesBorislav Petkov
Drop "edac_" string from the filenames since they're prefixed with edac/ in their pathname anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Rework MCE injectionBorislav Petkov
Add sysfs injection facilities for testing of the MCE decoding code. Remove large parts of amd64_edac_dbg.c, as a result, which did only NB MCE injection anyway and the new injection code supports that functionality already. Add an injection module so that MCE decoding code in production kernels like those in RHEL and SLES can be tested. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC: Export edac sysfs class to users.Borislav Petkov
Move toplevel sysfs class to the stub and make it available to non-modularized code too. Add proper refcounting of its users and move the registration functionality into the reference counting routines. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Pass complete MCE info to decodersBorislav Petkov
... instead of the MCi_STATUS info only for improved handling of certain types of errors later. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Sanitize error codesBorislav Petkov
Clean up error codes names, shorten to mnemonics, add RRRR boundary checking. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Remove unused function parameterBorislav Petkov
Remove remains from previous functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add HW_ERR prefixBorislav Petkov
.. so that the user knows what she's looking at there in dmesg. Also, fix a minor cosmetic output inconsistency. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC: Fix error returnBorislav Petkov
We should return a negative value when we cannot get the toplevel edac sysfs class. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-18Update broken web addresses in the kernel.Justin P. Mattock
The patch below updates broken web addresses in the kernel Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com> Cc: Mike Frysinger <vapier.adi@gmail.com> Acked-by: Ben Pfaff <blp@cs.stanford.edu> Acked-by: Hans J. Koch <hjk@linutronix.de> Reviewed-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-01i7core_edac: fix panic in udimm sysfs attributes registrationMarcin Slusarz
Array of udimm sysfs attributes was not ended with NULL marker, leading to dereference of random memory. EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm0 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm1 EDAC DEBUG: edac_create_mci_instance_attributes: edac_create_mci_instance_attributes() file udimm2 BUG: unable to handle kernel NULL pointer dereference at 00000000000001a4 IP: [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 Pid: 1, comm: swapper Not tainted 2.6.36-rc3-nv+ #483 P6T SE/System Product Name RIP: 0010:[<ffffffff81330b36>] [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 (...) Call Trace: [<ffffffff81330b86>] edac_create_mci_instance_attributes+0x198/0x1f1 [<ffffffff81330c9a>] edac_create_sysfs_mci_device+0xbb/0x2b2 [<ffffffff8132f533>] edac_mc_add_mc+0x46b/0x557 [<ffffffff81428901>] i7core_probe+0xccf/0xec0 RIP [<ffffffff81330b36>] edac_create_mci_instance_attributes+0x148/0x1f1 ---[ end trace 20de320855b81d78 ]--- Kernel panic - not syncing: Attempted to kill init! Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Doug Thompson <dougthompson@xmission.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-27amd64_edac: Fix driver module removalBorislav Petkov
f4347553b30ec66530bfe63c84530afea3803396 removed the edac polling mechanism in favor of using a notifier chain for conveying MCE information to edac. However, the module removal path didn't test whether the driver had setup the polling function workqueue at all and the rmmod process was hanging in the kernel at try_to_del_timer_sync() in the cancel_delayed_work() path, trying to cancel an uninitialized work struct. Fix that by adding a balancing check to the workqueue removal path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-09-24i7300_edac: Properly initialize per-csrow memory sizeMauro Carvalho Chehab
Due to the current edac-core limits, we cannot represent a per-channel memory size, for FB-DIMM drivers. So, we need to sum-up all values for each slot, in order to properly represent the total amount of memory found by the i7300 driver. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-09-24V4L/DVB: i7300_edac: better initialize page countsMauro Carvalho Chehab
It is still somewhat fake, as the pages may not be on this exact order, and may even be used in mirror mode, but this is a best guess than the other random fake values. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-09-20x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NBAndreas Herrmann
The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-17x86, k8-gart: Decouple handling of garts and northbridgesAndreas Herrmann
So far we only provide num_k8_northbridges. This is required in different areas (e.g. L3 cache index disable, GART). But not all AMD CPUs provide a GART. Thus it is useful to split off the GART handling from the generic caching of AMD northbridge misc devices. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160254.GC4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-30i7300-edac: CodingStyle cleanupMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Improve commentsMauro Carvalho Chehab
This is basically a cleanup patch, improving the comments for each function. While here, do a few cleanups. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Cleanup: reorganize the file contentsMauro Carvalho Chehab
This change should do no functional change. It just rearranges the contents of the c file, in order to make easier to understand and maintain it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Properly detect channel on CE errorsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: enrich FBD error info for corrected errorsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: enrich FBD error info for fatal errorsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: pre-allocate a buffer used to prepare err messagesMauro Carvalho Chehab
Instead of dynamically allocating a buffer for it where needed, just allocate it once. As we'll use the same buffer also during fatal and non-fatal errors, is is very risky to dynamically allocate it during an error. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Fix MTR x4/x8 detection logicMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Make the debug messages coherent with the othersMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Cleanup: remove get_error_info logicMauro Carvalho Chehab
As the error logic in this driver came from i5400 driver, it were using one function to get errors, and another to display. Let's make it simpler and avoid doing it into two steps. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Add a code to cleanup error registersMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Add support for reporting FBD errorsMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Properly detect the type of error correctionMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Detect if the device is on single modeMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Adds detection for enhanced scrub mode on x8Mauro Carvalho Chehab
While here, do some cleanup by adding some macros to check for device features. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Clear the error bit after readingMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Add error detection code for global errorsMauro Carvalho Chehab
There's no mention at the datasheet about how to enable global error reporting. So, I'm assuming that those errors are always enabled. Maybe I'm plain wrong about that ;) Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Better name PCI devicesMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: Add a FIXME note about the error correction typeMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: add global error registersMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: display info if ECC is enabled or notMauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-30i7300_edac: start a driver for i7300 chipset (Clarksboro)Mauro Carvalho Chehab
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-08-26amd64_edac: Do not report error overflow as a separate errorBorislav Petkov
When the Overflow MCi_STATUS bit is set, EDAC reports the lost error with a "no information available" message which often puzzles users parsing the dmesg. This doesn't make much sense since this error has been lost anyway so no need for reporting it separately. Thus, report the overflow bit setting in the MCE dump instead. While at it, remove reporting of MiscV and ErrorEnable (en) which are superfluous. Now it looks like this: [ 1501.650024] MC4_STATUS: Corrected error, other errors lost: yes, CPU context corrupt: no, CECC Error [ 1501.666887] Northbridge Error, node 2 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-08-24MCE, AMD: Limit MCE decoding to current families for nowBorislav Petkov
Limit MCE error decoding to current and older families only (K8-F11h). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-08-12Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: mmc_spi: Fix unterminated of_match_table of/sparc: fix build regression from of_device changes of/device: Replace struct of_device with struct platform_device
2010-08-11edac: mpc85xx: add support for new MPCxxx/Pxxxx EDAC controllersAnton Vorontsov
Simply add proper IDs into the device table. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11edac: i5400: improve handling of pci_enable_device() return valueKulikov Vasiliy
-EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11edac: i5000: improve handling of pci_enable_device() return valueKulikov Vasiliy
-EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>