summaryrefslogtreecommitdiffstats
path: root/drivers/ata/libata-eh.c
AgeCommit message (Collapse)Author
2007-10-29libata: track SLEEP state and issue SRST to wake it upTejun Heo
ATA devices in SLEEP mode don't respond to any commands. SRST is necessary to wake it up. Till now, when a command is issued to a device in SLEEP mode, the command times out, which makes EH reset the device and retry the command after that, causing a long delay. This patch makes libata track SLEEP state and issue SRST automatically if a command is about to be issued to a device in SLEEP. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Bruce Allen <ballen@gravity.phys.uwm.edu> Cc: Andrew Paprocki <andrew@ishiboo.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-25libata: cosmetic clean up in ata_eh_reset()Tejun Heo
Local variable @action usage in ata_eh_reset() is a bit confusing. It's used only to cache ehc->i.action to test reset masks after clearing it; however, due to the generic name "action", it's easy to misinterpret the local variable as containing the selected reset method later. Also, the reason for caching the original value is easy to miss. This patch renames @action to @tmp_action and make it buffer newly selected value instead to improve readability. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-23[libata] checkpatch-inspired cleanupsJeff Garzik
Tackle the relatively sane complaints of checkpatch --file. The vast majority is indentation and whitespace changes, the rest are * #include fixes * printk KERN_xxx prefix addition * BSS/initializer cleanups Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-12[libata] struct pci_dev related cleanupsJeff Garzik
* remove pointless pci_dev_to_dev() wrapper. Just directly reference the embedded struct device like everyone else does. * pata_cs5520: delete cs5520_remove_one(), it was a duplicate of ata_pci_remove_one() * linux/libata.h: don't bother including linux/pci.h, we don't need it. Simply declare 'struct pci_dev' and assume interested parties will include the header, as they should be doing anyway. * linux/libata.h: consolidate all CONFIG_PCI declarations into a single location in the header. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-12libata: use ata_exec_internal() for PMP register accessTejun Heo
PMP registers used to be accessed with dedicated accessors ->pmp_read and ->pmp_write. During reset, those callbacks are called with the port frozen so they should be able to run without depending on interrupt delivery. To achieve this, they were implemented polling. However, as resetting the host port makes the PMP to isolate fan-out ports until SError.X is cleared, resetting fan-out ports while port is frozen doesn't buy much additional safety. This patch updates libata PMP support such that PMP registers are accessed using regular ata_exec_internal() mechanism and kills ->pmp_read/write() callbacks. The following changes are made. * PMP access helpers - sata_pmp_read_init_tf(), sata_pmp_read_val(), sata_pmp_write_init_tf() are folded into sata_pmp_read/write() which are now standalone PMP register access functions. * sata_pmp_read/write() returns err_mask instead of rc. This is consistent with other functions which issue internal commands and allows more detailed error reporting. * ahci interrupt handler is modified to ignore BAD_PMP and spurious/illegal completion IRQs while reset is in progress. These conditions are expected during reset. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: implement ATA_PFLAG_RESETTINGTejun Heo
Implement ATA_PFLAG_RESETTING. This flag is set while reset is in progress. It's set before prereset is called and cleared after reset fails or postreset is finished. This flag itself doesn't have any function. It will be used by LLDs to tell whether reset is in progress if it needs to behave differently during reset. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: add @timeout to ata_exec_internal[_sg]()Tejun Heo
Add @timeout argument to ata_exec_internal[_sg](). If 0, default timeout ata_probe_timeout is used. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: wrap schedule_timeout_uninterruptible() in loopTejun Heo
Tasks in uninterruptible sleep might be woken up by unrelated events and should check whether the condition it was waiting for has actually triggered. Wrap schedule_timeout_uninterruptible() in loop to achieve it. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: skip suppress reporting if ATA_EHI_QUIETTejun Heo
ATA_EHI_NO_AUTOPSY and ATA_EHI_QUIET are used during initial probing to skip exception analysis and reporting. Usually, there's nothing to report but on some allowed but rare corner cases (e.g. phy status changed interrupt when IRQ is enabled on frozen port - this happens if IRQ pending status isn't cleared in the IRQ router or controller) exception messages get printed. Skip reporting if ATA_EHI_QUIET is set. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: add human-readable error value decodingRobert Hancock
This adds human-readable decoding of the ATA status and error registers (similar to what drivers/ide does) as well as the SATA Serror register to libata error handling output. This prevents the need to pore through standards documents to figure out the meaning of the bits in these registers when looking at error reports. Some bits that drivers/ide decoded are not decoded here, since the bits are either command-dependent or obsolete, and properly parsing them would add too much complexity. Signed-off-by: Robert Hancock <hancockr@shaw.ca> [edited slightly to make output a bit more symmetric] Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp: hook PMP support and enable itTejun Heo
Hook PMP support into libata and enable it. Connect SCR and probing functions, and update ata_dev_classify() to detect PMP. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp: update ata_eh_reset() for PMPTejun Heo
PMP always requires SRST to be enabled. Also, hardreset reports classification code from the first device when PMP is attached, not from the PMP. Update ata_eh_reset() such that followup softreset is performed if the controller is PMP capable and the host link is being reset. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: implement sata_async_notification()Tejun Heo
AN serves multiple purposes. For ATAPI, it's used for media change notification. For PMP, for downstream PHY status change notification. Implement sata_async_notification() which demultiplexes AN. To avoid unnecessary port events, ATAPI AN is not enabled if PMP is attached but SNTF is not available. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Kriten Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: implement EH fast-fail pathTejun Heo
If PMP itself becomes inaccessible while trying to link a downstream link, spending time to recover the downstream link doesn't make any sense. Make EH skip retry and fail fast if -ERESTART is received. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: implement ATA_LFLAG_DISABLEDTejun Heo
Implement ATA_LFLAG_DISABLED. The flag indicates the link is disabled due to EH recovery failure. While a link is disabled, no EH action is taken on the link and suspend/resume become noop too. This will be used by PMP links to manage failed links. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: implement ATA_LFLAG_NO_RETRYTejun Heo
Some PMP links are connected to internal pseudo devices which may come and go depending on situation. There's no reason to try hard to recover them. ATA_LFLAG_NO_RETRY tells EH to not retry if the device attached to the link fails. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: implement ATA_LFLAG_NO_SRST, ASSUME_ATA and ASSUME_SEMBTejun Heo
Some links on some PMPs locks up on SRST and/or report incorrect device signature. Implement ATA_LFLAG_NO_SRST, ASSUME_ATA and ASSUME_SEMB to handle these quirky links. NO_SRST makes EH avoid SRST. ASSUME_ATA and SEMB forces class code to ATA and SEMB_UNSUP respectively. Note that SEMB isn't currently supported yet so the _UNSUP variant is used. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: implement qc_defer helpersTejun Heo
Implement ap->nr_active_links (the number of links with active qcs), ap->excl_link (pointer to link which can be used by ->qc_defer and is cleared when a qc with ATA_QCFLAG_CLEAR_EXCL completes), and ata_link_active(). These can be used by ->qc_defer() to implement proper command exclusion. This set of helpers seem enough for both sil24 (ATAPI exclusion needed) and cmd-switching PMP. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: make a number of functions global to libataTejun Heo
Make a number of functions from libata-core.c and libata-eh.c global to libata (drivers/ata/libata.h). These will be used by PMP. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-pmp-prep: add @new_class to ata_dev_revalidate()Tejun Heo
Consider newly found class code while revalidating. PMP resetting always results in valid class code and issuing PMP commands to ATA/ATAPI device isn't very attractive. Add @new_class to ata_dev_revalidate() and check class code for revalidation. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: move EH repeat reporting into ata_eh_report()Tejun Heo
EH is sometimes repeated without any error or action. For example, this happens when probing IDENTIFY fails because of a phantom device. In these cases, all the repeated EH does is making sure there is no unhandled error or pending action and return. This repeation is necessary to avoid losing any event which occurred while EH was in progress. Unfortunately, this dry run causes annonying "EH pending after completion" message. This patch moves the repeat reporting into ata_eh_report() such that it's more compact and skipped on dry runs. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Mikael Pettersson <mikep@it.uu.se> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata: implement and use ata_port_desc() to report port configurationTejun Heo
Currently, port configuration reporting has the following problems. * iomapped address is reported instead of raw address * report contains irrelevant fields or lacks necessary fields for non-SFF controllers. * host->irq/irq2 are there just for reporting and hacky. This patch implements and uses ata_port_desc() and ata_port_pbar_desc(). ata_port_desc() is almost identical to ata_ehi_push_desc() except that it takes @ap instead of @ehi, has no locking requirement, can only be used during host initialization and " " is used as separator instead of ", ". ata_port_pbar_desc() is a helper to ease reporting of a PCI BAR or an offsetted address into it. LLD pushes whatever description it wants using the above two functions. The accumulated description is printed on host registration after "[S/P]ATA max MAX_XFERMODE ". SFF init helpers and ata_host_activate() automatically add descriptions for addresses and irq respectively, so only LLDs which isn't standard SFF need to add custom descriptions. In many cases, such controllers need to report different things anyway. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: update EH to deal with PMP linksTejun Heo
Update ata_eh_autopsy(), ata_eh_report(), ata_eh_revalidate_and_attach() and ata_eh_recover() to deal with PMP links. ata_eh_autopsy() and ata_eh_report() updates are straightforward. They just repeat the same operation over all configured links. The only change to ata_eh_revalidate_and_attach() is avoiding calling ->cable_select() on non-host ports. ata_eh_recover() update is more complex as it first processes all resets and then performs the rest. This is necessary as thawing with some links in unknown state can be dangerous. ehi->action is cleared on successful recovery of a link to avoid repeating recovery due to failures in other links. ata_eh_recover() iterates over only PMP links if PMP is attached, and, on failure, the failing link is returned in @failed_link instead of disabling devices directly. These are to integrate ata_eh_recover() into PMP EH later. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: update ata_scsi_error() to handle PMP linksTejun Heo
Update ata_scsi_error() to handle PMP links. As error conditions can occur on both host and PMP links, __ata_port_for_each_link() is used. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: implement ata_link_abort()Tejun Heo
Implement ata_link_abort(). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: linkify config/EH related functionsTejun Heo
Make the following functions deal with ata_link instead of ata_port. * ata_set_mode() * ata_eh_autopsy() and related functions * ata_eh_report() and related functions * suspend/resume related functions * ata_eh_recover() and related functions Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: linkify resetTejun Heo
Make reset methods and related functions deal with ata_link instead of ata_port. * ata_do_reset() * ata_eh_reset() * all prereset/reset/postreset methods and related functions This patch introduces no behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: linkify EH action helpersTejun Heo
Make ata_eh_about_to_do() and ata_eh_done() deal with ata_link instead of ata_port. This patch introduces no behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: linkify PHY-related functionsTejun Heo
Make the following PHY-related functions to deal with ata_link instead of ata_port. * sata_print_link_status() * sata_down_spd_limit() * ata_set_sata_spd_limit() and friends * sata_link_debounce/resume() * sata_scr_valid/read/write/write_flush() * ata_link_on/offline() This patch introduces no behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: implement and use link/device iteratorsTejun Heo
Multiple links and different number of devices per link should be considered to iterate over links and devices. This patch implements and uses link and device iterators - ata_port_for_each_link() and ata_link_for_each_dev() - and ata_link_max_devices(). This change makes a lot of functions iterate over only possible devices instead of from dev 0 to dev ATA_MAX_DEVICES. All such changes have been examined and nothing should be broken. While at it, add a separating comment before device helpers to distinguish them better from link helpers and others. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-12libata-link: introduce ata_linkTejun Heo
Introduce ata_link. It abstracts PHY and sits between ata_port and ata_device. This new level of abstraction is necessary to support SATA Port Multiplier, which basically adds a bunch of links (PHYs) to a ATA host port. Fields related to command execution, spd_limit and EH are per-link and thus moved to ata_link. This patch only defines the host link. Multiple link handling will be added later. Also, a lot of ap->link derefences are added but many of them will be removed as each part is converted to deal directly with ata_link instead of ata_port. This patch introduces no behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: implement EH fast drainTejun Heo
In most cases, when EH is scheduled, all in-flight commands are aborted causing EH to kick in immediately. However, in some cases (especially with PMP), it's unclear which commands are affected by the error condition and although aborting all in-flight commands work, it isn't optimal and may cause unnecessary disruption. On the other hand, waiting for in-flight commands to drain themselves can take up to 30seconds. This patch implements EH fast drain to handle such situations. It gives in-flight commands some time to finish up but doesn't wait for too long. After EH is scheduled, fast drain timer is started and if no other completion occurs in ATA_EH_FASTDRAIN_INTERVAL all in-flight commands are aborted. If any completion occurred in the interval, the port is given another interval to finish up itself. Currently ATA_EH_FASTDRAIN_INTERVAL is 3 secs which should be enough for finishing up most commands. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: schedule probing after SError access failure during autopsyTejun Heo
If SError isn't accessible, EH can't tell whether hotplug has happened or not. Report SError read failure with AC_ERR_OTHER and schedule probing with hardreset. This will be mainly useful for PMPs. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: clear HOTPLUG flag after a resetTejun Heo
ATA_EHI_HOTPLUGGED is a hint for reset functions indicating the the port might have gone through hotplug/unplug just before entering EH. Reset functions modify their behaviors a bit to handle the situation better - e.g. using longer debouncing delay. Currently, once HOTPLUG is set, it isn't cleared till the end of EH. This is unnecessary and makes EH take longer. Clear the HOTPLUGGED flag after a reset try (successful or not). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: quickly trigger SATA SPD down after debouncing failedTejun Heo
Debouncing failure is a good indicator of basic link problem. Use -EPIPE to indicate debouncing failure and make ata_eh_reset() invoke sata_down_spd_limit() if the error occurs during reset. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: improve SATA PHY speed down logicTejun Heo
sata_down_spd_limit() first reads the current SPD from SStatus and limit the speed to the lower one of one below the current limit or one below the current SPD in SStatus. SPD may not be accessible or valid when SPD down is requested making sata_down_spd_limit() fail when it's most needed. This patch makes the current SPD cached after each successful reset and forces GEN I speed (1.5Gbps) if neither of SStatus or the cached value is valid, so sata_down_spd_limit() is now guaranteed to lower the speed limit if lower speed is available. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: implement AC_ERR_NCQTejun Heo
When an NCQ command fails, all commands in flight are aborted and the offending one is reported using log page 10h. Depending on controller characteristics and LLD implementation, all commands may appear as having a device error due to shared TF status making it hard to determine what's actually going on. This patch adds AC_ERR_NCQ, marks the command reported by log page 10h with it and print extra "<F>" after the error report for the command to help distinguishing the offending command. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-20libata: improve EH report formattingTejun Heo
Requiring LLDs to format multiple error description messages properly doesn't work too well. Help LLDs a bit by making ata_ehi_push_desc() insert ", " on each invocation. __ata_ehi_push_desc() is the raw version without the automatic separator. While at it, make ehi_desc interface proper functions instead of macros. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-10libata-link: separate out ata_eh_handle_dev_fail()Tejun Heo
Separate out ata_eh_handle_dev_fail() from ata_eh_recover(). This is in preparation of ata_link and PMP support. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-09libata-acpi: implement _GTM/_STM supportTejun Heo
Implement _GTM/_STM support. acpi_gtm is added to ata_port which stores _GTM parameters over suspend/resume cycle. A new hook ata_acpi_on_suspend() is responsible for storing _GTM parameters during suspend. _STM is executed in ata_acpi_on_resume(). With this change, invoking _GTF is safe on IDE hierarchy and acpi_sata check before _GTF is removed. ata_acpi_gtm() and ata_acpi_stm() implementation is taken from Alan Cox's pata_acpi implementation. ata_acpi_gtm() is fixed such that the result parameter is not shifted by sizeof(union acpi_object). Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-07-09libata: reimplement ACPI invocationTejun Heo
This patch reimplements ACPI invocation such that, instead of exporting ACPI details to the rest of libata, ACPI event handlers - ata_acpi_on_resume() and ata_acpi_on_devcfg() - are used. These two functions are responsible for determining whether specific ACPI method is used and when. On resume, _GTF is scheduled by setting ATA_DFLAG_ACPI_PENDING device flag. This is done this way to avoid performing the action on wrong device device (device swapping while suspended). On every ata_dev_configure(), ata_acpi_on_devcfg() is called, which performs _SDD and _GTF. _GTF is performed only after resuming and, if SATA, hardreset as the ACPI spec specifies. As _GTF may contain arbitrary commands, IDENTIFY page is re-read after _GTF taskfiles are executed. If one of ACPI methods fails, ata_acpi_on_devcfg() retries on the first failure. If it fails again on the second try, ACPI is disabled on the device. Note that successful configuration clears ACPI failed status. With all feature checks moved to the above two functions, do_drive_set_taskfiles() is trivial and thus collapsed into ata_acpi_exec_tfs(), which is now static and converted to return the number of executed taskfiles to be used by ata_acpi_on_resume(). As failures are handled properly, ata_acpi_push_id() now returns -errno on errors instead of unconditional zero. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-27libata: fix infinite EH waiting bugTejun Heo
When EH gives up after repeated exceptions, it doesn't't clear the PENDING bit on exit which leaves PENDING bit set without EH actually scheduled. This makes ata_port_wait_eh() to wait forever makes rmmod hang on such port. Fix it by clearing the flag. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-27libata: remove unused variable from ata_eh_reset()Tejun Heo
Removed unused variable did_followup_srst from ata_eh_reset(). Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-06-27libata: kill non-sense warning messageTejun Heo
prereset() is now allowed to set flag for unsupported reset method. EH layer is responsible for selecting the fallback. Remove non-sense warning message. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-21libata: Trim trailing whitespaceJeff Garzik
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11libata: give devices one last chance even if recovery failed with -EINVALTejun Heo
After certain errors, some devices report complete garbage on IDENTIFY. This can cause ata_dev_read_id() to fail with -EINVAL resulting in immediate disabling of the device. Give the device one last chance after -EINVAL to allow recovery from such situations. As -EINVAL is triggered very rarely, this shouldn't cause any noticeable affect on more common error paths. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Harald Dunkel <harald.dunkel@t-online.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11libata: ignore EH scheduling during initializationTejun Heo
libata enables SCSI host during ATA host activation which happens after IRQ handler is registered and IRQ is enabled. All ATA ports are in frozen state when IRQ is enabled but frozen ports may raise limited number of IRQs after being frozen - IOW, ->freeze() is not responsible for clearing pending IRQs. During normal operation, the IRQ handler is responsible for clearing spurious IRQs on frozen ports and it usually doesn't require any extra code. Unfortunately, during host initialization, the IRQ handler can end up scheduling EH for a port whose SCSI host isn't initialized yet. This results in OOPS in the SCSI midlayer. This is relatively short window and scheduling EH for probing is the first thing libata does after initialization, so ignoring EH scheduling until initialization is complete solves the problem nicely. This problem was spotted by Berck E. Nash in the following thread. http://thread.gmane.org/gmane.linux.kernel/519412 Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Berck E. Nash <flyboy@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-11libata: reimplement suspend/resume support using sdev->manage_start_stopTejun Heo
Reimplement suspend/resume support using sdev->manage_start_stop. * Device suspend/resume is now SCSI layer's responsibility and the code is simplified a lot. * DPM is dropped. This also simplifies code a lot. Suspend/resume status is port-wide now. * ata_scsi_device_suspend/resume() and ata_dev_ready() removed. * Resume now has to wait for disk to spin up before proceeding. I couldn't find easy way out as libata is in EH waiting for the disk to be ready and sd is waiting for EH to complete to issue START_STOP. * sdev->manage_start_stop is set to 1 in ata_scsi_slave_config(). This fixes spindown on shutdown and suspend-to-disk. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-01libata: reimplement reset sequencingTejun Heo
libata previously depended upon waits in prereset to get resets after hotplug right for both spin up and device ready wait. This was necessary both for reliablity and speed as reset was likely to fail if initiated too early and each try usually took more than 30secs to fail. Previous patches fixed the reliability part by fixing status and SCR handling in resets. This patch remedies the speed part by improving reset sequencing. Prereset waiting timeout is adjusted to 10s because spinup wait is replaced by reset sequencing and !BSY wait is not as important as before. During boot or module loading where the drive is already fully spun up, !BSY wait succeeds immediately, so 10s should be enough in most cases. It matters after hotplugging or other error conditions, but in those cases, !BSY wait in prereset simply can't be relied upon due to the varied and weird behaviors ATA controllers and devices show. Reset is now driven by ata_eh_reset_timeouts[] table which contains timeouts for each reset try. The first reset can be softreset but the following ones are always hardreset if available. Each timeout defines deadline for the reset try. If a reset try fails, reset is retried with the next timeout till the end of the timeout table is reached. If a reset try fails before the timeout with error, libata waits till the deadline of the failed try before retrying. IOW, the timeout table defines timetable of reset tries such that the n'th try always begins at least after the sum of all previous timeouts has passed. The current timetable defines 4 tries and takes around 1 minute. @0 : First try. This should succeed most of the time during boot. @10 : 10s is enough to spin up most consumer harddrives. Give it another shot. @20 : 20s should spin up > 99% of working drives. This has 30s timeout for retarded devices needing long idleness post reset. @55 : Final try with 5s timeout just in case. The above timetable is trade off between not annoying the device too much with frequent resets and taking reasonable amount of time in most cases. Some controllers may do better with shorter timeouts while others may fare better with longer but we just can't rely upon LLD writers to test each controller with wide variety of devices using various scenarios. We need default behavior which reasonably fits most cases. I've tested the above timetable on a dozen SATA controllers and a few PATA controllers with about a dozen different drives from all major vendors and 4 different ODDs from three different vendors for both boot and hotplug (if available) cases. Boot probing is not affected unless the device is broken in which cases new code gives up on the port after a minute rather than five or nine minutes. When hotplugging, most devices get detected on the first or second try. Multi-platter drives with long spin up time which sometimes took > 40 secs with the original code, now usually comes up during the second try and at least right after the third try @20. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-05-01libata: add deadline support to prereset and reset methodsTejun Heo
Add @deadline to prereset and reset methods and make them honor it. ata_wait_ready() which directly takes @deadline is implemented to be used as the wait function. This patch is in preparation for EH timing improvements. * ata_wait_ready() never does busy sleep. It's only used from EH and no wait in EH is that urgent. This function also prints 'be patient' message automatically after 5 secs of waiting if more than 3 secs is remaining till deadline. * ata_bus_post_reset() now fails with error code if any of its wait fails. This is important because earlier reset tries will have shorter timeout than the spec requires. If a device fails to respond before the short timeout, reset should be retried with longer timeout rather than silently ignoring the device. There are three behavior differences. 1. Timeout is applied to both devices at once, not separately. This is more consistent with what the spec says. 2. When a device passes devchk but fails to become ready before deadline. Previouly, post_reset would just succeed and let device classification remove the device. New code fails the reset thus causing reset retry. After a few times, EH will give up disabling the port. 3. When slave device passes devchk but fails to become accessible (TF-wise) after reset. Original code disables dev1 after 30s timeout and continues as if the device doesn't exist, while the patched code fails reset. When this happens, new code fails reset on whole port rather than proceeding with only the primary device. If the failing device is suffering transient problems, new code retries reset which is a better behavior. If the failing device is actually broken, the net effect is identical to it, but not to the other device sharing the channel. In the previous code, reset would have succeeded after 30s thus detecting the working one. In the new code, reset fails and whole port gets disabled. IMO, it's a pathological case anyway (broken device sharing bus with working one) and doesn't really matter. * ata_bus_softreset() is changed to return error code from ata_bus_post_reset(). It used to return 0 unconditionally. * Spin up waiting is to be removed and not converted to honor deadline. * To be on the safe side, deadline is set to 40s for the time being. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>