summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)Author
2011-05-24[SCSI] mpt2sas: Fix missing reference tag seed with Type 2 devicesMartin K. Petersen
Ensure that the initial reference tag is passed on to the HBA firmware for DIF Type 2 devices. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Kashyap Desai <Kashyap.Desai@lsi.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] sd: Unmap discard alignment needs to be converted to bytesMartin K. Petersen
The block layer discard alignment is reported in bytes, not in units of the logical block size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] bfa: kdump fixJing Huang
Root cause: When kernel crashes, bfa IOC state machine and FW doesn't get a notification and hence are not cleanly shutdown. So registers holding driver/IOC state information are not reset back to valid disabled/parking values. This causes subsequent driver initialization to hang during kdump kernel boot. Fix description: during the initialization of first PCI function, reset corresponding register when unclean shutown is detect by reading chip registers. This will make sure that ioc/fw gets clean re-initialization. Signed-off-by: Jing Huang <huangj@brocade.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] ipr: fix possible false positive detection of stuck interruptWayne Boyer
If the driver is getting flooded with interrupts, there's a possibility that the interrupt service routine could falsely detect a stuck interrupt condition and reset the adapter. This patch changes the logic such that the routine will loop back into the command processing code one more time after detecting the stuck interrupt signature. If there are no commands to process after that pass, and the interrupt is still not cleared, then the driver will print the "Error clearing HRRQ" message and reset the adapter. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfcoe: Remove unnecessary module state checksRobert Love
libfcoe's interface consists of create, destroy, enable, disable and create_vn2vn. These are currently module paramaters added durring the module initialization. A concern arose that the module parameters were being added with write permissions before the module had completed initialization. The following code was added to each sysfs store file. * Make sure the module has been initialized, and is not about to be * removed. Module parameter sysfs files are writable before the * module_init function is called and after module_exit. */ if (THIS_MODULE->state != MODULE_STATE_LIVE) goto out_nodev; This check was called out as unhelpful as the module can go dead at any time and therefore its state isn't a reliable thing to look at as a sign of stability and initialization completion. Also, that functional interfaces like these should be added after module initialization. This patch removes the unnecessary checks and hopes to disprove the concern about initialization ordering. Recent fcoe transport rework changes now require fcoe transports to register with libfcoe before any operation can take place. libfcoe may access some static variables but nothing that could cause a problem. Once a fcoe transport is registered, libfcoe is usable and any interface calls will be functional. Signed-off-by: Robert Love <robert.w.love@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfc: do not immediately retry the cmd when seq_send fails in ↵Yi Zou
fc_fcp_send_data Currently, when seq_send() fails in fc_fcp_send_data(), fc_fcp_retry_cmd() would complete this failed I/O directly and let scsi-ml retry. However, target side is not notified which may hang the target. Instead, we should just bail out from from fc_fcp_send_data and let scsi-ml times it out and aborts this I/O instead. Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfc: fix race in SRR responseVasu Dev
In this case fsp was freed before error handler was invoked, this is fixed by having SRR fsp reference freed by exch destructor so that fsp will be always held until it exch is freed. Also don't reset fsp->recov_seq since this is needed by SRR error handler to do exch done. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfc: don't call resp handler after FC_EX_TIMEOUTVasu Dev
In cases exch is already timed out then exch layer could end up calling resp handler again for its response frame received after timeout, though in this case fc_exch_timeout handler would have already called resp with FC_EX_TIMEOUT. This would cause REC response handler to release its fsp pkt hold twice instead once and possibly similar issues with other ELS exchanges in this race. To avoid this race have resp updated under exch lock in rx path, the resp would get set to NULL in case of FC_EX_TIMEOUT under the same lock to prevent resp callback after FC_EX_TIMEOUT. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfc: release DDP context if frame_send() failsYi Zou
In case frame_send() fails, make sure to let the underlying HW release the DDP context that has already been set up before calling frame_send(). Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfc: fix mm leak in handling incoming request for target discoveryHillf Danton
When handling incoming request, if the operation code carried by the received frame is not RSCN, the frame should be freed as in the RSCN case, or there is memory leakage. Signed-off-by: Hillf Danton <dhillf@gmail.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] fcoe: Prevent creation of an NPIV port with duplicate WWPNNeerav Parikh
This patch adds a validation step before allowing creation of a new NPIV port. It checks whether the WWPN passed for the new NPIV port to be created is unique for the given physical port. Signed-off-by: Neerav Parikh <Neerav.Parikh@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libfcoe: Incorrect CVL handling for NPIV portsBhanu Prakash Gollapudi
Host doesnt handle CVL to NPIV instantiated ports correctly. - As per FC-BB-5 Rev 2 CVLs with no VN_Port descriptors shall be treated as implicit logout of ALL vn_ports. - CVL for NPIV ports should be handled before physical port even if descriptor for physical port appears before NPIV ports Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Version and Changelog updateadam radford
Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Add 1078 OCR supportadam radford
Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Convert 6,10,12 byte CDB's for FastPath IOadam radford
The following patch for megaraid_sas converts 6,10,12 byte CDB's to 16 byte CDB for large LBA's for FastPath IO. Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Fix bug where AENs could be lost in probe() and resume()adam radford
Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Disable interrupts/free_irq() in megasas_shutdown()adam radford
The following patch for megaraid_sas disables interrupts and free_irq() in megasas_shutdown(). Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Check MFI_REG_STATE.fault.resetAdapteradam radford
The following patch for megaraid_sas fixes the function megasas_reset_fusion() and makes the reset code check MFI_REG_STATE.fault.resetAdapter. Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Remove un-used functionadam radford
The following patch for megaraid_sas removes un-used function megasas_return_cmd_for_smid(). Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] megaraid_sas: Remove MSI-X black list, use MFI_REG_STATE insteadadam radford
This patch for megaraid_sas removes the MSI-X black list and uses MFI_REG_STATE.ready.msiEnable instead. Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] libsas: fix SATA NCQ errorXiangliang Yu
Current version of libsas can not handle SATA NCQ error. This patch handle SATA NCQ error as AHCI do. Signed-off-by: Xiangliang Yu <yuxiangl@marvell.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] mpt2sas: Driver version upgrade 08.100.00.02Kashyap, Desai
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24[SCSI] mpt2sas: move even handling of MPT2SAS_TURN_ON_FAULT_LED into process ↵Kashyap, Desai
context Driver was a sending a SEP request during interrupt context which required to go to sleep. The fix is to rearrange the code so a fake event MPT2SAS_TURN_ON_FAULT_LED is fired from interrupt context, then later during the kernel worker threads processing, the SEP request is issued to firmware. Cc: stable@kernel.org Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24loop: handle on-demand devices correctlyNamhyung Kim
When finding or allocating a loop device, loop_probe() did not take partition numbers into account so that it can result to a different device. Consider following example: $ sudo modprobe loop max_part=15 $ ls -l /dev/loop* brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0 brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1 brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2 brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3 brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4 brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5 brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6 brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7 $ sudo mknod /dev/loop8 b 7 128 $ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img $ sudo losetup -a /dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img) $ ls -l /dev/loop* brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0 brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1 brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128 brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1 brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2 brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3 brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2 brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3 brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4 brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5 brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6 brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7 brw-r--r-- 1 root root 7, 128 2011-05-24 22:17 /dev/loop8 After this patch, /dev/loop8 - instead of /dev/loop128 - was accessed correctly. In addition, 'range' passed to blk_register_region() should include all range of dev_t that LOOP_MAJOR can address. It does not need to be limited by partition numbers unless 'max_loop' param was specified. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24loop: limit 'max_part' module param to DISK_MAX_PARTSNamhyung Kim
The 'max_part' parameter controls the number of maximum partition a loop block device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel panic (or, at least, produce invalid device nodes in some cases). On my desktop system, following command kills the kernel. On qemu, it triggers similar oops but the kernel was alive: $ sudo modprobe loop max_part0000 ------------[ cut here ]------------ kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65! invalid opcode: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: loop(+) Pid: 43, comm: insmod Tainted: G W 2.6.39-qemu+ #155 Bochs Bochs RIP: 0010:[<ffffffff8113ce61>] [<ffffffff8113ce61>] internal_create_group= +0x2a/0x170 RSP: 0018:ffff880007b3fde8 EFLAGS: 00000246 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800 FS: 0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000= 00 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c= 0) Stack: ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b 0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868 0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8 Call Trace: [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12 [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16 [<ffffffff8116a090>] blk_register_queue+0x47/0xf7 [<ffffffff8116f527>] add_disk+0xdf/0x290 [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop] [<ffffffffa0006000>] ? 0xffffffffa0005fff [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e [<ffffffff81096804>] sys_init_module+0x9c/0x1e0 [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb= 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe = 48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49 RIP [<ffffffff8113ce61>] internal_create_group+0x2a/0x170 RSP <ffff880007b3fde8> ---[ end trace a123eb592043acad ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-24video: mb862xxfb: add support for L1 displayingAnatolij Gustschin
Allow displaying L1 video data on top of the primary L0 layer. The L1 layer position and dimensions can be configured and the layer enabled/disabled by using the appropriate L1 controls added by this patch. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xx: add support for controller's I2C bus adapterAnatolij Gustschin
Add adapter driver for I2C adapter in Coral-P(A)/Lime GDCs. So we can easily access devices on controller's I2C bus using i2c-dev interface. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xxfb: relocate register space to get contiguous vramAnatolij Gustschin
By default the GDC registers are located in the middle of the 64MiB area for video RAM and registers. When 32MiB VRAM or more is used, relocate the register space to the top of the 64MiB space so that we get the contiguous VRAM for GDC frame buffer layers, drawing frames, capture and cursor buffers. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xxfb: use pre-initialized configuration for PCI GDCsAnatolij Gustschin
If the bootloader has already initialized the display controller, do not re-initialize it in the driver. Take over the bootloader's configuration instead. This is already supported for non PCI GDCs Lime and Mint. Add this functionality for PCI GDCs Coral-P and Coral-PA. It is useful to avoid flicker and also avoids unneeded init delays while booting. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24video: mb862xxfb: correct fix.smem_len field initializationAnatolij Gustschin
Initialize smem_len field to the actual frame buffer size and not to the whole video RAM size. This prevents overwriting other video memory (which could be used by other layers, cursors or accelerated drivers) by frame buffer applications relying on fix.smem_len. Signed-off-by: Anatolij Gustschin <agust@denx.de>
2011-05-24intel-iommu: Flush unmaps at domain_exitAlex Williamson
We typically batch unmaps to be lazily flushed out at regular intervals. When we destroy a domain, we need to force a flush of these lazy unmaps to be sure none reference the domain we're about to free. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=35062 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org
2011-05-24intel-iommu: Remove obsolete comment from detect_intel_iommuJan Kiszka
Since cacd4213d8, this comment no longer applies. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-24intel-iommu: fix VT-d PMR disable for TXT on S3 resumeJoseph Cihula
This patch is a follow on to https://lkml.org/lkml/2011/3/21/239, which was merged as commit 51a63e67da6056c13b5b597dcc9e1b3bd7ceaa55. This patch adds support for S3, as pointed out by Chris Wright. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-24oprofile: Use linux/mutex.hAnton Blanchard
The oprofile code is still including asm/mutex.h instead of linux/mutex.h. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Robert Richter <robert.richter@amd.com>
2011-05-24video: s3c-fb: correct transparency checking in 32bppJingoo Han
32bpp means ARGB 8888 in the driver, therfore the transparency length and offset should be 8 and 24 respectively. However, the transparency length and offset were previously 0, which means that the driver supports RGB 888 without alpha blending when 32bpp is used. So, the transparency checking in 32bpp is corrected so that the transparency length and offset are 8 and 24 respectively. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-05-24video: s3c-fb: add gpio setup function to resume functionJingoo Han
This patch adds gpio setup function to resume function to ensure gpio used by FIMD IP and LCD panel during a resume. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-05-24drbd: fix warningAndrew Morton
In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type Forward declarations of enums do not work. Fix it unpleasantly by moving the prototype. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Lars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2011-05-24drbd: fix warningPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24drbd: Fix spellingBart Van Assche
Found these with the help of ispell -l. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24drbd: fix schedule in atomicLars Ellenberg
An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Take a more conservative approach when deciding max_bio_sizePhilipp Reisner
The old (optimistic) implementation could shrink the bio size on an primary device. Shrinking the bio size on a primary device is bad. Since there we might get BIOs with the old (bigger) size shortly after we published the new size. The new implementation is more conservative, and eventually increases the max_bio_size on a primary device (which is valid). It does so, when it knows the local limit AND the remote limit. We cache the last seen max_bio_size of the peer in the meta data, and rely on that, to make the operation of single nodes more efficient. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fixed state transitions after async outdate-peer-handler returnedPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Disallow the peer_disk_state to be D_OUTDATED while connectedPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fix for the connection problems on high latency linksPhilipp Reisner
It seems that the real cause of all the issues where that we did not noticed in drbd_try_connect() when the other guy closes one socket if the round trip time gets higher than 100ms. There were that 100ms hard coded! Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix potential activity log refcount imbalance in error pathLars Ellenberg
It is no longer sufficient to trigger on local WRITE, we need to check on (rq_state & RQ_IN_ACT_LOG) before calling drbd_al_complete_io also in the error path. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Only downgrade the disk state in case of disk failuresPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix disconnect/reconnect loop, if ping-timeout == ping-intLars Ellenberg
If there is no replication traffic within the idle timeout (ping-int seconds), DRBD will send a P_PING, and adjust the timeout to ping-timeout. If there is no P_PING_ACK received within this ping-timeout, DRBD finally drops the connection, and tries to re-establish it. To decide which timeout was active, we compared the current timeout with the ping-timeout, and dropped the connection, if that was the case. By default, ping-int is 10 seconds, ping-timeout is 500 ms. Unfortunately, if you configure ping-timeout to be the same as ping-int, expiry of the idle-timeout had been mistaken for a missing ping ack, and caused an immediate reconnection attempt. Fix: Allow both timeouts to be equal, use a local variable to store which timeout is active. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: fix potential distributed deadlockLars Ellenberg
We limit ourselves to a configurable maximum number of pages used as temporary bio pages. If the configured "max_buffers" is not big enough to match the bandwidth of the respective deployment, a distributed deadlock could be triggered by e.g. fast online verify and heavy application IO. TCP connections would block on congestion, because both receivers would wait on pages to become available. Fortunately the respective senders in this case would be able to give back some pages already. So do that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24drbd: Fix for application IO with the on-io-error=pass-on policyPhilipp Reisner
In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24Merge branch 'for_2.6.40/pm-cleanup' of ↵Tony Lindgren
ssh://master.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm into omap-for-linus