diff options
Diffstat (limited to 'Documentation/block')
-rw-r--r-- | Documentation/block/00-INDEX | 10 | ||||
-rw-r--r-- | Documentation/block/cfq-iosched.txt | 77 | ||||
-rw-r--r-- | Documentation/block/queue-sysfs.txt | 64 |
3 files changed, 149 insertions, 2 deletions
diff --git a/Documentation/block/00-INDEX b/Documentation/block/00-INDEX index d111e3b23db..d18ecd827c4 100644 --- a/Documentation/block/00-INDEX +++ b/Documentation/block/00-INDEX @@ -3,15 +3,21 @@ biodoc.txt - Notes on the Generic Block Layer Rewrite in Linux 2.5 capability.txt - - Generic Block Device Capability (/sys/block/<disk>/capability) + - Generic Block Device Capability (/sys/block/<device>/capability) +cfq-iosched.txt + - CFQ IO scheduler tunables +data-integrity.txt + - Block data integrity deadline-iosched.txt - Deadline IO scheduler tunables ioprio.txt - Block io priorities (in CFQ scheduler) +queue-sysfs.txt + - Queue's sysfs entries request.txt - The members of struct request (in include/linux/blkdev.h) stat.txt - - Block layer statistics in /sys/block/<dev>/stat + - Block layer statistics in /sys/block/<device>/stat switching-sched.txt - Switching I/O schedulers at runtime writeback_cache_control.txt diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt index 6d670f57045..d89b4fe724d 100644 --- a/Documentation/block/cfq-iosched.txt +++ b/Documentation/block/cfq-iosched.txt @@ -1,3 +1,14 @@ +CFQ (Complete Fairness Queueing) +=============================== + +The main aim of CFQ scheduler is to provide a fair allocation of the disk +I/O bandwidth for all the processes which requests an I/O operation. + +CFQ maintains the per process queue for the processes which request I/O +operation(syncronous requests). In case of asynchronous requests, all the +requests from all the processes are batched together according to their +process's I/O priority. + CFQ ioscheduler tunables ======================== @@ -25,6 +36,72 @@ there are multiple spindles behind single LUN (Host based hardware RAID controller or for storage arrays), setting slice_idle=0 might end up in better throughput and acceptable latencies. +back_seek_max +------------- +This specifies, given in Kbytes, the maximum "distance" for backward seeking. +The distance is the amount of space from the current head location to the +sectors that are backward in terms of distance. + +This parameter allows the scheduler to anticipate requests in the "backward" +direction and consider them as being the "next" if they are within this +distance from the current head location. + +back_seek_penalty +----------------- +This parameter is used to compute the cost of backward seeking. If the +backward distance of request is just 1/back_seek_penalty from a "front" +request, then the seeking cost of two requests is considered equivalent. + +So scheduler will not bias toward one or the other request (otherwise scheduler +will bias toward front request). Default value of back_seek_penalty is 2. + +fifo_expire_async +----------------- +This parameter is used to set the timeout of asynchronous requests. Default +value of this is 248ms. + +fifo_expire_sync +---------------- +This parameter is used to set the timeout of synchronous requests. Default +value of this is 124ms. In case to favor synchronous requests over asynchronous +one, this value should be decreased relative to fifo_expire_async. + +slice_async +----------- +This parameter is same as of slice_sync but for asynchronous queue. The +default value is 40ms. + +slice_async_rq +-------------- +This parameter is used to limit the dispatching of asynchronous request to +device request queue in queue's slice time. The maximum number of request that +are allowed to be dispatched also depends upon the io priority. Default value +for this is 2. + +slice_sync +---------- +When a queue is selected for execution, the queues IO requests are only +executed for a certain amount of time(time_slice) before switching to another +queue. This parameter is used to calculate the time slice of synchronous +queue. + +time_slice is computed using the below equation:- +time_slice = slice_sync + (slice_sync/5 * (4 - prio)). To increase the +time_slice of synchronous queue, increase the value of slice_sync. Default +value is 100ms. + +quantum +------- +This specifies the number of request dispatched to the device queue. In a +queue's time slice, a request will not be dispatched if the number of request +in the device exceeds this parameter. This parameter is used for synchronous +request. + +In case of storage with several disk, this setting can limit the parallel +processing of request. Therefore, increasing the value can imporve the +performace although this can cause the latency of some I/O to increase due +to more number of requests. + CFQ IOPS Mode for group scheduling =================================== Basic CFQ design is to provide priority based time slices. Higher priority diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt index 6518a55273e..e54ac1d5340 100644 --- a/Documentation/block/queue-sysfs.txt +++ b/Documentation/block/queue-sysfs.txt @@ -9,20 +9,71 @@ These files are the ones found in the /sys/block/xxx/queue/ directory. Files denoted with a RO postfix are readonly and the RW postfix means read-write. +add_random (RW) +---------------- +This file allows to trun off the disk entropy contribution. Default +value of this file is '1'(on). + +discard_granularity (RO) +----------------------- +This shows the size of internal allocation of the device in bytes, if +reported by the device. A value of '0' means device does not support +the discard functionality. + +discard_max_bytes (RO) +---------------------- +Devices that support discard functionality may have internal limits on +the number of bytes that can be trimmed or unmapped in a single operation. +The discard_max_bytes parameter is set by the device driver to the maximum +number of bytes that can be discarded in a single operation. Discard +requests issued to the device must not exceed this limit. A discard_max_bytes +value of 0 means that the device does not support discard functionality. + +discard_zeroes_data (RO) +------------------------ +When read, this file will show if the discarded block are zeroed by the +device or not. If its value is '1' the blocks are zeroed otherwise not. + hw_sector_size (RO) ------------------- This is the hardware sector size of the device, in bytes. +iostats (RW) +------------- +This file is used to control (on/off) the iostats accounting of the +disk. + +logical_block_size (RO) +----------------------- +This is the logcal block size of the device, in bytes. + max_hw_sectors_kb (RO) ---------------------- This is the maximum number of kilobytes supported in a single data transfer. +max_integrity_segments (RO) +--------------------------- +When read, this file shows the max limit of integrity segments as +set by block layer which a hardware controller can handle. + max_sectors_kb (RW) ------------------- This is the maximum number of kilobytes that the block layer will allow for a filesystem request. Must be smaller than or equal to the maximum size allowed by the hardware. +max_segments (RO) +----------------- +Maximum number of segments of the device. + +max_segment_size (RO) +--------------------- +Maximum segment size of the device. + +minimum_io_size (RO) +-------------------- +This is the smallest preferred io size reported by the device. + nomerges (RW) ------------- This enables the user to disable the lookup logic involved with IO @@ -45,11 +96,24 @@ per-block-cgroup request pool. IOW, if there are N block cgroups, each request queue may have upto N request pools, each independently regulated by nr_requests. +optimal_io_size (RO) +-------------------- +This is the optimal io size reported by the device. + +physical_block_size (RO) +------------------------ +This is the physical block size of device, in bytes. + read_ahead_kb (RW) ------------------ Maximum number of kilobytes to read-ahead for filesystems on this block device. +rotational (RW) +--------------- +This file is used to stat if the device is of rotational type or +non-rotational type. + rq_affinity (RW) ---------------- If this option is '1', the block layer will migrate request completions to the |