summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-module12
-rw-r--r--Documentation/DocBook/kernel-api.tmpl3
-rw-r--r--Documentation/block/00-INDEX4
-rw-r--r--Documentation/block/barrier.txt261
-rw-r--r--Documentation/block/writeback_cache_control.txt86
-rw-r--r--Documentation/cgroups/blkio-controller.txt106
-rw-r--r--Documentation/devices.txt6
-rw-r--r--Documentation/dynamic-debug-howto.txt22
-rw-r--r--Documentation/filesystems/proc.txt32
-rw-r--r--Documentation/kernel-parameters.txt16
-rw-r--r--Documentation/lguest/lguest.c29
-rw-r--r--Documentation/powerpc/dts-bindings/fsl/usb.txt22
-rw-r--r--Documentation/scsi/st.txt15
-rw-r--r--Documentation/usb/proc_usb_info.txt34
-rw-r--r--Documentation/workqueue.txt29
15 files changed, 360 insertions, 317 deletions
diff --git a/Documentation/ABI/testing/sysfs-module b/Documentation/ABI/testing/sysfs-module
new file mode 100644
index 00000000000..cfcec3bffc0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-module
@@ -0,0 +1,12 @@
+What: /sys/module/pch_phub/drivers/.../pch_mac
+Date: August 2010
+KernelVersion: 2.6.35
+Contact: masa-korg@dsn.okisemi.com
+Description: Write/read GbE MAC address.
+
+What: /sys/module/pch_phub/drivers/.../pch_firmware
+Date: August 2010
+KernelVersion: 2.6.35
+Contact: masa-korg@dsn.okisemi.com
+Description: Write/read Option ROM data.
+
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index 6899f471fb1..6b4e07f28b6 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -257,7 +257,8 @@ X!Earch/x86/kernel/mca_32.c
!Iblock/blk-sysfs.c
!Eblock/blk-settings.c
!Eblock/blk-exec.c
-!Eblock/blk-barrier.c
+!Eblock/blk-flush.c
+!Eblock/blk-lib.c
!Eblock/blk-tag.c
!Iblock/blk-tag.c
!Eblock/blk-integrity.c
diff --git a/Documentation/block/00-INDEX b/Documentation/block/00-INDEX
index a406286f6f3..d111e3b23db 100644
--- a/Documentation/block/00-INDEX
+++ b/Documentation/block/00-INDEX
@@ -1,7 +1,5 @@
00-INDEX
- This file
-barrier.txt
- - I/O Barriers
biodoc.txt
- Notes on the Generic Block Layer Rewrite in Linux 2.5
capability.txt
@@ -16,3 +14,5 @@ stat.txt
- Block layer statistics in /sys/block/<dev>/stat
switching-sched.txt
- Switching I/O schedulers at runtime
+writeback_cache_control.txt
+ - Control of volatile write back caches
diff --git a/Documentation/block/barrier.txt b/Documentation/block/barrier.txt
deleted file mode 100644
index 2c2f24f634e..00000000000
--- a/Documentation/block/barrier.txt
+++ /dev/null
@@ -1,261 +0,0 @@
-I/O Barriers
-============
-Tejun Heo <htejun@gmail.com>, July 22 2005
-
-I/O barrier requests are used to guarantee ordering around the barrier
-requests. Unless you're crazy enough to use disk drives for
-implementing synchronization constructs (wow, sounds interesting...),
-the ordering is meaningful only for write requests for things like
-journal checkpoints. All requests queued before a barrier request
-must be finished (made it to the physical medium) before the barrier
-request is started, and all requests queued after the barrier request
-must be started only after the barrier request is finished (again,
-made it to the physical medium).
-
-In other words, I/O barrier requests have the following two properties.
-
-1. Request ordering
-
-Requests cannot pass the barrier request. Preceding requests are
-processed before the barrier and following requests after.
-
-Depending on what features a drive supports, this can be done in one
-of the following three ways.
-
-i. For devices which have queue depth greater than 1 (TCQ devices) and
-support ordered tags, block layer can just issue the barrier as an
-ordered request and the lower level driver, controller and drive
-itself are responsible for making sure that the ordering constraint is
-met. Most modern SCSI controllers/drives should support this.
-
-NOTE: SCSI ordered tag isn't currently used due to limitation in the
- SCSI midlayer, see the following random notes section.
-
-ii. For devices which have queue depth greater than 1 but don't
-support ordered tags, block layer ensures that the requests preceding
-a barrier request finishes before issuing the barrier request. Also,
-it defers requests following the barrier until the barrier request is
-finished. Older SCSI controllers/drives and SATA drives fall in this
-category.
-
-iii. Devices which have queue depth of 1. This is a degenerate case
-of ii. Just keeping issue order suffices. Ancient SCSI
-controllers/drives and IDE drives are in this category.
-
-2. Forced flushing to physical medium
-
-Again, if you're not gonna do synchronization with disk drives (dang,
-it sounds even more appealing now!), the reason you use I/O barriers
-is mainly to protect filesystem integrity when power failure or some
-other events abruptly stop the drive from operating and possibly make
-the drive lose data in its cache. So, I/O barriers need to guarantee
-that requests actually get written to non-volatile medium in order.
-
-There are four cases,
-
-i. No write-back cache. Keeping requests ordered is enough.
-
-ii. Write-back cache but no flush operation. There's no way to
-guarantee physical-medium commit order. This kind of devices can't to
-I/O barriers.
-
-iii. Write-back cache and flush operation but no FUA (forced unit
-access). We need two cache flushes - before and after the barrier
-request.
-
-iv. Write-back cache, flush operation and FUA. We still need one
-flush to make sure requests preceding a barrier are written to medium,
-but post-barrier flush can be avoided by using FUA write on the
-barrier itself.
-
-
-How to support barrier requests in drivers
-------------------------------------------
-
-All barrier handling is done inside block layer proper. All low level
-drivers have to are implementing its prepare_flush_fn and using one
-the following two functions to indicate what barrier type it supports
-and how to prepare flush requests. Note that the term 'ordered' is
-used to indicate the whole sequence of performing barrier requests
-including draining and flushing.
-
-typedef void (prepare_flush_fn)(struct request_queue *q, struct request *rq);
-
-int blk_queue_ordered(struct request_queue *q, unsigned ordered,
- prepare_flush_fn *prepare_flush_fn);
-
-@q : the queue in question
-@ordered : the ordered mode the driver/device supports
-@prepare_flush_fn : this function should prepare @rq such that it
- flushes cache to physical medium when executed
-
-For example, SCSI disk driver's prepare_flush_fn looks like the
-following.
-
-static void sd_prepare_flush(struct request_queue *q, struct request *rq)
-{
- memset(rq->cmd, 0, sizeof(rq->cmd));
- rq->cmd_type = REQ_TYPE_BLOCK_PC;
- rq->timeout = SD_TIMEOUT;
- rq->cmd[0] = SYNCHRONIZE_CACHE;
- rq->cmd_len = 10;
-}
-
-The following seven ordered modes are supported. The following table
-shows which mode should be used depending on what features a
-device/driver supports. In the leftmost column of table,
-QUEUE_ORDERED_ prefix is omitted from the mode names to save space.
-
-The table is followed by description of each mode. Note that in the
-descriptions of QUEUE_ORDERED_DRAIN*, '=>' is used whereas '->' is
-used for QUEUE_ORDERED_TAG* descriptions. '=>' indicates that the
-preceding step must be complete before proceeding to the next step.
-'->' indicates that the next step can start as soon as the previous
-step is issued.
-
- write-back cache ordered tag flush FUA
------------------------------------------------------------------------
-NONE yes/no N/A no N/A
-DRAIN no no N/A N/A
-DRAIN_FLUSH yes no yes no
-DRAIN_FUA yes no yes yes
-TAG no yes N/A N/A
-TAG_FLUSH yes yes yes no
-TAG_FUA yes yes yes yes
-
-
-QUEUE_ORDERED_NONE
- I/O barriers are not needed and/or supported.
-
- Sequence: N/A
-
-QUEUE_ORDERED_DRAIN
- Requests are ordered by draining the request queue and cache
- flushing isn't needed.
-
- Sequence: drain => barrier
-
-QUEUE_ORDERED_DRAIN_FLUSH
- Requests are ordered by draining the request queue and both
- pre-barrier and post-barrier cache flushings are needed.
-
- Sequence: drain => preflush => barrier => postflush
-
-QUEUE_ORDERED_DRAIN_FUA
- Requests are ordered by draining the request queue and
- pre-barrier cache flushing is needed. By using FUA on barrier
- request, post-barrier flushing can be skipped.
-
- Sequence: drain => preflush => barrier
-
-QUEUE_ORDERED_TAG
- Requests are ordered by ordered tag and cache flushing isn't
- needed.
-
- Sequence: barrier
-
-QUEUE_ORDERED_TAG_FLUSH
- Requests are ordered by ordered tag and both pre-barrier and
- post-barrier cache flushings are needed.
-
- Sequence: preflush -> barrier -> postflush
-
-QUEUE_ORDERED_TAG_FUA
- Requests are ordered by ordered tag and pre-barrier cache
- flushing is needed. By using FUA on barrier request,
- post-barrier flushing can be skipped.
-
- Sequence: preflush -> barrier
-
-
-Random notes/caveats
---------------------
-
-* SCSI layer currently can't use TAG ordering even if the drive,
-controller and driver support it. The problem is that SCSI midlayer
-request dispatch function is not atomic. It releases queue lock and
-switch to SCSI host lock during issue and it's possible and likely to
-happen in time that requests change their relative positions. Once
-this problem is solved, TAG ordering can be enabled.
-
-* Currently, no matter which ordered mode is used, there can be only
-one barrier request in progress. All I/O barriers are held off by
-block layer until the previous I/O barrier is complete. This doesn't
-make any difference for DRAIN ordered devices, but, for TAG ordered
-devices with very high command latency, passing multiple I/O barriers
-to low level *might* be helpful if they are very frequent. Well, this
-certainly is a non-issue. I'm writing this just to make clear that no
-two I/O barrier is ever passed to low-level driver.
-
-* Completion order. Requests in ordered sequence are issued in order
-but not required to finish in order. Barrier implementation can
-handle out-of-order completion of ordered sequence. IOW, the requests
-MUST be processed in order but the hardware/software completion paths
-are allowed to reorder completion notifications - eg. current SCSI
-midlayer doesn't preserve completion order during error handling.
-
-* Requeueing order. Low-level drivers are free to requeue any request
-after they removed it from the request queue with
-blkdev_dequeue_request(). As barrier sequence should be kept in order
-when requeued, generic elevator code takes care of putting requests in
-order around barrier. See blk_ordered_req_seq() and
-ELEVATOR_INSERT_REQUEUE handling in __elv_add_request() for details.
-
-Note that block drivers must not requeue preceding requests while
-completing latter requests in an ordered sequence. Currently, no
-error checking is done against this.
-
-* Error handling. Currently, block layer will report error to upper
-layer if any of requests in an ordered sequence fails. Unfortunately,
-this doesn't seem to be enough. Look at the following request flow.
-QUEUE_ORDERED_TAG_FLUSH is in use.
-
- [0] [1] [2] [3] [pre] [barrier] [post] < [4] [5] [6] ... >
- still in elevator
-
-Let's say request [2], [3] are write requests to update file system
-metadata (journal or whatever) and [barrier] is used to mark that
-those updates are valid. Consider the following sequence.
-
- i. Requests [0] ~ [post] leaves the request queue and enters
- low-level driver.
- ii. After a while, unfortunately, something goes wrong and the
- drive fails [2]. Note that any of [0], [1] and [3] could have
- completed by this time, but [pre] couldn't have been finished
- as the drive must process it in order and it failed before
- processing that command.
- iii. Error handling kicks in and determines that the error is
- unrecoverable and fails [2], and resumes operation.
- iv. [pre] [barrier] [post] gets processed.
- v. *BOOM* power fails
-
-The problem here is that the barrier request is *supposed* to indicate
-that filesystem update requests [2] and [3] made it safely to the
-physical medium and, if the machine crashes after the barrier is
-written, filesystem recovery code can depend on that. Sadly, that
-isn't true in this case anymore. IOW, the success of a I/O barrier
-should also be dependent on success of some of the preceding requests,
-where only upper layer (filesystem) knows what 'some' is.
-
-This can be solved by implementing a way to tell the block layer which
-requests affect the success of the following barrier request and
-making lower lever drivers to resume operation on error only after
-block layer tells it to do so.
-
-As the probability of this happening is very low and the drive should
-be faulty, implementing the fix is probably an overkill. But, still,
-it's there.
-
-* In previous drafts of barrier implementation, there was fallback
-mechanism such that, if FUA or ordered TAG fails, less fancy ordered
-mode can be selected and the failed barrier request is retried
-automatically. The rationale for this feature was that as FUA is
-pretty new in ATA world and ordered tag was never used widely, there
-could be devices which report to support those features but choke when
-actually given such requests.
-
- This was removed for two reasons 1. it's an overkill 2. it's
-impossible to implement properly when TAG ordering is used as low
-level drivers resume after an error automatically. If it's ever
-needed adding it back and modifying low level drivers accordingly
-shouldn't be difficult.
diff --git a/Documentation/block/writeback_cache_control.txt b/Documentation/block/writeback_cache_control.txt
new file mode 100644
index 00000000000..83407d36630
--- /dev/null
+++ b/Documentation/block/writeback_cache_control.txt
@@ -0,0 +1,86 @@
+
+Explicit volatile write back cache control
+=====================================
+
+Introduction
+------------
+
+Many storage devices, especially in the consumer market, come with volatile
+write back caches. That means the devices signal I/O completion to the
+operating system before data actually has hit the non-volatile storage. This
+behavior obviously speeds up various workloads, but it means the operating
+system needs to force data out to the non-volatile storage when it performs
+a data integrity operation like fsync, sync or an unmount.
+
+The Linux block layer provides two simple mechanisms that let filesystems
+control the caching behavior of the storage device. These mechanisms are
+a forced cache flush, and the Force Unit Access (FUA) flag for requests.
+
+
+Explicit cache flushes
+----------------------
+
+The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
+the filesystem and will make sure the volatile cache of the storage device
+has been flushed before the actual I/O operation is started. This explicitly
+guarantees that previously completed write requests are on non-volatile
+storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
+set on an otherwise empty bio structure, which causes only an explicit cache
+flush without any dependent I/O. It is recommend to use
+the blkdev_issue_flush() helper for a pure cache flush.
+
+
+Forced Unit Access
+-----------------
+
+The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
+filesystem and will make sure that I/O completion for this request is only
+signaled after the data has been committed to non-volatile storage.
+
+
+Implementation details for filesystems
+--------------------------------------
+
+Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
+worry if the underlying devices need any explicit cache flushing and how
+the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags
+may both be set on a single bio.
+
+
+Implementation details for make_request_fn based block drivers
+--------------------------------------------------------------
+
+These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
+directly below the submit_bio interface. For remapping drivers the REQ_FUA
+bits need to be propagated to underlying devices, and a global flush needs
+to be implemented for bios with the REQ_FLUSH bit set. For real device
+drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits
+on non-empty bios can simply be ignored, and REQ_FLUSH requests without
+data can be completed successfully without doing any work. Drivers for
+devices with volatile caches need to implement the support for these
+flags themselves without any help from the block layer.
+
+
+Implementation details for request_fn based block drivers
+--------------------------------------------------------------
+
+For devices that do not support volatile write caches there is no driver
+support required, the block layer completes empty REQ_FLUSH requests before
+entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from
+requests that have a payload. For devices with volatile write caches the
+driver needs to tell the block layer that it supports flushing caches by
+doing:
+
+ blk_queue_flush(sdkp->disk->queue, REQ_FLUSH);
+
+and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that
+REQ_FLUSH requests with a payload are automatically turned into a sequence
+of an empty REQ_FLUSH request followed by the actual write by the block
+layer. For devices that also support the FUA bit the block layer needs
+to be told to pass through the REQ_FUA bit using:
+
+ blk_queue_flush(sdkp->disk->queue, REQ_FLUSH | REQ_FUA);
+
+and the driver must handle write requests that have the REQ_FUA bit set
+in prep_fn/request_fn. If the FUA bit is not natively supported the block
+layer turns it into an empty REQ_FLUSH request after the actual write.
diff --git a/Documentation/cgroups/blkio-controller.txt b/Documentation/cgroups/blkio-controller.txt
index 6919d62591d..d6da611f8f6 100644
--- a/Documentation/cgroups/blkio-controller.txt
+++ b/Documentation/cgroups/blkio-controller.txt
@@ -8,12 +8,17 @@ both at leaf nodes as well as at intermediate nodes in a storage hierarchy.
Plan is to use the same cgroup based management interface for blkio controller
and based on user options switch IO policies in the background.
-In the first phase, this patchset implements proportional weight time based
-division of disk policy. It is implemented in CFQ. Hence this policy takes
-effect only on leaf nodes when CFQ is being used.
+Currently two IO control policies are implemented. First one is proportional
+weight time based division of disk policy. It is implemented in CFQ. Hence
+this policy takes effect only on leaf nodes when CFQ is being used. The second
+one is throttling policy which can be used to specify upper IO rate limits
+on devices. This policy is implemented in generic block layer and can be
+used on leaf nodes as well as higher level logical devices like device mapper.
HOWTO
=====
+Proportional Weight division of bandwidth
+-----------------------------------------
You can do a very simple testing of running two dd threads in two different
cgroups. Here is what you can do.
@@ -55,6 +60,35 @@ cgroups. Here is what you can do.
group dispatched to the disk. We provide fairness in terms of disk time, so
ideally io.disk_time of cgroups should be in proportion to the weight.
+Throttling/Upper Limit policy
+-----------------------------
+- Enable Block IO controller
+ CONFIG_BLK_CGROUP=y
+
+- Enable throttling in block layer
+ CONFIG_BLK_DEV_THROTTLING=y
+
+- Mount blkio controller
+ mount -t cgroup -o blkio none /cgroup/blkio
+
+- Specify a bandwidth rate on particular device for root group. The format
+ for policy is "<major>:<minor> <byes_per_second>".
+
+ echo "8:16 1048576" > /cgroup/blkio/blkio.read_bps_device
+
+ Above will put a limit of 1MB/second on reads happening for root group
+ on device having major/minor number 8:16.
+
+- Run dd to read a file and see if rate is throttled to 1MB/s or not.
+
+ # dd if=/mnt/common/zerofile of=/dev/null bs=4K count=1024
+ # iflag=direct
+ 1024+0 records in
+ 1024+0 records out
+ 4194304 bytes (4.2 MB) copied, 4.0001 s, 1.0 MB/s
+
+ Limits for writes can be put using blkio.write_bps_device file.
+
Various user visible config options
===================================
CONFIG_BLK_CGROUP
@@ -68,8 +102,13 @@ CONFIG_CFQ_GROUP_IOSCHED
- Enables group scheduling in CFQ. Currently only 1 level of group
creation is allowed.
+CONFIG_BLK_DEV_THROTTLING
+ - Enable block device throttling support in block layer.
+
Details of cgroup files
=======================
+Proportional weight policy files
+--------------------------------
- blkio.weight
- Specifies per cgroup weight. This is default weight of the group
on all the devices until and unless overridden by per device rule.
@@ -210,6 +249,67 @@ Details of cgroup files
and minor number of the device and third field specifies the number
of times a group was dequeued from a particular device.
+Throttling/Upper limit policy files
+-----------------------------------
+- blkio.throttle.read_bps_device
+ - Specifies upper limit on READ rate from the device. IO rate is
+ specified in bytes per second. Rules are per deivce. Following is
+ the format.
+
+ echo "<major>:<minor> <rate_bytes_per_second>" > /cgrp/blkio.read_bps_device
+
+- blkio.throttle.write_bps_device
+ - Specifies upper limit on WRITE rate to the device. IO rate is
+ specified in bytes per second. Rules are per deivce. Following is
+ the format.
+
+ echo "<major>:<minor> <rate_bytes_per_second>" > /cgrp/blkio.write_bps_device
+
+- blkio.throttle.read_iops_device
+ - Specifies upper limit on READ rate from the device. IO rate is
+ specified in IO per second. Rules are per deivce. Following is
+ the format.
+
+ echo "<major>:<minor> <rate_io_per_second>" > /cgrp/blkio.read_iops_device
+
+- blkio.throttle.write_iops_device
+ - Specifies upper limit on WRITE rate to the device. IO rate is
+ specified in io per second. Rules are per deivce. Following is
+ the format.
+
+ echo "<major>:<minor> <rate_io_per_second>" > /cgrp/blkio.write_iops_device
+
+Note: If both BW and IOPS rules are specified for a device, then IO is
+ subjectd to both the constraints.
+
+- blkio.throttle.io_serviced
+ - Number of IOs (bio) completed to/from the disk by the group (as
+ seen by throttling policy). These are further divided by the type
+ of operation - read or write, sync or async. First two fields specify
+ the major and minor number of the device, third field specifies the
+ operation type and the fourth field specifies the number of IOs.
+
+ blkio.io_serviced does accounting as seen by CFQ and counts are in
+ number of requests (struct request). On the other hand,
+ blkio.throttle.io_serviced counts number of IO in terms of number
+ of bios as seen by throttling policy. These bios can later be
+ merged by elevator and total number of requests completed can be
+ lesser.
+
+- blkio.throttle.io_service_bytes
+ - Number of bytes transferred to/from the disk by the group. These
+ are further divided by the type of operation - read or write, sync
+ or async. First two fields specify the major and minor number of the
+ device, third field specifies the operation type and the fourth field
+ specifies the number of bytes.
+
+ These numbers should roughly be same as blkio.io_service_bytes as
+ updated by CFQ. The difference between two is that
+ blkio.io_service_bytes will not be updated if CFQ is not operating
+ on request queue.
+
+Common files among various policies
+-----------------------------------
- blkio.reset_stats
- Writing an int to this file will result in resetting all the stats
for that cgroup.
diff --git a/Documentation/devices.txt b/Documentation/devices.txt
index d0d1df6cb5d..c58abf1ccc7 100644
--- a/Documentation/devices.txt
+++ b/Documentation/devices.txt
@@ -239,6 +239,7 @@ Your cooperation is appreciated.
0 = /dev/tty Current TTY device
1 = /dev/console System console
2 = /dev/ptmx PTY master multiplex
+ 3 = /dev/ttyprintk User messages via printk TTY device
64 = /dev/cua0 Callout device for ttyS0
...
255 = /dev/cua191 Callout device for ttyS191
@@ -2553,7 +2554,10 @@ Your cooperation is appreciated.
175 = /dev/usb/legousbtower15 16th USB Legotower device
176 = /dev/usb/usbtmc1 First USB TMC device
...
- 192 = /dev/usb/usbtmc16 16th USB TMC device
+ 191 = /dev/usb/usbtmc16 16th USB TMC device
+ 192 = /dev/usb/yurex1 First USB Yurex device
+ ...
+ 209 = /dev/usb/yurex16 16th USB Yurex device
240 = /dev/usb/dabusb0 First daubusb device
...
243 = /dev/usb/dabusb3 Fourth dabusb device
diff --git a/Documentation/dynamic-debug-howto.txt b/Documentation/dynamic-debug-howto.txt
index 674c5663d34..58ea64a9616 100644
--- a/Documentation/dynamic-debug-howto.txt
+++ b/Documentation/dynamic-debug-howto.txt
@@ -24,7 +24,7 @@ Dynamic debug has even more useful features:
read to display the complete list of known debug statements, to help guide you
Controlling dynamic debug Behaviour
-===============================
+===================================
The behaviour of pr_debug()/dev_debug()s are controlled via writing to a
control file in the 'debugfs' filesystem. Thus, you must first mount the debugfs
@@ -212,6 +212,26 @@ Note the regexp ^[-+=][scp]+$ matches a flags specification.
Note also that there is no convenient syntax to remove all
the flags at once, you need to use "-psc".
+
+Debug messages during boot process
+==================================
+
+To be able to activate debug messages during the boot process,
+even before userspace and debugfs exists, use the boot parameter:
+ddebug_query="QUERY"
+
+QUERY follows the syntax described above, but must not exceed 1023
+characters. The enablement of debug messages is done as an arch_initcall.
+Thus you can enable debug messages in all code processed after this
+arch_initcall via this boot parameter.
+On an x86 system for example ACPI enablement is a subsys_initcall and
+ddebug_query="file ec.c +p"
+will show early Embedded Controller transactions during ACPI setup if
+your machine (typically a laptop) has an Embedded Controller.
+PCI (or other devices) initialization also is a hot candidate for using
+this boot parameter for debugging purposes.
+
+
Examples
========
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index a6aca874088..98223a67694 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1075,6 +1075,7 @@ Table 1-11: Files in /proc/tty
drivers list of drivers and their usage
ldiscs registered line disciplines
driver/serial usage statistic and status of single tty lines
+ consoles registered system console lines
..............................................................................
To see which tty's are currently in use, you can simply look into the file
@@ -1093,6 +1094,37 @@ To see which tty's are currently in use, you can simply look into the file
/dev/tty /dev/tty 5 0 system:/dev/tty
unknown /dev/tty 4 1-63 console
+To see which character device lines are currently used for the system console
+/dev/console, you may simply look into the file /proc/tty/consoles:
+
+ > cat /proc/tty/consoles
+ tty0 -WU (ECp) 4:7
+ ttyS0 -W- (Ep) 4:64
+
+The columns are:
+
+ device name of the device
+ operations R = can do read operations
+ W = can do write operations
+ U = can do unblank
+ flags E = it is enabled
+ C = it is prefered console
+ B = it is primary boot console
+ p = it is used for printk buffer
+ b = it is not a TTY but a Braille device
+ a = it is safe to use when cpu is offline
+ * = it is standard input of the reading process
+ major:minor major and minor number of the device separated by a colon
+
+If the reading process holds /dev/console open at the regular standard input
+stream the active device will be marked by an asterisk:
+
+ > cat /proc/tty/consoles < /dev/console
+ tty0 -WU (ECp*) 4:7
+ ttyS0 -W- (Ep) 4:64
+ > tty
+ /dev/pts/3
+
1.8 Miscellaneous kernel statistics in /proc/stat
-------------------------------------------------
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 02f21d9220c..4cd8b86e00e 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,10 +43,11 @@ parameter is applicable:
AVR32 AVR32 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
BLACKFIN Blackfin architecture is enabled.
- DRM Direct Rendering Management support is enabled.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EIDE EIDE/ATAPI support is enabled.
+ DRM Direct Rendering Management support is enabled.
+ DYNAMIC_DEBUG Build in debug messages and enable them at runtime
FB The frame buffer device is enabled.
GCOV GCOV profiling is enabled.
HW Appropriate hardware is enabled.
@@ -570,6 +571,10 @@ and is between 256 and 4096 characters. It is defined in the file
Format: <port#>,<type>
See also Documentation/input/joystick-parport.txt
+ ddebug_query= [KNL,DYNAMIC_DEBUG] Enable debug messages at early boot
+ time. See Documentation/dynamic-debug-howto.txt for
+ details.
+
debug [KNL] Enable kernel debugging (events log level).
debug_locks_verbose=
@@ -2370,6 +2375,15 @@ and is between 256 and 4096 characters. It is defined in the file
switches= [HW,M68k]
+ sysfs.deprecated=0|1 [KNL]
+ Enable/disable old style sysfs layout for old udev
+ on older distributions. When this option is enabled
+ very new udev will not work anymore. When this option
+ is disabled (or CONFIG_SYSFS_DEPRECATED not compiled)
+ in older udev will not work anymore.
+ Default depends on CONFIG_SYSFS_DEPRECATED_V2 set in
+ the kernel configuration.
+
sysrq_always_enabled
[KNL]
Ignore sysrq setting - this boot parameter will
diff --git a/Documentation/lguest/lguest.c b/Documentation/lguest/lguest.c
index 8a6a8c6d498..dc73bc54cc4 100644
--- a/Documentation/lguest/lguest.c
+++ b/Documentation/lguest/lguest.c
@@ -1640,15 +1640,6 @@ static void blk_request(struct virtqueue *vq)
off = out->sector * 512;
/*
- * The block device implements "barriers", where the Guest indicates
- * that it wants all previous writes to occur before this write. We
- * don't have a way of asking our kernel to do a barrier, so we just
- * synchronize all the data in the file. Pretty poor, no?
- */
- if (out->type & VIRTIO_BLK_T_BARRIER)
- fdatasync(vblk->fd);
-
- /*
* In general the virtio block driver is allowed to try SCSI commands.
* It'd be nice if we supported eject, for example, but we don't.
*/
@@ -1680,6 +1671,13 @@ static void blk_request(struct virtqueue *vq)
/* Die, bad Guest, die. */
errx(1, "Write past end %llu+%u", off, ret);
}
+
+ wlen = sizeof(*in);
+ *in = (ret >= 0 ? VIRTIO_BLK_S_OK : VIRTIO_BLK_S_IOERR);
+ } else if (out->type & VIRTIO_BLK_T_FLUSH) {
+ /* Flush */
+ ret = fdatasync(vblk->fd);
+ verbose("FLUSH fdatasync: %i\n", ret);
wlen = sizeof(*in);
*in = (ret >= 0 ? VIRTIO_BLK_S_OK : VIRTIO_BLK_S_IOERR);
} else {
@@ -1703,15 +1701,6 @@ static void blk_request(struct virtqueue *vq)
}
}
- /*
- * OK, so we noted that it was pretty poor to use an fdatasync as a
- * barrier. But Christoph Hellwig points out that we need a sync
- * *afterwards* as well: "Barriers specify no reordering to the front
- * or the back." And Jens Axboe confirmed it, so here we are:
- */
- if (out->type & VIRTIO_BLK_T_BARRIER)
- fdatasync(vblk->fd);
-
/* Finished that request. */
add_used(vq, head, wlen);
}
@@ -1736,8 +1725,8 @@ static void setup_block_file(const char *filename)
vblk->fd = open_or_die(filename, O_RDWR|O_LARGEFILE);
vblk->len = lseek64(vblk->fd, 0, SEEK_END);
- /* We support barriers. */
- add_feature(dev, VIRTIO_BLK_F_BARRIER);
+ /* We support FLUSH. */
+ add_feature(dev, VIRTIO_BLK_F_FLUSH);
/* Tell Guest how many sectors this device has. */
conf.capacity = cpu_to_le64(vblk->len / 512);
diff --git a/Documentation/powerpc/dts-bindings/fsl/usb.txt b/Documentation/powerpc/dts-bindings/fsl/usb.txt
index b0015240269..bd5723f0b67 100644
--- a/Documentation/powerpc/dts-bindings/fsl/usb.txt
+++ b/Documentation/powerpc/dts-bindings/fsl/usb.txt
@@ -8,6 +8,7 @@ and additions :
Required properties :
- compatible : Should be "fsl-usb2-mph" for multi port host USB
controllers, or "fsl-usb2-dr" for dual role USB controllers
+ or "fsl,mpc5121-usb2-dr" for dual role USB controllers of MPC5121
- phy_type : For multi port host USB controllers, should be one of
"ulpi", or "serial". For dual role USB controllers, should be
one of "ulpi", "utmi", "utmi_wide", or "serial".
@@ -33,6 +34,12 @@ Recommended properties :
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
+Optional properties :
+ - fsl,invert-drvvbus : boolean; for MPC5121 USB0 only. Indicates the
+ port power polarity of internal PHY signal DRVVBUS is inverted.
+ - fsl,invert-pwr-fault : boolean; for MPC5121 USB0 only. Indicates
+ the PWR_FAULT signal polarity is inverted.
+
Example multi port host USB controller device node :
usb@22000 {
compatible = "fsl-usb2-mph";
@@ -57,3 +64,18 @@ Example dual role USB controller device node :
dr_mode = "otg";
phy = "ulpi";
};
+
+Example dual role USB controller device node for MPC5121ADS:
+
+ usb@4000 {
+ compatible = "fsl,mpc5121-usb2-dr";
+ reg = <0x4000 0x1000>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ interrupt-parent = < &ipic >;
+ interrupts = <44 0x8>;
+ dr_mode = "otg";
+ phy_type = "utmi_wide";
+ fsl,invert-drvvbus;
+ fsl,invert-pwr-fault;
+ };
diff --git a/Documentation/scsi/st.txt b/Documentation/scsi/st.txt
index 40752602c05..691ca292c24 100644
--- a/Documentation/scsi/st.txt
+++ b/Documentation/scsi/st.txt
@@ -2,7 +2,7 @@ This file contains brief information about the SCSI tape driver.
The driver is currently maintained by Kai Mäkisara (email
Kai.Makisara@kolumbus.fi)
-Last modified: Sun Feb 24 21:59:07 2008 by kai.makisara
+Last modified: Sun Aug 29 18:25:47 2010 by kai.makisara
BASICS
@@ -85,6 +85,17 @@ writing and the last operation has been a write. Two filemarks can be
optionally written. In both cases end of data is signified by
returning zero bytes for two consecutive reads.
+Writing filemarks without the immediate bit set in the SCSI command block acts
+as a synchronization point, i.e., all remaining data form the drive buffers is
+written to tape before the command returns. This makes sure that write errors
+are caught at that point, but this takes time. In some applications, several
+consecutive files must be written fast. The MTWEOFI operation can be used to
+write the filemarks without flushing the drive buffer. Writing filemark at
+close() is always flushing the drive buffers. However, if the previous
+operation is MTWEOFI, close() does not write a filemark. This can be used if
+the program wants to close/open the tape device between files and wants to
+skip waiting.
+
If rewind, offline, bsf, or seek is done and previous tape operation was
write, a filemark is written before moving tape.
@@ -301,6 +312,8 @@ MTBSR Space backward over count records.
MTFSS Space forward over count setmarks.
MTBSS Space backward over count setmarks.
MTWEOF Write count filemarks.
+MTWEOFI Write count filemarks with immediate bit set (i.e., does not
+ wait until data is on tape)
MTWSM Write count setmarks.
MTREW Rewind tape.
MTOFFL Set device off line (often rewind plus eject).
diff --git a/Documentation/usb/proc_usb_info.txt b/Documentation/usb/proc_usb_info.txt
index fafcd472326..afe596d5f20 100644
--- a/Documentation/usb/proc_usb_info.txt
+++ b/Documentation/usb/proc_usb_info.txt
@@ -1,12 +1,17 @@
/proc/bus/usb filesystem output
===============================
-(version 2003.05.30)
+(version 2010.09.13)
The usbfs filesystem for USB devices is traditionally mounted at
/proc/bus/usb. It provides the /proc/bus/usb/devices file, as well as
the /proc/bus/usb/BBB/DDD files.
+In many modern systems the usbfs filsystem isn't used at all. Instead
+USB device nodes are created under /dev/usb/ or someplace similar. The
+"devices" file is available in debugfs, typically as
+/sys/kernel/debug/usb/devices.
+
**NOTE**: If /proc/bus/usb appears empty, and a host controller
driver has been linked, then you need to mount the
@@ -106,8 +111,8 @@ Legend:
Topology info:
-T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=ddd MxCh=dd
-| | | | | | | | |__MaxChildren
+T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=dddd MxCh=dd
+| | | | | | | | |__MaxChildren
| | | | | | | |__Device Speed in Mbps
| | | | | | |__DeviceNumber
| | | | | |__Count of devices at this level
@@ -120,8 +125,13 @@ T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=ddd MxCh=dd
Speed may be:
1.5 Mbit/s for low speed USB
12 Mbit/s for full speed USB
- 480 Mbit/s for high speed USB (added for USB 2.0)
+ 480 Mbit/s for high speed USB (added for USB 2.0);
+ also used for Wireless USB, which has no fixed speed
+ 5000 Mbit/s for SuperSpeed USB (added for USB 3.0)
+ For reasons lost in the mists of time, the Port number is always
+ too low by 1. For example, a device plugged into port 4 will
+ show up with "Port=03".
Bandwidth info:
B: Alloc=ddd/ddd us (xx%), #Int=ddd, #Iso=ddd
@@ -291,7 +301,7 @@ Here's an example, from a system which has a UHCI root hub,
an external hub connected to the root hub, and a mouse and
a serial converter connected to the external hub.
-T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
+T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
B: Alloc= 28/900 us ( 3%), #Int= 2, #Iso= 0
D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=0000 ProdID=0000 Rev= 0.00
@@ -301,21 +311,21 @@ C:* #Ifs= 1 Cfg#= 1 Atr=40 MxPwr= 0mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=255ms
-T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
+T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=0451 ProdID=1446 Rev= 1.00
C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=255ms
-T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
+T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=04b4 ProdID=0001 Rev= 0.00
C:* #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=mouse
E: Ad=81(I) Atr=03(Int.) MxPS= 3 Ivl= 10ms
-T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
+T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=0565 ProdID=0001 Rev= 1.08
S: Manufacturer=Peracom Networks, Inc.
@@ -330,12 +340,12 @@ E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl= 8ms
Selecting only the "T:" and "I:" lines from this (for example, by using
"procusb ti"), we have:
-T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
-T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
+T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
+T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
-T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
+T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=mouse
-T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
+T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
I: If#= 0 Alt= 0 #EPs= 3 Cls=00(>ifc ) Sub=00 Prot=00 Driver=serial
diff --git a/Documentation/workqueue.txt b/Documentation/workqueue.txt
index e4498a2872c..996a27d9b8d 100644
--- a/Documentation/workqueue.txt
+++ b/Documentation/workqueue.txt
@@ -196,11 +196,11 @@ resources, scheduled and executed.
suspend operations. Work items on the wq are drained and no
new work item starts execution until thawed.
- WQ_RESCUER
+ WQ_MEM_RECLAIM
All wq which might be used in the memory reclaim paths _MUST_
- have this flag set. This reserves one worker exclusively for
- the execution of this wq under memory pressure.
+ have this flag set. The wq is guaranteed to have at least one
+ execution context regardless of memory pressure.
WQ_HIGHPRI
@@ -356,11 +356,11 @@ If q1 has WQ_CPU_INTENSIVE set,
6. Guidelines
-* Do not forget to use WQ_RESCUER if a wq may process work items which
- are used during memory reclaim. Each wq with WQ_RESCUER set has one
- rescuer thread reserved for it. If there is dependency among
- multiple work items used during memory reclaim, they should be
- queued to separate wq each with WQ_RESCUER.
+* Do not forget to use WQ_MEM_RECLAIM if a wq may process work items
+ which are used during memory reclaim. Each wq with WQ_MEM_RECLAIM
+ set has an execution context reserved for it. If there is
+ dependency among multiple work items used during memory reclaim,
+ they should be queued to separate wq each with WQ_MEM_RECLAIM.
* Unless strict ordering is required, there is no need to use ST wq.
@@ -368,12 +368,13 @@ If q1 has WQ_CPU_INTENSIVE set,
recommended. In most use cases, concurrency level usually stays
well under the default limit.
-* A wq serves as a domain for forward progress guarantee (WQ_RESCUER),
- flush and work item attributes. Work items which are not involved
- in memory reclaim and don't need to be flushed as a part of a group
- of work items, and don't require any special attribute, can use one
- of the system wq. There is no difference in execution
- characteristics between using a dedicated wq and a system wq.
+* A wq serves as a domain for forward progress guarantee
+ (WQ_MEM_RECLAIM, flush and work item attributes. Work items which
+ are not involved in memory reclaim and don't need to be flushed as a
+ part of a group of work items, and don't require any special
+ attribute, can use one of the system wq. There is no difference in
+ execution characteristics between using a dedicated wq and a system
+ wq.
* Unless work items are expected to consume a huge amount of CPU
cycles, using a bound wq is usually beneficial due to the increased