diff options
Diffstat (limited to 'Documentation')
28 files changed, 534 insertions, 166 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index 34f51100f02..dff1f48d252 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -210,3 +210,15 @@ Users: firmware assigned instance number of the PCI device that can help in understanding the firmware intended order of the PCI device. + +What: /sys/bus/pci/devices/.../d3cold_allowed +Date: July 2012 +Contact: Huang Ying <ying.huang@intel.com> +Description: + d3cold_allowed is bit to control whether the corresponding PCI + device can be put into D3Cold state. If it is cleared, the + device will never be put into D3Cold state. If it is set, the + device may be put into D3Cold state if other requirements are + satisfied too. Reading this attribute will show the current + value of d3cold_allowed bit. Writing this attribute will set + the value of d3cold_allowed bit. diff --git a/Documentation/ABI/testing/sysfs-class-regulator b/Documentation/ABI/testing/sysfs-class-regulator index e091fa87379..bc578bc6062 100644 --- a/Documentation/ABI/testing/sysfs-class-regulator +++ b/Documentation/ABI/testing/sysfs-class-regulator @@ -349,3 +349,24 @@ Description: This will be one of the same strings reported by the "state" attribute. + +What: /sys/class/regulator/.../bypass +Date: September 2012 +KernelVersion: 3.7 +Contact: Mark Brown <broonie@opensource.wolfsonmicro.com> +Description: + Some regulator directories will contain a field called + bypass. This indicates if the device is in bypass mode. + + This will be one of the following strings: + + 'enabled' + 'disabled' + 'unknown' + + 'enabled' means the regulator is in bypass mode. + + 'disabled' means that the regulator is regulating. + + 'unknown' means software cannot determine the state, or + the reported state is invalid. diff --git a/Documentation/ABI/testing/sysfs-driver-wacom b/Documentation/ABI/testing/sysfs-driver-wacom index 8d55a83d692..7fc781048b7 100644 --- a/Documentation/ABI/testing/sysfs-driver-wacom +++ b/Documentation/ABI/testing/sysfs-driver-wacom @@ -1,3 +1,16 @@ +WWhat: /sys/class/hidraw/hidraw*/device/oled*_img +Date: June 2012 +Contact: linux-bluetooth@vger.kernel.org +Description: + The /sys/class/hidraw/hidraw*/device/oled*_img files control + OLED mocro displays on Intuos4 Wireless tablet. Accepted image + has to contain 256 bytes (64x32 px 1 bit colour). The format + is the same as PBM image 62x32px without header (64 bits per + horizontal line, 32 lines). An example of setting OLED No. 0: + dd bs=256 count=1 if=img_file of=[path to oled0_img]/oled0_img + The attribute is read only and no local copy of the image is + stored. + What: /sys/class/hidraw/hidraw*/device/speed Date: April 2010 Kernel Version: 2.6.35 diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index fc103d7a047..cdb20d41a44 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -310,6 +310,12 @@ over a rather long period of time, but improvements are always welcome! code under the influence of preempt_disable(), you instead need to use synchronize_irq() or synchronize_sched(). + This same limitation also applies to synchronize_rcu_bh() + and synchronize_srcu(), as well as to the asynchronous and + expedited forms of the three primitives, namely call_rcu(), + call_rcu_bh(), call_srcu(), synchronize_rcu_expedited(), + synchronize_rcu_bh_expedited(), and synchronize_srcu_expedited(). + 12. Any lock acquired by an RCU callback must be acquired elsewhere with softirq disabled, e.g., via spin_lock_irqsave(), spin_lock_bh(), etc. Failing to disable irq on a given diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 523364e4e1f..1927151b386 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -99,7 +99,7 @@ In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is printed: INFO: rcu_preempt detected stall on CPU - 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1 + 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer not pending (t=65000 jiffies) The "(64628 ticks this GP)" indicates that this CPU has taken more @@ -116,13 +116,13 @@ number between the two "/"s is the value of the nesting, which will be a small positive number if in the idle loop and a very large positive number (as shown above) otherwise. -For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the -CPU is not in the process of trying to force itself into dyntick-idle -state, the "." indicates that the CPU has not given up forcing RCU -into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1" -indicates that the CPU has not recented forced RCU into dyntick-idle -mode (it would otherwise indicate the number of microseconds remaining -in this forced state). +For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the CPU is +not in the process of trying to force itself into dyntick-idle state, the +"." indicates that the CPU has not given up forcing RCU into dyntick-idle +mode (it would be "H" otherwise), and the "timer not pending" indicates +that the CPU has not recently forced RCU into dyntick-idle mode (it +would otherwise indicate the number of microseconds remaining in this +forced state). Multiple Warnings From One Stall diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index f6f15ce3990..672d1908325 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -333,23 +333,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct The output of "cat rcu/rcu_pending" looks as follows: rcu_sched: - 0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 - 1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 - 2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 - 3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 - 4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 - 5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 - 6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 - 7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 + 0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nn=146741 + 1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nn=155792 + 2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nn=136629 + 3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nn=137723 + 4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nn=123110 + 5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nn=137456 + 6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nn=120834 + 7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nn=144888 rcu_bh: - 0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 - 1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 - 2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 - 3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 - 4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 - 5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 - 6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 - 7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 + 0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nn=145314 + 1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nn=143180 + 2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nn=117936 + 3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nn=134863 + 4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nn=110671 + 5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nn=133235 + 6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nn=110921 + 7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nn=118542 As always, this is once again split into "rcu_sched" and "rcu_bh" portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional @@ -377,17 +377,6 @@ o "gpc" is the number of times that an old grace period had o "gps" is the number of times that a new grace period had started, but this CPU was not yet aware of it. -o "nf" is the number of times that this CPU suspected that the - current grace period had run for too long, and thus needed to - be forced. - - Please note that "forcing" consists of sending resched IPIs - to holdout CPUs. If that CPU really still is in an old RCU - read-side critical section, then we really do have to wait for it. - The assumption behing "forcing" is that the CPU is not still in - an old RCU read-side critical section, but has not yet responded - for some other reason. - o "nn" is the number of times that this CPU needed nothing. Alert readers will note that the rcu "nn" number for a given CPU very closely matches the rcu_bh "np" number for that same CPU. This diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 69ee188515e..bf0f6de2aa0 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -873,7 +873,7 @@ d. Do you need to treat NMI handlers, hardirq handlers, and code segments with preemption disabled (whether via preempt_disable(), local_irq_save(), local_bh_disable(), or some other mechanism) as if they were explicit RCU readers? - If so, you need RCU-sched. + If so, RCU-sched is the only choice that will work for you. e. Do you need RCU grace periods to complete even in the face of softirq monopolization of one or more of the CPUs? For @@ -884,7 +884,12 @@ f. Is your workload too update-intensive for normal use of RCU, but inappropriate for other synchronization mechanisms? If so, consider SLAB_DESTROY_BY_RCU. But please be careful! -g. Otherwise, use RCU. +g. Do you need read-side critical sections that are respected + even though they are in the middle of the idle loop, during + user-mode execution, or on an offlined CPU? If so, SRCU is the + only choice that will work for you. + +h. Otherwise, use RCU. Of course, this all assumes that you have determined that RCU is in fact the right tool for your job. diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index f6318f6d7ba..6f706aca204 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c @@ -98,10 +98,9 @@ static int create_nl_socket(int protocol) if (rcvbufsz) if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcvbufsz, sizeof(rcvbufsz)) < 0) { - fprintf(stderr, "Unable to set socket rcv buf size " - "to %d\n", + fprintf(stderr, "Unable to set socket rcv buf size to %d\n", rcvbufsz); - return -1; + goto error; } memset(&local, 0, sizeof(local)); diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt new file mode 100644 index 00000000000..9c4d388dadd --- /dev/null +++ b/Documentation/arm64/booting.txt @@ -0,0 +1,152 @@ + Booting AArch64 Linux + ===================== + +Author: Will Deacon <will.deacon@arm.com> +Date : 07 September 2012 + +This document is based on the ARM booting document by Russell King and +is relevant to all public releases of the AArch64 Linux kernel. + +The AArch64 exception model is made up of a number of exception levels +(EL0 - EL3), with EL0 and EL1 having a secure and a non-secure +counterpart. EL2 is the hypervisor level and exists only in non-secure +mode. EL3 is the highest priority level and exists only in secure mode. + +For the purposes of this document, we will use the term `boot loader' +simply to define all software that executes on the CPU(s) before control +is passed to the Linux kernel. This may include secure monitor and +hypervisor code, or it may just be a handful of instructions for +preparing a minimal boot environment. + +Essentially, the boot loader should provide (as a minimum) the +following: + +1. Setup and initialise the RAM +2. Setup the device tree +3. Decompress the kernel image +4. Call the kernel image + + +1. Setup and initialise RAM +--------------------------- + +Requirement: MANDATORY + +The boot loader is expected to find and initialise all RAM that the +kernel will use for volatile data storage in the system. It performs +this in a machine dependent manner. (It may use internal algorithms +to automatically locate and size all RAM, or it may use knowledge of +the RAM in the machine, or any other method the boot loader designer +sees fit.) + + +2. Setup the device tree +------------------------- + +Requirement: MANDATORY + +The device tree blob (dtb) must be no bigger than 2 megabytes in size +and placed at a 2-megabyte boundary within the first 512 megabytes from +the start of the kernel image. This is to allow the kernel to map the +blob using a single section mapping in the initial page tables. + + +3. Decompress the kernel image +------------------------------ + +Requirement: OPTIONAL + +The AArch64 kernel does not currently provide a decompressor and +therefore requires decompression (gzip etc.) to be performed by the boot +loader if a compressed Image target (e.g. Image.gz) is used. For +bootloaders that do not implement this requirement, the uncompressed +Image target is available instead. + + +4. Call the kernel image +------------------------ + +Requirement: MANDATORY + +The decompressed kernel image contains a 32-byte header as follows: + + u32 magic = 0x14000008; /* branch to stext, little-endian */ + u32 res0 = 0; /* reserved */ + u64 text_offset; /* Image load offset */ + u64 res1 = 0; /* reserved */ + u64 res2 = 0; /* reserved */ + +The image must be placed at the specified offset (currently 0x80000) +from the start of the system RAM and called there. The start of the +system RAM must be aligned to 2MB. + +Before jumping into the kernel, the following conditions must be met: + +- Quiesce all DMA capable devices so that memory does not get + corrupted by bogus network packets or disk data. This will save + you many hours of debug. + +- Primary CPU general-purpose register settings + x0 = physical address of device tree blob (dtb) in system RAM. + x1 = 0 (reserved for future use) + x2 = 0 (reserved for future use) + x3 = 0 (reserved for future use) + +- CPU mode + All forms of interrupts must be masked in PSTATE.DAIF (Debug, SError, + IRQ and FIQ). + The CPU must be in either EL2 (RECOMMENDED in order to have access to + the virtualisation extensions) or non-secure EL1. + +- Caches, MMUs + The MMU must be off. + Instruction cache may be on or off. + Data cache must be off and invalidated. + External caches (if present) must be configured and disabled. + +- Architected timers + CNTFRQ must be programmed with the timer frequency. + If entering the kernel at EL1, CNTHCTL_EL2 must have EL1PCTEN (bit 0) + set where available. + +- Coherency + All CPUs to be booted by the kernel must be part of the same coherency + domain on entry to the kernel. This may require IMPLEMENTATION DEFINED + initialisation to enable the receiving of maintenance operations on + each CPU. + +- System registers + All writable architected system registers at the exception level where + the kernel image will be entered must be initialised by software at a + higher exception level to prevent execution in an UNKNOWN state. + +The boot loader is expected to enter the kernel on each CPU in the +following manner: + +- The primary CPU must jump directly to the first instruction of the + kernel image. The device tree blob passed by this CPU must contain + for each CPU node: + + 1. An 'enable-method' property. Currently, the only supported value + for this field is the string "spin-table". + + 2. A 'cpu-release-addr' property identifying a 64-bit, + zero-initialised memory location. + + It is expected that the bootloader will generate these device tree + properties and insert them into the blob prior to kernel entry. + +- Any secondary CPUs must spin outside of the kernel in a reserved area + of memory (communicated to the kernel by a /memreserve/ region in the + device tree) polling their cpu-release-addr location, which must be + contained in the reserved region. A wfe instruction may be inserted + to reduce the overhead of the busy-loop and a sev will be issued by + the primary CPU. When a read of the location pointed to by the + cpu-release-addr returns a non-zero value, the CPU must jump directly + to this value. + +- Secondary CPU general-purpose register settings + x0 = 0 (reserved for future use) + x1 = 0 (reserved for future use) + x2 = 0 (reserved for future use) + x3 = 0 (reserved for future use) diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt new file mode 100644 index 00000000000..dbbdcbba75a --- /dev/null +++ b/Documentation/arm64/memory.txt @@ -0,0 +1,73 @@ + Memory Layout on AArch64 Linux + ============================== + +Author: Catalin Marinas <catalin.marinas@arm.com> +Date : 20 February 2012 + +This document describes the virtual memory layout used by the AArch64 +Linux kernel. The architecture allows up to 4 levels of translation +tables with a 4KB page size and up to 3 levels with a 64KB page size. + +AArch64 Linux uses 3 levels of translation tables with the 4KB page +configuration, allowing 39-bit (512GB) virtual addresses for both user +and kernel. With 64KB pages, only 2 levels of translation tables are +used but the memory layout is the same. + +User addresses have bits 63:39 set to 0 while the kernel addresses have +the same bits set to 1. TTBRx selection is given by bit 63 of the +virtual address. The swapper_pg_dir contains only kernel (global) +mappings while the user pgd contains only user (non-global) mappings. +The swapper_pgd_dir address is written to TTBR1 and never written to +TTBR0. + + +AArch64 Linux memory layout: + +Start End Size Use +----------------------------------------------------------------------- +0000000000000000 0000007fffffffff 512GB user + +ffffff8000000000 ffffffbbfffcffff ~240GB vmalloc + +ffffffbbfffd0000 ffffffbcfffdffff 64KB [guard page] + +ffffffbbfffe0000 ffffffbcfffeffff 64KB PCI I/O space + +ffffffbbffff0000 ffffffbcffffffff 64KB [guard page] + +ffffffbc00000000 ffffffbdffffffff 8GB vmemmap + +ffffffbe00000000 ffffffbffbffffff ~8GB [guard, future vmmemap] + +ffffffbffc000000 ffffffbfffffffff 64MB modules + +ffffffc000000000 ffffffffffffffff 256GB memory + + +Translation table lookup with 4KB pages: + ++--------+--------+--------+--------+--------+--------+--------+--------+ +|63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0| ++--------+--------+--------+--------+--------+--------+--------+--------+ + | | | | | | + | | | | | v + | | | | | [11:0] in-page offset + | | | | +-> [20:12] L3 index + | | | +-----------> [29:21] L2 index + | | +---------------------> [38:30] L1 index + | +-------------------------------> [47:39] L0 index (not used) + +-------------------------------------------------> [63] TTBR0/1 + + +Translation table lookup with 64KB pages: + ++--------+--------+--------+--------+--------+--------+--------+--------+ +|63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0| ++--------+--------+--------+--------+--------+--------+--------+--------+ + | | | | | + | | | | v + | | | | [15:0] in-page offset + | | | +----------> [28:16] L3 index + | | +--------------------------> [41:29] L2 index (only 38:29 used) + | +-------------------------------> [47:42] L1 index (not used) + +-------------------------------------------------> [63] TTBR0/1 diff --git a/Documentation/block/00-INDEX b/Documentation/block/00-INDEX index d111e3b23db..d18ecd827c4 100644 --- a/Documentation/block/00-INDEX +++ b/Documentation/block/00-INDEX @@ -3,15 +3,21 @@ biodoc.txt - Notes on the Generic Block Layer Rewrite in Linux 2.5 capability.txt - - Generic Block Device Capability (/sys/block/<disk>/capability) + - Generic Block Device Capability (/sys/block/<device>/capability) +cfq-iosched.txt + - CFQ IO scheduler tunables +data-integrity.txt + - Block data integrity deadline-iosched.txt - Deadline IO scheduler tunables ioprio.txt - Block io priorities (in CFQ scheduler) +queue-sysfs.txt + - Queue's sysfs entries request.txt - The members of struct request (in include/linux/blkdev.h) stat.txt - - Block layer statistics in /sys/block/<dev>/stat + - Block layer statistics in /sys/block/<device>/stat switching-sched.txt - Switching I/O schedulers at runtime writeback_cache_control.txt diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt index 6d670f57045..d89b4fe724d 100644 --- a/Documentation/block/cfq-iosched.txt +++ b/Documentation/block/cfq-iosched.txt @@ -1,3 +1,14 @@ +CFQ (Complete Fairness Queueing) +=============================== + +The main aim of CFQ scheduler is to provide a fair allocation of the disk +I/O bandwidth for all the processes which requests an I/O operation. + +CFQ maintains the per process queue for the processes which request I/O +operation(syncronous requests). In case of asynchronous requests, all the +requests from all the processes are batched together according to their +process's I/O priority. + CFQ ioscheduler tunables ======================== @@ -25,6 +36,72 @@ there are multiple spindles behind single LUN (Host based hardware RAID controller or for storage arrays), setting slice_idle=0 might end up in better throughput and acceptable latencies. +back_seek_max +------------- +This specifies, given in Kbytes, the maximum "distance" for backward seeking. +The distance is the amount of space from the current head location to the +sectors that are backward in terms of distance. + +This parameter allows the scheduler to anticipate requests in the "backward" +direction and consider them as being the "next" if they are within this +distance from the current head location. + +back_seek_penalty +----------------- +This parameter is used to compute the cost of backward seeking. If the +backward distance of request is just 1/back_seek_penalty from a "front" +request, then the seeking cost of two requests is considered equivalent. + +So scheduler will not bias toward one or the other request (otherwise scheduler +will bias toward front request). Default value of back_seek_penalty is 2. + +fifo_expire_async +----------------- +This parameter is used to set the timeout of asynchronous requests. Default +value of this is 248ms. + +fifo_expire_sync +---------------- +This parameter is used to set the timeout of synchronous requests. Default +value of this is 124ms. In case to favor synchronous requests over asynchronous +one, this value should be decreased relative to fifo_expire_async. + +slice_async +----------- +This parameter is same as of slice_sync but for asynchronous queue. The +default value is 40ms. + +slice_async_rq +-------------- +This parameter is used to limit the dispatching of asynchronous request to +device request queue in queue's slice time. The maximum number of request that +are allowed to be dispatched also depends upon the io priority. Default value +for this is 2. + +slice_sync +---------- +When a queue is selected for execution, the queues IO requests are only +executed for a certain amount of time(time_slice) before switching to another +queue. This parameter is used to calculate the time slice of synchronous +queue. + +time_slice is computed using the below equation:- +time_slice = slice_sync + (slice_sync/5 * (4 - prio)). To increase the +time_slice of synchronous queue, increase the value of slice_sync. Default +value is 100ms. + +quantum +------- +This specifies the number of request dispatched to the device queue. In a +queue's time slice, a request will not be dispatched if the number of request +in the device exceeds this parameter. This parameter is used for synchronous +request. + +In case of storage with several disk, this setting can limit the parallel +processing of request. Therefore, increasing the value can imporve the +performace although this can cause the latency of some I/O to increase due +to more number of requests. + CFQ IOPS Mode for group scheduling =================================== Basic CFQ design is to provide priority based time slices. Higher priority diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt index 6518a55273e..e54ac1d5340 100644 --- a/Documentation/block/queue-sysfs.txt +++ b/Documentation/block/queue-sysfs.txt @@ -9,20 +9,71 @@ These files are the ones found in the /sys/block/xxx/queue/ directory. Files denoted with a RO postfix are readonly and the RW postfix means read-write. +add_random (RW) +---------------- +This file allows to trun off the disk entropy contribution. Default +value of this file is '1'(on). + +discard_granularity (RO) +----------------------- +This shows the size of internal allocation of the device in bytes, if +reported by the device. A value of '0' means device does not support +the discard functionality. + +discard_max_bytes (RO) +---------------------- +Devices that support discard functionality may have internal limits on +the number of bytes that can be trimmed or unmapped in a single operation. +The discard_max_bytes parameter is set by the device driver to the maximum +number of bytes that can be discarded in a single operation. Discard +requests issued to the device must not exceed this limit. A discard_max_bytes +value of 0 means that the device does not support discard functionality. + +discard_zeroes_data (RO) +------------------------ +When read, this file will show if the discarded block are zeroed by the +device or not. If its value is '1' the blocks are zeroed otherwise not. + hw_sector_size (RO) ------------------- This is the hardware sector size of the device, in bytes. +iostats (RW) +------------- +This file is used to control (on/off) the iostats accounting of the +disk. + +logical_block_size (RO) +----------------------- +This is the logcal block size of the device, in bytes. + max_hw_sectors_kb (RO) ---------------------- This is the maximum number of kilobytes supported in a single data transfer. +max_integrity_segments (RO) +--------------------------- +When read, this file shows the max limit of integrity segments as +set by block layer which a hardware controller can handle. + max_sectors_kb (RW) ------------------- This is the maximum number of kilobytes that the block layer will allow for a filesystem request. Must be smaller than or equal to the maximum size allowed by the hardware. +max_segments (RO) +----------------- +Maximum number of segments of the device. + +max_segment_size (RO) +--------------------- +Maximum segment size of the device. + +minimum_io_size (RO) +-------------------- +This is the smallest preferred io size reported by the device. + nomerges (RW) ------------- This enables the user to disable the lookup logic involved with IO @@ -45,11 +96,24 @@ per-block-cgroup request pool. IOW, if there are N block cgroups, each request queue may have upto N request pools, each independently regulated by nr_requests. +optimal_io_size (RO) +-------------------- +This is the optimal io size reported by the device. + +physical_block_size (RO) +------------------------ +This is the physical block size of device, in bytes. + read_ahead_kb (RW) ------------------ Maximum number of kilobytes to read-ahead for filesystems on this block device. +rotational (RW) +--------------- +This file is used to stat if the device is of rotational type or +non-rotational type. + rq_affinity (RW) ---------------- If this option is '1', the block layer will migrate request completions to the diff --git a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt index 70cd49b1caa..1dd622546d0 100644 --- a/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt +++ b/Documentation/devicetree/bindings/mmc/fsl-imx-esdhc.txt @@ -10,8 +10,8 @@ Required properties: - compatible : Should be "fsl,<chip>-esdhc" Optional properties: -- fsl,cd-internal : Indicate to use controller internal card detection -- fsl,wp-internal : Indicate to use controller internal write protection +- fsl,cd-controller : Indicate to use controller internal card detection +- fsl,wp-controller : Indicate to use controller internal write protection Examples: @@ -19,8 +19,8 @@ esdhc@70004000 { compatible = "fsl,imx51-esdhc"; reg = <0x70004000 0x4000>; interrupts = <1>; - fsl,cd-internal; - fsl,wp-internal; + fsl,cd-controller; + fsl,wp-controller; }; esdhc@70008000 { diff --git a/Documentation/devicetree/bindings/regulator/regulator.txt b/Documentation/devicetree/bindings/regulator/regulator.txt index 66ece3f87bb..ecfc6ccd67e 100644 --- a/Documentation/devicetree/bindings/regulator/regulator.txt +++ b/Documentation/devicetree/bindings/regulator/regulator.txt @@ -11,10 +11,13 @@ Optional properties: - regulator-boot-on: bootloader/firmware enabled regulator - <name>-supply: phandle to the parent supply/regulator node - regulator-ramp-delay: ramp delay for regulator(in uV/uS) + +Deprecated properties: - regulator-compatible: If a regulator chip contains multiple regulators, and if the chip's binding contains a child node that describes each regulator, then this property indicates which regulator - this child node is intended to configure. + this child node is intended to configure. If this property is missing, + the node's name will be used instead. Example: diff --git a/Documentation/devicetree/bindings/regulator/tps65217.txt b/Documentation/devicetree/bindings/regulator/tps65217.txt index 0487e9675ba..d316fb895da 100644 --- a/Documentation/devicetree/bindings/regulator/tps65217.txt +++ b/Documentation/devicetree/bindings/regulator/tps65217.txt @@ -22,66 +22,49 @@ Example: compatible = "ti,tps65217"; regulators { - #address-cells = <1>; - #size-cells = <0>; - - dcdc1_reg: regulator@0 { - reg = <0>; - regulator-compatible = "dcdc1"; + dcdc1_reg: dcdc1 { regulator-min-microvolt = <900000>; regulator-max-microvolt = <1800000>; regulator-boot-on; regulator-always-on; }; - dcdc2_reg: regulator@1 { - reg = <1>; - regulator-compatible = "dcdc2"; + dcdc2_reg: dcdc2 { regulator-min-microvolt = <900000>; regulator-max-microvolt = <3300000>; regulator-boot-on; regulator-always-on; }; - dcdc3_reg: regulator@2 { - reg = <2>; - regulator-compatible = "dcdc3"; + dcdc3_reg: dcc3 { regulator-min-microvolt = <900000>; regulator-max-microvolt = <1500000>; regulator-boot-on; regulator-always-on; }; - ldo1_reg: regulator@3 { - reg = <3>; - regulator-compatible = "ldo1"; + ldo1_reg: ldo1 { regulator-min-microvolt = <1000000>; regulator-max-microvolt = <3300000>; regulator-boot-on; regulator-always-on; }; - ldo2_reg: regulator@4 { - reg = <4>; - regulator-compatible = "ldo2"; + ldo2_reg: ldo2 { regulator-min-microvolt = <900000>; regulator-max-microvolt = <3300000>; regulator-boot-on; regulator-always-on; }; - ldo3_reg: regulator@5 { - reg = <5>; - regulator-compatible = "ldo3"; + ldo3_reg: ldo3 { regulator-min-microvolt = <1800000>; regulator-max-microvolt = <3300000>; regulator-boot-on; regulator-always-on; }; - ldo4_reg: regulator@6 { - reg = <6>; - regulator-compatible = "ldo4"; + ldo4_reg: ldo4 { regulator-min-microvolt = <1800000>; regulator-max-microvolt = <3300000>; regulator-boot-on; diff --git a/Documentation/devicetree/bindings/regulator/tps6586x.txt b/Documentation/devicetree/bindings/regulator/tps6586x.txt index da80c2ae091..07b9ef6e49d 100644 --- a/Documentation/devicetree/bindings/regulator/tps6586x.txt +++ b/Documentation/devicetree/bindings/regulator/tps6586x.txt @@ -6,9 +6,13 @@ Required properties: - interrupts: the interrupt outputs of the controller - #gpio-cells: number of cells to describe a GPIO - gpio-controller: mark the device as a GPIO controller -- regulators: list of regulators provided by this controller, must have - property "regulator-compatible" to match their hardware counterparts: - sm[0-2], ldo[0-9] and ldo_rtc +- regulators: A node that houses a sub-node for each regulator within the + device. Each sub-node is identified using the node's name (or the deprecated + regulator-compatible property if present), with valid values listed below. + The content of each sub-node is defined by the standard binding for + regulators; see regulator.txt. + sys, sm[0-2], ldo[0-9] and ldo_rtc +- sys-supply: The input supply for SYS. - vin-sm0-supply: The input supply for the SM0. - vin-sm1-supply: The input supply for the SM1. - vin-sm2-supply: The input supply for the SM2. @@ -20,6 +24,9 @@ Required properties: Each regulator is defined using the standard binding for regulators. +Note: LDO5 and LDO_RTC is supplied by SYS regulator internally and driver + take care of making proper parent child relationship. + Example: pmu: tps6586x@34 { @@ -30,6 +37,7 @@ Example: #gpio-cells = <2>; gpio-controller; + sys-supply = <&some_reg>; vin-sm0-supply = <&some_reg>; vin-sm1-supply = <&some_reg>; vin-sm2-supply = <&some_reg>; @@ -40,103 +48,80 @@ Example: vinldo9-supply = <...>; regulators { - #address-cells = <1>; - #size-cells = <0>; + sys_reg: sys { + regulator-name = "vdd_sys"; + regulator-boot-on; + regulator-always-on; + }; - sm0_reg: regulator@0 { - reg = <0>; - regulator-compatible = "sm0"; + sm0_reg: sm0 { regulator-min-microvolt = < 725000>; regulator-max-microvolt = <1500000>; regulator-boot-on; regulator-always-on; }; - sm1_reg: regulator@1 { - reg = <1>; - regulator-compatible = "sm1"; + sm1_reg: sm1 { regulator-min-microvolt = < 725000>; regulator-max-microvolt = <1500000>; regulator-boot-on; regulator-always-on; }; - sm2_reg: regulator@2 { - reg = <2>; - regulator-compatible = "sm2"; + sm2_reg: sm2 { regulator-min-microvolt = <3000000>; regulator-max-microvolt = <4550000>; regulator-boot-on; regulator-always-on; }; - ldo0_reg: regulator@3 { - reg = <3>; - regulator-compatible = "ldo0"; + ldo0_reg: ldo0 { regulator-name = "PCIE CLK"; regulator-min-microvolt = <3300000>; regulator-max-microvolt = <3300000>; }; - ldo1_reg: regulator@4 { - reg = <4>; - regulator-compatible = "ldo1"; + ldo1_reg: ldo1 { regulator-min-microvolt = < 725000>; regulator-max-microvolt = <1500000>; }; - ldo2_reg: regulator@5 { - reg = <5>; - regulator-compatible = "ldo2"; + ldo2_reg: ldo2 { regulator-min-microvolt = < 725000>; regulator-max-microvolt = <1500000>; }; - ldo3_reg: regulator@6 { - reg = <6>; - regulator-compatible = "ldo3"; + ldo3_reg: ldo3 { regulator-min-microvolt = <1250000>; regulator-max-microvolt = <3300000>; }; - ldo4_reg: regulator@7 { - reg = <7>; - regulator-compatible = "ldo4"; + ldo4_reg: ldo4 { regulator-min-microvolt = <1700000>; regulator-max-microvolt = <2475000>; }; - ldo5_reg: regulator@8 { - reg = <8>; - regulator-compatible = "ldo5"; + ldo5_reg: ldo5 { regulator-min-microvolt = <1250000>; regulator-max-microvolt = <3300000>; }; - ldo6_reg: regulator@9 { - reg = <9>; - regulator-compatible = "ldo6"; + ldo6_reg: ldo6 { regulator-min-microvolt = <1250000>; regulator-max-microvolt = <3300000>; }; - ldo7_reg: regulator@10 { - reg = <10>; - regulator-compatible = "ldo7"; + ldo7_reg: ldo7 { regulator-min-microvolt = <1250000>; regulator-max-microvolt = <3300000>; }; - ldo8_reg: regulator@11 { - reg = <11>; - regulator-compatible = "ldo8"; + ldo8_reg: ldo8 { regulator-min-microvolt = <1250000>; regulator-max-microvolt = <3300000>; }; - ldo9_reg: regulator@12 { - reg = <12>; - regulator-compatible = "ldo9"; + ldo9_reg: ldo9 { regulator-min-microvolt = <1250000>; regulator-max-microvolt = <3300000>; }; diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 39462cf35cd..74c25c8d888 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -162,7 +162,6 @@ mach-types.h machtypes.h map map_hugetlb -maui_boot.h media mconf miboot* diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index afaff312bf4..bc55d38081d 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -253,38 +253,6 @@ Who: Dave Jones <davej@redhat.com>, Matthew Garrett <mjg@redhat.com> ----------------------------- -What: fakephp and associated sysfs files in /sys/bus/pci/slots/ -When: 2011 -Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to - represent a machine's physical PCI slots. The change in semantics - had userspace implications, as the hotplug core no longer allowed - drivers to create multiple sysfs files per physical slot (required - for multi-function devices, e.g.). fakephp was seen as a developer's - tool only, and its interface changed. Too late, we learned that - there were some users of the fakephp interface. - - In 2.6.30, the original fakephp interface was restored. At the same - time, the PCI core gained the ability that fakephp provided, namely - function-level hot-remove and hot-add. - - Since the PCI core now provides the same functionality, exposed in: - - /sys/bus/pci/rescan - /sys/bus/pci/devices/.../remove - /sys/bus/pci/devices/.../rescan - - there is no functional reason to maintain fakephp as well. - - We will keep the existing module so that 'modprobe fakephp' will - present the old /sys/bus/pci/slots/... interface for compatibility, - but users are urged to migrate their applications to the API above. - - After a reasonable transition period, we will remove the legacy - fakephp interface. -Who: Alex Chiang <achiang@hp.com> - ---------------------------- - What: CONFIG_RFKILL_INPUT When: 2.6.33 Why: Should be implemented in userspace, policy daemon. @@ -579,7 +547,7 @@ Why: KVM tracepoints provide mostly equivalent information in a much more ---------------------------- What: at91-mci driver ("CONFIG_MMC_AT91") -When: 3.7 +When: 3.8 Why: There are two mci drivers: at91-mci and atmel-mci. The PDC support was added to atmel-mci as a first step to support more chips. Then at91-mci was kept only for old IP versions (on at91rm9200 and diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index 615142da4ef..157416e78cc 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 @@ -21,6 +21,7 @@ Supported adapters: * Intel DH89xxCC (PCH) * Intel Panther Point (PCH) * Intel Lynx Point (PCH) + * Intel Lynx Point-LP (PCH) Datasheets: Publicly available at the Intel website On Intel Patsburg and later chipsets, both the normal host SMBus controller diff --git a/Documentation/ia64/aliasing-test.c b/Documentation/ia64/aliasing-test.c index 5caa2af3320..62a190d45f3 100644 --- a/Documentation/ia64/aliasing-test.c +++ b/Documentation/ia64/aliasing-test.c @@ -132,6 +132,7 @@ static int read_rom(char *path) rc = write(fd, "1", 2); if (rc <= 0) { + close(fd); perror("write"); return -1; } diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index ad7e2e5088c..df551dfa8e5 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1833,6 +1833,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. and restore using xsave. The kernel will fallback to enabling legacy floating-point and sse state. + eagerfpu= [X86] + on enable eager fpu restore + off disable eager fpu restore + auto selects the default scheme, which automatically + enables eagerfpu restore for xsaveopt. + nohlt [BUGS=ARM,SH] Tells the kernel that the sleep(SH) or wfi(ARM) instruction doesn't work correctly and not to use it. This is also useful when using JTAG debugger. @@ -2385,6 +2391,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. rcutree.rcu_cpu_stall_timeout= [KNL,BOOT] Set timeout for RCU CPU stall warning messages. + rcutree.jiffies_till_first_fqs= [KNL,BOOT] + Set delay from grace-period initialization to + first attempt to force quiescent states. + Units are jiffies, minimum value is zero, + and maximum value is HZ. + + rcutree.jiffies_till_next_fqs= [KNL,BOOT] + Set delay between subsequent attempts to force + quiescent states. Units are jiffies, minimum + value is one, and maximum value is HZ. + rcutorture.fqs_duration= [KNL,BOOT] Set duration of force_quiescent_state bursts. @@ -2638,9 +2655,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. smart2= [HW] Format: <io1>[,<io2>[,...,<io8>]] - smp-alt-once [X86-32,SMP] On a hotplug CPU system, only - attempt to substitute SMP alternatives once at boot. - smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port smsc-ircc2.ircc_sir= [HW] SIR base I/O port diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt index 92341b84250..0b4b63e7e9b 100644 --- a/Documentation/power/swsusp.txt +++ b/Documentation/power/swsusp.txt @@ -53,7 +53,7 @@ before suspend (it is limited to 500 MB by default). Article about goals and implementation of Software Suspend for Linux ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Author: Gábor Kuti +Author: Gábor Kuti Last revised: 2003-10-20 by Pavel Machek Idea and goals to achieve diff --git a/Documentation/scheduler/sched-arch.txt b/Documentation/scheduler/sched-arch.txt index 28aa1075e29..b1b8587b86f 100644 --- a/Documentation/scheduler/sched-arch.txt +++ b/Documentation/scheduler/sched-arch.txt @@ -17,16 +17,6 @@ you must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file Unlocked context switches introduce only a very minor performance penalty to the core scheduler implementation in the CONFIG_SMP case. -2. Interrupt status -By default, the switch_to arch function is called with interrupts -disabled. Interrupts may be enabled over the call if it is likely to -introduce a significant interrupt latency by adding the line -`#define __ARCH_WANT_INTERRUPTS_ON_CTXSW` in the same place as for -unlocked context switches. This define also implies -`__ARCH_WANT_UNLOCKED_CTXSW`. See arch/arm/include/asm/system.h for an -example. - - CPU idle ======== Your cpu_idle routines need to obey the following rules: diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt index d0d0bb9e3e2..d68ea5fc812 100644 --- a/Documentation/trace/kprobetrace.txt +++ b/Documentation/trace/kprobetrace.txt @@ -12,7 +12,7 @@ kprobes can probe (this means, all functions body except for __kprobes functions). Unlike the Tracepoint based event, this can be added and removed dynamically, on the fly. -To enable this feature, build your kernel with CONFIG_KPROBE_TRACING=y. +To enable this feature, build your kernel with CONFIG_KPROBE_EVENT=y. Similar to the events tracer, this doesn't need to be activated via current_tracer. Instead of that, add probe points via diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt index 0cb6685c802..8eda3635a17 100644 --- a/Documentation/vfio.txt +++ b/Documentation/vfio.txt @@ -133,7 +133,7 @@ character devices for this group: $ lspci -n -s 0000:06:0d.0 06:0d.0 0401: 1102:0002 (rev 08) # echo 0000:06:0d.0 > /sys/bus/pci/devices/0000:06:0d.0/driver/unbind -# echo 1102 0002 > /sys/bus/pci/drivers/vfio/new_id +# echo 1102 0002 > /sys/bus/pci/drivers/vfio-pci/new_id Now we need to look at what other devices are in the group to free it for use by VFIO: diff --git a/Documentation/watchdog/src/watchdog-test.c b/Documentation/watchdog/src/watchdog-test.c index 73ff5cc93e0..3da822967ee 100644 --- a/Documentation/watchdog/src/watchdog-test.c +++ b/Documentation/watchdog/src/watchdog-test.c @@ -31,7 +31,7 @@ static void keep_alive(void) * or "-e" to enable the card. */ -void term(int sig) +static void term(int sig) { close(fd); fprintf(stderr, "Stopping watchdog ticks...\n"); diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index c54b4f503e2..de38429beb7 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -50,6 +50,13 @@ Machine check monarchtimeout: Sets the time in us to wait for other CPUs on machine checks. 0 to disable. + mce=bios_cmci_threshold + Don't overwrite the bios-set CMCI threshold. This boot option + prevents Linux from overwriting the CMCI threshold set by the + bios. Without this option, Linux always sets the CMCI + threshold to 1. Enabling this may make memory predictive failure + analysis less effective if the bios sets thresholds for memory + errors since we will not see details for all errors. nomce (for compatibility with i386): same as mce=off |