diff options
Diffstat (limited to 'Documentation')
78 files changed, 2575 insertions, 691 deletions
diff --git a/Documentation/.gitignore b/Documentation/.gitignore new file mode 100644 index 00000000000..bcd907b4141 --- /dev/null +++ b/Documentation/.gitignore @@ -0,0 +1,7 @@ +filesystems/dnotify_test +laptops/dslm +timers/hpet_example +vm/hugepage-mmap +vm/hugepage-shm +vm/map_hugetlb + diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index dd10b51b4e6..5405f7aecef 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -32,8 +32,6 @@ DocBook/ - directory with DocBook templates etc. for kernel documentation. HOWTO - the process and procedures of how to do Linux kernel development. -IO-mapping.txt - - how to access I/O mapped memory from within device drivers. IPMI.txt - info on Linux Intelligent Platform Management Interface (IPMI) Driver. IRQ-affinity.txt @@ -84,6 +82,8 @@ blockdev/ - info on block devices & drivers btmrvl.txt - info on Marvell Bluetooth driver usage. +bus-virt-phys-mapping.txt + - how to access I/O mapped memory from within device drivers. cachetlb.txt - describes the cache/TLB flushing interfaces Linux uses. cdrom/ @@ -168,6 +168,8 @@ initrd.txt - how to use the RAM disk as an initial/temporary root filesystem. input/ - info on Linux input device support. +io-mapping.txt + - description of io_mapping functions in linux/io-mapping.h io_ordering.txt - info on ordering I/O writes to memory-mapped addresses. ioctl/ diff --git a/Documentation/ABI/testing/debugfs-ec b/Documentation/ABI/testing/debugfs-ec new file mode 100644 index 00000000000..6546115a94d --- /dev/null +++ b/Documentation/ABI/testing/debugfs-ec @@ -0,0 +1,20 @@ +What: /sys/kernel/debug/ec/*/{gpe,use_global_lock,io} +Date: July 2010 +Contact: Thomas Renninger <trenn@suse.de> +Description: + +General information like which GPE is assigned to the EC and whether +the global lock should get used. +Knowing the EC GPE one can watch the amount of HW events related to +the EC here (XY -> GPE number from /sys/kernel/debug/ec/*/gpe): +/sys/firmware/acpi/interrupts/gpeXY + +The io file is binary and a userspace tool located here: +ftp://ftp.suse.com/pub/people/trenn/sources/ec/ +should get used to read out the 256 Embedded Controller registers +or writing to them. + +CAUTION: Do not write to the Embedded Controller if you don't know +what you are doing! Rebooting afterwards also is a good idea. +This can influence the way your machine is cooled and fans may +not get switched on again after you did a wrong write. diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index 428676cfa61..25be3250f7d 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -133,46 +133,6 @@ Description: The symbolic link points to the PCI device sysfs entry of the Physical Function this device associates with. - -What: /sys/bus/pci/slots/... -Date: April 2005 (possibly older) -KernelVersion: 2.6.12 (possibly older) -Contact: linux-pci@vger.kernel.org -Description: - When the appropriate driver is loaded, it will create a - directory per claimed physical PCI slot in - /sys/bus/pci/slots/. The names of these directories are - specific to the driver, which in turn, are specific to the - platform, but in general, should match the label on the - machine's physical chassis. - - The drivers that can create slot directories include the - PCI hotplug drivers, and as of 2.6.27, the pci_slot driver. - - The slot directories contain, at a minimum, a file named - 'address' which contains the PCI bus:device:function tuple. - Other files may appear as well, but are specific to the - driver. - -What: /sys/bus/pci/slots/.../function[0-7] -Date: March 2010 -KernelVersion: 2.6.35 -Contact: linux-pci@vger.kernel.org -Description: - If PCI slot directories (as described above) are created, - and the physical slot is actually populated with a device, - symbolic links in the slot directory pointing to the - device's PCI functions are created as well. - -What: /sys/bus/pci/devices/.../slot -Date: March 2010 -KernelVersion: 2.6.35 -Contact: linux-pci@vger.kernel.org -Description: - If PCI slot directories (as described above) are created, - a symbolic link pointing to the slot directory will be - created as well. - What: /sys/bus/pci/slots/.../module Date: June 2009 Contact: linux-pci@vger.kernel.org diff --git a/Documentation/ABI/testing/sysfs-firmware-sfi b/Documentation/ABI/testing/sysfs-firmware-sfi new file mode 100644 index 00000000000..4be7d44aeac --- /dev/null +++ b/Documentation/ABI/testing/sysfs-firmware-sfi @@ -0,0 +1,15 @@ +What: /sys/firmware/sfi/tables/ +Date: May 2010 +Contact: Len Brown <lenb@kernel.org> +Description: + SFI defines a number of small static memory tables + so the kernel can get platform information from firmware. + + The tables are defined in the latest SFI specification: + http://simplefirmware.org/documentation + + While the tables are used by the kernel, user-space + can observe them this way: + + # cd /sys/firmware/sfi/tables + # cat $TABLENAME > $TABLENAME.bin diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index d6a801f45b4..2875f1f74a0 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -114,3 +114,18 @@ Description: if this file contains "1", which is the default. It may be disabled by writing "0" to this file, in which case all devices will be suspended and resumed synchronously. + +What: /sys/power/wakeup_count +Date: July 2010 +Contact: Rafael J. Wysocki <rjw@sisk.pl> +Description: + The /sys/power/wakeup_count file allows user space to put the + system into a sleep state while taking into account the + concurrent arrival of wakeup events. Reading from it returns + the current number of registered wakeup events and it blocks if + some wakeup events are being processed at the time the file is + read from. Writing to it will only succeed if the current + number of wakeup events is equal to the written value and, if + successful, will make the kernel abort a subsequent transition + to a sleep state if any wakeup events are reported after the + write has returned. diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/DMA-API-HOWTO.txt index 2e435adfbd6..98ce51796f7 100644 --- a/Documentation/DMA-API-HOWTO.txt +++ b/Documentation/DMA-API-HOWTO.txt @@ -639,6 +639,36 @@ is planned to completely remove virt_to_bus() and bus_to_virt() as they are entirely deprecated. Some ports already do not provide these as it is impossible to correctly support them. + Handling Errors + +DMA address space is limited on some architectures and an allocation +failure can be determined by: + +- checking if dma_alloc_coherent returns NULL or dma_map_sg returns 0 + +- checking the returned dma_addr_t of dma_map_single and dma_map_page + by using dma_mapping_error(): + + dma_addr_t dma_handle; + + dma_handle = dma_map_single(dev, addr, size, direction); + if (dma_mapping_error(dev, dma_handle)) { + /* + * reduce current DMA mapping usage, + * delay and try again later or + * reset driver. + */ + } + +Networking drivers must call dev_kfree_skb to free the socket buffer +and return NETDEV_TX_OK if the DMA mapping fails on the transmit hook +(ndo_start_xmit). This means that the socket buffer is just dropped in +the failure case. + +SCSI drivers must return SCSI_MLQUEUE_HOST_BUSY if the DMA mapping +fails in the queuecommand hook. This means that the SCSI subsystem +passes the command to the driver again later. + Optimizing Unmap State Space Consumption On many platforms, dma_unmap_{single,page}() is simply a nop. @@ -703,42 +733,25 @@ to "Closing". 1) Struct scatterlist requirements. - Struct scatterlist must contain, at a minimum, the following - members: - - struct page *page; - unsigned int offset; - unsigned int length; - - The base address is specified by a "page+offset" pair. - - Previous versions of struct scatterlist contained a "void *address" - field that was sometimes used instead of page+offset. As of Linux - 2.5., page+offset is always used, and the "address" field has been - deleted. - -2) More to come... - - Handling Errors - -DMA address space is limited on some architectures and an allocation -failure can be determined by: - -- checking if dma_alloc_coherent returns NULL or dma_map_sg returns 0 - -- checking the returned dma_addr_t of dma_map_single and dma_map_page - by using dma_mapping_error(): - - dma_addr_t dma_handle; - - dma_handle = dma_map_single(dev, addr, size, direction); - if (dma_mapping_error(dev, dma_handle)) { - /* - * reduce current DMA mapping usage, - * delay and try again later or - * reset driver. - */ - } + Don't invent the architecture specific struct scatterlist; just use + <asm-generic/scatterlist.h>. You need to enable + CONFIG_NEED_SG_DMA_LENGTH if the architecture supports IOMMUs + (including software IOMMU). + +2) ARCH_KMALLOC_MINALIGN + + Architectures must ensure that kmalloc'ed buffer is + DMA-safe. Drivers and subsystems depend on it. If an architecture + isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in + the CPU cache is identical to data in main memory), + ARCH_KMALLOC_MINALIGN must be set so that the memory allocator + makes sure that kmalloc'ed buffer doesn't share a cache line with + the others. See arch/arm/include/asm/cache.h as an example. + + Note that ARCH_KMALLOC_MINALIGN is about DMA memory alignment + constraints. You don't need to worry about the architecture data + alignment constraints (e.g. the alignment constraints about 64-bit + objects). Closing diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 7583dc7cf64..910c923a9b8 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -389,7 +389,7 @@ </para> <para> If your driver supports memory management (it should!), you'll - need to set that up at load time as well. How you intialize + need to set that up at load time as well. How you initialize it depends on which memory manager you're using, TTM or GEM. </para> <sect3> @@ -399,7 +399,7 @@ aperture space for graphics devices. TTM supports both UMA devices and devices with dedicated video RAM (VRAM), i.e. most discrete graphics devices. If your device has dedicated RAM, supporting - TTM is desireable. TTM also integrates tightly with your + TTM is desirable. TTM also integrates tightly with your driver specific buffer execution function. See the radeon driver for examples. </para> @@ -443,7 +443,7 @@ likely eventually calling ttm_bo_global_init and ttm_bo_global_release, respectively. Also like the previous object, ttm_global_item_ref is used to create an initial reference - count for the TTM, which will call your initalization function. + count for the TTM, which will call your initialization function. </para> </sect3> <sect3> @@ -557,7 +557,7 @@ void intel_crt_init(struct drm_device *dev) CRT connector and encoder combination is created. A device specific i2c bus is also created, for fetching EDID data and performing monitor detection. Once the process is complete, - the new connector is regsitered with sysfs, to make its + the new connector is registered with sysfs, to make its properties available to applications. </para> <sect4> @@ -581,12 +581,12 @@ void intel_crt_init(struct drm_device *dev) <para> For each encoder, CRTC and connector, several functions must be provided, depending on the object type. Encoder objects - need should provide a DPMS (basically on/off) function, mode fixup + need to provide a DPMS (basically on/off) function, mode fixup (for converting requested modes into native hardware timings), and prepare, set and commit functions for use by the core DRM helper functions. Connector helpers need to provide mode fetch and validity functions as well as an encoder matching function for - returing an ideal encoder for a given connector. The core + returning an ideal encoder for a given connector. The core connector functions include a DPMS callback, (deprecated) save/restore routines, detection, mode probing, property handling, and cleanup functions. diff --git a/Documentation/DocBook/dvb/dvbapi.xml b/Documentation/DocBook/dvb/dvbapi.xml index 63c528fee62..e3a97fdd62a 100644 --- a/Documentation/DocBook/dvb/dvbapi.xml +++ b/Documentation/DocBook/dvb/dvbapi.xml @@ -12,10 +12,12 @@ <othername role="mi">O. C.</othername> <affiliation><address><email>rjkm@metzlerbros.de</email></address></affiliation> </author> +</authorgroup> +<authorgroup> <author> <firstname>Mauro</firstname> -<surname>Chehab</surname> <othername role="mi">Carvalho</othername> +<surname>Chehab</surname> <affiliation><address><email>mchehab@redhat.com</email></address></affiliation> <contrib>Ported document to Docbook XML.</contrib> </author> @@ -23,13 +25,24 @@ <copyright> <year>2002</year> <year>2003</year> - <year>2009</year> <holder>Convergence GmbH</holder> </copyright> +<copyright> + <year>2009-2010</year> + <holder>Mauro Carvalho Chehab</holder> +</copyright> <revhistory> <!-- Put document revisions here, newest first. --> <revision> + <revnumber>2.0.3</revnumber> + <date>2010-07-03</date> + <authorinitials>mcc</authorinitials> + <revremark> + Add some frontend capabilities flags, present on kernel, but missing at the specs. + </revremark> +</revision> +<revision> <revnumber>2.0.2</revnumber> <date>2009-10-25</date> <authorinitials>mcc</authorinitials> @@ -63,7 +76,7 @@ Added ISDB-T test originally written by Patrick Boettcher <title>LINUX DVB API</title> -<subtitle>Version 3</subtitle> +<subtitle>Version 5.2</subtitle> <!-- ADD THE CHAPTERS HERE --> <chapter id="dvb_introdution"> &sub-intro; diff --git a/Documentation/DocBook/dvb/frontend.h.xml b/Documentation/DocBook/dvb/frontend.h.xml index b99644f5340..d08e0d40141 100644 --- a/Documentation/DocBook/dvb/frontend.h.xml +++ b/Documentation/DocBook/dvb/frontend.h.xml @@ -63,6 +63,7 @@ typedef enum fe_caps { FE_CAN_8VSB = 0x200000, FE_CAN_16VSB = 0x400000, FE_HAS_EXTENDED_CAPS = 0x800000, /* We need more bitspace for newer APIs, indicate this. */ + FE_CAN_TURBO_FEC = 0x8000000, /* frontend supports "turbo fec modulation" */ FE_CAN_2G_MODULATION = 0x10000000, /* frontend supports "2nd generation modulation" (DVB-S2) */ FE_NEEDS_BENDING = 0x20000000, /* not supported anymore, don't use (frontend requires frequency bending) */ FE_CAN_RECOVER = 0x40000000, /* frontend can recover from a cable unplug automatically */ diff --git a/Documentation/DocBook/dvb/frontend.xml b/Documentation/DocBook/dvb/frontend.xml index 300ba1f0417..78d756de590 100644 --- a/Documentation/DocBook/dvb/frontend.xml +++ b/Documentation/DocBook/dvb/frontend.xml @@ -64,8 +64,14 @@ a specific frontend type.</para> FE_CAN_BANDWIDTH_AUTO = 0x40000, FE_CAN_GUARD_INTERVAL_AUTO = 0x80000, FE_CAN_HIERARCHY_AUTO = 0x100000, - FE_CAN_MUTE_TS = 0x80000000, - FE_CAN_CLEAN_SETUP = 0x40000000 + FE_CAN_8VSB = 0x200000, + FE_CAN_16VSB = 0x400000, + FE_HAS_EXTENDED_CAPS = 0x800000, + FE_CAN_TURBO_FEC = 0x8000000, + FE_CAN_2G_MODULATION = 0x10000000, + FE_NEEDS_BENDING = 0x20000000, + FE_CAN_RECOVER = 0x40000000, + FE_CAN_MUTE_TS = 0x80000000 } fe_caps_t; </programlisting> </section> diff --git a/Documentation/DocBook/media-entities.tmpl b/Documentation/DocBook/media-entities.tmpl index 5d4d40f429a..6ae97157b1c 100644 --- a/Documentation/DocBook/media-entities.tmpl +++ b/Documentation/DocBook/media-entities.tmpl @@ -218,6 +218,7 @@ <!ENTITY sub-dev-teletext SYSTEM "v4l/dev-teletext.xml"> <!ENTITY sub-driver SYSTEM "v4l/driver.xml"> <!ENTITY sub-libv4l SYSTEM "v4l/libv4l.xml"> +<!ENTITY sub-lirc_device_interface SYSTEM "v4l/lirc_device_interface.xml"> <!ENTITY sub-remote_controllers SYSTEM "v4l/remote_controllers.xml"> <!ENTITY sub-fdl-appendix SYSTEM "v4l/fdl-appendix.xml"> <!ENTITY sub-close SYSTEM "v4l/func-close.xml"> diff --git a/Documentation/DocBook/media.tmpl b/Documentation/DocBook/media.tmpl index eea564bb12c..f11048d4053 100644 --- a/Documentation/DocBook/media.tmpl +++ b/Documentation/DocBook/media.tmpl @@ -28,7 +28,7 @@ <title>LINUX MEDIA INFRASTRUCTURE API</title> <copyright> - <year>2009</year> + <year>2009-2010</year> <holder>LinuxTV Developers</holder> </copyright> @@ -61,7 +61,7 @@ Foundation. A copy of the license is included in the chapter entitled in fact it covers several different video standards including DVB-T, DVB-S, DVB-C and ATSC. The API is currently being updated to documment support also for DVB-S2, ISDB-T and ISDB-S.</para> - <para>The third part covers other API's used by all media infrastructure devices</para> + <para>The third part covers Remote Controller API</para> <para>For additional information and for the latest development code, see: <ulink url="http://linuxtv.org">http://linuxtv.org</ulink>.</para> <para>For discussing improvements, reporting troubles, sending new drivers, etc, please mail to: <ulink url="http://vger.kernel.org/vger-lists.html#linux-media">Linux Media Mailing List (LMML).</ulink>.</para> @@ -86,7 +86,7 @@ Foundation. A copy of the license is included in the chapter entitled </author> </authorgroup> <copyright> - <year>2009</year> + <year>2009-2010</year> <holder>Mauro Carvalho Chehab</holder> </copyright> @@ -101,7 +101,7 @@ Foundation. A copy of the license is included in the chapter entitled </revhistory> </partinfo> -<title>Other API's used by media infrastructure drivers</title> +<title>Remote Controller API</title> <chapter id="remote_controllers"> &sub-remote_controllers; </chapter> diff --git a/Documentation/DocBook/v4l/lirc_device_interface.xml b/Documentation/DocBook/v4l/lirc_device_interface.xml new file mode 100644 index 00000000000..0413234023d --- /dev/null +++ b/Documentation/DocBook/v4l/lirc_device_interface.xml @@ -0,0 +1,235 @@ +<section id="lirc_dev"> +<title>LIRC Device Interface</title> + + +<section id="lirc_dev_intro"> +<title>Introduction</title> + +<para>The LIRC device interface is a bi-directional interface for +transporting raw IR data between userspace and kernelspace. Fundamentally, +it is just a chardev (/dev/lircX, for X = 0, 1, 2, ...), with a number +of standard struct file_operations defined on it. With respect to +transporting raw IR data to and fro, the essential fops are read, write +and ioctl.</para> + +<para>Example dmesg output upon a driver registering w/LIRC:</para> + <blockquote> + <para>$ dmesg |grep lirc_dev</para> + <para>lirc_dev: IR Remote Control driver registered, major 248</para> + <para>rc rc0: lirc_dev: driver ir-lirc-codec (mceusb) registered at minor = 0</para> + </blockquote> + +<para>What you should see for a chardev:</para> + <blockquote> + <para>$ ls -l /dev/lirc*</para> + <para>crw-rw---- 1 root root 248, 0 Jul 2 22:20 /dev/lirc0</para> + </blockquote> +</section> + +<section id="lirc_read"> +<title>LIRC read fop</title> + +<para>The lircd userspace daemon reads raw IR data from the LIRC chardev. The +exact format of the data depends on what modes a driver supports, and what +mode has been selected. lircd obtains supported modes and sets the active mode +via the ioctl interface, detailed at <xref linkend="lirc_ioctl"/>. The generally +preferred mode is LIRC_MODE_MODE2, in which packets containing an int value +describing an IR signal are read from the chardev.</para> + +<para>See also <ulink url="http://www.lirc.org/html/technical.html">http://www.lirc.org/html/technical.html</ulink> for more info.</para> +</section> + +<section id="lirc_write"> +<title>LIRC write fop</title> + +<para>The data written to the chardev is a pulse/space sequence of integer +values. Pulses and spaces are only marked implicitly by their position. The +data must start and end with a pulse, therefore, the data must always include +an unevent number of samples. The write function must block until the data has +been transmitted by the hardware.</para> +</section> + +<section id="lirc_ioctl"> +<title>LIRC ioctl fop</title> + +<para>The LIRC device's ioctl definition is bound by the ioctl function +definition of struct file_operations, leaving us with an unsigned int +for the ioctl command and an unsigned long for the arg. For the purposes +of ioctl portability across 32-bit and 64-bit, these values are capped +to their 32-bit sizes.</para> + +<para>The following ioctls can be used to change specific hardware settings. +In general each driver should have a default set of settings. The driver +implementation is expected to re-apply the default settings when the device +is closed by user-space, so that every application opening the device can rely +on working with the default settings initially.</para> + +<variablelist> + <varlistentry> + <term>LIRC_GET_FEATURES</term> + <listitem> + <para>Obviously, get the underlying hardware device's features. If a driver + does not announce support of certain features, calling of the corresponding + ioctls is undefined.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_SEND_MODE</term> + <listitem> + <para>Get supported transmit mode. Only LIRC_MODE_PULSE is supported by lircd.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_REC_MODE</term> + <listitem> + <para>Get supported receive modes. Only LIRC_MODE_MODE2 and LIRC_MODE_LIRCCODE + are supported by lircd.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_SEND_CARRIER</term> + <listitem> + <para>Get carrier frequency (in Hz) currently used for transmit.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_REC_CARRIER</term> + <listitem> + <para>Get carrier frequency (in Hz) currently used for IR reception.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_{G,S}ET_{SEND,REC}_DUTY_CYCLE</term> + <listitem> + <para>Get/set the duty cycle (from 0 to 100) of the carrier signal. Currently, + no special meaning is defined for 0 or 100, but this could be used to switch + off carrier generation in the future, so these values should be reserved.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_REC_RESOLUTION</term> + <listitem> + <para>Some receiver have maximum resolution which is defined by internal + sample rate or data format limitations. E.g. it's common that signals can + only be reported in 50 microsecond steps. This integer value is used by + lircd to automatically adjust the aeps tolerance value in the lircd + config file.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_M{IN,AX}_TIMEOUT</term> + <listitem> + <para>Some devices have internal timers that can be used to detect when + there's no IR activity for a long time. This can help lircd in detecting + that a IR signal is finished and can speed up the decoding process. + Returns an integer value with the minimum/maximum timeout that can be + set. Some devices have a fixed timeout, in that case both ioctls will + return the same value even though the timeout cannot be changed.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_M{IN,AX}_FILTER_{PULSE,SPACE}</term> + <listitem> + <para>Some devices are able to filter out spikes in the incoming signal + using given filter rules. These ioctls return the hardware capabilities + that describe the bounds of the possible filters. Filter settings depend + on the IR protocols that are expected. lircd derives the settings from + all protocols definitions found in its config file.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_GET_LENGTH</term> + <listitem> + <para>Retrieves the code length in bits (only for LIRC_MODE_LIRCCODE). + Reads on the device must be done in blocks matching the bit count. + The bit could should be rounded up so that it matches full bytes.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_{SEND,REC}_MODE</term> + <listitem> + <para>Set send/receive mode. Largely obsolete for send, as only + LIRC_MODE_PULSE is supported.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_{SEND,REC}_CARRIER</term> + <listitem> + <para>Set send/receive carrier (in Hz).</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_TRANSMITTER_MASK</term> + <listitem> + <para>This enables the given set of transmitters. The first transmitter + is encoded by the least significant bit, etc. When an invalid bit mask + is given, i.e. a bit is set, even though the device does not have so many + transitters, then this ioctl returns the number of available transitters + and does nothing otherwise.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_REC_TIMEOUT</term> + <listitem> + <para>Sets the integer value for IR inactivity timeout (cf. + LIRC_GET_MIN_TIMEOUT and LIRC_GET_MAX_TIMEOUT). A value of 0 (if + supported by the hardware) disables all hardware timeouts and data should + be reported as soon as possible. If the exact value cannot be set, then + the next possible value _greater_ than the given value should be set.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_REC_TIMEOUT_REPORTS</term> + <listitem> + <para>Enable (1) or disable (0) timeout reports in LIRC_MODE_MODE2. By + default, timeout reports should be turned off.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_REC_FILTER_{,PULSE,SPACE}</term> + <listitem> + <para>Pulses/spaces shorter than this are filtered out by hardware. If + filters cannot be set independently for pulse/space, the corresponding + ioctls must return an error and LIRC_SET_REC_FILTER shall be used instead.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_MEASURE_CARRIER_MODE</term> + <listitem> + <para>Enable (1)/disable (0) measure mode. If enabled, from the next key + press on, the driver will send LIRC_MODE2_FREQUENCY packets. By default + this should be turned off.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SET_REC_{DUTY_CYCLE,CARRIER}_RANGE</term> + <listitem> + <para>To set a range use LIRC_SET_REC_DUTY_CYCLE_RANGE/LIRC_SET_REC_CARRIER_RANGE + with the lower bound first and later LIRC_SET_REC_DUTY_CYCLE/LIRC_SET_REC_CARRIER + with the upper bound.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_NOTIFY_DECODE</term> + <listitem> + <para>This ioctl is called by lircd whenever a successful decoding of an + incoming IR signal could be done. This can be used by supporting hardware + to give visual feedback to the user e.g. by flashing a LED.</para> + </listitem> + </varlistentry> + <varlistentry> + <term>LIRC_SETUP_{START,END}</term> + <listitem> + <para>Setting of several driver parameters can be optimized by encapsulating + the according ioctl calls with LIRC_SETUP_START/LIRC_SETUP_END. When a + driver receives a LIRC_SETUP_START ioctl it can choose to not commit + further setting changes to the hardware until a LIRC_SETUP_END is received. + But this is open to the driver implementation and every driver must also + handle parameter changes which are not encapsulated by LIRC_SETUP_START + and LIRC_SETUP_END. Drivers can also choose to ignore these ioctls.</para> + </listitem> + </varlistentry> +</variablelist> + +</section> +</section> diff --git a/Documentation/DocBook/v4l/remote_controllers.xml b/Documentation/DocBook/v4l/remote_controllers.xml index 73f5eab091f..3c3b667b28e 100644 --- a/Documentation/DocBook/v4l/remote_controllers.xml +++ b/Documentation/DocBook/v4l/remote_controllers.xml @@ -173,3 +173,5 @@ keymapping.</para> <para>This program demonstrates how to replace the keymap tables.</para> &sub-keytable-c; </section> + +&sub-lirc_device_interface; diff --git a/Documentation/DocBook/v4l/v4l2.xml b/Documentation/DocBook/v4l/v4l2.xml index 9737243377a..7c3c098d5d0 100644 --- a/Documentation/DocBook/v4l/v4l2.xml +++ b/Documentation/DocBook/v4l/v4l2.xml @@ -58,7 +58,7 @@ MPEG stream embedded, sliced VBI data format in this specification. </contrib> <affiliation> <address> - <email>awalls@radix.net</email> + <email>awalls@md.metrocast.net</email> </address> </affiliation> </author> diff --git a/Documentation/DocBook/v4l/vidioc-query-dv-preset.xml b/Documentation/DocBook/v4l/vidioc-query-dv-preset.xml index 87e4f0f6151..402229ee06f 100644 --- a/Documentation/DocBook/v4l/vidioc-query-dv-preset.xml +++ b/Documentation/DocBook/v4l/vidioc-query-dv-preset.xml @@ -53,8 +53,10 @@ input</refpurpose> automatically, similar to sensing the video standard. To do so, applications call <constant> VIDIOC_QUERY_DV_PRESET</constant> with a pointer to a &v4l2-dv-preset; type. Once the hardware detects a preset, that preset is -returned in the preset field of &v4l2-dv-preset;. When detection is not -possible or fails, the value V4L2_DV_INVALID is returned.</para> +returned in the preset field of &v4l2-dv-preset;. If the preset could not be +detected because there was no signal, or the signal was unreliable, or the +signal did not map to a supported preset, then the value V4L2_DV_INVALID is +returned.</para> </refsect1> <refsect1> diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers index 99e72a81fa2..4947fd8fb18 100644 --- a/Documentation/SubmittingDrivers +++ b/Documentation/SubmittingDrivers @@ -130,6 +130,8 @@ Linux kernel master tree: ftp.??.kernel.org:/pub/linux/kernel/... ?? == your country code, such as "us", "uk", "fr", etc. + http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git + Linux kernel mailing list: linux-kernel@vger.kernel.org [mail majordomo@vger.kernel.org to subscribe] @@ -160,3 +162,6 @@ How to NOT write kernel driver by Arjan van de Ven: Kernel Janitor: http://janitor.kernelnewbies.org/ + +GIT, Fast Version Control System: + http://git-scm.com/ diff --git a/Documentation/acpi/apei/einj.txt b/Documentation/acpi/apei/einj.txt new file mode 100644 index 00000000000..dfab71848dc --- /dev/null +++ b/Documentation/acpi/apei/einj.txt @@ -0,0 +1,59 @@ + APEI Error INJection + ~~~~~~~~~~~~~~~~~~~~ + +EINJ provides a hardware error injection mechanism +It is very useful for debugging and testing of other APEI and RAS features. + +To use EINJ, make sure the following are enabled in your kernel +configuration: + +CONFIG_DEBUG_FS +CONFIG_ACPI_APEI +CONFIG_ACPI_APEI_EINJ + +The user interface of EINJ is debug file system, under the +directory apei/einj. The following files are provided. + +- available_error_type + Reading this file returns the error injection capability of the + platform, that is, which error types are supported. The error type + definition is as follow, the left field is the error type value, the + right field is error description. + + 0x00000001 Processor Correctable + 0x00000002 Processor Uncorrectable non-fatal + 0x00000004 Processor Uncorrectable fatal + 0x00000008 Memory Correctable + 0x00000010 Memory Uncorrectable non-fatal + 0x00000020 Memory Uncorrectable fatal + 0x00000040 PCI Express Correctable + 0x00000080 PCI Express Uncorrectable fatal + 0x00000100 PCI Express Uncorrectable non-fatal + 0x00000200 Platform Correctable + 0x00000400 Platform Uncorrectable non-fatal + 0x00000800 Platform Uncorrectable fatal + + The format of file contents are as above, except there are only the + available error type lines. + +- error_type + This file is used to set the error type value. The error type value + is defined in "available_error_type" description. + +- error_inject + Write any integer to this file to trigger the error + injection. Before this, please specify all necessary error + parameters. + +- param1 + This file is used to set the first error parameter value. Effect of + parameter depends on error_type specified. For memory error, this is + physical memory address. + +- param2 + This file is used to set the second error parameter value. Effect of + parameter depends on error_type specified. For memory error, this is + physical memory address mask. + +For more information about EINJ, please refer to ACPI specification +version 4.0, section 17.5. diff --git a/Documentation/apparmor.txt b/Documentation/apparmor.txt new file mode 100644 index 00000000000..93c1fd7d063 --- /dev/null +++ b/Documentation/apparmor.txt @@ -0,0 +1,39 @@ +--- What is AppArmor? --- + +AppArmor is MAC style security extension for the Linux kernel. It implements +a task centered policy, with task "profiles" being created and loaded +from user space. Tasks on the system that do not have a profile defined for +them run in an unconfined state which is equivalent to standard Linux DAC +permissions. + +--- How to enable/disable --- + +set CONFIG_SECURITY_APPARMOR=y + +If AppArmor should be selected as the default security module then + set CONFIG_DEFAULT_SECURITY="apparmor" + and CONFIG_SECURITY_APPARMOR_BOOTPARAM_VALUE=1 + +Build the kernel + +If AppArmor is not the default security module it can be enabled by passing +security=apparmor on the kernel's command line. + +If AppArmor is the default security module it can be disabled by passing +apparmor=0, security=XXXX (where XXX is valid security module), on the +kernel's command line + +For AppArmor to enforce any restrictions beyond standard Linux DAC permissions +policy must be loaded into the kernel from user space (see the Documentation +and tools links). + +--- Documentation --- + +Documentation can be found on the wiki. + +--- Links --- + +Mailing List - apparmor@lists.ubuntu.com +Wiki - http://apparmor.wiki.kernel.org/ +User space tools - https://launchpad.net/apparmor +Kernel module - git://git.kernel.org/pub/scm/linux/kernel/git/jj/apparmor-dev.git diff --git a/Documentation/arm/Samsung-S3C24XX/GPIO.txt b/Documentation/arm/Samsung-S3C24XX/GPIO.txt index 2af2cf39915..816d6071669 100644 --- a/Documentation/arm/Samsung-S3C24XX/GPIO.txt +++ b/Documentation/arm/Samsung-S3C24XX/GPIO.txt @@ -12,6 +12,8 @@ Introduction of the s3c2410 GPIO system, please read the Samsung provided data-sheet/users manual to find out the complete list. + See Documentation/arm/Samsung/GPIO.txt for the core implemetation. + GPIOLIB ------- @@ -24,8 +26,60 @@ GPIOLIB listed below will be removed (they may be marked as __deprecated in the near future). - - s3c2410_gpio_getpin - - s3c2410_gpio_setpin + The following functions now either have a s3c_ specific variant + or are merged into gpiolib. See the definitions in + arch/arm/plat-samsung/include/plat/gpio-cfg.h: + + s3c2410_gpio_setpin() gpio_set_value() or gpio_direction_output() + s3c2410_gpio_getpin() gpio_get_value() or gpio_direction_input() + s3c2410_gpio_getirq() gpio_to_irq() + s3c2410_gpio_cfgpin() s3c_gpio_cfgpin() + s3c2410_gpio_getcfg() s3c_gpio_getcfg() + s3c2410_gpio_pullup() s3c_gpio_setpull() + + +GPIOLIB conversion +------------------ + +If you need to convert your board or driver to use gpiolib from the exiting +s3c2410 api, then here are some notes on the process. + +1) If your board is exclusively using an GPIO, say to control peripheral + power, then it will require to claim the gpio with gpio_request() before + it can use it. + + It is recommended to check the return value, with at least WARN_ON() + during initialisation. + +2) The s3c2410_gpio_cfgpin() can be directly replaced with s3c_gpio_cfgpin() + as they have the same arguments, and can either take the pin specific + values, or the more generic special-function-number arguments. + +3) s3c2410_gpio_pullup() changs have the problem that whilst the + s3c2410_gpio_pullup(x, 1) can be easily translated to the + s3c_gpio_setpull(x, S3C_GPIO_PULL_NONE), the s3c2410_gpio_pullup(x, 0) + are not so easy. + + The s3c2410_gpio_pullup(x, 0) case enables the pull-up (or in the case + of some of the devices, a pull-down) and as such the new API distinguishes + between the UP and DOWN case. There is currently no 'just turn on' setting + which may be required if this becomes a problem. + +4) s3c2410_gpio_setpin() can be replaced by gpio_set_value(), the old call + does not implicitly configure the relevant gpio to output. The gpio + direction should be changed before using gpio_set_value(). + +5) s3c2410_gpio_getpin() is replaceable by gpio_get_value() if the pin + has been set to input. It is currently unknown what the behaviour is + when using gpio_get_value() on an output pin (s3c2410_gpio_getpin + would return the value the pin is supposed to be outputting). + +6) s3c2410_gpio_getirq() should be directly replacable with the + gpio_to_irq() call. + +The s3c2410_gpio and gpio_ calls have always operated on the same gpio +numberspace, so there is no problem with converting the gpio numbering +between the calls. Headers @@ -54,6 +108,11 @@ PIN Numbers eg S3C2410_GPA(0) or S3C2410_GPF(1). These defines are used to tell the GPIO functions which pin is to be used. + With the conversion to gpiolib, there is no longer a direct conversion + from gpio pin number to register base address as in earlier kernels. This + is due to the number space required for newer SoCs where the later + GPIOs are not contiguous. + Configuring a pin ----------------- @@ -71,6 +130,8 @@ Configuring a pin which would turn GPA(0) into the lowest Address line A0, and set GPE(8) to be connected to the SDIO/MMC controller's SDDAT1 line. + The s3c_gpio_cfgpin() call is a functional replacement for this call. + Reading the current configuration --------------------------------- @@ -82,6 +143,9 @@ Reading the current configuration The return value will be from the same set of values which can be passed to s3c2410_gpio_cfgpin(). + The s3c_gpio_getcfg() call should be a functional replacement for + this call. + Configuring a pull-up resistor ------------------------------ @@ -95,6 +159,10 @@ Configuring a pull-up resistor Where the to value is zero to set the pull-up off, and 1 to enable the specified pull-up. Any other values are currently undefined. + The s3c_gpio_setpull() offers similar functionality, but with the + ability to encode whether the pull is up or down. Currently there + is no 'just on' state, so up or down must be selected. + Getting the state of a PIN -------------------------- @@ -106,6 +174,9 @@ Getting the state of a PIN This will return either zero or non-zero. Do not count on this function returning 1 if the pin is set. + This call is now implemented by the relevant gpiolib calls, convert + your board or driver to use gpiolib. + Setting the state of a PIN -------------------------- @@ -117,6 +188,9 @@ Setting the state of a PIN Which sets the given pin to the value. Use 0 to write 0, and 1 to set the output to 1. + This call is now implemented by the relevant gpiolib calls, convert + your board or driver to use gpiolib. + Getting the IRQ number associated with a PIN -------------------------------------------- @@ -128,6 +202,9 @@ Getting the IRQ number associated with a PIN Note, not all pins have an IRQ. + This call is now implemented by the relevant gpiolib calls, convert + your board or driver to use gpiolib. + Authour ------- diff --git a/Documentation/arm/Samsung-S3C24XX/Overview.txt b/Documentation/arm/Samsung-S3C24XX/Overview.txt index 081892df4fd..c12bfc1a00c 100644 --- a/Documentation/arm/Samsung-S3C24XX/Overview.txt +++ b/Documentation/arm/Samsung-S3C24XX/Overview.txt @@ -8,10 +8,16 @@ Introduction The Samsung S3C24XX range of ARM9 System-on-Chip CPUs are supported by the 's3c2410' architecture of ARM Linux. Currently the S3C2410, - S3C2412, S3C2413, S3C2440, S3C2442 and S3C2443 devices are supported. + S3C2412, S3C2413, S3C2416 S3C2440, S3C2442, S3C2443 and S3C2450 devices + are supported. Support for the S3C2400 and S3C24A0 series are in progress. + The S3C2416 and S3C2450 devices are very similar and S3C2450 support is + included under the arch/arm/mach-s3c2416 directory. Note, whilst core + support for these SoCs is in, work on some of the extra peripherals + and extra interrupts is still ongoing. + Configuration ------------- @@ -209,6 +215,13 @@ GPIO Newer kernels carry GPIOLIB, and support is being moved towards this with some of the older support in line to be removed. + As of v2.6.34, the move towards using gpiolib support is almost + complete, and very little of the old calls are left. + + See Documentation/arm/Samsung-S3C24XX/GPIO.txt for the S3C24XX specific + support and Documentation/arm/Samsung/GPIO.txt for the core Samsung + implementation. + Clock Management ---------------- diff --git a/Documentation/arm/Samsung/GPIO.txt b/Documentation/arm/Samsung/GPIO.txt new file mode 100644 index 00000000000..05850c62abe --- /dev/null +++ b/Documentation/arm/Samsung/GPIO.txt @@ -0,0 +1,42 @@ + Samsung GPIO implementation + =========================== + +Introduction +------------ + +This outlines the Samsung GPIO implementation and the architecture +specfic calls provided alongisde the drivers/gpio core. + + +S3C24XX (Legacy) +---------------- + +See Documentation/arm/Samsung-S3C24XX/GPIO.txt for more information +about these devices. Their implementation is being brought into line +with the core samsung implementation described in this document. + + +GPIOLIB integration +------------------- + +The gpio implementation uses gpiolib as much as possible, only providing +specific calls for the items that require Samsung specific handling, such +as pin special-function or pull resistor control. + +GPIO numbering is synchronised between the Samsung and gpiolib system. + + +PIN configuration +----------------- + +Pin configuration is specific to the Samsung architecutre, with each SoC +registering the necessary information for the core gpio configuration +implementation to configure pins as necessary. + +The s3c_gpio_cfgpin() and s3c_gpio_setpull() provide the means for a +driver or machine to change gpio configuration. + +See arch/arm/plat-samsung/include/plat/gpio-cfg.h for more information +on these functions. + + diff --git a/Documentation/arm/Samsung/Overview.txt b/Documentation/arm/Samsung/Overview.txt index 7cced1fea9c..c3094ea51aa 100644 --- a/Documentation/arm/Samsung/Overview.txt +++ b/Documentation/arm/Samsung/Overview.txt @@ -13,9 +13,10 @@ Introduction - S3C24XX: See Documentation/arm/Samsung-S3C24XX/Overview.txt for full list - S3C64XX: S3C6400 and S3C6410 - - S5PC6440 - - S5PC100 and S5PC110 support is currently being merged + - S5P6440 + - S5P6442 + - S5PC100 + - S5PC110 / S5PV210 S3C24XX Systems @@ -35,7 +36,10 @@ Configuration unifying all the SoCs into one kernel. s5p6440_defconfig - S5P6440 specific default configuration + s5p6442_defconfig - S5P6442 specific default configuration s5pc100_defconfig - S5PC100 specific default configuration + s5pc110_defconfig - S5PC110 specific default configuration + s5pv210_defconfig - S5PV210 specific default configuration Layout @@ -50,18 +54,27 @@ Layout specific information. It contains the base clock, GPIO and device definitions to get the system running. - plat-s3c is the s3c24xx/s3c64xx platform directory, although it is currently - involved in other builds this will be phased out once the relevant code is - moved elsewhere. - plat-s3c24xx is for s3c24xx specific builds, see the S3C24XX docs. - plat-s3c64xx is for the s3c64xx specific bits, see the S3C24XX docs. + plat-s5p is for s5p specific builds, and contains common support for the + S5P specific systems. Not all S5Ps use all the features in this directory + due to differences in the hardware. + + +Layout changes +-------------- + + The old plat-s3c and plat-s5pc1xx directories have been removed, with + support moved to either plat-samsung or plat-s5p as necessary. These moves + where to simplify the include and dependency issues involved with having + so many different platform directories. - plat-s5p is for s5p specific builds, more to be added. + It was decided to remove plat-s5pc1xx as some of the support was already + in plat-s5p or plat-samsung, with the S5PC110 support added with S5PV210 + the only user was the S5PC100. The S5PC100 specific items where moved to + arch/arm/mach-s5pc100. - [ to finish ] Port Contributors diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt index eb0fae18ffb..771d48d3b33 100644 --- a/Documentation/arm/memory.txt +++ b/Documentation/arm/memory.txt @@ -33,7 +33,13 @@ ffff0000 ffff0fff CPU vector page. fffe0000 fffeffff XScale cache flush area. This is used in proc-xscale.S to flush the whole data - cache. Free for other usage on non-XScale. + cache. (XScale does not have TCM.) + +fffe8000 fffeffff DTCM mapping area for platforms with + DTCM mounted inside the CPU. + +fffe0000 fffe7fff ITCM mapping area for platforms with + ITCM mounted inside the CPU. fff00000 fffdffff Fixmap mapping region. Addresses provided by fix_to_virt() will be located here. diff --git a/Documentation/arm/tcm.txt b/Documentation/arm/tcm.txt index 77fd9376e6d..7c15871c188 100644 --- a/Documentation/arm/tcm.txt +++ b/Documentation/arm/tcm.txt @@ -19,8 +19,8 @@ defines a CPUID_TCM register that you can read out from the system control coprocessor. Documentation from ARM can be found at http://infocenter.arm.com, search for "TCM Status Register" to see documents for all CPUs. Reading this register you can -determine if ITCM (bit 0) and/or DTCM (bit 16) is present in the -machine. +determine if ITCM (bits 1-0) and/or DTCM (bit 17-16) is present +in the machine. There is further a TCM region register (search for "TCM Region Registers" at the ARM site) that can report and modify the location @@ -35,7 +35,15 @@ The TCM memory can then be remapped to another address again using the MMU, but notice that the TCM if often used in situations where the MMU is turned off. To avoid confusion the current Linux implementation will map the TCM 1 to 1 from physical to virtual -memory in the location specified by the machine. +memory in the location specified by the kernel. Currently Linux +will map ITCM to 0xfffe0000 and on, and DTCM to 0xfffe8000 and +on, supporting a maximum of 32KiB of ITCM and 32KiB of DTCM. + +Newer versions of the region registers also support dividing these +TCMs in two separate banks, so for example an 8KiB ITCM is divided +into two 4KiB banks with its own control registers. The idea is to +be able to lock and hide one of the banks for use by the secure +world (TrustZone). TCM is used for a few things: @@ -65,18 +73,18 @@ in <asm/tcm.h>. Using this interface it is possible to: memory. Such a heap is great for things like saving device state when shutting off device power domains. -A machine that has TCM memory shall select HAVE_TCM in -arch/arm/Kconfig for itself, and then the -rest of the functionality will depend on the physical -location and size of ITCM and DTCM to be defined in -mach/memory.h for the machine. Code that needs to use -TCM shall #include <asm/tcm.h> If the TCM is not located -at the place given in memory.h it will be moved using -the TCM Region registers. +A machine that has TCM memory shall select HAVE_TCM from +arch/arm/Kconfig for itself. Code that needs to use TCM shall +#include <asm/tcm.h> Functions to go into itcm can be tagged like this: int __tcmfunc foo(int bar); +Since these are marked to become long_calls and you may want +to have functions called locally inside the TCM without +wasting space, there is also the __tcmlocalfunc prefix that +will make the call relative. + Variables to go into dtcm can be tagged like this: int __tcmdata foo; diff --git a/Documentation/IO-mapping.txt b/Documentation/bus-virt-phys-mapping.txt index 1b5aa10df84..1b5aa10df84 100644 --- a/Documentation/IO-mapping.txt +++ b/Documentation/bus-virt-phys-mapping.txt diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index 57444c2609f..b34823ff164 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt @@ -339,7 +339,7 @@ To mount a cgroup hierarchy with all available subsystems, type: The "xxx" is not interpreted by the cgroup code, but will appear in /proc/mounts so may be any useful identifying string that you like. -To mount a cgroup hierarchy with just the cpuset and numtasks +To mount a cgroup hierarchy with just the cpuset and memory subsystems, type: # mount -t cgroup -o cpuset,memory hier1 /dev/cgroup diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 6cab1f29da4..7781857dc94 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -1,18 +1,15 @@ Memory Resource Controller NOTE: The Memory Resource Controller has been generically been referred -to as the memory controller in this document. Do not confuse memory controller -used here with the memory controller that is used in hardware. + to as the memory controller in this document. Do not confuse memory + controller used here with the memory controller that is used in hardware. -Salient features - -a. Enable control of Anonymous, Page Cache (mapped and unmapped) and - Swap Cache memory pages. -b. The infrastructure allows easy addition of other types of memory to control -c. Provides *zero overhead* for non memory controller users -d. Provides a double LRU: global memory pressure causes reclaim from the - global LRU; a cgroup on hitting a limit, reclaims from the per - cgroup LRU +(For editors) +In this document: + When we mention a cgroup (cgroupfs's directory) with memory controller, + we call it "memory cgroup". When you see git-log and source code, you'll + see patch's title and function names tend to use "memcg". + In this document, we avoid using it. Benefits and Purpose of the memory controller @@ -33,6 +30,45 @@ d. A CD/DVD burner could control the amount of memory used by the e. There are several other use cases, find one or use the controller just for fun (to learn and hack on the VM subsystem). +Current Status: linux-2.6.34-mmotm(development version of 2010/April) + +Features: + - accounting anonymous pages, file caches, swap caches usage and limiting them. + - private LRU and reclaim routine. (system's global LRU and private LRU + work independently from each other) + - optionally, memory+swap usage can be accounted and limited. + - hierarchical accounting + - soft limit + - moving(recharging) account at moving a task is selectable. + - usage threshold notifier + - oom-killer disable knob and oom-notifier + - Root cgroup has no limit controls. + + Kernel memory and Hugepages are not under control yet. We just manage + pages on LRU. To add more controls, we have to take care of performance. + +Brief summary of control files. + + tasks # attach a task(thread) and show list of threads + cgroup.procs # show list of processes + cgroup.event_control # an interface for event_fd() + memory.usage_in_bytes # show current memory(RSS+Cache) usage. + memory.memsw.usage_in_bytes # show current memory+Swap usage + memory.limit_in_bytes # set/show limit of memory usage + memory.memsw.limit_in_bytes # set/show limit of memory+Swap usage + memory.failcnt # show the number of memory usage hits limits + memory.memsw.failcnt # show the number of memory+Swap hits limits + memory.max_usage_in_bytes # show max memory usage recorded + memory.memsw.usage_in_bytes # show max memory+Swap usage recorded + memory.soft_limit_in_bytes # set/show soft limit of memory usage + memory.stat # show various statistics + memory.use_hierarchy # set/show hierarchical account enabled + memory.force_empty # trigger forced move charge to parent + memory.swappiness # set/show swappiness parameter of vmscan + (See sysctl's vm.swappiness) + memory.move_charge_at_immigrate # set/show controls of moving charges + memory.oom_control # set/show oom controls. + 1. History The memory controller has a long history. A request for comments for the memory @@ -106,14 +142,14 @@ the necessary data structures and check if the cgroup that is being charged is over its limit. If it is then reclaim is invoked on the cgroup. More details can be found in the reclaim section of this document. If everything goes well, a page meta-data-structure called page_cgroup is -allocated and associated with the page. This routine also adds the page to -the per cgroup LRU. +updated. page_cgroup has its own LRU on cgroup. +(*) page_cgroup structure is allocated at boot/memory-hotplug time. 2.2.1 Accounting details All mapped anon pages (RSS) and cache pages (Page Cache) are accounted. -(some pages which never be reclaimable and will not be on global LRU - are not accounted. we just accounts pages under usual vm management.) +Some pages which are never reclaimable and will not be on the global LRU +are not accounted. We just account pages under usual VM management. RSS pages are accounted at page_fault unless they've already been accounted for earlier. A file page will be accounted for as Page Cache when it's @@ -121,12 +157,19 @@ inserted into inode (radix-tree). While it's mapped into the page tables of processes, duplicate accounting is carefully avoided. A RSS page is unaccounted when it's fully unmapped. A PageCache page is -unaccounted when it's removed from radix-tree. +unaccounted when it's removed from radix-tree. Even if RSS pages are fully +unmapped (by kswapd), they may exist as SwapCache in the system until they +are really freed. Such SwapCaches also also accounted. +A swapped-in page is not accounted until it's mapped. + +Note: The kernel does swapin-readahead and read multiple swaps at once. +This means swapped-in pages may contain pages for other tasks than a task +causing page fault. So, we avoid accounting at swap-in I/O. At page migration, accounting information is kept. -Note: we just account pages-on-lru because our purpose is to control amount -of used pages. not-on-lru pages are tend to be out-of-control from vm view. +Note: we just account pages-on-LRU because our purpose is to control amount +of used pages; not-on-LRU pages tend to be out-of-control from VM view. 2.3 Shared Page Accounting @@ -143,6 +186,7 @@ caller of swapoff rather than the users of shmem. 2.4 Swap Extension (CONFIG_CGROUP_MEM_RES_CTLR_SWAP) + Swap Extension allows you to record charge for swap. A swapped-in page is charged back to original page allocator if possible. @@ -150,13 +194,20 @@ When swap is accounted, following files are added. - memory.memsw.usage_in_bytes. - memory.memsw.limit_in_bytes. -usage of mem+swap is limited by memsw.limit_in_bytes. +memsw means memory+swap. Usage of memory+swap is limited by +memsw.limit_in_bytes. -* why 'mem+swap' rather than swap. +Example: Assume a system with 4G of swap. A task which allocates 6G of memory +(by mistake) under 2G memory limitation will use all swap. +In this case, setting memsw.limit_in_bytes=3G will prevent bad use of swap. +By using memsw limit, you can avoid system OOM which can be caused by swap +shortage. + +* why 'memory+swap' rather than swap. The global LRU(kswapd) can swap out arbitrary pages. Swap-out means to move account from memory to swap...there is no change in usage of -mem+swap. In other words, when we want to limit the usage of swap without -affecting global LRU, mem+swap limit is better than just limiting swap from +memory+swap. In other words, when we want to limit the usage of swap without +affecting global LRU, memory+swap limit is better than just limiting swap from OS point of view. * What happens when a cgroup hits memory.memsw.limit_in_bytes @@ -168,12 +219,12 @@ it by cgroup. 2.5 Reclaim -Each cgroup maintains a per cgroup LRU that consists of an active -and inactive list. When a cgroup goes over its limit, we first try +Each cgroup maintains a per cgroup LRU which has the same structure as +global VM. When a cgroup goes over its limit, we first try to reclaim memory from the cgroup so as to make space for the new pages that the cgroup has touched. If the reclaim is unsuccessful, an OOM routine is invoked to select and kill the bulkiest task in the -cgroup. +cgroup. (See 10. OOM Control below.) The reclaim algorithm has not been modified for cgroups, except that pages that are selected for reclaiming come from the per cgroup LRU @@ -184,13 +235,22 @@ limits on the root cgroup. Note2: When panic_on_oom is set to "2", the whole system will panic. -2. Locking +When oom event notifier is registered, event will be delivered. +(See oom_control section) + +2.6 Locking -The memory controller uses the following hierarchy + lock_page_cgroup()/unlock_page_cgroup() should not be called under + mapping->tree_lock. -1. zone->lru_lock is used for selecting pages to be isolated -2. mem->per_zone->lru_lock protects the per cgroup LRU (per zone) -3. lock_page_cgroup() is used to protect page->page_cgroup + Other lock order is following: + PG_locked. + mm->page_table_lock + zone->lru_lock + lock_page_cgroup. + In many cases, just lock_page_cgroup() is called. + per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by + zone->lru_lock, it has no lock of its own. 3. User Interface @@ -199,6 +259,7 @@ The memory controller uses the following hierarchy a. Enable CONFIG_CGROUPS b. Enable CONFIG_RESOURCE_COUNTERS c. Enable CONFIG_CGROUP_MEM_RES_CTLR +d. Enable CONFIG_CGROUP_MEM_RES_CTLR_SWAP (to use swap extension) 1. Prepare the cgroups # mkdir -p /cgroups @@ -206,31 +267,28 @@ c. Enable CONFIG_CGROUP_MEM_RES_CTLR 2. Make the new group and move bash into it # mkdir /cgroups/0 -# echo $$ > /cgroups/0/tasks +# echo $$ > /cgroups/0/tasks -Since now we're in the 0 cgroup, -We can alter the memory limit: +Since now we're in the 0 cgroup, we can alter the memory limit: # echo 4M > /cgroups/0/memory.limit_in_bytes NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo, -mega or gigabytes. +mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes, Gibibytes.) + NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited). NOTE: We cannot set limits on the root cgroup any more. # cat /cgroups/0/memory.limit_in_bytes 4194304 -NOTE: The interface has now changed to display the usage in bytes -instead of pages - We can check the usage: # cat /cgroups/0/memory.usage_in_bytes 1216512 A successful write to this file does not guarantee a successful set of -this limit to the value written into the file. This can be due to a +this limit to the value written into the file. This can be due to a number of factors, such as rounding up to page boundaries or the total -availability of memory on the system. The user is required to re-read +availability of memory on the system. The user is required to re-read this file after a write to guarantee the value committed by the kernel. # echo 1 > memory.limit_in_bytes @@ -245,15 +303,23 @@ caches, RSS and Active pages/Inactive pages are shown. 4. Testing -Balbir posted lmbench, AIM9, LTP and vmmstress results [10] and [11]. -Apart from that v6 has been tested with several applications and regular -daily use. The controller has also been tested on the PPC64, x86_64 and -UML platforms. +For testing features and implementation, see memcg_test.txt. + +Performance test is also important. To see pure memory controller's overhead, +testing on tmpfs will give you good numbers of small overheads. +Example: do kernel make on tmpfs. + +Page-fault scalability is also important. At measuring parallel +page fault test, multi-process test may be better than multi-thread +test because it has noise of shared objects/status. + +But the above two are testing extreme situations. +Trying usual test under memory controller is always helpful. 4.1 Troubleshooting Sometimes a user might find that the application under a cgroup is -terminated. There are several causes for this: +terminated by OOM killer. There are several causes for this: 1. The cgroup limit is too low (just too low to do anything useful) 2. The user is using anonymous memory and swap is turned off or too low @@ -261,6 +327,9 @@ terminated. There are several causes for this: A sync followed by echo 1 > /proc/sys/vm/drop_caches will help get rid of some of the pages cached in the cgroup (page cache pages). +To know what happens, disable OOM_Kill by 10. OOM Control(see below) and +seeing what happens will be helpful. + 4.2 Task migration When a task migrates from one cgroup to another, its charge is not @@ -268,16 +337,19 @@ carried forward by default. The pages allocated from the original cgroup still remain charged to it, the charge is dropped when the page is freed or reclaimed. -Note: You can move charges of a task along with task migration. See 8. +You can move charges of a task along with task migration. +See 8. "Move charges at task migration" 4.3 Removing a cgroup A cgroup can be removed by rmdir, but as discussed in sections 4.1 and 4.2, a cgroup might have some charge associated with it, even though all -tasks have migrated away from it. -Such charges are freed(at default) or moved to its parent. When moved, -both of RSS and CACHES are moved to parent. -If both of them are busy, rmdir() returns -EBUSY. See 5.1 Also. +tasks have migrated away from it. (because we charge against pages, not +against tasks.) + +Such charges are freed or moved to their parent. At moving, both of RSS +and CACHES are moved to parent. +rmdir() may return -EBUSY if freeing/moving fails. See 5.1 also. Charges recorded in swap information is not updated at removal of cgroup. Recorded information is discarded and a cgroup which uses swap (swapcache) @@ -293,10 +365,10 @@ will be charged as a new owner of it. # echo 0 > memory.force_empty - Almost all pages tracked by this memcg will be unmapped and freed. Some of - pages cannot be freed because it's locked or in-use. Such pages are moved - to parent and this cgroup will be empty. But this may return -EBUSY in - some too busy case. + Almost all pages tracked by this memory cgroup will be unmapped and freed. + Some pages cannot be freed because they are locked or in-use. Such pages are + moved to parent and this cgroup will be empty. This may return -EBUSY if + VM is too busy to free/move all pages immediately. Typical use case of this interface is that calling this before rmdir(). Because rmdir() moves all pages to parent, some out-of-use page caches can be @@ -306,19 +378,41 @@ will be charged as a new owner of it. memory.stat file includes following statistics +# per-memory cgroup local status cache - # of bytes of page cache memory. rss - # of bytes of anonymous and swap cache memory. +mapped_file - # of bytes of mapped file (includes tmpfs/shmem) pgpgin - # of pages paged in (equivalent to # of charging events). pgpgout - # of pages paged out (equivalent to # of uncharging events). -active_anon - # of bytes of anonymous and swap cache memory on active - lru list. +swap - # of bytes of swap usage inactive_anon - # of bytes of anonymous memory and swap cache memory on - inactive lru list. -active_file - # of bytes of file-backed memory on active lru list. -inactive_file - # of bytes of file-backed memory on inactive lru list. + LRU list. +active_anon - # of bytes of anonymous and swap cache memory on active + inactive LRU list. +inactive_file - # of bytes of file-backed memory on inactive LRU list. +active_file - # of bytes of file-backed memory on active LRU list. unevictable - # of bytes of memory that cannot be reclaimed (mlocked etc). -The following additional stats are dependent on CONFIG_DEBUG_VM. +# status considering hierarchy (see memory.use_hierarchy settings) + +hierarchical_memory_limit - # of bytes of memory limit with regard to hierarchy + under which the memory cgroup is +hierarchical_memsw_limit - # of bytes of memory+swap limit with regard to + hierarchy under which memory cgroup is. + +total_cache - sum of all children's "cache" +total_rss - sum of all children's "rss" +total_mapped_file - sum of all children's "cache" +total_pgpgin - sum of all children's "pgpgin" +total_pgpgout - sum of all children's "pgpgout" +total_swap - sum of all children's "swap" +total_inactive_anon - sum of all children's "inactive_anon" +total_active_anon - sum of all children's "active_anon" +total_inactive_file - sum of all children's "inactive_file" +total_active_file - sum of all children's "active_file" +total_unevictable - sum of all children's "unevictable" + +# The following additional stats are dependent on CONFIG_DEBUG_VM. inactive_ratio - VM internal parameter. (see mm/page_alloc.c) recent_rotated_anon - VM internal parameter. (see mm/vmscan.c) @@ -327,24 +421,37 @@ recent_scanned_anon - VM internal parameter. (see mm/vmscan.c) recent_scanned_file - VM internal parameter. (see mm/vmscan.c) Memo: - recent_rotated means recent frequency of lru rotation. - recent_scanned means recent # of scans to lru. + recent_rotated means recent frequency of LRU rotation. + recent_scanned means recent # of scans to LRU. showing for better debug please see the code for meanings. Note: Only anonymous and swap cache memory is listed as part of 'rss' stat. This should not be confused with the true 'resident set size' or the - amount of physical memory used by the cgroup. Per-cgroup rss - accounting is not done yet. + amount of physical memory used by the cgroup. + 'rss + file_mapped" will give you resident set size of cgroup. + (Note: file and shmem may be shared among other cgroups. In that case, + file_mapped is accounted only when the memory cgroup is owner of page + cache.) 5.3 swappiness - Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only. - Following cgroups' swappiness can't be changed. - - root cgroup (uses /proc/sys/vm/swappiness). - - a cgroup which uses hierarchy and it has child cgroup. - - a cgroup which uses hierarchy and not the root of hierarchy. +Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only. +Following cgroups' swappiness can't be changed. +- root cgroup (uses /proc/sys/vm/swappiness). +- a cgroup which uses hierarchy and it has other cgroup(s) below it. +- a cgroup which uses hierarchy and not the root of hierarchy. + +5.4 failcnt + +A memory cgroup provides memory.failcnt and memory.memsw.failcnt files. +This failcnt(== failure count) shows the number of times that a usage counter +hit its limit. When a memory cgroup hits a limit, failcnt increases and +memory under it will be reclaimed. + +You can reset failcnt by writing 0 to failcnt file. +# echo 0 > .../memory.failcnt 6. Hierarchy support @@ -363,13 +470,13 @@ hierarchy In the diagram above, with hierarchical accounting enabled, all memory usage of e, is accounted to its ancestors up until the root (i.e, c and root), -that has memory.use_hierarchy enabled. If one of the ancestors goes over its +that has memory.use_hierarchy enabled. If one of the ancestors goes over its limit, the reclaim algorithm reclaims from the tasks in the ancestor and the children of the ancestor. 6.1 Enabling hierarchical accounting and reclaim -The memory controller by default disables the hierarchy feature. Support +A memory cgroup by default disables the hierarchy feature. Support can be enabled by writing 1 to memory.use_hierarchy file of the root cgroup # echo 1 > memory.use_hierarchy @@ -379,10 +486,10 @@ The feature can be disabled by # echo 0 > memory.use_hierarchy NOTE1: Enabling/disabling will fail if the cgroup already has other -cgroups created below it. + cgroups created below it. NOTE2: When panic_on_oom is set to "2", the whole system will panic in -case of an oom event in any cgroup. + case of an OOM event in any cgroup. 7. Soft limits @@ -392,7 +499,7 @@ is to allow control groups to use as much of the memory as needed, provided a. There is no memory contention b. They do not exceed their hard limit -When the system detects memory contention or low memory control groups +When the system detects memory contention or low memory, control groups are pushed back to their soft limits. If the soft limit of each control group is very high, they are pushed back as much as possible to make sure that one control group does not starve the others of memory. @@ -406,7 +513,7 @@ it gets invoked from balance_pgdat (kswapd). 7.1 Interface Soft limits can be setup by using the following commands (in this example we -assume a soft limit of 256 megabytes) +assume a soft limit of 256 MiB) # echo 256M > memory.soft_limit_in_bytes @@ -442,7 +549,7 @@ Note: Charges are moved only when you move mm->owner, IOW, a leader of a thread Note: If we cannot find enough space for the task in the destination cgroup, we try to make space by reclaiming memory. Task migration may fail if we cannot make enough space. -Note: It can take several seconds if you move charges in giga bytes order. +Note: It can take several seconds if you move charges much. And if you want disable it again: @@ -451,21 +558,27 @@ And if you want disable it again: 8.2 Type of charges which can be move Each bits of move_charge_at_immigrate has its own meaning about what type of -charges should be moved. +charges should be moved. But in any cases, it must be noted that an account of +a page or a swap can be moved only when it is charged to the task's current(old) +memory cgroup. bit | what type of charges would be moved ? -----+------------------------------------------------------------------------ 0 | A charge of an anonymous page(or swap of it) used by the target task. | Those pages and swaps must be used only by the target task. You must | enable Swap Extension(see 2.4) to enable move of swap charges. - -Note: Those pages and swaps must be charged to the old cgroup. -Note: More type of pages(e.g. file cache, shmem,) will be supported by other - bits in future. + -----+------------------------------------------------------------------------ + 1 | A charge of file pages(normal file, tmpfs file(e.g. ipc shared memory) + | and swaps of tmpfs file) mmapped by the target task. Unlike the case of + | anonymous pages, file pages(and swaps) in the range mmapped by the task + | will be moved even if the task hasn't done page fault, i.e. they might + | not be the task's "RSS", but other task's "RSS" that maps the same file. + | And mapcount of the page is ignored(the page can be moved even if + | page_mapcount(page) > 1). You must enable Swap Extension(see 2.4) to + | enable move of swap charges. 8.3 TODO -- Add support for other types of pages(e.g. file cache, shmem, etc.). - Implement madvise(2) to let users decide the vma to be moved or not to be moved. - All of moving charge operations are done under cgroup_mutex. It's not good @@ -473,22 +586,61 @@ Note: More type of pages(e.g. file cache, shmem,) will be supported by other 9. Memory thresholds -Memory controler implements memory thresholds using cgroups notification +Memory cgroup implements memory thresholds using cgroups notification API (see cgroups.txt). It allows to register multiple memory and memsw thresholds and gets notifications when it crosses. To register a threshold application need: - - create an eventfd using eventfd(2); - - open memory.usage_in_bytes or memory.memsw.usage_in_bytes; - - write string like "<event_fd> <memory.usage_in_bytes> <threshold>" to - cgroup.event_control. +- create an eventfd using eventfd(2); +- open memory.usage_in_bytes or memory.memsw.usage_in_bytes; +- write string like "<event_fd> <fd of memory.usage_in_bytes> <threshold>" to + cgroup.event_control. Application will be notified through eventfd when memory usage crosses threshold in any direction. It's applicable for root and non-root cgroup. -10. TODO +10. OOM Control + +memory.oom_control file is for OOM notification and other controls. + +Memory cgroup implements OOM notifier using cgroup notification +API (See cgroups.txt). It allows to register multiple OOM notification +delivery and gets notification when OOM happens. + +To register a notifier, application need: + - create an eventfd using eventfd(2) + - open memory.oom_control file + - write string like "<event_fd> <fd of memory.oom_control>" to + cgroup.event_control + +Application will be notified through eventfd when OOM happens. +OOM notification doesn't work for root cgroup. + +You can disable OOM-killer by writing "1" to memory.oom_control file, as: + + #echo 1 > memory.oom_control + +This operation is only allowed to the top cgroup of sub-hierarchy. +If OOM-killer is disabled, tasks under cgroup will hang/sleep +in memory cgroup's OOM-waitqueue when they request accountable memory. + +For running them, you have to relax the memory cgroup's OOM status by + * enlarge limit or reduce usage. +To reduce usage, + * kill some tasks. + * move some tasks to other group with account migration. + * remove some files (on tmpfs?) + +Then, stopped tasks will work again. + +At reading, current status of OOM is shown. + oom_kill_disable 0 or 1 (if 1, oom-killer is disabled) + under_oom 0 or 1 (if 1, the memory cgroup is under OOM, tasks may + be stopped.) + +11. TODO 1. Add support for accounting huge pages (as a separate controller) 2. Make per-cgroup scanner reclaim not-shared pages first diff --git a/Documentation/credentials.txt b/Documentation/credentials.txt index a2db3528700..995baf379c0 100644 --- a/Documentation/credentials.txt +++ b/Documentation/credentials.txt @@ -417,6 +417,9 @@ reference on them using: This does all the RCU magic inside of it. The caller must call put_cred() on the credentials so obtained when they're finished with. + [*] Note: The result of __task_cred() should not be passed directly to + get_cred() as this may race with commit_cred(). + There are a couple of convenience functions to access bits of another task's credentials, hiding the RCU magic from the caller: diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 239cbdbf4d1..350959f4e41 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -26,7 +26,7 @@ use IO::Handle; "dec3000s", "vp7041", "dibusb", "nxt2002", "nxt2004", "or51211", "or51132_qam", "or51132_vsb", "bluebird", "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718", - "af9015", "ngene"); + "af9015", "ngene", "az6027"); # Check args syntax() if (scalar(@ARGV) != 1); @@ -518,11 +518,11 @@ sub bluebird { sub af9015 { my $sourcefile = "download.ashx?file=57"; my $url = "http://www.ite.com.tw/EN/Services/$sourcefile"; - my $hash = "ff5b096ed47c080870eacdab2de33ad6"; + my $hash = "e3f08935158038d385ad382442f4bb2d"; my $outfile = "dvb-usb-af9015.fw"; my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); - my $fwoffset = 0x22708; - my $fwlength = 18225; + my $fwoffset = 0x25690; + my $fwlength = 18725; my ($chunklength, $buf, $rcount); checkstandard(); @@ -567,6 +567,23 @@ sub ngene { "$file1, $file2"; } +sub az6027{ + my $file = "AZ6027_Linux_Driver.tar.gz"; + my $url = "http://linux.terratec.de/files/$file"; + my $firmware = "dvb-usb-az6027-03.fw"; + + wgetfile($file, $url); + + #untar + if( system("tar xzvf $file $firmware")){ + die "failed to untar firmware"; + } + if( system("rm $file")){ + die ("unable to remove unnecessary files"); + } + + $firmware; +} # --------------------------------------------------------------- # Utilities diff --git a/Documentation/edac.txt b/Documentation/edac.txt index 79c53322376..0b875e8da96 100644 --- a/Documentation/edac.txt +++ b/Documentation/edac.txt @@ -6,6 +6,8 @@ Written by Doug Thompson <dougthompson@xmission.com> 7 Dec 2005 17 Jul 2007 Updated +(c) Mauro Carvalho Chehab <mchehab@redhat.com> +05 Aug 2009 Nehalem interface EDAC is maintained and written by: @@ -717,3 +719,153 @@ unique drivers for their hardware systems. The 'test_device_edac' sample driver is located at the bluesmoke.sourceforge.net project site for EDAC. +======================================================================= +NEHALEM USAGE OF EDAC APIs + +This chapter documents some EXPERIMENTAL mappings for EDAC API to handle +Nehalem EDAC driver. They will likely be changed on future versions +of the driver. + +Due to the way Nehalem exports Memory Controller data, some adjustments +were done at i7core_edac driver. This chapter will cover those differences + +1) On Nehalem, there are one Memory Controller per Quick Patch Interconnect + (QPI). At the driver, the term "socket" means one QPI. This is + associated with a physical CPU socket. + + Each MC have 3 physical read channels, 3 physical write channels and + 3 logic channels. The driver currenty sees it as just 3 channels. + Each channel can have up to 3 DIMMs. + + The minimum known unity is DIMMs. There are no information about csrows. + As EDAC API maps the minimum unity is csrows, the driver sequencially + maps channel/dimm into different csrows. + + For example, suposing the following layout: + Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs + dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400 + dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400 + Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs + dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400 + Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs + dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400 + The driver will map it as: + csrow0: channel 0, dimm0 + csrow1: channel 0, dimm1 + csrow2: channel 1, dimm0 + csrow3: channel 2, dimm0 + +exports one + DIMM per csrow. + + Each QPI is exported as a different memory controller. + +2) Nehalem MC has the hability to generate errors. The driver implements this + functionality via some error injection nodes: + + For injecting a memory error, there are some sysfs nodes, under + /sys/devices/system/edac/mc/mc?/: + + inject_addrmatch/*: + Controls the error injection mask register. It is possible to specify + several characteristics of the address to match an error code: + dimm = the affected dimm. Numbers are relative to a channel; + rank = the memory rank; + channel = the channel that will generate an error; + bank = the affected bank; + page = the page address; + column (or col) = the address column. + each of the above values can be set to "any" to match any valid value. + + At driver init, all values are set to any. + + For example, to generate an error at rank 1 of dimm 2, for any channel, + any bank, any page, any column: + echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm + echo 1 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank + + To return to the default behaviour of matching any, you can do: + echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm + echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank + + inject_eccmask: + specifies what bits will have troubles, + + inject_section: + specifies what ECC cache section will get the error: + 3 for both + 2 for the highest + 1 for the lowest + + inject_type: + specifies the type of error, being a combination of the following bits: + bit 0 - repeat + bit 1 - ecc + bit 2 - parity + + inject_enable starts the error generation when something different + than 0 is written. + + All inject vars can be read. root permission is needed for write. + + Datasheet states that the error will only be generated after a write on an + address that matches inject_addrmatch. It seems, however, that reading will + also produce an error. + + For example, the following code will generate an error for any write access + at socket 0, on any DIMM/address on channel 2: + + echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/channel + echo 2 >/sys/devices/system/edac/mc/mc0/inject_type + echo 64 >/sys/devices/system/edac/mc/mc0/inject_eccmask + echo 3 >/sys/devices/system/edac/mc/mc0/inject_section + echo 1 >/sys/devices/system/edac/mc/mc0/inject_enable + dd if=/dev/mem of=/dev/null seek=16k bs=4k count=1 >& /dev/null + + For socket 1, it is needed to replace "mc0" by "mc1" at the above + commands. + + The generated error message will look like: + + EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error)) + +3) Nehalem specific Corrected Error memory counters + + Nehalem have some registers to count memory errors. The driver uses those + registers to report Corrected Errors on devices with Registered Dimms. + + However, those counters don't work with Unregistered Dimms. As the chipset + offers some counters that also work with UDIMMS (but with a worse level of + granularity than the default ones), the driver exposes those registers for + UDIMM memories. + + They can be read by looking at the contents of all_channel_counts/ + + $ for i in /sys/devices/system/edac/mc/mc0/all_channel_counts/*; do echo $i; cat $i; done + /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0 + 0 + /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1 + 0 + /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2 + 0 + + What happens here is that errors on different csrows, but at the same + dimm number will increment the same counter. + So, in this memory mapping: + csrow0: channel 0, dimm0 + csrow1: channel 0, dimm1 + csrow2: channel 1, dimm0 + csrow3: channel 2, dimm0 + The hardware will increment udimm0 for an error at the first dimm at either + csrow0, csrow2 or csrow3; + The hardware will increment udimm1 for an error at the second dimm at either + csrow0, csrow2 or csrow3; + The hardware will increment udimm2 for an error at the third dimm at either + csrow0, csrow2 or csrow3; + +4) Standard error counters + + The standard error counters are generated when an mcelog error is received + by the driver. Since, with udimm, this is counted by software, it is + possible that some errors could be lost. With rdimm's, they displays the + contents of the registers diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index a86152ae2f6..2f1e5b621d0 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -303,15 +303,6 @@ Who: Johannes Berg <johannes@sipsolutions.net> --------------------------- -What: CONFIG_NF_CT_ACCT -When: 2.6.29 -Why: Accounting can now be enabled/disabled without kernel recompilation. - Currently used only to set a default value for a feature that is also - controlled by a kernel/module/sysfs/sysctl parameter. -Who: Krzysztof Piotr Oledzki <ole@ans.pl> - ---------------------------- - What: sysfs ui for changing p4-clockmod parameters When: September 2009 Why: See commits 129f8ae9b1b5be94517da76009ea956e89104ce8 and @@ -377,16 +368,6 @@ Who: Eric Paris <eparis@redhat.com> ---------------------------- -What: lock_policy_rwsem_* and unlock_policy_rwsem_* will not be - exported interface anymore. -When: 2.6.33 -Why: cpu_policy_rwsem has a new cleaner definition making it local to - cpufreq core and contained inside cpufreq.c. Other dependent - drivers should not use it in order to safely avoid lockdep issues. -Who: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> - ----------------------------- - What: sound-slot/service-* module aliases and related clutters in sound/sound_core.c When: August 2010 @@ -459,57 +440,6 @@ Who: Corentin Chary <corentin.chary@gmail.com> ---------------------------- -What: usbvideo quickcam_messenger driver -When: 2.6.35 -Files: drivers/media/video/usbvideo/quickcam_messenger.[ch] -Why: obsolete v4l1 driver replaced by gspca_stv06xx -Who: Hans de Goede <hdegoede@redhat.com> - ----------------------------- - -What: ov511 v4l1 driver -When: 2.6.35 -Files: drivers/media/video/ov511.[ch] -Why: obsolete v4l1 driver replaced by gspca_ov519 -Who: Hans de Goede <hdegoede@redhat.com> - ----------------------------- - -What: w9968cf v4l1 driver -When: 2.6.35 -Files: drivers/media/video/w9968cf*.[ch] -Why: obsolete v4l1 driver replaced by gspca_ov519 -Who: Hans de Goede <hdegoede@redhat.com> - ----------------------------- - -What: ovcamchip sensor framework -When: 2.6.35 -Files: drivers/media/video/ovcamchip/* -Why: Only used by obsoleted v4l1 drivers -Who: Hans de Goede <hdegoede@redhat.com> - ----------------------------- - -What: stv680 v4l1 driver -When: 2.6.35 -Files: drivers/media/video/stv680.[ch] -Why: obsolete v4l1 driver replaced by gspca_stv0680 -Who: Hans de Goede <hdegoede@redhat.com> - ----------------------------- - -What: zc0301 v4l driver -When: 2.6.35 -Files: drivers/media/video/zc0301/* -Why: Duplicate functionality with the gspca_zc3xx driver, zc0301 only - supports 2 USB-ID's (because it only supports a limited set of - sensors) wich are also supported by the gspca_zc3xx driver - (which supports 53 USB-ID's in total) -Who: Hans de Goede <hdegoede@redhat.com> - ----------------------------- - What: sysfs-class-rfkill state file When: Feb 2014 Files: net/rfkill/core.c @@ -538,17 +468,6 @@ Who: Jan Kiszka <jan.kiszka@web.de> ---------------------------- -What: KVM memory aliases support -When: July 2010 -Why: Memory aliasing support is used for speeding up guest vga access - through the vga windows. - - Modern userspace no longer uses this feature, so it's just bitrotted - code and can be removed with no impact. -Who: Avi Kivity <avi@redhat.com> - ----------------------------- - What: xtime, wall_to_monotonic When: 2.6.36+ Files: kernel/time/timekeeping.c include/linux/time.h @@ -559,16 +478,6 @@ Who: John Stultz <johnstul@us.ibm.com> ---------------------------- -What: KVM kernel-allocated memory slots -When: July 2010 -Why: Since 2.6.25, kvm supports user-allocated memory slots, which are - much more flexible than kernel-allocated slots. All current userspace - supports the newer interface and this code can be removed with no - impact. -Who: Avi Kivity <avi@redhat.com> - ----------------------------- - What: KVM paravirt mmu host support When: January 2011 Why: The paravirt mmu host support is slower than non-paravirt mmu, both @@ -578,15 +487,6 @@ Who: Avi Kivity <avi@redhat.com> ---------------------------- -What: "acpi=ht" boot option -When: 2.6.35 -Why: Useful in 2003, implementation is a hack. - Generally invoked by accident today. - Seen as doing more harm than good. -Who: Len Brown <len.brown@intel.com> - ----------------------------- - What: iwlwifi 50XX module parameters When: 2.6.40 Why: The "..50" modules parameters were used to configure 5000 series and @@ -646,3 +546,20 @@ Who: Thomas Gleixner <tglx@linutronix.de> ---------------------------- +What: old ieee1394 subsystem (CONFIG_IEEE1394) +When: 2.6.37 +Files: drivers/ieee1394/ except init_ohci1394_dma.c +Why: superseded by drivers/firewire/ (CONFIG_FIREWIRE) which offers more + features, better performance, and better security, all with smaller + and more modern code base +Who: Stefan Richter <stefanr@s5r6.in-berlin.de> + +---------------------------- + +What: The acpi_sleep=s4_nonvs command line option +When: 2.6.37 +Files: arch/x86/kernel/acpi/sleep.c +Why: superseded by acpi_sleep=nonvs +Who: Rafael J. Wysocki <rjw@sisk.pl> + +---------------------------- diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index af1608070cd..96d4293607e 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -380,7 +380,7 @@ prototypes: int (*open) (struct inode *, struct file *); int (*flush) (struct file *); int (*release) (struct inode *, struct file *); - int (*fsync) (struct file *, struct dentry *, int datasync); + int (*fsync) (struct file *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); @@ -429,8 +429,9 @@ check_flags: no implementations. If your fs is not using generic_file_llseek, you need to acquire and release the appropriate locks in your ->llseek(). For many filesystems, it is probably safe to acquire the inode -mutex. Note some filesystems (i.e. remote ones) provide no -protection for i_size so you will need to use the BKL. +mutex or just to use i_size_read() instead. +Note: this does not protect the file->f_pos against concurrent modifications +since this is something the userspace has to take care about. Note: ext2_release() was *the* source of contention on fs-intensive loads and dropping BKL on ->release() helps to get rid of that (we still diff --git a/Documentation/filesystems/nfs/nfsroot.txt b/Documentation/filesystems/nfs/nfsroot.txt index 3ba0b945aaf..f2430a7974e 100644 --- a/Documentation/filesystems/nfs/nfsroot.txt +++ b/Documentation/filesystems/nfs/nfsroot.txt @@ -124,6 +124,8 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> <hostname> Name of the client. May be supplied by autoconfiguration, but its absence will not trigger autoconfiguration. + If specified and DHCP is used, the user provided hostname will + be carried in the DHCP request to hopefully update DNS record. Default: Client IP address is used in ASCII notation. diff --git a/Documentation/filesystems/squashfs.txt b/Documentation/filesystems/squashfs.txt index b324c033035..203f7202cc9 100644 --- a/Documentation/filesystems/squashfs.txt +++ b/Documentation/filesystems/squashfs.txt @@ -38,7 +38,8 @@ Hard link support: yes no Real inode numbers: yes no 32-bit uids/gids: yes no File creation time: yes no -Xattr and ACL support: no no +Xattr support: yes no +ACL support: no no Squashfs compresses data, inodes and directories. In addition, inode and directory data are highly compacted, and packed on byte boundaries. Each @@ -58,7 +59,7 @@ obtained from this site also. 3. SQUASHFS FILESYSTEM DESIGN ----------------------------- -A squashfs filesystem consists of seven parts, packed together on a byte +A squashfs filesystem consists of a maximum of eight parts, packed together on a byte alignment: --------------- @@ -80,6 +81,9 @@ alignment: |---------------| | uid/gid | | lookup table | + |---------------| + | xattr | + | table | --------------- Compressed data blocks are written to the filesystem as files are read from @@ -192,6 +196,26 @@ This table is stored compressed into metadata blocks. A second index table is used to locate these. This second index table for speed of access (and because it is small) is read at mount time and cached in memory. +3.7 Xattr table +--------------- + +The xattr table contains extended attributes for each inode. The xattrs +for each inode are stored in a list, each list entry containing a type, +name and value field. The type field encodes the xattr prefix +("user.", "trusted." etc) and it also encodes how the name/value fields +should be interpreted. Currently the type indicates whether the value +is stored inline (in which case the value field contains the xattr value), +or if it is stored out of line (in which case the value field stores a +reference to where the actual value is stored). This allows large values +to be stored out of line improving scanning and lookup performance and it +also allows values to be de-duplicated, the value being stored once, and +all other occurences holding an out of line reference to that value. + +The xattr lists are packed into compressed 8K metadata blocks. +To reduce overhead in inodes, rather than storing the on-disk +location of the xattr list inside each inode, a 32-bit xattr id +is stored. This xattr id is mapped into the location of the xattr +list using a second xattr id lookup table. 4. TODOS AND OUTSTANDING ISSUES ------------------------------- @@ -199,9 +223,7 @@ it is small) is read at mount time and cached in memory. 4.1 Todo list ------------- -Implement Xattr and ACL support. The Squashfs 4.0 filesystem layout has hooks -for these but the code has not been written. Once the code has been written -the existing layout should not require modification. +Implement ACL support. 4.2 Squashfs internal cache --------------------------- diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index b66858538df..94677e7dcb1 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -401,11 +401,16 @@ otherwise noted. started might not be in the page cache at the end of the walk). - truncate: called by the VFS to change the size of a file. The + truncate: Deprecated. This will not be called if ->setsize is defined. + Called by the VFS to change the size of a file. The i_size field of the inode is set to the desired size by the VFS before this method is called. This method is called by the truncate(2) system call and related functionality. + Note: ->truncate and vmtruncate are deprecated. Do not add new + instances/calls of these. Filesystems should be converted to do their + truncate sequence via ->setattr(). + permission: called by the VFS to check for access rights on a POSIX-like filesystem. @@ -729,7 +734,7 @@ struct file_operations { int (*open) (struct inode *, struct file *); int (*flush) (struct file *); int (*release) (struct inode *, struct file *); - int (*fsync) (struct file *, struct dentry *, int datasync); + int (*fsync) (struct file *, int datasync); int (*aio_fsync) (struct kiocb *, int datasync); int (*fasync) (int, struct file *, int); int (*lock) (struct file *, int, struct file_lock *); diff --git a/Documentation/filesystems/xfs-delayed-logging-design.txt b/Documentation/filesystems/xfs-delayed-logging-design.txt index d8119e9d2d6..96d0df28bed 100644 --- a/Documentation/filesystems/xfs-delayed-logging-design.txt +++ b/Documentation/filesystems/xfs-delayed-logging-design.txt @@ -794,11 +794,6 @@ designed. Roadmap: -2.6.35 Inclusion in mainline as an experimental mount option - => approximately 2-3 months to merge window - => needs to be in xfs-dev tree in 4-6 weeks - => code is nearing readiness for review - 2.6.37 Remove experimental tag from mount option => should be roughly 6 months after initial merge => enough time to: diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt index 9878f50d6ed..7bff3e4f35d 100644 --- a/Documentation/filesystems/xfs.txt +++ b/Documentation/filesystems/xfs.txt @@ -131,17 +131,6 @@ When mounting an XFS filesystem, the following options are accepted. Don't check for double mounted file systems using the file system uuid. This is useful to mount LVM snapshot volumes. - osyncisosync - Make O_SYNC writes implement true O_SYNC. WITHOUT this option, - Linux XFS behaves as if an "osyncisdsync" option is used, - which will make writes to files opened with the O_SYNC flag set - behave as if the O_DSYNC flag had been used instead. - This can result in better performance without compromising - data safety. - However if this option is not in effect, timestamp updates from - O_SYNC writes can be lost if the system crashes. - If timestamp updates are critical, use the osyncisosync option. - uquota/usrquota/uqnoenforce/quota User disk quota accounting enabled, and limits (optionally) enforced. Refer to xfs_quota(8) for further details. diff --git a/Documentation/hwmon/dme1737 b/Documentation/hwmon/dme1737 index 001d2e70bc1..fc5df7654d6 100644 --- a/Documentation/hwmon/dme1737 +++ b/Documentation/hwmon/dme1737 @@ -9,11 +9,15 @@ Supported chips: * SMSC SCH3112, SCH3114, SCH3116 Prefix: 'sch311x' Addresses scanned: none, address read from Super-I/O config space - Datasheet: http://www.nuhorizons.com/FeaturedProducts/Volume1/SMSC/311x.pdf + Datasheet: Available on the Internet * SMSC SCH5027 Prefix: 'sch5027' Addresses scanned: I2C 0x2c, 0x2d, 0x2e Datasheet: Provided by SMSC upon request and under NDA + * SMSC SCH5127 + Prefix: 'sch5127' + Addresses scanned: none, address read from Super-I/O config space + Datasheet: Provided by SMSC upon request and under NDA Authors: Juerg Haefliger <juergh@gmail.com> @@ -36,8 +40,8 @@ Description ----------- This driver implements support for the hardware monitoring capabilities of the -SMSC DME1737 and Asus A8000 (which are the same), SMSC SCH5027, and SMSC -SCH311x Super-I/O chips. These chips feature monitoring of 3 temp sensors +SMSC DME1737 and Asus A8000 (which are the same), SMSC SCH5027, SCH311x, +and SCH5127 Super-I/O chips. These chips feature monitoring of 3 temp sensors temp[1-3] (2 remote diodes and 1 internal), 7 voltages in[0-6] (6 external and 1 internal) and up to 6 fan speeds fan[1-6]. Additionally, the chips implement up to 5 PWM outputs pwm[1-3,5-6] for controlling fan speeds both manually and @@ -48,14 +52,14 @@ Fan[3-6] and pwm[3,5-6] are optional features and their availability depends on the configuration of the chip. The driver will detect which features are present during initialization and create the sysfs attributes accordingly. -For the SCH311x, fan[1-3] and pwm[1-3] are always present and fan[4-6] and -pwm[5-6] don't exist. +For the SCH311x and SCH5127, fan[1-3] and pwm[1-3] are always present and +fan[4-6] and pwm[5-6] don't exist. The hardware monitoring features of the DME1737, A8000, and SCH5027 are only -accessible via SMBus, while the SCH311x only provides access via the ISA bus. -The driver will therefore register itself as an I2C client driver if it detects -a DME1737, A8000, or SCH5027 and as a platform driver if it detects a SCH311x -chip. +accessible via SMBus, while the SCH311x and SCH5127 only provide access via +the ISA bus. The driver will therefore register itself as an I2C client driver +if it detects a DME1737, A8000, or SCH5027 and as a platform driver if it +detects a SCH311x or SCH5127 chip. Voltage Monitoring @@ -76,7 +80,7 @@ DME1737, A8000: in6: Vbat (+3.0V) 0V - 4.38V SCH311x: - in0: +2.5V 0V - 6.64V + in0: +2.5V 0V - 3.32V in1: Vccp (processor core) 0V - 2V in2: VCC (internal +3.3V) 0V - 4.38V in3: +5V 0V - 6.64V @@ -93,6 +97,15 @@ SCH5027: in5: VTR (+3.3V standby) 0V - 4.38V in6: Vbat (+3.0V) 0V - 4.38V +SCH5127: + in0: +2.5 0V - 3.32V + in1: Vccp (processor core) 0V - 3V + in2: VCC (internal +3.3V) 0V - 4.38V + in3: V2_IN 0V - 1.5V + in4: V1_IN 0V - 1.5V + in5: VTR (+3.3V standby) 0V - 4.38V + in6: Vbat (+3.0V) 0V - 4.38V + Each voltage input has associated min and max limits which trigger an alarm when crossed. @@ -293,3 +306,21 @@ pwm[1-3]_auto_point1_pwm RW Auto PWM pwm point. Auto_point1 is the pwm[1-3]_auto_point2_pwm RO Auto PWM pwm point. Auto_point2 is the full-speed duty-cycle which is hard- wired to 255 (100% duty-cycle). + +Chip Differences +---------------- + +Feature dme1737 sch311x sch5027 sch5127 +------------------------------------------------------- +temp[1-3]_offset yes yes +vid yes +zone3 yes yes yes +zone[1-3]_hyst yes yes +pwm min/off yes yes +fan3 opt yes opt yes +pwm3 opt yes opt yes +fan4 opt opt +fan5 opt opt +pwm5 opt opt +fan6 opt opt +pwm6 opt opt diff --git a/Documentation/hwmon/lm63 b/Documentation/hwmon/lm63 index 31660bf9797..b9843eab1af 100644 --- a/Documentation/hwmon/lm63 +++ b/Documentation/hwmon/lm63 @@ -7,6 +7,11 @@ Supported chips: Addresses scanned: I2C 0x4c Datasheet: Publicly available at the National Semiconductor website http://www.national.com/pf/LM/LM63.html + * National Semiconductor LM64 + Prefix: 'lm64' + Addresses scanned: I2C 0x18 and 0x4e + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LM/LM64.html Author: Jean Delvare <khali@linux-fr.org> @@ -55,3 +60,5 @@ The lm63 driver will not update its values more frequently than every second; reading them more often will do no harm, but will return 'old' values. +The LM64 is effectively an LM63 with GPIO lines. The driver does not +support these GPIO lines at present. diff --git a/Documentation/hwmon/ltc4245 b/Documentation/hwmon/ltc4245 index 02838a47d86..86b5880d850 100644 --- a/Documentation/hwmon/ltc4245 +++ b/Documentation/hwmon/ltc4245 @@ -72,9 +72,7 @@ in6_min_alarm 5v output undervoltage alarm in7_min_alarm 3v output undervoltage alarm in8_min_alarm Vee (-12v) output undervoltage alarm -in9_input GPIO #1 voltage data -in10_input GPIO #2 voltage data -in11_input GPIO #3 voltage data +in9_input GPIO voltage data power1_input 12v power usage (mW) power2_input 5v power usage (mW) diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface index 3de6b0bcb14..d4e2917c6f1 100644 --- a/Documentation/hwmon/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface @@ -80,9 +80,9 @@ All entries (except name) are optional, and should only be created in a given driver if the chip has the feature. -******** -* Name * -******** +********************* +* Global attributes * +********************* name The chip name. This should be a short, lowercase string, not containing @@ -91,6 +91,13 @@ name The chip name. I2C devices get this attribute created automatically. RO +update_rate The rate at which the chip will update readings. + Unit: millisecond + RW + Some devices have a variable update rate. This attribute + can be used to change the update rate to the desired + frequency. + ************ * Voltages * diff --git a/Documentation/hwmon/tmp102 b/Documentation/hwmon/tmp102 new file mode 100644 index 00000000000..8454a776312 --- /dev/null +++ b/Documentation/hwmon/tmp102 @@ -0,0 +1,26 @@ +Kernel driver tmp102 +==================== + +Supported chips: + * Texas Instruments TMP102 + Prefix: 'tmp102' + Addresses scanned: none + Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp102.html + +Author: + Steven King <sfking@fdwdc.com> + +Description +----------- + +The Texas Instruments TMP102 implements one temperature sensor. Limits can be +set through the Overtemperature Shutdown register and Hysteresis register. The +sensor is accurate to 0.5 degree over the range of -25 to +85 C, and to 1.0 +degree from -40 to +125 C. Resolution of the sensor is 0.0625 degree. The +operating temperature has a minimum of -55 C and a maximum of +150 C. + +The TMP102 has a programmable update rate that can select between 8, 4, 1, and +0.5 Hz. (Currently the driver only supports the default of 4 Hz). + +The driver provides the common sysfs-interface for temperatures (see +Documentation/hwmon/sysfs-interface under Temperatures). diff --git a/Documentation/i2c/busses/i2c-ali1535 b/Documentation/i2c/busses/i2c-ali1535 index 0db3b4c74ad..acbc65a0809 100644 --- a/Documentation/i2c/busses/i2c-ali1535 +++ b/Documentation/i2c/busses/i2c-ali1535 @@ -6,12 +6,12 @@ Supported adapters: http://www.ali.com.tw/eng/support/datasheet_request.php Authors: - Frodo Looijaard <frodol@dds.nl>, + Frodo Looijaard <frodol@dds.nl>, Philip Edelbrock <phil@netroedge.com>, Mark D. Studebaker <mdsxyz123@yahoo.com>, Dan Eaton <dan.eaton@rocketlogix.com>, Stephen Rousset<stephen.rousset@rocketlogix.com> - + Description ----------- diff --git a/Documentation/i2c/busses/i2c-ali1563 b/Documentation/i2c/busses/i2c-ali1563 index 99ad4b9bcc3..54691698d2d 100644 --- a/Documentation/i2c/busses/i2c-ali1563 +++ b/Documentation/i2c/busses/i2c-ali1563 @@ -18,7 +18,7 @@ For an overview of these chips see http://www.acerlabs.com The M1563 southbridge is deceptively similar to the M1533, with a few notable exceptions. One of those happens to be the fact they upgraded the i2c core to be SMBus 2.0 compliant, and happens to be almost identical to -the i2c controller found in the Intel 801 south bridges. +the i2c controller found in the Intel 801 south bridges. Features -------- diff --git a/Documentation/i2c/busses/i2c-ali15x3 b/Documentation/i2c/busses/i2c-ali15x3 index ff28d381beb..600da90b8f1 100644 --- a/Documentation/i2c/busses/i2c-ali15x3 +++ b/Documentation/i2c/busses/i2c-ali15x3 @@ -6,8 +6,8 @@ Supported adapters: http://www.ali.com.tw/eng/support/datasheet_request.php Authors: - Frodo Looijaard <frodol@dds.nl>, - Philip Edelbrock <phil@netroedge.com>, + Frodo Looijaard <frodol@dds.nl>, + Philip Edelbrock <phil@netroedge.com>, Mark D. Studebaker <mdsxyz123@yahoo.com> Module Parameters @@ -40,10 +40,10 @@ M1541 and M1543C South Bridges. The M1543C is a South bridge for desktop systems. The M1541 is a South bridge for portable systems. They are part of the following ALI chipsets: - - * "Aladdin Pro 2" includes the M1621 Slot 1 North bridge with AGP and + + * "Aladdin Pro 2" includes the M1621 Slot 1 North bridge with AGP and 100MHz CPU Front Side bus - * "Aladdin V" includes the M1541 Socket 7 North bridge with AGP and 100MHz + * "Aladdin V" includes the M1541 Socket 7 North bridge with AGP and 100MHz CPU Front Side bus Some Aladdin V motherboards: Asus P5A @@ -77,7 +77,7 @@ output of lspci will show something similar to the following: ** then run lspci. ** If you see the 1533 and 5229 devices but NOT the 7101 device, ** then you must enable ACPI, the PMU, SMB, or something similar -** in the BIOS. +** in the BIOS. ** The driver won't work if it can't find the M7101 device. The SMB controller is part of the M7101 device, which is an ACPI-compliant @@ -87,8 +87,8 @@ The whole M7101 device has to be enabled for the SMB to work. You can't just enable the SMB alone. The SMB and the ACPI have separate I/O spaces. We make sure that the SMB is enabled. We leave the ACPI alone. -Features --------- +Features +-------- This driver controls the SMB Host only. The SMB Slave controller on the M15X3 is not enabled. This driver does not use diff --git a/Documentation/i2c/busses/i2c-pca-isa b/Documentation/i2c/busses/i2c-pca-isa index 6fc8f4c27c3..b044e526548 100644 --- a/Documentation/i2c/busses/i2c-pca-isa +++ b/Documentation/i2c/busses/i2c-pca-isa @@ -1,10 +1,10 @@ Kernel driver i2c-pca-isa Supported adapters: -This driver supports ISA boards using the Philips PCA 9564 -Parallel bus to I2C bus controller +This driver supports ISA boards using the Philips PCA 9564 +Parallel bus to I2C bus controller -Author: Ian Campbell <icampbell@arcom.com>, Arcom Control Systems +Author: Ian Campbell <icampbell@arcom.com>, Arcom Control Systems Module Parameters ----------------- @@ -12,12 +12,12 @@ Module Parameters * base int I/O base address * irq int - IRQ interrupt -* clock int + IRQ interrupt +* clock int Clock rate as described in table 1 of PCA9564 datasheet Description ----------- -This driver supports ISA boards using the Philips PCA 9564 -Parallel bus to I2C bus controller +This driver supports ISA boards using the Philips PCA 9564 +Parallel bus to I2C bus controller diff --git a/Documentation/i2c/busses/i2c-sis5595 b/Documentation/i2c/busses/i2c-sis5595 index cc47db7d00a..ecd21fb49a8 100644 --- a/Documentation/i2c/busses/i2c-sis5595 +++ b/Documentation/i2c/busses/i2c-sis5595 @@ -1,41 +1,41 @@ Kernel driver i2c-sis5595 -Authors: +Authors: Frodo Looijaard <frodol@dds.nl>, Mark D. Studebaker <mdsxyz123@yahoo.com>, - Philip Edelbrock <phil@netroedge.com> + Philip Edelbrock <phil@netroedge.com> Supported adapters: * Silicon Integrated Systems Corp. SiS5595 Southbridge Datasheet: Publicly available at the Silicon Integrated Systems Corp. site. -Note: all have mfr. ID 0x1039. - - SUPPORTED PCI ID - 5595 0008 - - Note: these chips contain a 0008 device which is incompatible with the - 5595. We recognize these by the presence of the listed - "blacklist" PCI ID and refuse to load. - - NOT SUPPORTED PCI ID BLACKLIST PCI ID - 540 0008 0540 - 550 0008 0550 - 5513 0008 5511 - 5581 0008 5597 - 5582 0008 5597 - 5597 0008 5597 - 5598 0008 5597/5598 - 630 0008 0630 - 645 0008 0645 - 646 0008 0646 - 648 0008 0648 - 650 0008 0650 - 651 0008 0651 - 730 0008 0730 - 735 0008 0735 - 745 0008 0745 - 746 0008 0746 +Note: all have mfr. ID 0x1039. + + SUPPORTED PCI ID + 5595 0008 + + Note: these chips contain a 0008 device which is incompatible with the + 5595. We recognize these by the presence of the listed + "blacklist" PCI ID and refuse to load. + + NOT SUPPORTED PCI ID BLACKLIST PCI ID + 540 0008 0540 + 550 0008 0550 + 5513 0008 5511 + 5581 0008 5597 + 5582 0008 5597 + 5597 0008 5597 + 5598 0008 5597/5598 + 630 0008 0630 + 645 0008 0645 + 646 0008 0646 + 648 0008 0648 + 650 0008 0650 + 651 0008 0651 + 730 0008 0730 + 735 0008 0735 + 745 0008 0745 + 746 0008 0746 Module Parameters ----------------- diff --git a/Documentation/i2c/busses/i2c-sis630 b/Documentation/i2c/busses/i2c-sis630 index 9aca6889f74..629ea2c356f 100644 --- a/Documentation/i2c/busses/i2c-sis630 +++ b/Documentation/i2c/busses/i2c-sis630 @@ -14,9 +14,9 @@ Module Parameters * force = [1|0] Forcibly enable the SIS630. DANGEROUS! This can be interesting for chipsets not named above to check if it works for you chipset, but DANGEROUS! - -* high_clock = [1|0] Forcibly set Host Master Clock to 56KHz (default, - what your BIOS use). DANGEROUS! This should be a bit + +* high_clock = [1|0] Forcibly set Host Master Clock to 56KHz (default, + what your BIOS use). DANGEROUS! This should be a bit faster, but freeze some systems (i.e. my Laptop). @@ -44,6 +44,6 @@ Philip Edelbrock <phil@netroedge.com> - testing SiS730 support Mark M. Hoffman <mhoffman@lightlink.com> - bug fixes - + To anyone else which I forgot here ;), thanks! diff --git a/Documentation/i2c/ten-bit-addresses b/Documentation/i2c/ten-bit-addresses index 200074f8136..e9890709c50 100644 --- a/Documentation/i2c/ten-bit-addresses +++ b/Documentation/i2c/ten-bit-addresses @@ -1,17 +1,17 @@ -The I2C protocol knows about two kinds of device addresses: normal 7 bit +The I2C protocol knows about two kinds of device addresses: normal 7 bit addresses, and an extended set of 10 bit addresses. The sets of addresses do not intersect: the 7 bit address 0x10 is not the same as the 10 bit address 0x10 (though a single device could respond to both of them). You select a 10 bit address by adding an extra byte after the address byte: - S Addr7 Rd/Wr .... + S Addr7 Rd/Wr .... becomes S 11110 Addr10 Rd/Wr S is the start bit, Rd/Wr the read/write bit, and if you count the number of bits, you will see the there are 8 after the S bit for 7 bit addresses, and 16 after the S bit for 10 bit addresses. -WARNING! The current 10 bit address support is EXPERIMENTAL. There are +WARNING! The current 10 bit address support is EXPERIMENTAL. There are several places in the code that will cause SEVERE PROBLEMS with 10 bit addresses, even though there is some basic handling and hooks. Also, almost no supported adapter handles the 10 bit addresses correctly. diff --git a/Documentation/input/multi-touch-protocol.txt b/Documentation/input/multi-touch-protocol.txt index c0fc1c75fd8..bdcba154b83 100644 --- a/Documentation/input/multi-touch-protocol.txt +++ b/Documentation/input/multi-touch-protocol.txt @@ -6,31 +6,149 @@ Multi-touch (MT) Protocol Introduction ------------ -In order to utilize the full power of the new multi-touch devices, a way to -report detailed finger data to user space is needed. This document -describes the multi-touch (MT) protocol which allows kernel drivers to -report details for an arbitrary number of fingers. +In order to utilize the full power of the new multi-touch and multi-user +devices, a way to report detailed data from multiple contacts, i.e., +objects in direct contact with the device surface, is needed. This +document describes the multi-touch (MT) protocol which allows kernel +drivers to report details for an arbitrary number of contacts. + +The protocol is divided into two types, depending on the capabilities of the +hardware. For devices handling anonymous contacts (type A), the protocol +describes how to send the raw data for all contacts to the receiver. For +devices capable of tracking identifiable contacts (type B), the protocol +describes how to send updates for individual contacts via event slots. + + +Protocol Usage +-------------- + +Contact details are sent sequentially as separate packets of ABS_MT +events. Only the ABS_MT events are recognized as part of a contact +packet. Since these events are ignored by current single-touch (ST) +applications, the MT protocol can be implemented on top of the ST protocol +in an existing driver. + +Drivers for type A devices separate contact packets by calling +input_mt_sync() at the end of each packet. This generates a SYN_MT_REPORT +event, which instructs the receiver to accept the data for the current +contact and prepare to receive another. + +Drivers for type B devices separate contact packets by calling +input_mt_slot(), with a slot as argument, at the beginning of each packet. +This generates an ABS_MT_SLOT event, which instructs the receiver to +prepare for updates of the given slot. + +All drivers mark the end of a multi-touch transfer by calling the usual +input_sync() function. This instructs the receiver to act upon events +accumulated since last EV_SYN/SYN_REPORT and prepare to receive a new set +of events/packets. + +The main difference between the stateless type A protocol and the stateful +type B slot protocol lies in the usage of identifiable contacts to reduce +the amount of data sent to userspace. The slot protocol requires the use of +the ABS_MT_TRACKING_ID, either provided by the hardware or computed from +the raw data [5]. + +For type A devices, the kernel driver should generate an arbitrary +enumeration of the full set of anonymous contacts currently on the +surface. The order in which the packets appear in the event stream is not +important. Event filtering and finger tracking is left to user space [3]. + +For type B devices, the kernel driver should associate a slot with each +identified contact, and use that slot to propagate changes for the contact. +Creation, replacement and destruction of contacts is achieved by modifying +the ABS_MT_TRACKING_ID of the associated slot. A non-negative tracking id +is interpreted as a contact, and the value -1 denotes an unused slot. A +tracking id not previously present is considered new, and a tracking id no +longer present is considered removed. Since only changes are propagated, +the full state of each initiated contact has to reside in the receiving +end. Upon receiving an MT event, one simply updates the appropriate +attribute of the current slot. + + +Protocol Example A +------------------ + +Here is what a minimal event sequence for a two-contact touch would look +like for a type A device: + + ABS_MT_POSITION_X x[0] + ABS_MT_POSITION_Y y[0] + SYN_MT_REPORT + ABS_MT_POSITION_X x[1] + ABS_MT_POSITION_Y y[1] + SYN_MT_REPORT + SYN_REPORT +The sequence after moving one of the contacts looks exactly the same; the +raw data for all present contacts are sent between every synchronization +with SYN_REPORT. -Usage ------ +Here is the sequence after lifting the first contact: + + ABS_MT_POSITION_X x[1] + ABS_MT_POSITION_Y y[1] + SYN_MT_REPORT + SYN_REPORT + +And here is the sequence after lifting the second contact: + + SYN_MT_REPORT + SYN_REPORT + +If the driver reports one of BTN_TOUCH or ABS_PRESSURE in addition to the +ABS_MT events, the last SYN_MT_REPORT event may be omitted. Otherwise, the +last SYN_REPORT will be dropped by the input core, resulting in no +zero-contact event reaching userland. -Anonymous finger details are sent sequentially as separate packets of ABS -events. Only the ABS_MT events are recognized as part of a finger -packet. The end of a packet is marked by calling the input_mt_sync() -function, which generates a SYN_MT_REPORT event. This instructs the -receiver to accept the data for the current finger and prepare to receive -another. The end of a multi-touch transfer is marked by calling the usual -input_sync() function. This instructs the receiver to act upon events -accumulated since last EV_SYN/SYN_REPORT and prepare to receive a new -set of events/packets. + +Protocol Example B +------------------ + +Here is what a minimal event sequence for a two-contact touch would look +like for a type B device: + + ABS_MT_SLOT 0 + ABS_MT_TRACKING_ID 45 + ABS_MT_POSITION_X x[0] + ABS_MT_POSITION_Y y[0] + ABS_MT_SLOT 1 + ABS_MT_TRACKING_ID 46 + ABS_MT_POSITION_X x[1] + ABS_MT_POSITION_Y y[1] + SYN_REPORT + +Here is the sequence after moving contact 45 in the x direction: + + ABS_MT_SLOT 0 + ABS_MT_POSITION_X x[0] + SYN_REPORT + +Here is the sequence after lifting the contact in slot 0: + + ABS_MT_TRACKING_ID -1 + SYN_REPORT + +The slot being modified is already 0, so the ABS_MT_SLOT is omitted. The +message removes the association of slot 0 with contact 45, thereby +destroying contact 45 and freeing slot 0 to be reused for another contact. + +Finally, here is the sequence after lifting the second contact: + + ABS_MT_SLOT 1 + ABS_MT_TRACKING_ID -1 + SYN_REPORT + + +Event Usage +----------- A set of ABS_MT events with the desired properties is defined. The events are divided into categories, to allow for partial implementation. The minimum set consists of ABS_MT_POSITION_X and ABS_MT_POSITION_Y, which -allows for multiple fingers to be tracked. If the device supports it, the +allows for multiple contacts to be tracked. If the device supports it, the ABS_MT_TOUCH_MAJOR and ABS_MT_WIDTH_MAJOR may be used to provide the size -of the contact area and approaching finger, respectively. +of the contact area and approaching contact, respectively. The TOUCH and WIDTH parameters have a geometrical interpretation; imagine looking through a window at someone gently holding a finger against the @@ -41,56 +159,26 @@ ABS_MT_TOUCH_MAJOR, the diameter of the outer region is ABS_MT_WIDTH_MAJOR. Now imagine the person pressing the finger harder against the glass. The inner region will increase, and in general, the ratio ABS_MT_TOUCH_MAJOR / ABS_MT_WIDTH_MAJOR, which is always smaller than -unity, is related to the finger pressure. For pressure-based devices, +unity, is related to the contact pressure. For pressure-based devices, ABS_MT_PRESSURE may be used to provide the pressure on the contact area instead. -In addition to the MAJOR parameters, the oval shape of the finger can be +In addition to the MAJOR parameters, the oval shape of the contact can be described by adding the MINOR parameters, such that MAJOR and MINOR are the major and minor axis of an ellipse. Finally, the orientation of the oval shape can be describe with the ORIENTATION parameter. The ABS_MT_TOOL_TYPE may be used to specify whether the touching tool is a -finger or a pen or something else. Devices with more granular information +contact or a pen or something else. Devices with more granular information may specify general shapes as blobs, i.e., as a sequence of rectangular shapes grouped together by an ABS_MT_BLOB_ID. Finally, for the few devices that currently support it, the ABS_MT_TRACKING_ID event may be used to -report finger tracking from hardware [5]. +report contact tracking from hardware [5]. -Here is what a minimal event sequence for a two-finger touch would look -like: - - ABS_MT_POSITION_X - ABS_MT_POSITION_Y - SYN_MT_REPORT - ABS_MT_POSITION_X - ABS_MT_POSITION_Y - SYN_MT_REPORT - SYN_REPORT - -Here is the sequence after lifting one of the fingers: - - ABS_MT_POSITION_X - ABS_MT_POSITION_Y - SYN_MT_REPORT - SYN_REPORT - -And here is the sequence after lifting the remaining finger: - - SYN_MT_REPORT - SYN_REPORT - -If the driver reports one of BTN_TOUCH or ABS_PRESSURE in addition to the -ABS_MT events, the last SYN_MT_REPORT event may be omitted. Otherwise, the -last SYN_REPORT will be dropped by the input core, resulting in no -zero-finger event reaching userland. Event Semantics --------------- -The word "contact" is used to describe a tool which is in direct contact -with the surface. A finger, a pen or a rubber all classify as contacts. - ABS_MT_TOUCH_MAJOR The length of the major axis of the contact. The length should be given in @@ -157,15 +245,16 @@ MT_TOOL_PEN [2]. ABS_MT_BLOB_ID The BLOB_ID groups several packets together into one arbitrarily shaped -contact. This is a low-level anonymous grouping, and should not be confused -with the high-level trackingID [5]. Most kernel drivers will not have blob -capability, and can safely omit the event. +contact. This is a low-level anonymous grouping for type A devices, and +should not be confused with the high-level trackingID [5]. Most type A +devices do not have blob capability, so drivers can safely omit this event. ABS_MT_TRACKING_ID The TRACKING_ID identifies an initiated contact throughout its life cycle -[5]. There are currently only a few devices that support it, so this event -should normally be omitted. +[5]. This event is mandatory for type B devices. The value range of the +TRACKING_ID should be large enough to ensure unique identification of a +contact maintained over an extended period of time. Event Computation @@ -192,20 +281,11 @@ finger along the X axis (1). Finger Tracking --------------- -The kernel driver should generate an arbitrary enumeration of the set of -anonymous contacts currently on the surface. The order in which the packets -appear in the event stream is not important. - The process of finger tracking, i.e., to assign a unique trackingID to each -initiated contact on the surface, is left to user space; preferably the -multi-touch X driver [3]. In that driver, the trackingID stays the same and -unique until the contact vanishes (when the finger leaves the surface). The -problem of assigning a set of anonymous fingers to a set of identified -fingers is a euclidian bipartite matching problem at each event update, and -relies on a sufficiently rapid update rate. - -There are a few devices that support trackingID in hardware. User space can -make use of these native identifiers to reduce bandwidth and cpu usage. +initiated contact on the surface, is a Euclidian Bipartite Matching +problem. At each event synchronization, the set of actual contacts is +matched to the set of contacts from the previous synchronization. A full +implementation can be found in [3]. Gestures diff --git a/Documentation/isdn/INTERFACE.CAPI b/Documentation/isdn/INTERFACE.CAPI index f172091fb7c..309eb5ed942 100644 --- a/Documentation/isdn/INTERFACE.CAPI +++ b/Documentation/isdn/INTERFACE.CAPI @@ -113,12 +113,16 @@ char *driver_name int (*load_firmware)(struct capi_ctr *ctrlr, capiloaddata *ldata) (optional) pointer to a callback function for sending firmware and configuration data to the device + The function may return before the operation has completed. + Completion must be signalled by a call to capi_ctr_ready(). Return value: 0 on success, error code on error Called in process context. void (*reset_ctr)(struct capi_ctr *ctrlr) - (optional) pointer to a callback function for performing a reset on - the device, releasing all registered applications + (optional) pointer to a callback function for stopping the device, + releasing all registered applications + The function may return before the operation has completed. + Completion must be signalled by a call to capi_ctr_down(). Called in process context. void (*register_appl)(struct capi_ctr *ctrlr, u16 applid, diff --git a/Documentation/isdn/README.gigaset b/Documentation/isdn/README.gigaset index e472df84232..ef3343eaa00 100644 --- a/Documentation/isdn/README.gigaset +++ b/Documentation/isdn/README.gigaset @@ -47,9 +47,9 @@ GigaSet 307x Device Driver 1.2. Software -------- - The driver works with ISDN4linux and so can be used with any software - which is able to use ISDN4linux for ISDN connections (voice or data). - Experimental Kernel CAPI support is available as a compilation option. + The driver works with the Kernel CAPI subsystem as well as the old + ISDN4Linux subsystem, so it can be used with any software which is able + to use CAPI 2.0 or ISDN4Linux for ISDN connections (voice or data). There are some user space tools available at http://sourceforge.net/projects/gigaset307x/ @@ -152,61 +152,42 @@ GigaSet 307x Device Driver - GIGVER_FWBASE: retrieve the firmware version of the base Upon return, version[] is filled with the requested version information. -2.3. ISDN4linux - ---------- - This is the "normal" mode of operation. After loading the module you can - set up the ISDN system just as you'd do with any ISDN card supported by - the ISDN4Linux subsystem. Most distributions provide some configuration - utility. If not, you can use some HOWTOs like - http://www.linuxhaven.de/dlhp/HOWTO/DE-ISDN-HOWTO-5.html - If this doesn't work, because you have some device like SX100 where - debug output (see section 3.2.) shows something like this when dialing - CMD Received: ERROR - Available Params: 0 - Connection State: 0, Response: -1 - gigaset_process_response: resp_code -1 in ConState 0 ! - Timeout occurred - you probably need to use unimodem mode. (see section 2.5.) - -2.4. CAPI +2.3. CAPI ---- If the driver is compiled with CAPI support (kernel configuration option - GIGASET_CAPI, experimental) it can also be used with CAPI 2.0 kernel and - user space applications. For user space access, the module capi.ko must - be loaded. The capiinit command (included in the capi4k-utils package) - does this for you. - - The CAPI variant of the driver supports legacy ISDN4Linux applications - via the capidrv compatibility driver. The kernel module capidrv.ko must - be loaded explicitly with the command + GIGASET_CAPI) the devices will show up as CAPI controllers as soon as the + corresponding driver module is loaded, and can then be used with CAPI 2.0 + kernel and user space applications. For user space access, the module + capi.ko must be loaded. + + Legacy ISDN4Linux applications are supported via the capidrv + compatibility driver. The kernel module capidrv.ko must be loaded + explicitly with the command modprobe capidrv if needed, and cannot be unloaded again without unloading the driver first. (These are limitations of capidrv.) - The note about unimodem mode in the preceding section applies here, too. - -2.5. Unimodem mode - ------------- - This is needed for some devices [e.g. SX100] as they have problems with - the "normal" commands. + Most distributions handle loading and unloading of the various CAPI + modules automatically via the command capiinit(1) from the capi4k-utils + package or a similar mechanism. Note that capiinit(1) cannot unload the + Gigaset drivers because it doesn't support more than one module per + driver. - If you have installed the command line tool gigacontr, you can enter - unimodem mode using - gigacontr --mode unimodem - You can switch back using - gigacontr --mode isdn +2.4. ISDN4Linux + ---------- + If the driver is compiled without CAPI support (native ISDN4Linux + variant), it registers the device with the legacy ISDN4Linux subsystem + after loading the module. It can then be used with ISDN4Linux + applications only. Most distributions provide some configuration utility + for setting up that subsystem. Otherwise you can use some HOWTOs like + http://www.linuxhaven.de/dlhp/HOWTO/DE-ISDN-HOWTO-5.html - You can also put the driver directly into Unimodem mode when it's loaded, - by passing the module parameter startmode=0 to the hardware specific - module, e.g. - modprobe usb_gigaset startmode=0 - or by adding a line like - options usb_gigaset startmode=0 - to an appropriate module configuration file, like /etc/modprobe.d/gigaset - or /etc/modprobe.conf.local. +2.5. Unimodem mode + ------------- In this mode the device works like a modem connected to a serial port (the /dev/ttyGU0, ... mentioned above) which understands the commands + ATZ init, reset => OK or ERROR ATD @@ -234,6 +215,31 @@ GigaSet 307x Device Driver to an appropriate module configuration file, like /etc/modprobe.d/gigaset or /etc/modprobe.conf.local. + Unimodem mode is needed for making some devices [e.g. SX100] work which + do not support the regular Gigaset command set. If debug output (see + section 3.2.) shows something like this when dialing: + CMD Received: ERROR + Available Params: 0 + Connection State: 0, Response: -1 + gigaset_process_response: resp_code -1 in ConState 0 ! + Timeout occurred + then switching to unimodem mode may help. + + If you have installed the command line tool gigacontr, you can enter + unimodem mode using + gigacontr --mode unimodem + You can switch back using + gigacontr --mode isdn + + You can also put the driver directly into Unimodem mode when it's loaded, + by passing the module parameter startmode=0 to the hardware specific + module, e.g. + modprobe usb_gigaset startmode=0 + or by adding a line like + options usb_gigaset startmode=0 + to an appropriate module configuration file, like /etc/modprobe.d/gigaset + or /etc/modprobe.conf.local. + 2.6. Call-ID (CID) mode ------------------ Call-IDs are numbers used to tag commands to, and responses from, the @@ -263,7 +269,22 @@ GigaSet 307x Device Driver change its CID mode while the driver is loaded, eg. echo 0 > /sys/class/tty/ttyGU0/cidmode -2.7. Unregistered Wireless Devices (M101/M105) +2.7. Dialing Numbers + --------------- + The called party number provided by an application for dialing out must + be a public network number according to the local dialing plan, without + any dial prefix for getting an outside line. + + Internal calls can be made by providing an internal extension number + prefixed with "**" (two asterisks) as the called party number. So to dial + eg. the first registered DECT handset, give "**11" as the called party + number. Dialing "***" (three asterisks) calls all extensions + simultaneously (global call). + + This holds for both CAPI 2.0 and ISDN4Linux applications. Unimodem mode + does not support internal calls. + +2.8. Unregistered Wireless Devices (M101/M105) ----------------------------------------- The main purpose of the ser_gigaset and usb_gigaset drivers is to allow the M101 and M105 wireless devices to be used as ISDN devices for ISDN diff --git a/Documentation/kbuild/kbuild.txt b/Documentation/kbuild/kbuild.txt index 6f8c1cabbc5..634c625da8c 100644 --- a/Documentation/kbuild/kbuild.txt +++ b/Documentation/kbuild/kbuild.txt @@ -65,7 +65,7 @@ CROSS_COMPILE Specify an optional fixed part of the binutils filename. CROSS_COMPILE can be a part of the filename or the full path. -CROSS_COMPILE is also used for ccache is some setups. +CROSS_COMPILE is also used for ccache in some setups. CF -------------------------------------------------- @@ -162,3 +162,7 @@ For tags/TAGS/cscope targets, you can specify more than one arch to be included in the databases, separated by blank space. E.g.: $ make ALLSOURCE_ARCHS="x86 mips arm" tags + +To get all available archs you can also specify all. E.g.: + + $ make ALLSOURCE_ARCHS=all tags diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index b56ea860da2..d9239d5f3ad 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -93,6 +93,7 @@ parameter is applicable: Documentation/scsi/. SECURITY Different security models are enabled. SELINUX SELinux support is enabled. + APPARMOR AppArmor support is enabled. SERIAL Serial support is enabled. SH SuperH architecture is enabled. SMP The kernel is an SMP kernel. @@ -145,11 +146,10 @@ and is between 256 and 4096 characters. It is defined in the file acpi= [HW,ACPI,X86] Advanced Configuration and Power Interface - Format: { force | off | ht | strict | noirq | rsdt } + Format: { force | off | strict | noirq | rsdt } force -- enable ACPI if default was off off -- disable ACPI if default was on noirq -- do not use ACPI for IRQ routing - ht -- run only enough ACPI to enable Hyper Threading strict -- Be less tolerant of platforms that are not strictly ACPI specification compliant. rsdt -- prefer RSDT over (default) XSDT @@ -255,8 +255,8 @@ and is between 256 and 4096 characters. It is defined in the file control method, with respect to putting devices into low power states, to be enforced (the ACPI 2.0 ordering of _PTS is used by default). - s4_nonvs prevents the kernel from saving/restoring the - ACPI NVS memory during hibernation. + nonvs prevents the kernel from saving/restoring the + ACPI NVS memory during suspend/hibernation and resume. sci_force_enable causes the kernel to set SCI_EN directly on resume from S1/S3 (which is against the ACPI spec, but some broken systems don't work without it). @@ -758,6 +758,10 @@ and is between 256 and 4096 characters. It is defined in the file Default value is 0. Value can be changed at runtime via /selinux/enforce. + erst_disable [ACPI] + Disable Error Record Serialization Table (ERST) + support. + ether= [HW,NET] Ethernet cards parameters This option is obsoleted by the "netdev=" option, which has equivalent usage. See its documentation for details. @@ -852,6 +856,11 @@ and is between 256 and 4096 characters. It is defined in the file hd= [EIDE] (E)IDE hard drive subsystem geometry Format: <cyl>,<head>,<sect> + hest_disable [ACPI] + Disable Hardware Error Source Table (HEST) support; + corresponding firmware-first mode error processing + logic will be disabled. + highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact size of <nn>. This works even on boxes that have no highmem otherwise. This also works to reduce highmem @@ -1252,10 +1261,12 @@ and is between 256 and 4096 characters. It is defined in the file * nohrst, nosrst, norst: suppress hard, soft and both resets. + * dump_id: dump IDENTIFY data. + If there are multiple matching configurations changing the same attribute, the last one is used. - lmb=debug [KNL] Enable lmb debug messages. + memblock=debug [KNL] Enable memblock debug messages. load_ramdisk= [RAM] List of ramdisks to load from floppy See Documentation/blockdev/ramdisk.txt. @@ -1587,8 +1598,7 @@ and is between 256 and 4096 characters. It is defined in the file [NETFILTER] Enable connection tracking flow accounting 0 to disable accounting 1 to enable accounting - Default value depends on CONFIG_NF_CT_ACCT that is - going to be removed in 2.6.29. + Default value is 0. nfsaddrs= [NFS] See Documentation/filesystems/nfs/nfsroot.txt. @@ -2038,7 +2048,9 @@ and is between 256 and 4096 characters. It is defined in the file WARNING: Forcing ASPM on may cause system lockups. pcie_pme= [PCIE,PM] Native PCIe PME signaling options: - off Do not use native PCIe PME signaling. + Format: {auto|force}[,nomsi] + auto Use native PCIe PME signaling if the BIOS allows the + kernel to control PCIe config registers of root ports. force Use native PCIe PME signaling even if the BIOS refuses to allow the kernel to control the relevant PCIe config registers. @@ -2300,6 +2312,13 @@ and is between 256 and 4096 characters. It is defined in the file If enabled at boot time, /selinux/disable can be used later to disable prior to initial policy load. + apparmor= [APPARMOR] Disable or enable AppArmor at boot time + Format: { "0" | "1" } + See security/apparmor/Kconfig help text + 0 -- disable. + 1 -- enable. + Default value is set via kernel config option. + serialnumber [BUGS=X86-32] shapers= [NET] diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt index a237518e51b..5f5b64982b1 100644 --- a/Documentation/kvm/api.txt +++ b/Documentation/kvm/api.txt @@ -126,6 +126,10 @@ user fills in the size of the indices array in nmsrs, and in return kvm adjusts nmsrs to reflect the actual number of msrs and fills in the indices array with their numbers. +Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are +not returned in the MSR list, as different vcpus can have a different number +of banks, as set via the KVM_X86_SETUP_MCE ioctl. + 4.4 KVM_CHECK_EXTENSION Capability: basic @@ -160,29 +164,7 @@ Type: vm ioctl Parameters: struct kvm_memory_region (in) Returns: 0 on success, -1 on error -struct kvm_memory_region { - __u32 slot; - __u32 flags; - __u64 guest_phys_addr; - __u64 memory_size; /* bytes */ -}; - -/* for kvm_memory_region::flags */ -#define KVM_MEM_LOG_DIRTY_PAGES 1UL - -This ioctl allows the user to create or modify a guest physical memory -slot. When changing an existing slot, it may be moved in the guest -physical memory space, or its flags may be modified. It may not be -resized. Slots may not overlap. - -The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which -instructs kvm to keep track of writes to memory within the slot. See -the KVM_GET_DIRTY_LOG ioctl. - -It is recommended to use the KVM_SET_USER_MEMORY_REGION ioctl instead -of this API, if available. This newer API allows placing guest memory -at specified locations in the host address space, yielding better -control and easy access. +This ioctl is obsolete and has been removed. 4.6 KVM_CREATE_VCPU @@ -226,17 +208,7 @@ Type: vm ioctl Parameters: struct kvm_memory_alias (in) Returns: 0 (success), -1 (error) -struct kvm_memory_alias { - __u32 slot; /* this has a different namespace than memory slots */ - __u32 flags; - __u64 guest_phys_addr; - __u64 memory_size; - __u64 target_phys_addr; -}; - -Defines a guest physical address space region as an alias to another -region. Useful for aliased address, for example the VGA low memory -window. Should not be used with userspace memory. +This ioctl is obsolete and has been removed. 4.9 KVM_RUN @@ -892,6 +864,174 @@ arguments. This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel irqchip, the multiprocessing state must be maintained by userspace. +4.39 KVM_SET_IDENTITY_MAP_ADDR + +Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR +Architectures: x86 +Type: vm ioctl +Parameters: unsigned long identity (in) +Returns: 0 on success, -1 on error + +This ioctl defines the physical address of a one-page region in the guest +physical address space. The region must be within the first 4GB of the +guest physical address space and must not conflict with any memory slot +or any mmio address. The guest may malfunction if it accesses this memory +region. + +This ioctl is required on Intel-based hosts. This is needed on Intel hardware +because of a quirk in the virtualization implementation (see the internals +documentation when it pops into existence). + +4.40 KVM_SET_BOOT_CPU_ID + +Capability: KVM_CAP_SET_BOOT_CPU_ID +Architectures: x86, ia64 +Type: vm ioctl +Parameters: unsigned long vcpu_id +Returns: 0 on success, -1 on error + +Define which vcpu is the Bootstrap Processor (BSP). Values are the same +as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default +is vcpu 0. + +4.41 KVM_GET_XSAVE + +Capability: KVM_CAP_XSAVE +Architectures: x86 +Type: vcpu ioctl +Parameters: struct kvm_xsave (out) +Returns: 0 on success, -1 on error + +struct kvm_xsave { + __u32 region[1024]; +}; + +This ioctl would copy current vcpu's xsave struct to the userspace. + +4.42 KVM_SET_XSAVE + +Capability: KVM_CAP_XSAVE +Architectures: x86 +Type: vcpu ioctl +Parameters: struct kvm_xsave (in) +Returns: 0 on success, -1 on error + +struct kvm_xsave { + __u32 region[1024]; +}; + +This ioctl would copy userspace's xsave struct to the kernel. + +4.43 KVM_GET_XCRS + +Capability: KVM_CAP_XCRS +Architectures: x86 +Type: vcpu ioctl +Parameters: struct kvm_xcrs (out) +Returns: 0 on success, -1 on error + +struct kvm_xcr { + __u32 xcr; + __u32 reserved; + __u64 value; +}; + +struct kvm_xcrs { + __u32 nr_xcrs; + __u32 flags; + struct kvm_xcr xcrs[KVM_MAX_XCRS]; + __u64 padding[16]; +}; + +This ioctl would copy current vcpu's xcrs to the userspace. + +4.44 KVM_SET_XCRS + +Capability: KVM_CAP_XCRS +Architectures: x86 +Type: vcpu ioctl +Parameters: struct kvm_xcrs (in) +Returns: 0 on success, -1 on error + +struct kvm_xcr { + __u32 xcr; + __u32 reserved; + __u64 value; +}; + +struct kvm_xcrs { + __u32 nr_xcrs; + __u32 flags; + struct kvm_xcr xcrs[KVM_MAX_XCRS]; + __u64 padding[16]; +}; + +This ioctl would set vcpu's xcr to the value userspace specified. + +4.45 KVM_GET_SUPPORTED_CPUID + +Capability: KVM_CAP_EXT_CPUID +Architectures: x86 +Type: system ioctl +Parameters: struct kvm_cpuid2 (in/out) +Returns: 0 on success, -1 on error + +struct kvm_cpuid2 { + __u32 nent; + __u32 padding; + struct kvm_cpuid_entry2 entries[0]; +}; + +#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX 1 +#define KVM_CPUID_FLAG_STATEFUL_FUNC 2 +#define KVM_CPUID_FLAG_STATE_READ_NEXT 4 + +struct kvm_cpuid_entry2 { + __u32 function; + __u32 index; + __u32 flags; + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; + __u32 padding[3]; +}; + +This ioctl returns x86 cpuid features which are supported by both the hardware +and kvm. Userspace can use the information returned by this ioctl to +construct cpuid information (for KVM_SET_CPUID2) that is consistent with +hardware, kernel, and userspace capabilities, and with user requirements (for +example, the user may wish to constrain cpuid to emulate older hardware, +or for feature consistency across a cluster). + +Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure +with the 'nent' field indicating the number of entries in the variable-size +array 'entries'. If the number of entries is too low to describe the cpu +capabilities, an error (E2BIG) is returned. If the number is too high, +the 'nent' field is adjusted and an error (ENOMEM) is returned. If the +number is just right, the 'nent' field is adjusted to the number of valid +entries in the 'entries' array, which is then filled. + +The entries returned are the host cpuid as returned by the cpuid instruction, +with unknown or unsupported features masked out. The fields in each entry +are defined as follows: + + function: the eax value used to obtain the entry + index: the ecx value used to obtain the entry (for entries that are + affected by ecx) + flags: an OR of zero or more of the following: + KVM_CPUID_FLAG_SIGNIFCANT_INDEX: + if the index field is valid + KVM_CPUID_FLAG_STATEFUL_FUNC: + if cpuid for this function returns different values for successive + invocations; there will be several entries with the same function, + all with this flag set + KVM_CPUID_FLAG_STATE_READ_NEXT: + for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is + the first entry to be read by a cpu + eax, ebx, ecx, edx: the values returned by the cpuid instruction for + this function/index combination + 5. The kvm_run structure Application code obtains a pointer to the kvm_run structure by diff --git a/Documentation/kvm/mmu.txt b/Documentation/kvm/mmu.txt index aaed6ab9d7a..142cc513665 100644 --- a/Documentation/kvm/mmu.txt +++ b/Documentation/kvm/mmu.txt @@ -77,10 +77,10 @@ Memory Guest memory (gpa) is part of the user address space of the process that is using kvm. Userspace defines the translation between guest addresses and user -addresses (gpa->hva); note that two gpas may alias to the same gva, but not +addresses (gpa->hva); note that two gpas may alias to the same hva, but not vice versa. -These gvas may be backed using any method available to the host: anonymous +These hvas may be backed using any method available to the host: anonymous memory, file backed memory, and device memory. Memory might be paged by the host at any time. @@ -161,7 +161,7 @@ Shadow pages contain the following information: role.cr4_pae: Contains the value of cr4.pae for which the page is valid (e.g. whether 32-bit or 64-bit gptes are in use). - role.cr4_nxe: + role.nxe: Contains the value of efer.nxe for which the page is valid. role.cr0_wp: Contains the value of cr0.wp for which the page is valid. @@ -180,7 +180,9 @@ Shadow pages contain the following information: guest pages as leaves. gfns: An array of 512 guest frame numbers, one for each present pte. Used to - perform a reverse map from a pte to a gfn. + perform a reverse map from a pte to a gfn. When role.direct is set, any + element of this array can be calculated from the gfn field when used, in + this case, the array of gfns is not allocated. See role.direct and gfn. slot_bitmap: A bitmap containing one bit per memory slot. If the page contains a pte mapping a page from memory slot n, then bit n of slot_bitmap will be set @@ -296,6 +298,48 @@ Host translation updates: - look up affected sptes through reverse map - drop (or update) translations +Emulating cr0.wp +================ + +If tdp is not enabled, the host must keep cr0.wp=1 so page write protection +works for the guest kernel, not guest guest userspace. When the guest +cr0.wp=1, this does not present a problem. However when the guest cr0.wp=0, +we cannot map the permissions for gpte.u=1, gpte.w=0 to any spte (the +semantics require allowing any guest kernel access plus user read access). + +We handle this by mapping the permissions to two possible sptes, depending +on fault type: + +- kernel write fault: spte.u=0, spte.w=1 (allows full kernel access, + disallows user access) +- read fault: spte.u=1, spte.w=0 (allows full read access, disallows kernel + write access) + +(user write faults generate a #PF) + +Large pages +=========== + +The mmu supports all combinations of large and small guest and host pages. +Supported page sizes include 4k, 2M, 4M, and 1G. 4M pages are treated as +two separate 2M pages, on both guest and host, since the mmu always uses PAE +paging. + +To instantiate a large spte, four constraints must be satisfied: + +- the spte must point to a large host page +- the guest pte must be a large pte of at least equivalent size (if tdp is + enabled, there is no guest pte and this condition is satisified) +- if the spte will be writeable, the large page frame may not overlap any + write-protected pages +- the guest page must be wholly contained by a single memory slot + +To check the last two conditions, the mmu maintains a ->write_count set of +arrays for each memory slot and large page size. Every write protected page +causes its write_count to be incremented, thus preventing instantiation of +a large spte. The frames at the end of an unaligned memory slot have +artificically inflated ->write_counts so they can never be instantiated. + Further reading =============== diff --git a/Documentation/kvm/msr.txt b/Documentation/kvm/msr.txt new file mode 100644 index 00000000000..8ddcfe84c09 --- /dev/null +++ b/Documentation/kvm/msr.txt @@ -0,0 +1,153 @@ +KVM-specific MSRs. +Glauber Costa <glommer@redhat.com>, Red Hat Inc, 2010 +===================================================== + +KVM makes use of some custom MSRs to service some requests. +At present, this facility is only used by kvmclock. + +Custom MSRs have a range reserved for them, that goes from +0x4b564d00 to 0x4b564dff. There are MSRs outside this area, +but they are deprecated and their use is discouraged. + +Custom MSR list +-------- + +The current supported Custom MSR list is: + +MSR_KVM_WALL_CLOCK_NEW: 0x4b564d00 + + data: 4-byte alignment physical address of a memory area which must be + in guest RAM. This memory is expected to hold a copy of the following + structure: + + struct pvclock_wall_clock { + u32 version; + u32 sec; + u32 nsec; + } __attribute__((__packed__)); + + whose data will be filled in by the hypervisor. The hypervisor is only + guaranteed to update this data at the moment of MSR write. + Users that want to reliably query this information more than once have + to write more than once to this MSR. Fields have the following meanings: + + version: guest has to check version before and after grabbing + time information and check that they are both equal and even. + An odd version indicates an in-progress update. + + sec: number of seconds for wallclock. + + nsec: number of nanoseconds for wallclock. + + Note that although MSRs are per-CPU entities, the effect of this + particular MSR is global. + + Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid + leaf prior to usage. + +MSR_KVM_SYSTEM_TIME_NEW: 0x4b564d01 + + data: 4-byte aligned physical address of a memory area which must be in + guest RAM, plus an enable bit in bit 0. This memory is expected to hold + a copy of the following structure: + + struct pvclock_vcpu_time_info { + u32 version; + u32 pad0; + u64 tsc_timestamp; + u64 system_time; + u32 tsc_to_system_mul; + s8 tsc_shift; + u8 flags; + u8 pad[2]; + } __attribute__((__packed__)); /* 32 bytes */ + + whose data will be filled in by the hypervisor periodically. Only one + write, or registration, is needed for each VCPU. The interval between + updates of this structure is arbitrary and implementation-dependent. + The hypervisor may update this structure at any time it sees fit until + anything with bit0 == 0 is written to it. + + Fields have the following meanings: + + version: guest has to check version before and after grabbing + time information and check that they are both equal and even. + An odd version indicates an in-progress update. + + tsc_timestamp: the tsc value at the current VCPU at the time + of the update of this structure. Guests can subtract this value + from current tsc to derive a notion of elapsed time since the + structure update. + + system_time: a host notion of monotonic time, including sleep + time at the time this structure was last updated. Unit is + nanoseconds. + + tsc_to_system_mul: a function of the tsc frequency. One has + to multiply any tsc-related quantity by this value to get + a value in nanoseconds, besides dividing by 2^tsc_shift + + tsc_shift: cycle to nanosecond divider, as a power of two, to + allow for shift rights. One has to shift right any tsc-related + quantity by this value to get a value in nanoseconds, besides + multiplying by tsc_to_system_mul. + + With this information, guests can derive per-CPU time by + doing: + + time = (current_tsc - tsc_timestamp) + time = (time * tsc_to_system_mul) >> tsc_shift + time = time + system_time + + flags: bits in this field indicate extended capabilities + coordinated between the guest and the hypervisor. Availability + of specific flags has to be checked in 0x40000001 cpuid leaf. + Current flags are: + + flag bit | cpuid bit | meaning + ------------------------------------------------------------- + | | time measures taken across + 0 | 24 | multiple cpus are guaranteed to + | | be monotonic + ------------------------------------------------------------- + + Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid + leaf prior to usage. + + +MSR_KVM_WALL_CLOCK: 0x11 + + data and functioning: same as MSR_KVM_WALL_CLOCK_NEW. Use that instead. + + This MSR falls outside the reserved KVM range and may be removed in the + future. Its usage is deprecated. + + Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid + leaf prior to usage. + +MSR_KVM_SYSTEM_TIME: 0x12 + + data and functioning: same as MSR_KVM_SYSTEM_TIME_NEW. Use that instead. + + This MSR falls outside the reserved KVM range and may be removed in the + future. Its usage is deprecated. + + Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid + leaf prior to usage. + + The suggested algorithm for detecting kvmclock presence is then: + + if (!kvm_para_available()) /* refer to cpuid.txt */ + return NON_PRESENT; + + flags = cpuid_eax(0x40000001); + if (flags & 3) { + msr_kvm_system_time = MSR_KVM_SYSTEM_TIME_NEW; + msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK_NEW; + return PRESENT; + } else if (flags & 0) { + msr_kvm_system_time = MSR_KVM_SYSTEM_TIME; + msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK; + return PRESENT; + } else + return NON_PRESENT; diff --git a/Documentation/kvm/review-checklist.txt b/Documentation/kvm/review-checklist.txt new file mode 100644 index 00000000000..730475ae1b8 --- /dev/null +++ b/Documentation/kvm/review-checklist.txt @@ -0,0 +1,38 @@ +Review checklist for kvm patches +================================ + +1. The patch must follow Documentation/CodingStyle and + Documentation/SubmittingPatches. + +2. Patches should be against kvm.git master branch. + +3. If the patch introduces or modifies a new userspace API: + - the API must be documented in Documentation/kvm/api.txt + - the API must be discoverable using KVM_CHECK_EXTENSION + +4. New state must include support for save/restore. + +5. New features must default to off (userspace should explicitly request them). + Performance improvements can and should default to on. + +6. New cpu features should be exposed via KVM_GET_SUPPORTED_CPUID2 + +7. Emulator changes should be accompanied by unit tests for qemu-kvm.git + kvm/test directory. + +8. Changes should be vendor neutral when possible. Changes to common code + are better than duplicating changes to vendor code. + +9. Similarly, prefer changes to arch independent code than to arch dependent + code. + +10. User/kernel interfaces and guest/host interfaces must be 64-bit clean + (all variables and sizes naturally aligned on 64-bit; use specific types + only - u64 rather than ulong). + +11. New guest visible features must either be documented in a hardware manual + or be accompanied by documentation. + +12. Features must be robust against reset and kexec - for example, shared + host/guest memory must be unshared to prevent the host from writing to + guest memory that the guest has not reserved for this purpose. diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index fc15538d8b4..f6f80257add 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -960,70 +960,21 @@ Sysfs notes: subsystem, and follow all of the hwmon guidelines at Documentation/hwmon. +EXPERIMENTAL: Embedded controller register dump +----------------------------------------------- -EXPERIMENTAL: Embedded controller register dump -- /proc/acpi/ibm/ecdump ------------------------------------------------------------------------- - -This feature is marked EXPERIMENTAL because the implementation -directly accesses hardware registers and may not work as expected. USE -WITH CAUTION! To use this feature, you need to supply the -experimental=1 parameter when loading the module. - -This feature dumps the values of 256 embedded controller -registers. Values which have changed since the last time the registers -were dumped are marked with a star: - -[root@x40 ibm-acpi]# cat /proc/acpi/ibm/ecdump -EC +00 +01 +02 +03 +04 +05 +06 +07 +08 +09 +0a +0b +0c +0d +0e +0f -EC 0x00: a7 47 87 01 fe 96 00 08 01 00 cb 00 00 00 40 00 -EC 0x10: 00 00 ff ff f4 3c 87 09 01 ff 42 01 ff ff 0d 00 -EC 0x20: 00 00 00 00 00 00 00 00 00 00 00 03 43 00 00 80 -EC 0x30: 01 07 1a 00 30 04 00 00 *85 00 00 10 00 50 00 00 -EC 0x40: 00 00 00 00 00 00 14 01 00 04 00 00 00 00 00 00 -EC 0x50: 00 c0 02 0d 00 01 01 02 02 03 03 03 03 *bc *02 *bc -EC 0x60: *02 *bc *02 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0x70: 00 00 00 00 00 12 30 40 *24 *26 *2c *27 *20 80 *1f 80 -EC 0x80: 00 00 00 06 *37 *0e 03 00 00 00 0e 07 00 00 00 00 -EC 0x90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xa0: *ff 09 ff 09 ff ff *64 00 *00 *00 *a2 41 *ff *ff *e0 00 -EC 0xb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xd0: 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xe0: 00 00 00 00 00 00 00 00 11 20 49 04 24 06 55 03 -EC 0xf0: 31 55 48 54 35 38 57 57 08 2f 45 73 07 65 6c 1a - -This feature can be used to determine the register holding the fan -speed on some models. To do that, do the following: +This feature is not included in the thinkpad driver anymore. +Instead the EC can be accessed through /sys/kernel/debug/ec with +a userspace tool which can be found here: +ftp://ftp.suse.com/pub/people/trenn/sources/ec +Use it to determine the register holding the fan +speed on some models. To do that, do the following: - make sure the battery is fully charged - make sure the fan is running - - run 'cat /proc/acpi/ibm/ecdump' several times, once per second or so - -The first step makes sure various charging-related values don't -vary. The second ensures that the fan-related values do vary, since -the fan speed fluctuates a bit. The third will (hopefully) mark the -fan register with a star: - -[root@x40 ibm-acpi]# cat /proc/acpi/ibm/ecdump -EC +00 +01 +02 +03 +04 +05 +06 +07 +08 +09 +0a +0b +0c +0d +0e +0f -EC 0x00: a7 47 87 01 fe 96 00 08 01 00 cb 00 00 00 40 00 -EC 0x10: 00 00 ff ff f4 3c 87 09 01 ff 42 01 ff ff 0d 00 -EC 0x20: 00 00 00 00 00 00 00 00 00 00 00 03 43 00 00 80 -EC 0x30: 01 07 1a 00 30 04 00 00 85 00 00 10 00 50 00 00 -EC 0x40: 00 00 00 00 00 00 14 01 00 04 00 00 00 00 00 00 -EC 0x50: 00 c0 02 0d 00 01 01 02 02 03 03 03 03 bc 02 bc -EC 0x60: 02 bc 02 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0x70: 00 00 00 00 00 12 30 40 24 27 2c 27 21 80 1f 80 -EC 0x80: 00 00 00 06 *be 0d 03 00 00 00 0e 07 00 00 00 00 -EC 0x90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xa0: ff 09 ff 09 ff ff 64 00 00 00 a2 41 ff ff e0 00 -EC 0xb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xd0: 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -EC 0xe0: 00 00 00 00 00 00 00 00 11 20 49 04 24 06 55 03 -EC 0xf0: 31 55 48 54 35 38 57 57 08 2f 45 73 07 65 6c 1a - -Another set of values that varies often is the temperature + - use above mentioned tool to read out the EC + +Often fan and temperature values vary between readings. Since temperatures don't change vary fast, you can take several quick dumps to eliminate them. diff --git a/Documentation/mutex-design.txt b/Documentation/mutex-design.txt index aa60d1f627e..c91ccc0720f 100644 --- a/Documentation/mutex-design.txt +++ b/Documentation/mutex-design.txt @@ -66,14 +66,14 @@ of advantages of mutexes: c0377ccb <mutex_lock>: c0377ccb: f0 ff 08 lock decl (%eax) - c0377cce: 78 0e js c0377cde <.text.lock.mutex> + c0377cce: 78 0e js c0377cde <.text..lock.mutex> c0377cd0: c3 ret the unlocking fastpath is equally tight: c0377cd1 <mutex_unlock>: c0377cd1: f0 ff 00 lock incl (%eax) - c0377cd4: 7e 0f jle c0377ce5 <.text.lock.mutex+0x7> + c0377cd4: 7e 0f jle c0377ce5 <.text..lock.mutex+0x7> c0377cd6: c3 ret - 'struct mutex' semantics are well-defined and are enforced if diff --git a/Documentation/networking/README.ipw2200 b/Documentation/networking/README.ipw2200 index 80c728522c4..e4d3267071e 100644 --- a/Documentation/networking/README.ipw2200 +++ b/Documentation/networking/README.ipw2200 @@ -171,7 +171,7 @@ Where the supported parameter are: led Can be used to turn on experimental LED code. - 0 = Off, 1 = On. Default is 0. + 0 = Off, 1 = On. Default is 1. mode Can be used to set the default mode of the adapter. diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 61f516b135b..d0914781830 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -49,6 +49,7 @@ Table of Contents 3.3 Configuring Bonding Manually with Ifenslave 3.3.1 Configuring Multiple Bonds Manually 3.4 Configuring Bonding Manually via Sysfs +3.5 Overriding Configuration for Special Cases 4. Querying Bonding Configuration 4.1 Bonding Configuration @@ -1318,8 +1319,87 @@ echo 2000 > /sys/class/net/bond1/bonding/arp_interval echo +eth2 > /sys/class/net/bond1/bonding/slaves echo +eth3 > /sys/class/net/bond1/bonding/slaves - -4. Querying Bonding Configuration +3.5 Overriding Configuration for Special Cases +---------------------------------------------- +When using the bonding driver, the physical port which transmits a frame is +typically selected by the bonding driver, and is not relevant to the user or +system administrator. The output port is simply selected using the policies of +the selected bonding mode. On occasion however, it is helpful to direct certain +classes of traffic to certain physical interfaces on output to implement +slightly more complex policies. For example, to reach a web server over a +bonded interface in which eth0 connects to a private network, while eth1 +connects via a public network, it may be desirous to bias the bond to send said +traffic over eth0 first, using eth1 only as a fall back, while all other traffic +can safely be sent over either interface. Such configurations may be achieved +using the traffic control utilities inherent in linux. + +By default the bonding driver is multiqueue aware and 16 queues are created +when the driver initializes (see Documentation/networking/multiqueue.txt +for details). If more or less queues are desired the module parameter +tx_queues can be used to change this value. There is no sysfs parameter +available as the allocation is done at module init time. + +The output of the file /proc/net/bonding/bondX has changed so the output Queue +ID is now printed for each slave: + +Bonding Mode: fault-tolerance (active-backup) +Primary Slave: None +Currently Active Slave: eth0 +MII Status: up +MII Polling Interval (ms): 0 +Up Delay (ms): 0 +Down Delay (ms): 0 + +Slave Interface: eth0 +MII Status: up +Link Failure Count: 0 +Permanent HW addr: 00:1a:a0:12:8f:cb +Slave queue ID: 0 + +Slave Interface: eth1 +MII Status: up +Link Failure Count: 0 +Permanent HW addr: 00:1a:a0:12:8f:cc +Slave queue ID: 2 + +The queue_id for a slave can be set using the command: + +# echo "eth1:2" > /sys/class/net/bond0/bonding/queue_id + +Any interface that needs a queue_id set should set it with multiple calls +like the one above until proper priorities are set for all interfaces. On +distributions that allow configuration via initscripts, multiple 'queue_id' +arguments can be added to BONDING_OPTS to set all needed slave queues. + +These queue id's can be used in conjunction with the tc utility to configure +a multiqueue qdisc and filters to bias certain traffic to transmit on certain +slave devices. For instance, say we wanted, in the above configuration to +force all traffic bound to 192.168.1.100 to use eth1 in the bond as its output +device. The following commands would accomplish this: + +# tc qdisc add dev bond0 handle 1 root multiq + +# tc filter add dev bond0 protocol ip parent 1: prio 1 u32 match ip dst \ + 192.168.1.100 action skbedit queue_mapping 2 + +These commands tell the kernel to attach a multiqueue queue discipline to the +bond0 interface and filter traffic enqueued to it, such that packets with a dst +ip of 192.168.1.100 have their output queue mapping value overwritten to 2. +This value is then passed into the driver, causing the normal output path +selection policy to be overridden, selecting instead qid 2, which maps to eth1. + +Note that qid values begin at 1. Qid 0 is reserved to initiate to the driver +that normal output policy selection should take place. One benefit to simply +leaving the qid for a slave to 0 is the multiqueue awareness in the bonding +driver that is now present. This awareness allows tc filters to be placed on +slave devices as well as bond devices and the bonding driver will simply act as +a pass-through for selecting output queues on the slave device rather than +output port selection. + +This feature first appeared in bonding driver version 3.7.0 and support for +output slave selection was limited to round-robin and active-backup modes. + +4 Querying Bonding Configuration ================================= 4.1 Bonding Configuration diff --git a/Documentation/networking/caif/spi_porting.txt b/Documentation/networking/caif/spi_porting.txt new file mode 100644 index 00000000000..61d7c924745 --- /dev/null +++ b/Documentation/networking/caif/spi_porting.txt @@ -0,0 +1,208 @@ +- CAIF SPI porting - + +- CAIF SPI basics: + +Running CAIF over SPI needs some extra setup, owing to the nature of SPI. +Two extra GPIOs have been added in order to negotiate the transfers + between the master and the slave. The minimum requirement for running +CAIF over SPI is a SPI slave chip and two GPIOs (more details below). +Please note that running as a slave implies that you need to keep up +with the master clock. An overrun or underrun event is fatal. + +- CAIF SPI framework: + +To make porting as easy as possible, the CAIF SPI has been divided in +two parts. The first part (called the interface part) deals with all +generic functionality such as length framing, SPI frame negotiation +and SPI frame delivery and transmission. The other part is the CAIF +SPI slave device part, which is the module that you have to write if +you want to run SPI CAIF on a new hardware. This part takes care of +the physical hardware, both with regard to SPI and to GPIOs. + +- Implementing a CAIF SPI device: + + - Functionality provided by the CAIF SPI slave device: + + In order to implement a SPI device you will, as a minimum, + need to implement the following + functions: + + int (*init_xfer) (struct cfspi_xfer * xfer, struct cfspi_dev *dev): + + This function is called by the CAIF SPI interface to give + you a chance to set up your hardware to be ready to receive + a stream of data from the master. The xfer structure contains + both physical and logical adresses, as well as the total length + of the transfer in both directions.The dev parameter can be used + to map to different CAIF SPI slave devices. + + void (*sig_xfer) (bool xfer, struct cfspi_dev *dev): + + This function is called by the CAIF SPI interface when the output + (SPI_INT) GPIO needs to change state. The boolean value of the xfer + variable indicates whether the GPIO should be asserted (HIGH) or + deasserted (LOW). The dev parameter can be used to map to different CAIF + SPI slave devices. + + - Functionality provided by the CAIF SPI interface: + + void (*ss_cb) (bool assert, struct cfspi_ifc *ifc); + + This function is called by the CAIF SPI slave device in order to + signal a change of state of the input GPIO (SS) to the interface. + Only active edges are mandatory to be reported. + This function can be called from IRQ context (recommended in order + not to introduce latency). The ifc parameter should be the pointer + returned from the platform probe function in the SPI device structure. + + void (*xfer_done_cb) (struct cfspi_ifc *ifc); + + This function is called by the CAIF SPI slave device in order to + report that a transfer is completed. This function should only be + called once both the transmission and the reception are completed. + This function can be called from IRQ context (recommended in order + not to introduce latency). The ifc parameter should be the pointer + returned from the platform probe function in the SPI device structure. + + - Connecting the bits and pieces: + + - Filling in the SPI slave device structure: + + Connect the necessary callback functions. + Indicate clock speed (used to calculate toggle delays). + Chose a suitable name (helps debugging if you use several CAIF + SPI slave devices). + Assign your private data (can be used to map to your structure). + + - Filling in the SPI slave platform device structure: + Add name of driver to connect to ("cfspi_sspi"). + Assign the SPI slave device structure as platform data. + +- Padding: + +In order to optimize throughput, a number of SPI padding options are provided. +Padding can be enabled independently for uplink and downlink transfers. +Padding can be enabled for the head, the tail and for the total frame size. +The padding needs to be correctly configured on both sides of the link. +The padding can be changed via module parameters in cfspi_sspi.c or via +the sysfs directory of the cfspi_sspi driver (before device registration). + +- CAIF SPI device template: + +/* + * Copyright (C) ST-Ericsson AB 2010 + * Author: Daniel Martensson / Daniel.Martensson@stericsson.com + * License terms: GNU General Public License (GPL), version 2. + * + */ + +#include <linux/init.h> +#include <linux/module.h> +#include <linux/device.h> +#include <linux/wait.h> +#include <linux/interrupt.h> +#include <linux/dma-mapping.h> +#include <net/caif/caif_spi.h> + +MODULE_LICENSE("GPL"); + +struct sspi_struct { + struct cfspi_dev sdev; + struct cfspi_xfer *xfer; +}; + +static struct sspi_struct slave; +static struct platform_device slave_device; + +static irqreturn_t sspi_irq(int irq, void *arg) +{ + /* You only need to trigger on an edge to the active state of the + * SS signal. Once a edge is detected, the ss_cb() function should be + * called with the parameter assert set to true. It is OK + * (and even advised) to call the ss_cb() function in IRQ context in + * order not to add any delay. */ + + return IRQ_HANDLED; +} + +static void sspi_complete(void *context) +{ + /* Normally the DMA or the SPI framework will call you back + * in something similar to this. The only thing you need to + * do is to call the xfer_done_cb() function, providing the pointer + * to the CAIF SPI interface. It is OK to call this function + * from IRQ context. */ +} + +static int sspi_init_xfer(struct cfspi_xfer *xfer, struct cfspi_dev *dev) +{ + /* Store transfer info. For a normal implementation you should + * set up your DMA here and make sure that you are ready to + * receive the data from the master SPI. */ + + struct sspi_struct *sspi = (struct sspi_struct *)dev->priv; + + sspi->xfer = xfer; + + return 0; +} + +void sspi_sig_xfer(bool xfer, struct cfspi_dev *dev) +{ + /* If xfer is true then you should assert the SPI_INT to indicate to + * the master that you are ready to recieve the data from the master + * SPI. If xfer is false then you should de-assert SPI_INT to indicate + * that the transfer is done. + */ + + struct sspi_struct *sspi = (struct sspi_struct *)dev->priv; +} + +static void sspi_release(struct device *dev) +{ + /* + * Here you should release your SPI device resources. + */ +} + +static int __init sspi_init(void) +{ + /* Here you should initialize your SPI device by providing the + * necessary functions, clock speed, name and private data. Once + * done, you can register your device with the + * platform_device_register() function. This function will return + * with the CAIF SPI interface initialized. This is probably also + * the place where you should set up your GPIOs, interrupts and SPI + * resources. */ + + int res = 0; + + /* Initialize slave device. */ + slave.sdev.init_xfer = sspi_init_xfer; + slave.sdev.sig_xfer = sspi_sig_xfer; + slave.sdev.clk_mhz = 13; + slave.sdev.priv = &slave; + slave.sdev.name = "spi_sspi"; + slave_device.dev.release = sspi_release; + + /* Initialize platform device. */ + slave_device.name = "cfspi_sspi"; + slave_device.dev.platform_data = &slave.sdev; + + /* Register platform device. */ + res = platform_device_register(&slave_device); + if (res) { + printk(KERN_WARNING "sspi_init: failed to register dev.\n"); + return -ENODEV; + } + + return res; +} + +static void __exit sspi_exit(void) +{ + platform_device_del(&slave_device); +} + +module_init(sspi_init); +module_exit(sspi_exit); diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index d0536b5a4e0..f350c69b2bb 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -903,7 +903,7 @@ arp_ignore - INTEGER arp_notify - BOOLEAN Define mode for notification of address and device changes. 0 - (default): do nothing - 1 - Generate gratuitous arp replies when device is brought up + 1 - Generate gratuitous arp requests when device is brought up or hardware address changes. arp_accept - BOOLEAN diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt index 98f71a5cef0..2546aa4dc23 100644 --- a/Documentation/networking/packet_mmap.txt +++ b/Documentation/networking/packet_mmap.txt @@ -493,6 +493,32 @@ The user can also use poll() to check if a buffer is available: pfd.events = POLLOUT; retval = poll(&pfd, 1, timeout); +------------------------------------------------------------------------------- ++ PACKET_TIMESTAMP +------------------------------------------------------------------------------- + +The PACKET_TIMESTAMP setting determines the source of the timestamp in +the packet meta information. If your NIC is capable of timestamping +packets in hardware, you can request those hardware timestamps to used. +Note: you may need to enable the generation of hardware timestamps with +SIOCSHWTSTAMP. + +PACKET_TIMESTAMP accepts the same integer bit field as +SO_TIMESTAMPING. However, only the SOF_TIMESTAMPING_SYS_HARDWARE +and SOF_TIMESTAMPING_RAW_HARDWARE values are recognized by +PACKET_TIMESTAMP. SOF_TIMESTAMPING_SYS_HARDWARE takes precedence over +SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set. + + int req = 0; + req |= SOF_TIMESTAMPING_SYS_HARDWARE; + setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, (void *) &req, sizeof(req)) + +If PACKET_TIMESTAMP is not set, a software timestamp generated inside +the networking stack is used (the behavior before this setting was added). + +See include/linux/net_tstamp.h and Documentation/networking/timestamping +for more information on hardware timestamps. + -------------------------------------------------------------------------------- + THANKS -------------------------------------------------------------------------------- diff --git a/Documentation/networking/pktgen.txt b/Documentation/networking/pktgen.txt index 61bb645d50e..75e4fd708cc 100644 --- a/Documentation/networking/pktgen.txt +++ b/Documentation/networking/pktgen.txt @@ -151,6 +151,8 @@ Examples: pgset stop aborts injection. Also, ^C aborts generator. + pgset "rate 300M" set rate to 300 Mb/s + pgset "ratep 1000000" set rate to 1Mpps Example scripts =============== @@ -241,6 +243,9 @@ src6 flows flowlen +rate +ratep + References: ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/ ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/ diff --git a/Documentation/scsi/hpsa.txt b/Documentation/scsi/hpsa.txt new file mode 100644 index 00000000000..dca658362cb --- /dev/null +++ b/Documentation/scsi/hpsa.txt @@ -0,0 +1,107 @@ + +HPSA - Hewlett Packard Smart Array driver +----------------------------------------- + +This file describes the hpsa SCSI driver for HP Smart Array controllers. +The hpsa driver is intended to supplant the cciss driver for newer +Smart Array controllers. The hpsa driver is a SCSI driver, while the +cciss driver is a "block" driver. Actually cciss is both a block +driver (for logical drives) AND a SCSI driver (for tape drives). This +"split-brained" design of the cciss driver is a source of excess +complexity and eliminating that complexity is one of the reasons +for hpsa to exist. + +Supported devices: +------------------ + +Smart Array P212 +Smart Array P410 +Smart Array P410i +Smart Array P411 +Smart Array P812 +Smart Array P712m +Smart Array P711m +StorageWorks P1210m + +Additionally, older Smart Arrays may work with the hpsa driver if the kernel +boot parameter "hpsa_allow_any=1" is specified, however these are not tested +nor supported by HP with this driver. For older Smart Arrays, the cciss +driver should still be used. + +HPSA specific entries in /sys +----------------------------- + + In addition to the generic SCSI attributes available in /sys, hpsa supports + the following attributes: + + HPSA specific host attributes: + ------------------------------ + + /sys/class/scsi_host/host*/rescan + /sys/class/scsi_host/host*/firmware_revision + + the host "rescan" attribute is a write only attribute. Writing to this + attribute will cause the driver to scan for new, changed, or removed devices + (e.g. hot-plugged tape drives, or newly configured or deleted logical drives, + etc.) and notify the SCSI midlayer of any changes detected. Normally this is + triggered automatically by HP's Array Configuration Utility (either the GUI or + command line variety) so for logical drive changes, the user should not + normally have to use this. It may be useful when hot plugging devices like + tape drives, or entire storage boxes containing pre-configured logical drives. + + The "firmware_revision" attribute contains the firmware version of the Smart Array. + For example: + + root@host:/sys/class/scsi_host/host4# cat firmware_revision + 7.14 + + HPSA specific disk attributes: + ------------------------------ + + /sys/class/scsi_disk/c:b:t:l/device/unique_id + /sys/class/scsi_disk/c:b:t:l/device/raid_level + /sys/class/scsi_disk/c:b:t:l/device/lunid + + (where c:b:t:l are the controller, bus, target and lun of the device) + + For example: + + root@host:/sys/class/scsi_disk/4:0:0:0/device# cat unique_id + 600508B1001044395355323037570F77 + root@host:/sys/class/scsi_disk/4:0:0:0/device# cat lunid + 0x0000004000000000 + root@host:/sys/class/scsi_disk/4:0:0:0/device# cat raid_level + RAID 0 + +HPSA specific ioctls: +--------------------- + + For compatibility with applications written for the cciss driver, many, but + not all of the ioctls supported by the cciss driver are also supported by the + hpsa driver. The data structures used by these are described in + include/linux/cciss_ioctl.h + + CCISS_DEREGDISK + CCISS_REGNEWDISK + CCISS_REGNEWD + + The above three ioctls all do exactly the same thing, which is to cause the driver + to rescan for new devices. This does exactly the same thing as writing to the + hpsa specific host "rescan" attribute. + + CCISS_GETPCIINFO + + Returns PCI domain, bus, device and function and "board ID" (PCI subsystem ID). + + CCISS_GETDRIVVER + + Returns driver version in three bytes encoded as: + (major_version << 16) | (minor_version << 8) | (subminor_version) + + CCISS_PASSTHRU + CCISS_BIG_PASSTHRU + + Allows "BMIC" and "CISS" commands to be passed through to the Smart Array. + These are used extensively by the HP Array Configuration Utility, SNMP storage + agents, etc. See cciss_vol_status at http://cciss.sf.net for some examples. + diff --git a/Documentation/timers/Makefile b/Documentation/timers/Makefile index c85625f4ab2..73f75f8a87d 100644 --- a/Documentation/timers/Makefile +++ b/Documentation/timers/Makefile @@ -2,7 +2,7 @@ obj- := dummy.o # List of programs to build -hostprogs-y := hpet_example +hostprogs-$(CONFIG_X86) := hpet_example # Tell kbuild to always build the programs always := $(hostprogs-y) diff --git a/Documentation/tomoyo.txt b/Documentation/tomoyo.txt index b3a232cae7f..200a2d37cbc 100644 --- a/Documentation/tomoyo.txt +++ b/Documentation/tomoyo.txt @@ -3,8 +3,8 @@ TOMOYO is a name-based MAC extension (LSM module) for the Linux kernel. LiveCD-based tutorials are available at -http://tomoyo.sourceforge.jp/en/1.6.x/1st-step/ubuntu8.04-live/ -http://tomoyo.sourceforge.jp/en/1.6.x/1st-step/centos5-live/ . +http://tomoyo.sourceforge.jp/1.7/1st-step/ubuntu10.04-live/ +http://tomoyo.sourceforge.jp/1.7/1st-step/centos5-live/ . Though these tutorials use non-LSM version of TOMOYO, they are useful for you to know what TOMOYO is. @@ -13,12 +13,12 @@ to know what TOMOYO is. Build the kernel with CONFIG_SECURITY_TOMOYO=y and pass "security=tomoyo" on kernel's command line. -Please see http://tomoyo.sourceforge.jp/en/2.2.x/ for details. +Please see http://tomoyo.sourceforge.jp/2.3/ for details. --- Where is documentation? --- User <-> Kernel interface documentation is available at -http://tomoyo.sourceforge.jp/en/2.2.x/policy-reference.html . +http://tomoyo.sourceforge.jp/2.3/policy-reference.html . Materials we prepared for seminars and symposiums are available at http://sourceforge.jp/projects/tomoyo/docs/?category_id=532&language_id=1 . @@ -50,6 +50,6 @@ multiple LSM modules at the same time. We feel sorry that you have to give up SELinux/SMACK/AppArmor etc. when you want to use TOMOYO. We hope that LSM becomes stackable in future. Meanwhile, you can use non-LSM -version of TOMOYO, available at http://tomoyo.sourceforge.jp/en/1.6.x/ . +version of TOMOYO, available at http://tomoyo.sourceforge.jp/1.7/ . LSM version of TOMOYO is a subset of non-LSM version of TOMOYO. We are planning to port non-LSM version's functionalities to LSM versions. diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 16ca030e118..87c46347bd6 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -17,9 +17,9 @@ 16 -> DVBWorld DVB-S2 2005 [0001:2005] 17 -> NetUP Dual DVB-S2 CI [1b55:2a2c] 18 -> Hauppauge WinTV-HVR1270 [0070:2211] - 19 -> Hauppauge WinTV-HVR1275 [0070:2215] - 20 -> Hauppauge WinTV-HVR1255 [0070:2251] - 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295] + 19 -> Hauppauge WinTV-HVR1275 [0070:2215,0070:221d,0070:22f2] + 20 -> Hauppauge WinTV-HVR1255 [0070:2251,0070:2259,0070:22f1] + 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295,0070:2299,0070:229d,0070:22f0,0070:22f3,0070:22f4,0070:22f5] 22 -> Mygica X8506 DMB-TH [14f1:8651] 23 -> Magic-Pro ProHDTV Extreme 2 [14f1:8657] 24 -> Hauppauge WinTV-HVR1850 [0070:8541] diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 3a623aaeae5..5c568757c30 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -72,3 +72,4 @@ 73 -> Reddo DVB-C USB TV Box (em2870) 74 -> Actionmaster/LinXcel/Digitus VC211A (em2800) 75 -> Dikom DK300 (em2882) + 76 -> KWorld PlusTV 340U or UB435-Q (ATSC) (em2870) [1b80:a340] diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index 070f2576707..4000c29fcfb 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -176,5 +176,7 @@ 175 -> Leadtek Winfast DTV1000S [107d:6655] 176 -> Beholder BeholdTV 505 RDS [0000:5051] 177 -> Hawell HW-404M7 -179 -> Beholder BeholdTV H7 [5ace:7190] -180 -> Beholder BeholdTV A7 [5ace:7090] +178 -> Beholder BeholdTV H7 [5ace:7190] +179 -> Beholder BeholdTV A7 [5ace:7090] +180 -> Avermedia PCI M733A [1461:4155,1461:4255] +181 -> TechoTrend TT-budget T-3000 [13c2:2804] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index 9b2e0dd6017..e67c1db9685 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -82,3 +82,4 @@ tuner=81 - Partsnic (Daewoo) PTI-5NF05 tuner=82 - Philips CU1216L tuner=83 - NXP TDA18271 tuner=84 - Sony BTF-Pxn01Z +tuner=85 - Philips FQ1236 MK5 diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index 8f3f5d33327..56ba7bba716 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -29,8 +29,12 @@ zc3xx 041e:4029 Creative WebCam Vista Pro zc3xx 041e:4034 Creative Instant P0620 zc3xx 041e:4035 Creative Instant P0620D zc3xx 041e:4036 Creative Live ! +sq930x 041e:4038 Creative Joy-IT zc3xx 041e:403a Creative Nx Pro 2 spca561 041e:403b Creative Webcam Vista (VF0010) +sq930x 041e:403c Creative Live! Ultra +sq930x 041e:403d Creative Live! Ultra for Notebooks +sq930x 041e:4041 Creative Live! Motion zc3xx 041e:4051 Creative Live!Cam Notebook Pro (VF0250) ov519 041e:4052 Creative Live! VISTA IM zc3xx 041e:4053 Creative Live!Cam Video IM @@ -138,6 +142,7 @@ finepix 04cb:013d Fujifilm FinePix unknown model finepix 04cb:013f Fujifilm FinePix F420 sunplus 04f1:1001 JVC GC A50 spca561 04fc:0561 Flexcam 100 +spca1528 04fc:1528 Sunplus MD80 clone sunplus 04fc:500c Sunplus CA500C sunplus 04fc:504a Aiptek Mini PenCam 1.3 sunplus 04fc:504b Maxell MaxPocket LE 1.3 @@ -253,6 +258,7 @@ pac7302 093a:2620 Apollo AC-905 pac7302 093a:2621 PAC731x pac7302 093a:2622 Genius Eye 312 pac7302 093a:2624 PAC7302 +pac7302 093a:2625 Genius iSlim 310 pac7302 093a:2626 Labtec 2200 pac7302 093a:2628 Genius iLook 300 pac7302 093a:2629 Genious iSlim 300 @@ -290,6 +296,7 @@ sonixb 0c45:602e Genius VideoCam Messenger sonixj 0c45:6040 Speed NVC 350K sonixj 0c45:607c Sonix sn9c102p Hv7131R sonixj 0c45:60c0 Sangha Sn535 +sonixj 0c45:60ce USB-PC-Camera-168 (TALK-5067) sonixj 0c45:60ec SN9C105+MO4000 sonixj 0c45:60fb Surfer NoName sonixj 0c45:60fc LG-LIC300 @@ -361,6 +368,8 @@ sq905c 2770:9052 Disney pix micro 2 (VGA) sq905c 2770:905c All 11 known cameras with this ID sq905 2770:9120 All 24 known cameras with this ID sq905c 2770:913d All 4 known cameras with this ID +sq930x 2770:930b Sweex Motion Tracking / I-Tec iCam Tracer +sq930x 2770:930c Trust WB-3500T / NSG Robbie 2.0 spca500 2899:012c Toptro Industrial ov519 8020:ef04 ov519 spca508 8086:0110 Intel Easy PC Camera diff --git a/Documentation/vm/numa b/Documentation/vm/numa index e93ad9425e2..a200a386429 100644 --- a/Documentation/vm/numa +++ b/Documentation/vm/numa @@ -1,41 +1,149 @@ Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com> -The intent of this file is to have an uptodate, running commentary -from different people about NUMA specific code in the Linux vm. - -What is NUMA? It is an architecture where the memory access times -for different regions of memory from a given processor varies -according to the "distance" of the memory region from the processor. -Each region of memory to which access times are the same from any -cpu, is called a node. On such architectures, it is beneficial if -the kernel tries to minimize inter node communications. Schemes -for this range from kernel text and read-only data replication -across nodes, and trying to house all the data structures that -key components of the kernel need on memory on that node. - -Currently, all the numa support is to provide efficient handling -of widely discontiguous physical memory, so architectures which -are not NUMA but can have huge holes in the physical address space -can use the same code. All this code is bracketed by CONFIG_DISCONTIGMEM. - -The initial port includes NUMAizing the bootmem allocator code by -encapsulating all the pieces of information into a bootmem_data_t -structure. Node specific calls have been added to the allocator. -In theory, any platform which uses the bootmem allocator should -be able to put the bootmem and mem_map data structures anywhere -it deems best. - -Each node's page allocation data structures have also been encapsulated -into a pg_data_t. The bootmem_data_t is just one part of this. To -make the code look uniform between NUMA and regular UMA platforms, -UMA platforms have a statically allocated pg_data_t too (contig_page_data). -For the sake of uniformity, the function num_online_nodes() is also defined -for all platforms. As we run benchmarks, we might decide to NUMAize -more variables like low_on_memory, nr_free_pages etc into the pg_data_t. - -The NUMA aware page allocation code currently tries to allocate pages -from different nodes in a round robin manner. This will be changed to -do concentratic circle search, starting from current node, once the -NUMA port achieves more maturity. The call alloc_pages_node has been -added, so that drivers can make the call and not worry about whether -it is running on a NUMA or UMA platform. +What is NUMA? + +This question can be answered from a couple of perspectives: the +hardware view and the Linux software view. + +From the hardware perspective, a NUMA system is a computer platform that +comprises multiple components or assemblies each of which may contain 0 +or more CPUs, local memory, and/or IO buses. For brevity and to +disambiguate the hardware view of these physical components/assemblies +from the software abstraction thereof, we'll call the components/assemblies +'cells' in this document. + +Each of the 'cells' may be viewed as an SMP [symmetric multi-processor] subset +of the system--although some components necessary for a stand-alone SMP system +may not be populated on any given cell. The cells of the NUMA system are +connected together with some sort of system interconnect--e.g., a crossbar or +point-to-point link are common types of NUMA system interconnects. Both of +these types of interconnects can be aggregated to create NUMA platforms with +cells at multiple distances from other cells. + +For Linux, the NUMA platforms of interest are primarily what is known as Cache +Coherent NUMA or ccNUMA systems. With ccNUMA systems, all memory is visible +to and accessible from any CPU attached to any cell and cache coherency +is handled in hardware by the processor caches and/or the system interconnect. + +Memory access time and effective memory bandwidth varies depending on how far +away the cell containing the CPU or IO bus making the memory access is from the +cell containing the target memory. For example, access to memory by CPUs +attached to the same cell will experience faster access times and higher +bandwidths than accesses to memory on other, remote cells. NUMA platforms +can have cells at multiple remote distances from any given cell. + +Platform vendors don't build NUMA systems just to make software developers' +lives interesting. Rather, this architecture is a means to provide scalable +memory bandwidth. However, to achieve scalable memory bandwidth, system and +application software must arrange for a large majority of the memory references +[cache misses] to be to "local" memory--memory on the same cell, if any--or +to the closest cell with memory. + +This leads to the Linux software view of a NUMA system: + +Linux divides the system's hardware resources into multiple software +abstractions called "nodes". Linux maps the nodes onto the physical cells +of the hardware platform, abstracting away some of the details for some +architectures. As with physical cells, software nodes may contain 0 or more +CPUs, memory and/or IO buses. And, again, memory accesses to memory on +"closer" nodes--nodes that map to closer cells--will generally experience +faster access times and higher effective bandwidth than accesses to more +remote cells. + +For some architectures, such as x86, Linux will "hide" any node representing a +physical cell that has no memory attached, and reassign any CPUs attached to +that cell to a node representing a cell that does have memory. Thus, on +these architectures, one cannot assume that all CPUs that Linux associates with +a given node will see the same local memory access times and bandwidth. + +In addition, for some architectures, again x86 is an example, Linux supports +the emulation of additional nodes. For NUMA emulation, linux will carve up +the existing nodes--or the system memory for non-NUMA platforms--into multiple +nodes. Each emulated node will manage a fraction of the underlying cells' +physical memory. NUMA emluation is useful for testing NUMA kernel and +application features on non-NUMA platforms, and as a sort of memory resource +management mechanism when used together with cpusets. +[see Documentation/cgroups/cpusets.txt] + +For each node with memory, Linux constructs an independent memory management +subsystem, complete with its own free page lists, in-use page lists, usage +statistics and locks to mediate access. In addition, Linux constructs for +each memory zone [one or more of DMA, DMA32, NORMAL, HIGH_MEMORY, MOVABLE], +an ordered "zonelist". A zonelist specifies the zones/nodes to visit when a +selected zone/node cannot satisfy the allocation request. This situation, +when a zone has no available memory to satisfy a request, is called +"overflow" or "fallback". + +Because some nodes contain multiple zones containing different types of +memory, Linux must decide whether to order the zonelists such that allocations +fall back to the same zone type on a different node, or to a different zone +type on the same node. This is an important consideration because some zones, +such as DMA or DMA32, represent relatively scarce resources. Linux chooses +a default zonelist order based on the sizes of the various zone types relative +to the total memory of the node and the total memory of the system. The +default zonelist order may be overridden using the numa_zonelist_order kernel +boot parameter or sysctl. [see Documentation/kernel-parameters.txt and +Documentation/sysctl/vm.txt] + +By default, Linux will attempt to satisfy memory allocation requests from the +node to which the CPU that executes the request is assigned. Specifically, +Linux will attempt to allocate from the first node in the appropriate zonelist +for the node where the request originates. This is called "local allocation." +If the "local" node cannot satisfy the request, the kernel will examine other +nodes' zones in the selected zonelist looking for the first zone in the list +that can satisfy the request. + +Local allocation will tend to keep subsequent access to the allocated memory +"local" to the underlying physical resources and off the system interconnect-- +as long as the task on whose behalf the kernel allocated some memory does not +later migrate away from that memory. The Linux scheduler is aware of the +NUMA topology of the platform--embodied in the "scheduling domains" data +structures [see Documentation/scheduler/sched-domains.txt]--and the scheduler +attempts to minimize task migration to distant scheduling domains. However, +the scheduler does not take a task's NUMA footprint into account directly. +Thus, under sufficient imbalance, tasks can migrate between nodes, remote +from their initial node and kernel data structures. + +System administrators and application designers can restrict a task's migration +to improve NUMA locality using various CPU affinity command line interfaces, +such as taskset(1) and numactl(1), and program interfaces such as +sched_setaffinity(2). Further, one can modify the kernel's default local +allocation behavior using Linux NUMA memory policy. +[see Documentation/vm/numa_memory_policy.] + +System administrators can restrict the CPUs and nodes' memories that a non- +privileged user can specify in the scheduling or NUMA commands and functions +using control groups and CPUsets. [see Documentation/cgroups/CPUsets.txt] + +On architectures that do not hide memoryless nodes, Linux will include only +zones [nodes] with memory in the zonelists. This means that for a memoryless +node the "local memory node"--the node of the first zone in CPU's node's +zonelist--will not be the node itself. Rather, it will be the node that the +kernel selected as the nearest node with memory when it built the zonelists. +So, default, local allocations will succeed with the kernel supplying the +closest available memory. This is a consequence of the same mechanism that +allows such allocations to fallback to other nearby nodes when a node that +does contain memory overflows. + +Some kernel allocations do not want or cannot tolerate this allocation fallback +behavior. Rather they want to be sure they get memory from the specified node +or get notified that the node has no free memory. This is usually the case when +a subsystem allocates per CPU memory resources, for example. + +A typical model for making such an allocation is to obtain the node id of the +node to which the "current CPU" is attached using one of the kernel's +numa_node_id() or CPU_to_node() functions and then request memory from only +the node id returned. When such an allocation fails, the requesting subsystem +may revert to its own fallback path. The slab kernel memory allocator is an +example of this. Or, the subsystem may choose to disable or not to enable +itself on allocation failure. The kernel profiling subsystem is an example of +this. + +If the architecture supports--does not hide--memoryless nodes, then CPUs +attached to memoryless nodes would always incur the fallback path overhead +or some subsystems would fail to initialize if they attempted to allocated +memory exclusively from a node without memory. To support such +architectures transparently, kernel subsystems can use the numa_mem_id() +or cpu_to_mem() function to locate the "local memory node" for the calling or +specified CPU. Again, this is the same node from which default, local page +allocations will be attempted. diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt index 41c95cc1dc1..17ddd822b45 100644 --- a/Documentation/watchdog/watchdog-parameters.txt +++ b/Documentation/watchdog/watchdog-parameters.txt @@ -125,6 +125,11 @@ ibmasr: nowayout: Watchdog cannot be stopped once started (default=kernel config parameter) ------------------------------------------------- +imx2_wdt: +timeout: Watchdog timeout in seconds (default 60 s) +nowayout: Watchdog cannot be stopped once started + (default=kernel config parameter) +------------------------------------------------- indydog: nowayout: Watchdog cannot be stopped once started (default=kernel config parameter) |