summaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/gfs2-uevents.txt100
-rw-r--r--Documentation/filesystems/nfs.txt98
-rw-r--r--Documentation/filesystems/seq_file.txt2
3 files changed, 199 insertions, 1 deletions
diff --git a/Documentation/filesystems/gfs2-uevents.txt b/Documentation/filesystems/gfs2-uevents.txt
new file mode 100644
index 00000000000..fd966dc9979
--- /dev/null
+++ b/Documentation/filesystems/gfs2-uevents.txt
@@ -0,0 +1,100 @@
+ uevents and GFS2
+ ==================
+
+During the lifetime of a GFS2 mount, a number of uevents are generated.
+This document explains what the events are and what they are used
+for (by gfs_controld in gfs2-utils).
+
+A list of GFS2 uevents
+-----------------------
+
+1. ADD
+
+The ADD event occurs at mount time. It will always be the first
+uevent generated by the newly created filesystem. If the mount
+is successful, an ONLINE uevent will follow. If it is not successful
+then a REMOVE uevent will follow.
+
+The ADD uevent has two environment variables: SPECTATOR=[0|1]
+and RDONLY=[0|1] that specify the spectator status (a read-only mount
+with no journal assigned), and read-only (with journal assigned) status
+of the filesystem respectively.
+
+2. ONLINE
+
+The ONLINE uevent is generated after a successful mount or remount. It
+has the same environment variables as the ADD uevent. The ONLINE
+uevent, along with the two environment variables for spectator and
+RDONLY are a relatively recent addition (2.6.32-rc+) and will not
+be generated by older kernels.
+
+3. CHANGE
+
+The CHANGE uevent is used in two places. One is when reporting the
+successful mount of the filesystem by the first node (FIRSTMOUNT=Done).
+This is used as a signal by gfs_controld that it is then ok for other
+nodes in the cluster to mount the filesystem.
+
+The other CHANGE uevent is used to inform of the completion
+of journal recovery for one of the filesystems journals. It has
+two environment variables, JID= which specifies the journal id which
+has just been recovered, and RECOVERY=[Done|Failed] to indicate the
+success (or otherwise) of the operation. These uevents are generated
+for every journal recovered, whether it is during the initial mount
+process or as the result of gfs_controld requesting a specific journal
+recovery via the /sys/fs/gfs2/<fsname>/lock_module/recovery file.
+
+Because the CHANGE uevent was used (in early versions of gfs_controld)
+without checking the environment variables to discover the state, we
+cannot add any more functions to it without running the risk of
+someone using an older version of the user tools and breaking their
+cluster. For this reason the ONLINE uevent was used when adding a new
+uevent for a successful mount or remount.
+
+4. OFFLINE
+
+The OFFLINE uevent is only generated due to filesystem errors and is used
+as part of the "withdraw" mechanism. Currently this doesn't give any
+information about what the error is, which is something that needs to
+be fixed.
+
+5. REMOVE
+
+The REMOVE uevent is generated at the end of an unsuccessful mount
+or at the end of a umount of the filesystem. All REMOVE uevents will
+have been preceeded by at least an ADD uevent for the same fileystem,
+and unlike the other uevents is generated automatically by the kernel's
+kobject subsystem.
+
+
+Information common to all GFS2 uevents (uevent environment variables)
+----------------------------------------------------------------------
+
+1. LOCKTABLE=
+
+The LOCKTABLE is a string, as supplied on the mount command
+line (locktable=) or via fstab. It is used as a filesystem label
+as well as providing the information for a lock_dlm mount to be
+able to join the cluster.
+
+2. LOCKPROTO=
+
+The LOCKPROTO is a string, and its value depends on what is set
+on the mount command line, or via fstab. It will be either
+lock_nolock or lock_dlm. In the future other lock managers
+may be supported.
+
+3. JOURNALID=
+
+If a journal is in use by the filesystem (journals are not
+assigned for spectator mounts) then this will give the
+numeric journal id in all GFS2 uevents.
+
+4. UUID=
+
+With recent versions of gfs2-utils, mkfs.gfs2 writes a UUID
+into the filesystem superblock. If it exists, this will
+be included in every uevent relating to the filesystem.
+
+
+
diff --git a/Documentation/filesystems/nfs.txt b/Documentation/filesystems/nfs.txt
new file mode 100644
index 00000000000..f50f26ce6cd
--- /dev/null
+++ b/Documentation/filesystems/nfs.txt
@@ -0,0 +1,98 @@
+
+The NFS client
+==============
+
+The NFS version 2 protocol was first documented in RFC1094 (March 1989).
+Since then two more major releases of NFS have been published, with NFSv3
+being documented in RFC1813 (June 1995), and NFSv4 in RFC3530 (April
+2003).
+
+The Linux NFS client currently supports all the above published versions,
+and work is in progress on adding support for minor version 1 of the NFSv4
+protocol.
+
+The purpose of this document is to provide information on some of the
+upcall interfaces that are used in order to provide the NFS client with
+some of the information that it requires in order to fully comply with
+the NFS spec.
+
+The DNS resolver
+================
+
+NFSv4 allows for one server to refer the NFS client to data that has been
+migrated onto another server by means of the special "fs_locations"
+attribute. See
+ http://tools.ietf.org/html/rfc3530#section-6
+and
+ http://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00
+
+The fs_locations information can take the form of either an ip address and
+a path, or a DNS hostname and a path. The latter requires the NFS client to
+do a DNS lookup in order to mount the new volume, and hence the need for an
+upcall to allow userland to provide this service.
+
+Assuming that the user has the 'rpc_pipefs' filesystem mounted in the usual
+/var/lib/nfs/rpc_pipefs, the upcall consists of the following steps:
+
+ (1) The process checks the dns_resolve cache to see if it contains a
+ valid entry. If so, it returns that entry and exits.
+
+ (2) If no valid entry exists, the helper script '/sbin/nfs_cache_getent'
+ (may be changed using the 'nfs.cache_getent' kernel boot parameter)
+ is run, with two arguments:
+ - the cache name, "dns_resolve"
+ - the hostname to resolve
+
+ (3) After looking up the corresponding ip address, the helper script
+ writes the result into the rpc_pipefs pseudo-file
+ '/var/lib/nfs/rpc_pipefs/cache/dns_resolve/channel'
+ in the following (text) format:
+
+ "<ip address> <hostname> <ttl>\n"
+
+ Where <ip address> is in the usual IPv4 (123.456.78.90) or IPv6
+ (ffee:ddcc:bbaa:9988:7766:5544:3322:1100, ffee::1100, ...) format.
+ <hostname> is identical to the second argument of the helper
+ script, and <ttl> is the 'time to live' of this cache entry (in
+ units of seconds).
+
+ Note: If <ip address> is invalid, say the string "0", then a negative
+ entry is created, which will cause the kernel to treat the hostname
+ as having no valid DNS translation.
+
+
+
+
+A basic sample /sbin/nfs_cache_getent
+=====================================
+
+#!/bin/bash
+#
+ttl=600
+#
+cut=/usr/bin/cut
+getent=/usr/bin/getent
+rpc_pipefs=/var/lib/nfs/rpc_pipefs
+#
+die()
+{
+ echo "Usage: $0 cache_name entry_name"
+ exit 1
+}
+
+[ $# -lt 2 ] && die
+cachename="$1"
+cache_path=${rpc_pipefs}/cache/${cachename}/channel
+
+case "${cachename}" in
+ dns_resolve)
+ name="$2"
+ result="$(${getent} hosts ${name} | ${cut} -f1 -d\ )"
+ [ -z "${result}" ] && result="0"
+ ;;
+ *)
+ die
+ ;;
+esac
+echo "${result} ${name} ${ttl}" >${cache_path}
+
diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt
index b843743aa0b..0d15ebccf5b 100644
--- a/Documentation/filesystems/seq_file.txt
+++ b/Documentation/filesystems/seq_file.txt
@@ -46,7 +46,7 @@ better to do. The file is seekable, in that one can do something like the
following:
dd if=/proc/sequence of=out1 count=1
- dd if=/proc/sequence skip=1 out=out2 count=1
+ dd if=/proc/sequence skip=1 of=out2 count=1
Then concatenate the output files out1 and out2 and get the right
result. Yes, it is a thoroughly useless module, but the point is to show