diff options
Diffstat (limited to 'Documentation/fault-injection/fault-injection.txt')
-rw-r--r-- | Documentation/fault-injection/fault-injection.txt | 225 |
1 files changed, 225 insertions, 0 deletions
diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt new file mode 100644 index 00000000000..b7ca560b934 --- /dev/null +++ b/Documentation/fault-injection/fault-injection.txt @@ -0,0 +1,225 @@ +Fault injection capabilities infrastructure +=========================================== + +See also drivers/md/faulty.c and "every_nth" module option for scsi_debug. + + +Available fault injection capabilities +-------------------------------------- + +o failslab + + injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) + +o fail_page_alloc + + injects page allocation failures. (alloc_pages(), get_free_pages(), ...) + +o fail_make_request + + injects disk IO errors on devices permitted by setting + /sys/block/<device>/make-it-fail or + /sys/block/<device>/<partition>/make-it-fail. (generic_make_request()) + +Configure fault-injection capabilities behavior +----------------------------------------------- + +o debugfs entries + +fault-inject-debugfs kernel module provides some debugfs entries for runtime +configuration of fault-injection capabilities. + +- /debug/fail*/probability: + + likelihood of failure injection, in percent. + Format: <percent> + + Note that one-failure-per-hundred is a very high error rate + for some testcases. Consider setting probability=100 and configure + /debug/fail*/interval for such testcases. + +- /debug/fail*/interval: + + specifies the interval between failures, for calls to + should_fail() that pass all the other tests. + + Note that if you enable this, by setting interval>1, you will + probably want to set probability=100. + +- /debug/fail*/times: + + specifies how many times failures may happen at most. + A value of -1 means "no limit". + +- /debug/fail*/space: + + specifies an initial resource "budget", decremented by "size" + on each call to should_fail(,size). Failure injection is + suppressed until "space" reaches zero. + +- /debug/fail*/verbose + + Format: { 0 | 1 | 2 } + specifies the verbosity of the messages when failure is + injected. '0' means no messages; '1' will print only a single + log line per failure; '2' will print a call trace too -- useful + to debug the problems revealed by fault injection. + +- /debug/fail*/task-filter: + + Format: { 'Y' | 'N' } + A value of 'N' disables filtering by process (default). + Any positive value limits failures to only processes indicated by + /proc/<pid>/make-it-fail==1. + +- /debug/fail*/require-start: +- /debug/fail*/require-end: +- /debug/fail*/reject-start: +- /debug/fail*/reject-end: + + specifies the range of virtual addresses tested during + stacktrace walking. Failure is injected only if some caller + in the walked stacktrace lies within the required range, and + none lies within the rejected range. + Default required range is [0,ULONG_MAX) (whole of virtual address space). + Default rejected range is [0,0). + +- /debug/fail*/stacktrace-depth: + + specifies the maximum stacktrace depth walked during search + for a caller within [require-start,require-end) OR + [reject-start,reject-end). + +- /debug/fail_page_alloc/ignore-gfp-highmem: + + Format: { 'Y' | 'N' } + default is 'N', setting it to 'Y' won't inject failures into + highmem/user allocations. + +- /debug/failslab/ignore-gfp-wait: +- /debug/fail_page_alloc/ignore-gfp-wait: + + Format: { 'Y' | 'N' } + default is 'N', setting it to 'Y' will inject failures + only into non-sleep allocations (GFP_ATOMIC allocations). + +o Boot option + +In order to inject faults while debugfs is not available (early boot time), +use the boot option: + + failslab= + fail_page_alloc= + fail_make_request=<interval>,<probability>,<space>,<times> + +How to add new fault injection capability +----------------------------------------- + +o #include <linux/fault-inject.h> + +o define the fault attributes + + DECLARE_FAULT_INJECTION(name); + + Please see the definition of struct fault_attr in fault-inject.h + for details. + +o provide a way to configure fault attributes + +- boot option + + If you need to enable the fault injection capability from boot time, you can + provide boot option to configure it. There is a helper function for it: + + setup_fault_attr(attr, str); + +- debugfs entries + + failslab, fail_page_alloc, and fail_make_request use this way. + Helper functions: + + init_fault_attr_entries(entries, attr, name); + void cleanup_fault_attr_entries(entries); + +- module parameters + + If the scope of the fault injection capability is limited to a + single kernel module, it is better to provide module parameters to + configure the fault attributes. + +o add a hook to insert failures + + Upon should_fail() returning true, client code should inject a failure. + + should_fail(attr, size); + +Application Examples +-------------------- + +o inject slab allocation failures into module init/cleanup code + +------------------------------------------------------------------------------ +#!/bin/bash + +FAILCMD=Documentation/fault-injection/failcmd.sh +BLACKLIST="root_plug evbug" + +FAILNAME=failslab +echo Y > /debug/$FAILNAME/task-filter +echo 10 > /debug/$FAILNAME/probability +echo 100 > /debug/$FAILNAME/interval +echo -1 > /debug/$FAILNAME/times +echo 2 > /debug/$FAILNAME/verbose +echo 1 > /debug/$FAILNAME/ignore-gfp-wait + +blacklist() +{ + echo $BLACKLIST | grep $1 > /dev/null 2>&1 +} + +oops() +{ + dmesg | grep BUG > /dev/null 2>&1 +} + +find /lib/modules/`uname -r` -name '*.ko' -exec basename {} .ko \; | + while read i + do + oops && exit 1 + + if ! blacklist $i + then + echo inserting $i... + bash $FAILCMD modprobe $i + fi + done + +lsmod | awk '{ if ($3 == 0) { print $1 } }' | + while read i + do + oops && exit 1 + + if ! blacklist $i + then + echo removing $i... + bash $FAILCMD modprobe -r $i + fi + done + +------------------------------------------------------------------------------ + +o inject slab allocation failures only for a specific module + +------------------------------------------------------------------------------ +#!/bin/bash + +FAILMOD=Documentation/fault-injection/failmodule.sh + +echo injecting errors into the module $1... + +modprobe $1 +bash $FAILMOD failslab $1 10 +echo 25 > /debug/failslab/probability + +------------------------------------------------------------------------------ + |