summaryrefslogtreecommitdiffstats
path: root/Documentation/sysctl
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/sysctl')
-rw-r--r--Documentation/sysctl/kernel.txt21
-rw-r--r--Documentation/sysctl/vm.txt26
2 files changed, 37 insertions, 10 deletions
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 9886c3d57fc..708bb7f1b7e 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -77,6 +77,7 @@ show up in /proc/sys/kernel:
- shmmni
- stop-a [ SPARC only ]
- sysrq ==> Documentation/sysrq.txt
+- sysctl_writes_strict
- tainted
- threads-max
- unknown_nmi_panic
@@ -762,6 +763,26 @@ without users and with a dead originative process will be destroyed.
==============================================================
+sysctl_writes_strict:
+
+Control how file position affects the behavior of updating sysctl values
+via the /proc/sys interface:
+
+ -1 - Legacy per-write sysctl value handling, with no printk warnings.
+ Each write syscall must fully contain the sysctl value to be
+ written, and multiple writes on the same sysctl file descriptor
+ will rewrite the sysctl value, regardless of file position.
+ 0 - (default) Same behavior as above, but warn about processes that
+ perform writes to a sysctl file descriptor when the file position
+ is not 0.
+ 1 - Respect file position when writing sysctl strings. Multiple writes
+ will append to the sysctl value buffer. Anything past the max length
+ of the sysctl value buffer will be ignored. Writes to numeric sysctl
+ entries must always be at file position 0 and the value must be
+ fully contained in the buffer sent in the write syscall.
+
+==============================================================
+
tainted:
Non-zero if the kernel has been tainted. Numeric values, which
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index dd9d0e33b44..bd4b34c0373 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -746,8 +746,8 @@ Changing this takes effect whenever an application requests memory.
vfs_cache_pressure
------------------
-Controls the tendency of the kernel to reclaim the memory which is used for
-caching of directory and inode objects.
+This percentage value controls the tendency of the kernel to reclaim
+the memory which is used for caching of directory and inode objects.
At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a "fair" rate with respect to pagecache and
@@ -757,6 +757,11 @@ never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.
+Increasing vfs_cache_pressure significantly beyond 100 may have negative
+performance impact. Reclaim code needs to take various locks to find freeable
+directory and inode objects. With vfs_cache_pressure=1000, it will look for
+ten times more freeable objects than there are.
+
==============================================================
zone_reclaim_mode:
@@ -772,16 +777,17 @@ This is value ORed together of
2 = Zone reclaim writes dirty pages out
4 = Zone reclaim swaps pages
-zone_reclaim_mode is set during bootup to 1 if it is determined that pages
-from remote zones will cause a measurable performance reduction. The
-page allocator will then reclaim easily reusable pages (those page
-cache pages that are currently not used) before allocating off node pages.
-
-It may be beneficial to switch off zone reclaim if the system is
-used for a file server and all of memory should be used for caching files
-from disk. In that case the caching effect is more important than
+zone_reclaim_mode is disabled by default. For file servers or workloads
+that benefit from having their data cached, zone_reclaim_mode should be
+left disabled as the caching effect is likely to be more important than
data locality.
+zone_reclaim may be enabled if it's known that the workload is partitioned
+such that each partition fits within a NUMA node and that accessing remote
+memory would cause a measurable performance reduction. The page allocator
+will then reclaim easily reusable pages (those page cache pages that are
+currently not used) before allocating off node pages.
+
Allowing zone reclaim to write out pages stops processes that are
writing large amounts of data from dirtying pages on other nodes. Zone
reclaim will write out dirty pages if a zone fills up and so effectively