Task Control Groups: basic task cgroup framework

Generic Process Control Groups -------------------------- There have recently been various proposals floating around for resource management/accounting and other task grouping subsystems in the kernel, including ResGroups, User BeanCounters, NSProxy cgroups, and others. These all need the basic abstraction of being able to group together multiple processes in an aggregate, in order to track/limit the resources permitted to those processes, or control other behaviour of the processes, and all implement this grouping in different ways. This patchset provides a framework for tracking and grouping processes into arbitrary "cgroups" and assigning arbitrary state to those groupings, in order to control the behaviour of the cgroup as an aggregate. The intention is that the various resource management and virtualization/cgroup efforts can also become task cgroup clients, with the result that: - the userspace APIs are (somewhat) normalised - it's easier to test e.g. the ResGroups CPU controller in conjunction with the BeanCounters memory controller, or use either of them as the resource-control portion of a virtual server system. - the additional kernel footprint of any of the competing resource management systems is substantially reduced, since it doesn't need to provide process grouping/containment, hence improving their chances of getting into the kernel This patch: Add the main task cgroups framework - the cgroup filesystem, and the basic structures for tracking membership and associating subsystem state objects to tasks. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Paul Menage <menage@google.com> 2007-10-18 23:39:30 -0700
committer: Linus Torvalds <torvalds@woody.linux-foundation.org> 2007-10-19 11:53:36 -0700
commit: ddbcc7e8e50aefe467c01cac3dec71f118cd8ac2 (patch)
tree: 0881a031e669582f819d572339e955b04abfc3d2 /include/linux/sched.h
parent: 55a230aae650157720becc09cadb7d10efbf5013 (diff)
1 files changed, 33 insertions, 1 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 10a83d8d577..af2ed4bae67 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -894,6 +894,34 @@ struct sched_entity {
 #endif
 };
 
+#ifdef CONFIG_CGROUPS
+
+#define SUBSYS(_x) _x ## _subsys_id,
+enum cgroup_subsys_id {
+#include <linux/cgroup_subsys.h>
+	CGROUP_SUBSYS_COUNT
+};
+#undef SUBSYS
+
+/* A css_set is a structure holding pointers to a set of
+ * cgroup_subsys_state objects.
+ */
+
+struct css_set {
+
+	/* Set of subsystem states, one for each subsystem. NULL for
+	 * subsystems that aren't part of this hierarchy. These
+	 * pointers reduce the number of dereferences required to get
+	 * from a task to its state for a given cgroup, but result
+	 * in increased space usage if tasks are in wildly different
+	 * groupings across different hierarchies. This array is
+	 * immutable after creation */
+	struct cgroup_subsys_state *subsys[CGROUP_SUBSYS_COUNT];
+
+};
+
+#endif /* CONFIG_CGROUPS */
+
 struct task_struct {
 	volatile long state;	/* -1 unrunnable, 0 runnable, >0 stopped */
 	void *stack;
@@ -1130,6 +1158,9 @@ struct task_struct {
 	int cpuset_mems_generation;
 	int cpuset_mem_spread_rotor;
 #endif
+#ifdef CONFIG_CGROUPS
+	struct css_set cgroups;
+#endif
 #ifdef CONFIG_FUTEX
 	struct robust_list_head __user *robust_list;
 #ifdef CONFIG_COMPAT
@@ -1625,7 +1656,8 @@ static inline int thread_group_empty(struct task_struct *p)
 /*
  * Protects ->fs, ->files, ->mm, ->group_info, ->comm, keyring
  * subscriptions and synchronises with wait4().  Also used in procfs.  Also
- * pins the final release of task.io_context.  Also protects ->cpuset.
+ * pins the final release of task.io_context.  Also protects ->cpuset and
+ * ->cgroup.subsys[].
  *
  * Nests both inside and outside of read_lock(&tasklist_lock).
  * It must not be nested with write_lock_irq(&tasklist_lock),
author	Paul Menage <menage@google.com>	2007-10-18 23:39:30 -0700
committer	Linus Torvalds <torvalds@woody.linux-foundation.org>	2007-10-19 11:53:36 -0700
commit	ddbcc7e8e50aefe467c01cac3dec71f118cd8ac2 (patch)
tree	0881a031e669582f819d572339e955b04abfc3d2 /include/linux/sched.h
parent	55a230aae650157720becc09cadb7d10efbf5013 (diff)