summaryrefslogtreecommitdiffstats
path: root/Documentation/slow-work.txt
blob: a9d1b0ffdded2f379908cfeb74a971f5f7ad2981 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
		     ====================================
		     SLOW WORK ITEM EXECUTION THREAD POOL
		     ====================================

By: David Howells <dhowells@redhat.com>

The slow work item execution thread pool is a pool of threads for performing
things that take a relatively long time, such as making mkdir calls.
Typically, when processing something, these items will spend a lot of time
blocking a thread on I/O, thus making that thread unavailable for doing other
work.

The standard workqueue model is unsuitable for this class of work item as that
limits the owner to a single thread or a single thread per CPU.  For some
tasks, however, more threads - or fewer - are required.

There is just one pool per system.  It contains no threads unless something
wants to use it - and that something must register its interest first.  When
the pool is active, the number of threads it contains is dynamic, varying
between a maximum and minimum setting, depending on the load.


====================
CLASSES OF WORK ITEM
====================

This pool support two classes of work items:

 (*) Slow work items.

 (*) Very slow work items.

The former are expected to finish much quicker than the latter.

An operation of the very slow class may do a batch combination of several
lookups, mkdirs, and a create for instance.

An operation of the ordinarily slow class may, for example, write stuff or
expand files, provided the time taken to do so isn't too long.

Operations of both types may sleep during execution, thus tying up the thread
loaned to it.

A further class of work item is available, based on the slow work item class:

 (*) Delayed slow work items.

These are slow work items that have a timer to defer queueing of the item for
a while.


THREAD-TO-CLASS ALLOCATION
--------------------------

Not all the threads in the pool are available to work on very slow work items.
The number will be between one and one fewer than the number of active threads.
This is configurable (see the "Pool Configuration" section).

All the threads are available to work on ordinarily slow work items, but a
percentage of the threads will prefer to work on very slow work items.

The configuration ensures that at least one thread will be available to work on
very slow work items, and at least one thread will be available that won't work
on very slow work items at all.


=====================
USING SLOW WORK ITEMS
=====================

Firstly, a module or subsystem wanting to make use of slow work items must
register its interest:

	 int ret = slow_work_register_user(struct module *module);

This will return 0 if successful, or a -ve error upon failure.  The module
pointer should be the module interested in using this facility (almost
certainly THIS_MODULE).


Slow work items may then be set up by:

 (1) Declaring a slow_work struct type variable:

	#include <linux/slow-work.h>

	struct slow_work myitem;

 (2) Declaring the operations to be used for this item:

	struct slow_work_ops myitem_ops = {
		.get_ref = myitem_get_ref,
		.put_ref = myitem_put_ref,
		.execute = myitem_execute,
	};

     [*] For a description of the ops, see section "Item Operations".

 (3) Initialising the item:

	slow_work_init(&myitem, &myitem_ops);

     or:

	delayed_slow_work_init(&myitem, &myitem_ops);

     or:

	vslow_work_init(&myitem, &myitem_ops);

     depending on its class.

A suitably set up work item can then be enqueued for processing:

	int ret = slow_work_enqueue(&myitem);

This will return a -ve error if the thread pool is unable to gain a reference
on the item, 0 otherwise, or (for delayed work):

	int ret = delayed_slow_work_enqueue(&myitem, my_jiffy_delay);


The items are reference counted, so there ought to be no need for a flush
operation.  But as the reference counting is optional, means to cancel
existing work items are also included:

	cancel_slow_work(&myitem);
	cancel_delayed_slow_work(&myitem);

can be used to cancel pending work.  The above cancel function waits for
existing work to have been executed (or prevent execution of them, depending
on timing).


When all a module's slow work items have been processed, and the
module has no further interest in the facility, it should unregister its
interest:

	slow_work_unregister_user(struct module *module);

The module pointer is used to wait for all outstanding work items for that
module before completing the unregistration.  This prevents the put_ref() code
from being taken away before it completes.  module should almost certainly be
THIS_MODULE.


===============
ITEM OPERATIONS
===============

Each work item requires a table of operations of type struct slow_work_ops.
Only ->execute() is required, getting and putting of a reference are optional.

 (*) Get a reference on an item:

	int (*get_ref)(struct slow_work *work);

     This allows the thread pool to attempt to pin an item by getting a
     reference on it.  This function should return 0 if the reference was
     granted, or a -ve error otherwise.  If an error is returned,
     slow_work_enqueue() will fail.

     The reference is held whilst the item is queued and whilst it is being
     executed.  The item may then be requeued with the same reference held, or
     the reference will be released.

 (*) Release a reference on an item:

	void (*put_ref)(struct slow_work *work);

     This allows the thread pool to unpin an item by releasing the reference on
     it.  The thread pool will not touch the item again once this has been
     called.

 (*) Execute an item:

	void (*execute)(struct slow_work *work);

     This should perform the work required of the item.  It may sleep, it may
     perform disk I/O and it may wait for locks.


==================
POOL CONFIGURATION
==================

The slow-work thread pool has a number of configurables:

 (*) /proc/sys/kernel/slow-work/min-threads

     The minimum number of threads that should be in the pool whilst it is in
     use.  This may be anywhere between 2 and max-threads.

 (*) /proc/sys/kernel/slow-work/max-threads

     The maximum number of threads that should in the pool.  This may be
     anywhere between min-threads and 255 or NR_CPUS * 2, whichever is greater.

 (*) /proc/sys/kernel/slow-work/vslow-percentage

     The percentage of active threads in the pool that may be used to execute
     very slow work items.  This may be between 1 and 99.  The resultant number
     is bounded to between 1 and one fewer than the number of active threads.
     This ensures there is always at least one thread that can process very
     slow work items, and always at least one thread that won't.